AI RESEARCH

Bridging the Pose-Semantic Gap: A Cascade Framework for Text-Based Person Anomaly Search

arXiv CS.CV

ArXi:2604.23282v1 Announce Type: new Text-based person anomaly search retrieves specific behavioral events from surveillance archives using natural-language queries. Although recent pose-aware methods align geometric structures well, they face a fundamental Pose-Semantic Gap: semantically different actions can share similar skeletal geometries. While Multimodal Large Language Models (MLLMs) can reduce this ambiguity, using them for large-scale retrieval is computationally prohibitive.