AI RESEARCH

LLM Enhanced Action Recognition via Hierarchical Global-Local Skeleton-Language Model

arXiv CS.CV

ArXi:2603.27103v1 Announce Type: new Skeleton-based human action recognition has achieved remarkable progress in recent years. However, most existing GCN-based methods rely on short-range motion topologies, which not only struggle to capture long-range joint dependencies and complex temporal dynamics but also limit cross-modal semantic alignment and understanding due to insufficient modeling of action semantics. To address these challenges, we propose a hierarchical global-local skeleton-language model (HocSLM), enabling the large action model be representative of action semantics.