AI RESEARCH

Enabling Stroke-Level Structural Analysis of Hieroglyphic Scripts without Language-Specific Priors

arXiv CS.CL

ArXi:2601.05508v2 Announce Type: replace-cross Hieroglyphs, as logographic writing systems, encode rich semantic and cultural information within their internal structural composition. Yet, current advanced Large Language Models (LLMs) and Multimodal LLMs (MLLMs) usually remain structurally blind to this information. LLMs process characters as textual tokens, while MLLMs additionally view them as raw pixel grids. Both fall short to model the underlying logic of character strokes. Furthermore, existing structural analysis methods are often script-specific and labor-intensive.