AI RESEARCH

Tempus: A Temporally Scalable Resource-Invariant GEMM Streaming Framework for Versal AI Edge

arXiv CS.LG

ArXi:2605.00536v1 Announce Type: cross Scaling laws for Large Language Models (LLMs) establish that model quality improves with computational scale, yet edge deployment imposes strict constraints on compute, memory, and power. Since General Matrix Multiplication (GEMM) accounts for up to 90\% of inference time, efficient GEMM acceleration is critical for edge AI.