AI RESEARCH

Temporal Tokenization Strategies for Event Sequence Modeling with Large Language Models

arXiv CS.LG

ArXi:2512.13618v3 Announce Type: replace-cross Representing continuous time is a critical and under-explored challenge in modeling temporal event sequences with large language models (LLMs). Various strategies like byte-level representations or calendar tokens have been proposed. However, the optimal approach remains unclear, especially given the diverse statistical distributions of real-world event data, which range from smooth log-normal to discrete, spiky patterns.