AI RESEARCH

TailedTS: Benchmark Dataset for Heavy-Tailed Time Series Prediction and Periodicity Quantification

arXiv CS.LG

ArXi:2605.16361v1 Announce Type: new We present TailedTS, a large-scale benchmark dataset derived from Wikipedia hourly page view observations throughout 2024, specifically designed to test time series forecasting models under heavy-tailed, zero-inflated, and non-Gaussian conditions. The dataset comprises approximately 24.69B data points spanning roughly 3M unique Wikipedia pages per month, d in high-efficiency Apache Parquet format.