RoboLab: A High-Fidelity Simulation Benchmark for Analysis of Task Generalist Policies

ArXi:2604.09860v1 Announce Type: cross The pursuit of general-purpose robotics has yielded impressive foundation models, yet simulation-based benchmarking remains a bottleneck due to rapid performance saturation and a lack of true generalization testing. Existing benchmarks often exhibit significant domain overlap between