TRIP-Evaluate: An Open Multimodal Benchmark for Evaluating Large Models in Transportation

ArXi:2605.00907v1 Announce Type: cross Large language models (LLMs) and multimodal large models (MLLMs) are increasingly used for transportation tasks such as regulation question answering, traffic management, engineering review, and autonomous-driving scene reasoning. Yet transportation workflows are rule-intensive, computation-intensive, safety-critical, and inherently multimodal.