TOSSS: a CVE-based Software Security Benchmark for Large Language Models

ArXi:2603.10969v1 Announce Type: new With their increasing capabilities, Large Language Models (LLMs) are now used across many industries. They have become useful tools for software engineers and a wide range of development tasks. As LLMs are increasingly used in software development workflows, a critical question arises: are LLMs good at software security? At the same time, organizations worldwide invest heavily in cybersecurity to reduce exposure to disruptive attacks. The integration of LLMs into software engineering workflows may.