GPT 5.5 scores 1.7% on OpenAI-proof Q&A—an internal benchmark testing performance on real ML problems encountered during the process of research and engineering