AI RESEARCH

"I don't know!": Teaching neural networks to abstain with the HALO-Loss. [R]

r/MachineLearning

Current neural networks have a fundamental geometry problem: If you feed them garbage data, they won't admit that they have no clue. They will confidently hallucinate. This happens because the standard Cross-Entropy loss requires models to push their features "infinitely" far away from the origin to reach a loss of 0.0 which leaves the model with a jagged latent space. It literally leaves the model with no mathematically sound place to throw its trash. I've been working on a "fix" for this, and as a result I just open-sourced the HALO-Loss.