Mitigating hallucination [P]

Hi, Everyone. I repost this since my previous one was deleted(I don't know why, might be low quality of writing?) I’ve been working on a lightweight way to reduce hallucinations in LLMs without relying on external judges, extra human labels, or heavy preference-learning pipelines. The basic idea is simple: let a frozen base model generate a “bad” counterfactual answer, then train the adapted model to contrast the correct answer against that bad branch only from the first point where they diverge.