AI SAFETY & ETHICS

How Hard a Problem is Alignment? (My Opinionated Answer)

LessWrong AI

Epistemic status: We really need to know. TL;DR: Comparing person-years of effort, I argue that AI Safety seems harder than for steam engines, but probably less hard than the Apollo program or. I discuss why I suspect superalignment might not be super-hard. My has come down over the last half-decade, primarily because of properties of LLMs, and progress we’ve made in aligning them: I explain why certain previous concerns don’t apply to LLMs, and summarize what I see as key developments in Alignment. I guesstimate we might be about 10%-20% done.