AI SAFETY & ETHICS
My most common research advice: do quick sanity checks
Alignment Forum
•
Written quickly as part of the Inkhaven Residency. At a high level, research feedback I give to junior research collaborators often can fall into one of three categories: Doing quick sanity checks Saying precisely what you want to say Asking why one time In each case, I think the advice can be taken to an extreme I no longer endorse. Accordingly, I’ve tried to spell out the degree to which you should implement the advice, as well as what “taking it too far” might look like. This piece covers doing quick sanity checks, which is the most common advice I give to junior researchers.