Preview Mode Links will not work in preview mode

AXRP - the AI X-risk Research Podcast


Feb 4, 2023

How good are we at understanding the internal computation of advanced machine learning models, and do we have a hope at getting better? In this episode, Neel Nanda talks about the sub-field of mechanistic interpretability research, as well as papers he's contributed to that explore the basics of transformer...