“The Field of AI Alignment: A Postmortem, and What To Do About It” by johnswentworth
Dec 26, 2024
auto_awesome
johnswentworth, an insightful author from LessWrong, dissects the current state of AI alignment research. He uses an engaging metaphor about searching for keys under a streetlight to illustrate researchers' focus on easier problems while neglecting existential threats. The conversation shifts towards the urgent need for a recruitment overhaul, advocating for advanced technical skills to foster innovative approaches. Overall, the dialogue challenges existing paradigms and emphasizes tackling the real challenges in AI safety.
The metaphor of 'street lighting' illustrates how AI alignment research often prioritizes easier problems over the more complex issues that truly threaten humanity's survival.
A shift in recruitment strategies is needed to attract higher-caliber talent with specialized skills capable of effectively addressing the intricate challenges in AI alignment.
Deep dives
The Street Lighting Phenomenon
The concept of 'street lighting' highlights how AI alignment research often focuses on seemingly easy problems rather than addressing the critical and complex issues that threaten humanity's survival. This metaphor illustrates how researchers tend to search for solutions in areas where data and progress are easily measurable, despite the recognition that these may not correspond to the actual challenges at hand. Most AI safety researchers gravitate toward problems that yield tangible outputs, leading to a proliferation of work that lacks significant impact on alignment solutions. This tendency results in a prioritization of approaches that reinforce the status quo, thereby hindering advancements in addressing more profound concerns about AI alignment.
Selection Bias in Research
Selection effects play a major role in shaping the focus of AI alignment researchers, favoring those who pursue easier problems regardless of their relevance to the core challenges. For instance, a researcher may find motivation and funding in projects that offer rapid yields but ultimately contribute little to solving significant alignment issues. This creates a cycle where individuals with less traction on complex problems move away from them due to a lack of immediate results, while their easier counterparts receive continuous support. As a result, the field collectively trends toward addressing superficial issues, which may divert attention from the pressing, intricately difficult areas that demand exploration.
Recruitment and Community Dynamics
The recruitment process for AI alignment work currently emphasizes attracting standard STEM graduates, who may not possess the requisite skills to tackle complex alignment challenges effectively. The author argues that a higher caliber of intellect and experience, akin to that found among physics postdocs, is necessary to engage meaningfully with hard problems in AI alignment. Additionally, a more effective environment is needed—one that separates these skilled individuals from mainstream AI safety discourse, allowing for focused and innovative problem-solving. Establishing dedicated venues for discussions among those working on core challenges would foster progress and shift the priorities of the alignment field toward addressing critical issues more effectively.
A policeman sees a drunk man searching for something under a streetlight and asks what the drunk has lost. He says he lost his keys and they both look under the streetlight together. After a few minutes the policeman asks if he is sure he lost them here, and the drunk replies, no, and that he lost them in the park. The policeman asks why he is searching here, and the drunk replies, "this is where the light is".
Over the past few years, a major source of my relative optimism on AI has been the hope that the field of alignment would transition from pre-paradigmatic to paradigmatic, and make much more rapid progress.
At this point, that hope is basically dead. There has been some degree of paradigm formation, but the memetic competition has mostly been won by streetlighting: the large majority of AI Safety researchers and activists [...]
---
Outline:
(01:23) What This Post Is And Isnt, And An Apology