Into AI Safety

Jacob Haimes
undefined
Jan 1, 2024 • 13min

MINISODE: Staying Up-to-Date in AI

In this minisode I give some tips for staying up-to-date in the everchanging landscape of AI. I would like to point out that I am constantly iterating on these strategies, tools, and sources, so it is likely that I will make an update episode in the future.Links to all articles/papers which are mentioned throughout the episode can be found below, in order of their appearance.ToolsFeedlyarXiv Sanity LiteZoteroAlternativeToMy "Distilled AI" FolderAI Explained YouTube channelAI Safety newsletterData Machina newsletterImport AIMidwit AlignmentHonourable MentionsAI Alignment ForumLessWrongBounded Regret (Jacob Steinhart's blog)Cold Takes (Holden Karnofsky's blog)Chris Olah's blogTim Dettmers blogEpoch blogApollo Research blog
undefined
Dec 18, 2023 • 1h 11min

INTERVIEW: Applications w/ Alice Rigg

Alice Rigg, a mechanistic interpretability researcher from Ottawa, Canada, joins me to discuss their path and the applications process for research/mentorship programs.Join the Mech Interp Discord server and attend reading groups at 11:00am on Wednesdays (Mountain Time)!Check out Alice's website.Links to all articles/papers which are mentioned throughout the episode can be found below, in order of their appearance. EleutherAI Join the public EleutherAI discord server DistillEffective Altruism (EA)MATS Retrospective Summer 2023 postAmbitious Mechanistic Interpretability AISC research plan by Alice RiggSPARStability AI During their most recent fundraising round, Stability AI had a valuation of $4B (Bloomberg) Mech Interp Discord Server
undefined
Dec 11, 2023 • 18min

MINISODE: Program Applications (Winter 2024)

We're back after a month-long hiatus with a podcast refactor and advice on the applications process for research/mentorship programs.Check out the About page on the Into AI Safety website for a summary of the logistics updates.Links to all articles/papers which are mentioned throughout the episode can be found below, in order of their appearance. MATSASTRA FellowshipARENAAI Safety CampBlueDot ImpactTech with TimFast.AI's Practical Deep Learning for CodersKaggleAlignmentJamsLessWrongAI Alignment Forum
undefined
Dec 4, 2023 • 10min

MINISODE: EAG Takeaways (Boston 2023)

This episode is a brief overview of the major takeaways I had from attending EAG Boston 2023, and an update on my plans for the podcast moving forward.TL;DLStarting in early December (2023), I will be uploading episodes on a biweekly basis (day TBD).I won't be releasing another episode until then, so that I can build a cache of episodes up.During this month (November 2023), I'll also try to get the podcast up on more platforms, set up comments on more platforms, and create an anonymous feedback form.Links Links to all articles/papers which are mentioned throughout the episode can be found below, in order of their appearance. How to generate research proposalsKarolina Sarek: How to do research that mattersWill releasing the weights of future large language models grant widespread access to pandemic agents?Like the show? Think it could be improved? Fill out this anonymous feedback form to let me know!Please email all inquiries to intoaisafety@gmail.com.
undefined
Nov 27, 2023 • 57min

FEEDBACK: AISC Proposal w/ Remmelt Ellen

In this episode I discuss my initial research proposal for the 2024 Winter AI Safety Camp with one of the individuals who helps facilitate the program, Remmelt Ellen.The proposal is titled The Effect of Machine Learning on Bioengineered Pandemic Risk. A doc-capsule of the proposal at the time of this recording can be found at this link.Links to all articles/papers which are mentioned throughout the episode can be found below, in order of their appearance. MegaSyn: Integrating Generative Molecule Design, Automated Analog Designer and Synthetic Viability PredictionDual use of artificial-intelligence-powered drug discoveryArtificial intelligence and biological misuse: Differentiating risks of language models and biological design toolsModel Organisms of Misalignment: The Case for a New Pillar of Alignment ResearchShadow Alignment: The Ease of Subverting Safely-Aligned Language ModelsFine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To!unRLHF - Efficiently undoing LLM safeguards
undefined
Nov 13, 2023 • 10min

MINISODE: Introduction and Motivation

Welcome to the Into AI Safety podcast! In this episode I provide reasoning for why I am starting this podcast, what I am trying to accomplish with it, and a little bit of background on how I got here.Please email all inquiries and suggestions to intoaisafety@gmail.com.

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app