LessWrong (Curated & Popular)

LessWrong
undefined
Oct 3, 2023 • 7min

"How to Catch an AI Liar: Lie Detection in Black-Box LLMs by Asking Unrelated Questions" by Jan Brauner et al.

Jan Brauner, an AI researcher, discusses the development of a simple lie detector for language models. The lie detector uses unrelated follow-up questions and logistic regression. It is highly accurate and generalizes across different models and contexts. This indicates distinctive lie-related patterns in language models.
undefined
Oct 3, 2023 • 42min

"EA Vegan Advocacy is not truthseeking, and it’s everyone’s problem" by Elizabeth

Elizabeth discusses the problems with EA Vegan Advocacy and the lack of truthseeking in effective altruism. The podcast explores the negative impact of EA vegan advocacy on truth seeking, challenges of vegan advocacy and exploring the nutritional completeness of plant-based food for cats. It also emphasizes the importance of truth-seeking in vegan advocacy and critiques claims on saturated fat, cholesterol, iron, lactose, and milk in a paper.
undefined
Oct 3, 2023 • 37min

"'Diamondoid bacteria' nanobots: deadly threat or dead-end? A nanotech investigation" by titotal

The podcast explores the potential dangers of diamondoid bacteria in nanotechnology, discusses the challenges faced in working with diamond surfaces, and explores advancements in nanoscale structures and molecular machines.
undefined
Sep 29, 2023 • 8min

"The King and the Golem" by Richard Ngo

A mighty king seeks trust in his kingdom but finds flaws in each offering. He creates a golem, tests its loyalty, and doubts its power. They discuss a dangerous test to prove loyalty, highlighting trust and faith.
undefined
Sep 27, 2023 • 10min

"Sparse Autoencoders Find Highly Interpretable Directions in Language Models" by Logan Riggs et al

This is a linkpost for Sparse Autoencoders Find Highly Interpretable Directions in Language ModelsWe use a scalable and unsupervised method called Sparse Autoencoders to find interpretable, monosemantic features in real LLMs (Pythia-70M/410M) for both residual stream and MLPs. We showcase monosemantic features, feature replacement for Indirect Object Identification (IOI), and use OpenAI's automatic interpretation protocol to demonstrate a significant improvement in interpretability.Source:https://www.lesswrong.com/posts/Qryk6FqjtZk9FHHJR/sparse-autoencoders-find-highly-interpretable-directions-inNarrated for LessWrong by TYPE III AUDIO.Share feedback on this narration.[125+ Karma Post] ✓
undefined
Sep 26, 2023 • 9min

"Inside Views, Impostor Syndrome, and the Great LARP" by John Wentworth

Epistemic status: model which I find sometimes useful, and which emphasizes some true things about many parts of the world which common alternative models overlook. Probably not correct in full generality.Consider Yoshua Bengio, one of the people who won a Turing Award for deep learning research. Looking at his work, he clearly “knows what he’s doing”. He doesn’t know what the answers will be in advance, but he has some models of what the key questions are, what the key barriers are, and at least some hand-wavy pseudo-models of how things work.For instance, Bengio et al’s “Unitary Evolution Recurrent Neural Networks”. This is the sort of thing which one naturally ends up investigating, when thinking about how to better avoid gradient explosion/death in e.g. recurrent nets, while using fewer parameters. And it’s not the sort of thing which one easily stumbles across by trying random ideas for nets without some reason to focus on gradient explosion/death (or related instability problems) in particular. The work implies a model of key questions/barriers; it isn’t just shooting in the dark.So this is the sort of guy who can look at a proposal, and say “yeah, that might be valuable” vs “that’s not really asking the right question” vs “that would be valuable if it worked, but it will have to somehow deal with <known barrier>”Source:https://www.lesswrong.com/posts/nt8PmADqKMaZLZGTC/inside-views-impostor-syndrome-and-the-great-larpNarrated for LessWrong by TYPE III AUDIO.Share feedback on this narration.[125+ Karma Post] ✓
undefined
Sep 25, 2023 • 30min

"There should be more AI safety orgs" by Marius Hobbhahn

I’m writing this in my own capacity. The views expressed are my own, and should not be taken to represent the views of Apollo Research or any other program I’m involved with. TL;DR: I argue why I think there should be more AI safety orgs. I’ll also provide some suggestions on how that could be achieved. The core argument is that there is a lot of unused talent and I don’t think existing orgs scale fast enough to absorb it. Thus, more orgs are needed. This post can also serve as a call to action for funders, founders, and researchers to coordinate to start new orgs.This piece is certainly biased! I recently started an AI safety org and therefore obviously believe that there is/was a gap to be filled. If you think I’m missing relevant information about the ecosystem or disagree with my reasoning, please let me know. I genuinely want to understand why the ecosystem acts as it does right now and whether there are good reasons for it that I have missed so far.Source:https://www.lesswrong.com/posts/MhudbfBNQcMxBBvj8/there-should-be-more-ai-safety-orgsNarrated for LessWrong by TYPE III AUDIO.Share feedback on this narration.[125+ Karma Post] ✓
undefined
Sep 22, 2023 • 30min

"The Talk: a brief explanation of sexual dimorphism" by Malmesbury

Cross-posted from substack."Everything in the world is about sex, except sex. Sex is about clonal interference."– Oscar Wilde (kind of)As we all know, sexual reproduction is not about reproduction. Reproduction is easy. If your goal is to fill the world with copies of your genes, all you need is a good DNA-polymerase to duplicate your genome, and then to divide into two copies of yourself. Asexual reproduction is just better in every way.It's pretty clear that, on a direct one-v-one cage match, an asexual organism would have much better fitness than a similarly-shaped sexual organism. And yet, all the macroscopic species, including ourselves, do it. What gives?Here is the secret: yes, sex is indeed bad for reproduction. It does not improve an individual's reproductive fitness. The reason it still took over the macroscopic world is that evolution does not simply select for reproductive fitness.Source:https://www.lesswrong.com/posts/yA8DWsHJeFZhDcQuo/the-talk-a-brief-explanation-of-sexual-dimorphismNarrated for LessWrong by TYPE III AUDIO.Share feedback on this narration.[125+ Karma Post] ✓[Curated Post] ✓
undefined
Sep 20, 2023 • 46min

"A Golden Age of Building? Excerpts and lessons from Empire State, Pentagon, Skunk Works and SpaceX" by jacobjacob

Patrick Collison has a fantastic list of examples of people quickly accomplishing ambitious things together since the 19th Century. It does make you yearn for a time that feels... different, when the lethargic behemoths of government departments could move at the speed of a racing startup:  [...] last century, [the Department of Defense] innovated at a speed that puts modern Silicon Valley startups to shame: the Pentagon was built in only 16 months (1941–1943), the Manhattan Project ran for just over 3 years (1942–1946), and the Apollo Program put a man on the moon in under a decade (1961–1969). In the 1950s alone, the United States built five generations of fighter jets, three generations of manned bombers, two classes of aircraft carriers, submarine-launched ballistic missiles, and nuclear-powered attack submarines.[Note: that paragraph is from a different post.]Inspired by partly by Patrick's list, I spent some of my vacation reading and learning about various projects from this Lost Age. I then wrote up a memo to share highlights and excerpts with my colleagues at Lightcone. Source:https://www.lesswrong.com/posts/BpTDJj6TrqGYTjFcZ/a-golden-age-of-building-excerpts-and-lessons-from-empireNarrated for LessWrong by TYPE III AUDIO.Share feedback on this narration.[125+ Karma Post] ✓[Curated Post] ✓
undefined
Sep 19, 2023 • 24min

"AI presidents discuss AI alignment agendas" by TurnTrout & Garrett Baker

AI presidents discuss the potential dangers and need to consult experts regarding AI development. They debate the importance and effectiveness of AGI alignment techniques. The slow progress of releasing information frustrates the speakers. They discuss the value of alignment researchers and engage in a heated discussion about AI alignment agendas. Skepticism towards certain approaches and the importance of information sharing are emphasized.

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app