The Nonlinear Library

The Nonlinear Library allows you to easily listen to top EA and rationalist content on your podcast player. We use text-to-speech software to create an automatically updating repository of audio content from the EA Forum, Alignment Forum, LessWrong, and other EA blogs. To find out more, please visit us at nonlinear.org

Latest episodes

Sep 26, 2024 • 33sec

No new episodes will be published here. To keep listening to the EAF & LW, listen to this episode for instructions.

Counterfactuals strike again! The fora have their own official audio channels now, so The Nonlinear Library will no longer publish new episodes since it won't have any counterfactual impact. It's been a good run. We published thousands of episodes and generated a ton of passive impact. But we're not here for the views. We're here for the counterfactual impact. INSTRUCTIONS TO KEEP LISTENING TO THE FORA 1. Search "EA Forum" or "LessWrong" on your podcast player 2. Subscribe to the official channels 3. Go forth. Seek impact. Seek truth.

Sep 22, 2024 • 17min

LW - Augmenting Statistical Models with Natural Language Parameters by jsteinhardt

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Augmenting Statistical Models with Natural Language Parameters, published by jsteinhardt on September 22, 2024 on LessWrong. This is a guest post by my student Ruiqi Zhong, who has some very exciting work defining new families of statistical models that can take natural language explanations as parameters. The motivation is that existing statistical models are bad at explaining structured data. To address this problem, we agument these models with natural language parameters, which can represent interpretable abstract features and be learned automatically. Imagine the following scenario: It is the year 3024. We are historians trying to understand what happened between 2016 and 2024, by looking at how Twitter topics changed across that time period. We are given a dataset of user-posted images sorted by time, $x_1$, $x_2$ ... $x_T$, and our goal is to find trends in this dataset to help interpret what happened. If we successfully achieve our goal, we would discover, for instance, (1) a recurring spike of images depicting athletes every four years for the Olympics, and (2) a large increase in images containing medical concepts during and after the COVID-19 pandemic. How do we usually discover temporal trends from a dataset? One common approach is to fit a time series model to predict how the features evolve and then interpret the learned model. However, it is unclear what features to use: pixels and neural image embeddings are high-dimensional and uninterpretable, undermining the goal of extracting explainable trends. We address this problem by augmenting statistical models with interpretable natural language parameters. The figure below depicts a graphical model representation for the case of time series data. We explain the trends in the observed data [ $x_1$ ... $x_T$] by learning two sets of latent parameters: natural language parameters $\phi$ (the learned features) and real-valued parameters $w$ (the time-varying trends). $\phi$: the natural language descriptions of $K$ different topics, e.g. "depicts athletes competing". $\phi$ is an element of $\Sigma$, the universe of all natural language predicates. $w_t$: the frequency of each of the K topics at the time $t$. If our model successfully recovers the underlying trends, then we can visualize $w$ and $\phi$ below and see that: 1) more pictures contain medical concepts (red) starting from 2020, and 2) there are recurring (blue) spikes of athletes competing. In the rest of this post, we will explain in detail how to specify and learn models with natural language parameters and showcase the model on several real-world applications. We will cover: A warm-up example of a statistical model with natural language explanations A modeling language for specifying natural language parameters Applications of our framework, which can be used to specify models for time series, clustering, and applications. We will go over: A machine learning application that uses our time series model to monitor trends in LLM usage A business application that uses our clustering model to taxonomize product reviews A cognitive science application that uses our classification model to explain what images are more memorable for humans Thanks to Louise Verkin for helping to typeset the post in Ghost format. Warm-up Example: Logistic Regression with Natural Language Parameters Instead of understanding topic shifts across the entire time window of 2016-2024, let's first study a much simpler question: what images are more likely to appear after 2020? The usual way to approach this problem is to, 1. brainstorm some features, 2. extract the real-valued features from each image, and 3. run a logistic regression model on these features to predict the target $Y$ =1 if the image appears after 2020, $Y$ =0 otherwise. More concretely: Step 1: Propose different...

Sep 22, 2024 • 2h 50min

LW - o1-preview is pretty good at doing ML on an unknown dataset by Håvard Tveit Ihle

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: o1-preview is pretty good at doing ML on an unknown dataset, published by Håvard Tveit Ihle on September 20, 2024 on LessWrong. Previous post: How good are LLMs at doing ML on an unknown dataset? A while back I ran some evaluation tests on GPT4o, Claude Sonnet 3.5 and Gemini advanced to see how good they were at doing machine learning on a completely novel, and somewhat unusual dataset. The data was basically 512 points in the 2D plane, and some of the points make up a shape, and the goal is to classify the data according to what shape the points make up. None of the models did better than chance on the original (hard) dataset, while they did somewhat better on a much easier version I made afterwards. With the release of o1-preview, I wanted to quickly run the same test on o1, just to see how well it did. In summary, it basically solved the hard version of my previous challenge, achieving 77% accuracy on the test set on its fourth submission (this increases to 91% if I run it for 250 instead of 50 epochs), which is really impressive to me. Here is the full conversation with ChatGPT o1-preview In general o1-preview seems like a big step change in its ability to reliably do hard tasks like this without any advanced scaffolding or prompting to make it work. Detailed discussion of results The architecture that o1 went for in the first round is essentially the same that Sonnet 3.5 and gemini went for, a pointnet inspired model which extracts features from each point independently. While it managed to do slightly better than chance on the training set, it did not do well on the test set. For round two, it went for the approach (which also Sonnet 3.5 came up with) of binning the points in 2D into an image, and then using a regular 2D convnet to classify the shapes. This worked somewhat on the first try. It completely overfit the training data, but got to an accuracy of 56% on the test data. For round three, it understood that it needed to add data augmentations in order to generalize better, and it implemented scaling, translations and rotations of the data. It also switched to a slightly modified resnet18 architecture (a roughly 10x larger model). However, it made a bug when converting to PIL image (and back to torch.tensor), which resulted in an error. For round four, o1 fixed the error and has a basically working solution, achieving an accuracy of 77% (which increases to 91% if we increase the number of epochs from 50 to 250, all still well within the alloted hour of runtime). I consider the problem basically solved at this point, by playing around with smaller variations on this, you can probably get a few more percentage points without any more insights needed. For the last round, it tried the standard approach of using the pretrained weights of resnet18 and freezing almost all the layers, which is an approach that works well on many problems, but did not work well in this case. The accuracy reduced to 41%. I guess these data are just too different from imagenet (which resnet18 is trained on) for this approach to work well. I would not have expected this to work, but I don't hold it that much against o1, as it is a reasonable thing to try. Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org

Sep 20, 2024 • 7min

EA - The Best Argument is not a Simple English Yud Essay by Jonathan Bostock

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: The Best Argument is not a Simple English Yud Essay, published by Jonathan Bostock on September 20, 2024 on The Effective Altruism Forum. I was encouraged to post this here, but I don't yet have enough EA forum karma to crosspost directly! Epistemic status: these are my own opinions on AI risk communication, based primarily on my own instincts on the subject and discussions with people less involved with rationality than myself. Communication is highly subjective and I have not rigorously A/B tested messaging. I am even less confident in the quality of my responses than in the correctness of my critique. If they turn out to be true, these thoughts can probably be applied to all sorts of communication beyond AI risk. Lots of work has gone into trying to explain AI risk to laypersons. Overall, I think it's been great, but there's a particular trap that I've seen people fall into a few times. I'd summarize it as simplifying and shortening the text of an argument without enough thought for the information content. It comes in three forms. One is forgetting to adapt concepts for someone with a far inferential distance; another is forgetting to filter for the important information; the third is rewording an argument so much you fail to sound like a human being at all. I'm going to critique three examples which I think typify these: Failure to Adapt Concepts I got this from the summaries of AI risk arguments written by Katja Grace and Nathan Young here. I'm making the assumption that these summaries are supposed to be accessible to laypersons, since most of them seem written that way. This one stands out as not having been optimized on the concept level. This argument was below-aveage effectiveness when tested. I expect most people's reaction to point 2 would be "I understand all those words individually, but not together". It's a huge dump of conceptual information all at once which successfully points to the concept in the mind of someone who already understands it, but is unlikely to introduce that concept to someone's mind. Here's an attempt to do better: 1. So far, humans have mostly developed technology by understanding the systems which the technology depends on. 2. AI systems developed today are instead created by machine learning. This means that the computer learns to produce certain desired outputs, but humans do not tell the system how it should produce the outputs. We often have no idea how or why an AI behaves in the way that it does. 3. Since we don't understand how or why an AI works a certain way, it could easily behave in unpredictable and unwanted ways. 4. If the AI is powerful, then the consequences of unwanted behaviour could be catastrophic. And here's Claude's just for fun: 1. Up until now, humans have created new technologies by understanding how they work. 2. The AI systems made in 2024 are different. Instead of being carefully built piece by piece, they're created by repeatedly tweaking random systems until they do what we want. This means the people who make these AIs don't fully understand how they work on the inside. 3. When we use systems that we don't fully understand, we're more likely to run into unexpected problems or side effects. 4. If these not-fully-understood AI systems become very powerful, any unexpected problems could potentially be really big and harmful. I think it gets points 1 and 3 better than me, but 2 and 4 worse. Either way, I think we can improve upon the summary. Failure to Filter Information When you condense an argument down, you make it shorter. This is obvious. What is not always as obvious is that this means you have to throw out information to make the core point clearer. Sometimes the information that gets kept is distracting. Here's an example from a poster a friend of mine made for Pause AI: When I showed this to ...

Sep 20, 2024 • 2min

LW - Interested in Cognitive Bootcamp? by Raemon

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Interested in Cognitive Bootcamp?, published by Raemon on September 20, 2024 on LessWrong. I'm running more 4-day "Cognitive Bootcamps" over the next couple months (during Lighthaven Eternal September season). DM me if you're potentially interested (either as an individual, or as a team). The workshop is most valuable to people who: control their decisionmaking process (i.e. you decide what projects you or a team work on, rather than working at a day-job on someone else's vision) are either a) confused about planmaking / have a vague sense that they aren't as strategically ambitious as they could be. and/or, b) are at a place where it's natural to spend a few days thinking big-picture thoughts before deciding on their next project. There's a secondary[1] focus on "practice solving confusing problems", which IMO is time well spent, but requires more followup practice to pay off. I wrote about the previous workshop here. Participants said on average they'd have been willing to pay $850 for it, and would have paid $5000 for the ideal, perfectly-tailored-for-them version. My plan is to charge $500/person for the next workshop, and then $1000 for the next one. I'm most excited to run this for teams, who can develop a shared skillset and accompanying culture. I plan to tailor the workshops for the needs of whichever people show up. The dates are not scheduled yet (depends somewhat on when a critical mass of participants are available). DM me if you are interested. The skills being taught will be similar to the sort of thing listed in Skills from a year of Purposeful Rationality Practice and the Feedbackloop-first Rationality sequence. My default curriculum is aiming to teach several interrelated related skills you can practice over four days, that build into a coherent metaskill of "ambitious planning, at multiple timescales." 1. ^ I started this project oriented around "find better feedbackloops for solving confusing problems", and later decided that planmaking was the highest leverage part of the skill tree to focus on. Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org

Sep 19, 2024 • 13min

LW - Laziness death spirals by PatrickDFarley

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Laziness death spirals, published by PatrickDFarley on September 19, 2024 on LessWrong. I've claimed that Willpower compounds and that small wins in the present make it easier to get bigger wins in the future. Unfortunately, procrastination and laziness compound, too. You're stressed out for some reason, so you take the evening off for a YouTube binge. You end up staying awake a little later than usual and sleeping poorly. So the next morning you feel especially tired; you snooze a few extra times. In your rushed morning routine you don't have time to prepare for the work meeting as much as you'd planned to. So you have little to contribute during the meeting. You feel bad about your performance. You escape from the bad feelings with a Twitter break. But Twitter is freaking out. Elon Musk said what? Everyone is weighing in. This is going to occupy you intermittently for the rest of the day. And so on. Laziness has a kind of independent momentum to it. When you're having a day like the above, even if you consciously commit to getting back on track, the rut tends to find its way back to you within a couple of hours. Keep this up for a few days and your sleep is utterly messed up, and you walk around in a fog. Keep it up for a week or two and you're fully off your workout routine. In a month or two, you might have noticeably fallen behind on work; you might be absent from your social life; you might've visibly gained fat or lost muscle; you can no longer feel excited about your personal goals because they're behind a pile of mundane tasks you need to catch up on first. And so on. How do we stop the vicious circle? I'm spiraling! I'm spiraling! When you're in a laziness death spiral, it's hard to do anything deliberate. The first and most important step, which does take some willpower but not a lot, is to acknowledge, "I'm in a laziness death spiral today." If you don't acknowledge it, here's what happens: You vaguely notice you you've been wasting time today; you feel a twinge of guilt, so you quickly decide, "I'm going to turn the rest of the day around, starting right now." And does that work? Often it doesn't! Sure, after a small lapse you can just get back on track, but if enough laziness momentum has built up, a momentary reaction doesn't cut it. Deciding things quickly, in response to negative emotions, is exactly how you got into this situation! You're going to turn it around on a whim? You'll have a different whim in the next hour; what then? You need to take a step back and get your mind outside of the problem. Do what you can The next three sections are three different courses of action you can take to get out of a laziness death spiral. One of them is clearly preferable, but I'm writing the alternatives, too. When you're in a low-willpower state, it's often bad to attempt the very best solution - the farther you reach, the harder you can fall. Building a base of "small wins" is the reliable way to repair your willpower. If you start something lofty and then bail on it, you're doing real damage: logging another willpower failure and associating that "very best solution" with failure. Here are the moves: A) Emergency recovery If you're in a laziness spiral and you need to get out of it right now, there are some measures you can take that, while effective, are not ideal. They are unsustainable, promote bad habits, or are just generally unhealthy. But sometimes the need is there: maybe you have a deadline fast approaching (and the deadline itself isn't enough to snap you into action); maybe your friends or family need you to take care of something today; maybe you were in the middle of an awfully lazy day and a once-in-a-lifetime opportunity came up, and you just can't focus enough to act on it. Disclaimer: I believe that in a well planned life, none of these sho...

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app

The Nonlinear Library

Latest episodes

No new episodes will be published here. To keep listening to the EAF & LW, listen to this episode for instructions.

LW - Augmenting Statistical Models with Natural Language Parameters by jsteinhardt

LW - Glitch Token Catalog - (Almost) a Full Clear by Lao Mein

LW - Investigating an insurance-for-AI startup by L Rudolf L

LW - Applications of Chaos: Saying No (with Hastings Greer) by Elizabeth

LW - Work with me on agent foundations: independent fellowship by Alex Altair

LW - o1-preview is pretty good at doing ML on an unknown dataset by Håvard Tveit Ihle

EA - The Best Argument is not a Simple English Yud Essay by Jonathan Bostock

LW - Interested in Cognitive Bootcamp? by Raemon

LW - Laziness death spirals by PatrickDFarley

The AI-powered Podcast Player