The Nonlinear Library: LessWrong cover image

The Nonlinear Library: LessWrong

The Nonlinear Library allows you to easily listen to top EA and rationalist content on your podcast player. We use text-to-speech software to create an automatically updating repository of audio content from the EA Forum, Alignment Forum, LessWrong, and other EA blogs. To find out more, please visit us at nonlinear.org

Latest episodes

Aug 31, 2024 • 27min

LW - AI for Bio: State Of The Field by sarahconstantin

Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: AI for Bio: State Of The Field, published by sarahconstantin on August 31, 2024 on LessWrong. AI for biotech, particularly with drug discovery applications, has been used for more than a decade, with ambiguous success. But in the era of foundation models we may have experienced a step change in what's possible. I used to work on AI-for-drug-discovery years ago, at Recursion, where we sought to identify phenotypes of genetic diseases visible in microscopic images of cells, and screen for drugs that made the cells visually "look healthy" in the hopes that those drugs would also turn out to be effective against the symptoms of the disease. Circa 2016, we were just beginning to transition from the old-fashioned sort of machine learning based heavily on feature engineering, to the new "deep learning" paradigm with much larger neural nets. "Old-school" machine learning was often accused of being nothing more than logistic regression in fancy VC-funded branding, and there was often some truth to that. When our models worked best, they were picking up human-interpretable phenotypes that a pathologist could probably have described decades ago: something like "this disease causes enlarged nuclei". And, when we first started replacing the old models with deep neural nets, it wasn't clear that the New Hotness was going to work better than the Old Standby. But things have changed. Bigger, better models (often Transformer-based) are everywhere in biotech. They genuinely seem to be changing the state of the art in drug (and biologic) development. And it's past time to do a serious review of what's become available and what it can and can't do. AI optimists who aren't familiar with biotech are often wildly miscalibrated about what AI tools can do even in the best case scenario. The average approved drug in the US costs $879.3 million[1] in R&D expenses (counting the costs of failed drugs), and nearly 90% of that is spent on clinical trials. It's legally, scientifically, and ethically necessary to test drugs on humans to see if they're safe and effective. And while the ballooning cost of running clinical trials is a problem worth tackling in itself[2], it's inherently time- and labor-intensive to run valid experiments on human patients. An AI is never going to "design a drug" that you can give to patients right away. Even if the AI were a perfect all-knowing oracle, pharmaceutical companies would still need to run animal and then human trials. AI for biotech is attempting to automate and improve particular sub-problems within that 10% of costs spent on drug discovery and preclinical research. This is hardly trivial, especially if it enables the development of new classes of drugs that were completely inaccessible before. But it does place AI hype in context. An AI model's value to the drug discovery process is bounded by: the labor cost of the time it saves on more manual processes the cost it saves on any experiments it can fully replace the cost of any failed experiments it can prevent from being done altogether the value of any new successful therapies that would not even have been attempted without the model If the model tells you to do something you would probably have done anyway, it's useless. If the model replaces something you would have needed to do manually, it's somewhat useful. If the model increases your odds of a successful therapy, it's extremely useful, and if it adds successful therapies it's world-changing. With that paradigm set up, let's dig into the details. This won't be an exhaustive list of models, or an in-depth evaluation of their performance, but an overview of the big, influential, and buzzy and a summary of what they do. Structure Prediction Models One class of AI models with biotech applications tackles one of the most classically fiendish problems in c...

Aug 30, 2024 • 27min

LW - Principles for the AGI Race by William S

Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Principles for the AGI Race, published by William S on August 30, 2024 on LessWrong. Crossposted from https://williamrsaunders.substack.com/p/principles-for-the-agi-race Why form principles for the AGI Race? I worked at OpenAI for 3 years, on the Alignment and Superalignment teams. Our goal was to prepare for the possibility that OpenAI succeeded in its stated mission of building AGI (Artificial General Intelligence, roughly able to do most things a human can do), and then proceed on to make systems smarter than most humans. This will predictably face novel problems in controlling and shaping systems smarter than their supervisors and creators, which we don't currently know how to solve. It's not clear when this will happen, but a number of people would throw around estimates of this happening within a few years. While there, I would sometimes dream about what would have happened if I'd been a nuclear physicist in the 1940s. I do think that many of the kind of people who get involved in the effective altruism movement would have joined, naive but clever technologists worried about the consequences of a dangerous new technology. Maybe I would have followed them, and joined the Manhattan Project with the goal of preventing a world where Hitler could threaten the world with a new magnitude of destructive power. The nightmare is that I would have watched the fallout of bombings of Hiroshima and Nagasaki with a growing gnawing panicked horror in the pit of my stomach, knowing that I had some small share of the responsibility. Maybe, like Albert Einstein, I would have been unable to join the project due to a history of pacifism. If I had joined, I like to think that I would have joined the ranks of Joseph Rotblat and resigned once it became clear that Hitler would not get the Atomic Bomb. Or joined the signatories of the Szilárd petition requesting that the bomb only be used after terms of surrender had been publicly offered to Japan. Maybe I would have done something to try to wake up before the finale of the nightmare. I don't know what I would have done in a different time and place, facing different threats to the world. But as I've found myself entangled in the ongoing race to build AGI, it feels important to reflect on the lessons to learn from history. I can imagine this alter ego of myself and try to reflect on how I could take right actions in both this counterfactual world and the one I find myself in now. In particular, what could guide me to the right path even when I'm biased, subtly influenced by the people around me, misinformed, or deliberately manipulated? Simply trying to pick the action you think will lead to the best consequences for the world fails to capture the ways in which your model of the world is wrong, or your own thinking is corrupt. Joining the Manhattan Project, and using the weapons on Japan both had plausible consequentialist arguments supporting them, ostensibly inviting a lesser horror into the world to prevent a greater one. Instead I think the best guiding star to follow is reflecting on principles, rules which apply in a variety of possible worlds, including worlds in which you are wrong. Principles that help you gather the right information about the world. Principles that limit the downsides if you're wrong. Principles that help you tell whether you're in a world where racing to build a dangerous technology first is the best path, or you're in a world where it's a hubristic self-delusion. This matches more with the idea of rule consequentialism than pure act consequentialism: instead of making each decision based on what you think is best, think about what rules would be good for people to adopt if they were in a similar situation. My goal in imagining these principles is to find principles that prevent errors of the following forms...

Aug 30, 2024 • 30min

LW - Singular learning theory: exercises by Zach Furman

Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Singular learning theory: exercises, published by Zach Furman on August 30, 2024 on LessWrong. Thanks to Jesse Hoogland and George Wang for feedback on these exercises. In learning singular learning theory (SLT), I found it was often much easier to understand by working through examples, rather than try to work through the (fairly technical) theorems in their full generality. These exercises are an attempt to collect the sorts of examples that I worked through to understand SLT. Before doing these exercises, you should have read the Distilling Singular Learning Theory (DSLT) sequence, watched the SLT summit YouTube videos, or studied something equivalent. DSLT is a good reference to keep open while solving these problems, perhaps alongside Watanabe's textbook, the Gray Book. Note that some of these exercises cover the basics, which are well-covered in the above distillations, but some deliver material which will likely be new to you (because it's buried deep in a textbook, because it's only found in adjacent literature, etc). Exercises are presented mostly in conceptual order: later exercises freely use concepts developed in earlier exercises. Starred (*) exercises are what I consider the most essential exercises, and the ones I recommend you complete first. 1. *The normal distribution, like most classical statistical models, is a regular (i.e. non-singular[1]) statistical model. A univariate normal model with unit variance and mean μR is given by the probability density p(x|μ)=12πexp(12(xμ)2). Assume the true distribution q(x) of the data is realizable by the model: that is, q(x)=p(x|μ0) for some true parameter μ0. 1. Calculate the Fisher information matrix of this model (note that since we have only a single parameter, the FIM will be a 1x1 matrix). Use this to show the model is regular. 2. Write an explicit expression for the KL divergence K(μ) between q(x) and p(x|μ), as a function of the parameter μ. This quantity is sometimes also called the population loss. [See Example 1.1, Gray Book, for the case of a 2D normal distribution] 3. Using K(μ) from b), give an explicit formula for the volume of "almost optimal" parameters, V(ϵ)={μK(μ) 4. The volume scaling formula for the learning coefficient λ (also known as RLCT[2]) is λ=limϵ0log(V(aϵ)/V(ϵ))log(a) for any a1 [Theorem 7.1, Gray Book]. Using this formula, combined with the expression for V(ϵ) derived in b), calculate the learning coefficient[3]. Given that the model is regular, we expect the learning coefficient to be d2=12; compare your answer. 2. *We can make the normal distribution a singular model by changing the parameterization. Let a cubicly-parameterized normal model be the model p(x|μ)=12πexp(12(xμ3)2). Assume the true parameter is μ0. 1. Show that the cubicly-parameterized normal model is just as expressive as an ordinary normal model: that is, they both can express all univariate normal distributions. 2. Repeat 1a) with this model; calculate the Fisher information matrix to demonstrate that the model is singular, and find which parameters μ are singular. 3. Repeat 1b) - 1d) to calculate the learning coefficient this model, for μ0=0, and for μ00. Recall that the learning coefficient is a volume scaling exponent, such that V(ϵ)ϵλ [4] as ϵ0. Based on this, interpret your results. How does this make the cubicly-parameterized normal model different from the ordinary normal model? 4. Instead of taking ϵ0 to get the learning coefficient, fix a small but nonzero value for ϵ, such as ϵ=0.01. As we saw from c), the learning coefficient changes discontinuously when μ0=0 - what happens with V(ϵ) as μ0 gets close to zero? What changes if you make ϵ smaller or larger? Even though the asymptotic learning coefficien...

Aug 30, 2024 • 16min

LW - Nursing doubts by dynomight

Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Nursing doubts, published by dynomight on August 30, 2024 on LessWrong. If you ask the internet if breastfeeding is good, you will soon learn that YOU MUST BREASTFEED because BREAST MILK = OPTIMAL FOOD FOR BABY. But if you look for evidence, you'll discover two disturbing facts. First, there's no consensus about why breastfeeding is good. I've seen experts suggest at least eight possible mechanisms: 1. Formula can't fully reproduce the complex blend of fats, proteins and sugars in breast milk. 2. Formula lacks various bio-active things in breast milk, like antibodies, white blood cells, oligosaccharides, and epidermal growth factor. 3. If local water is unhealthy, then the mother's body acts as a kind of "filter". 4. Breastfeeding may have psychological/social benefits, perhaps in part by releasing oxytocin in the mother. 5. Breastfeeding decreases fertility, meaning the baby may get more time before resources are redirected to a younger sibling. 6. Breastfeeding may help mothers manage various post-birth health issues? 7. Infants are often given formula while lying on their backs, which might lead to fluid buildup in the ears and thus temporary hearing loss during a critical development period? 8. Breastfeeding is cheaper?? Second, the evidence for breastfeeding is overwhelmingly observational: It's not based on experiments, but rather looking at the existing population and "observing" that breastfeeding is correlated with having mildly fewer infections (of many kinds) and slightly lower obesity. It may also be correlated with better outcomes in terms of allergies, diabetes, lymphoma, colitis, Crohn's disease, or later IQ. Observational evidence is disturbing because correlations are bad. Even if breastfeeding did nothing, people think it's good, so the same parents who breastfeed more tend to have higher socioeconomic status and provide lots of other goodies too. Babies that wear baby Rolex watches are probably healthier on average. But that's because their parents are rich, not because Rolexes are good for you. Could breastfeeding be like that? Of course, experts are aware of this issue. They try to compensate for it by "controlling" for upstream variables. The most-cited meta-analysis on breastfeeding and IQ collected 18 papers that each controlled for different things, like parental education, social status, or how much social interaction the baby got. The control variables seemed to matter a lot: Among studies that… Breastfeeding associated with a… Did not control for maternal IQ 4.1 IQ point increase Controlled for maternal IQ 2.6 IQ point increase But what about paternal IQ? Might smarter dads convince mothers to breastfeed more? What if you forgot to control for something, or your data was noisy, or the relationship is nonlinear? (What if smarter babies manipulate their mothers into breastfeeding more?) If any of that happens, then correlations will probably exaggerate the causal impact of breastfeeding. So there's been a small movement in recent years to push back against Big Nurse, to argue that, despite the public health messaging, there is no clear evidence that breastfeeding is beneficial. (See Stuart Richie at Science Fictions or Emily Oster at Five Thirty Eight or The Guardian for good versions of this argument.) Naturally, I am sympathetic. Down with groupthink! Down with control variables! Down with putting so much pressure on mothers based on weak evidence! Except… Imagine you just gave birth on a desert island - one that for some reason has an unlimited supply of formula. You're considering breastfeeding your baby, but you can't read any studies. What should you do? Well, there's an obvious evolutionary argument. Maybe the epidermal growth factor and obscure mix of fats in breast milk are crucial. Or maybe they aren't. But they're probably not bad...

Aug 30, 2024 • 24min

LW - Things I learned talking to the new breed of scientific institution by Abhishaike Mahajan

Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Things I learned talking to the new breed of scientific institution, published by Abhishaike Mahajan on August 30, 2024 on LessWrong. Note: this article is sponsored by and cross-posted to the Good Science Project. They also write a fair bit, and their articles were essential reading for writing this essay! Also, this article would not be possible without the hours of discussion/editing help I've had with several people from these institutions, and a few outside of them. Huge shout-out to all of them! Introduction Arcadia Science, Speculative Technologies, FutureHouse, Arc, and Convergent. All of these are a new form of scientific institute. Most are funded entirely by a few billionaires. Most are non-profits. Most of them focus on the life-sciences. Most of them have sprung up in just the last few years. They do all also have one common thread: a grand statement. We are an experiment in a new way to do science. And they are! Traditionally, research is conducted in academic or private industry labs - dependent on NIH grants in the former and markets in the latter. Given the (often singular) sources of no-strings-attached funding, these new institutions need not satisfy either the NIH or the markets, allowing them to conduct research in a unique fashion. In one sense, the experimental aspect of these institutions revolves around the focus of the research itself, addressing fields or using methods that the founders - correctly or not - view as underserved/underutilized. But, on a more subtle level, the experimental aspect could be more closely tied to the culture of these organizations. Institutions like Arcadia, FutureHouse, and the rest could be viewed as the production of auteurs - a term from filmmaking for films with such a heavy sense of the director's personal taste that the film is inseparable from the director. This is where the novelty within these institutions primarily lie, in how the founders of the institute wish science was conducted. And wielding billions of dollars, thousands of hours of work, and hundreds of scientists as a means to test whether their theories are correct. Of course, nothing under the sun is truly new. There is an age-old history of scientist dissatisfaction with how 'things are traditionally done', and confidently building new institutions to solve the problems they've seen. Many of these are now household names amongst researchers: Broad Institute, Whitehead Institute, Max Planck Society, Howard Hughes Medical Institute (HHMI), and so on. Each of these were started with similar contrarian mentalities as the current era of institutions. Some of these were more experimental than others, most notably HHMI, which prized itself on its focus on interdisciplinary research above all else. But all were experiments, many of them extraordinarily successful. Yet, the current iteration of new research institutes is still arguably more experimental than its ancestors. While the last generation of institutes was typically tied directly to universities, the current era of ones (outside of Arc) are independent, allowing them a larger sense of opinionation on how science should be done. But, despite this experimentation, there is relatively little information out there on what's going on inside them. Not in terms of science, but more-so the vibes. While aspects of these organizations have been written about previously, such as in articles in The Atlantic and Endpoints, they aren't assessing vibes! These other articles are, first and foremost, news-pieces; valuable, but lack any opinionated observations on the inner-workings of the institutions. Nadia Asparouhova's essay on the subject comes closest to this regarding the history of these institutions, but still few details on how they practically function. This essay attempts to discuss that missing s...

Aug 30, 2024 • 13min

LW - Solving adversarial attacks in computer vision as a baby version of general AI alignment by stanislavfort

Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Solving adversarial attacks in computer vision as a baby version of general AI alignment, published by stanislavfort on August 30, 2024 on LessWrong. I spent the last few months trying to tackle the problem of adversarial attacks in computer vision from the ground up. The results of this effort are written up in our new paper Ensemble everything everywhere: Multi-scale aggregation for adversarial robustness (explainer on X/Twitter). Taking inspiration from biology, we reached state-of-the-art or above state-of-the-art robustness at 100x - 1000x less compute, got human-understandable interpretability for free, turned classifiers into generators, and designed transferable adversarial attacks on closed-source (v)LLMs such as GPT-4 or Claude 3. I strongly believe that there is a compelling case for devoting serious attention to solving the problem of adversarial robustness in computer vision, and I try to draw an analogy to the alignment of general AI systems here. 1. Introduction In this post, I argue that the problem of adversarial attacks in computer vision is in many ways analogous to the larger task of general AI alignment. In both cases, we are trying to faithfully convey an implicit function locked within the human brain to a machine, and we do so extremely successfully on average. Under static evaluations, the human and machine functions match up exceptionally well. However, as is typical in high-dimensional spaces, some phenomena can be relatively rare and basically impossible to find by chance, yet ubiquitous in their absolute count. This is the case for adversarial attacks - imperceptible modifications to images that completely fool computer vision systems and yet have virtually no effect on humans. Their existence highlights a crucial and catastrophic mismatch between the implicit human vision function and the function learned by machines - a mismatch that can be exploited in a dynamic evaluation by an active, malicious agent. Such failure modes will likely be present in more general AI systems, and our inability to remedy them even in the more restricted vision context (yet) does not bode well for the broader alignment project. This is a call to action to solve the problem of adversarial vision attacks - a stepping stone on the path to aligning general AI systems. 2. Communicating implicit human functions to machines The basic goal of computer vision can be viewed as trying to endow a machine with the same vision capabilities a human has. A human carries, locked inside their skull, an implicit vision function mapping visual inputs into semantically meaningful symbols, e.g. a picture of a tortoise into a semantic label tortoise. This function is represented implicitly and while we are extremely good at using it, we do not have direct, conscious access to its inner workings and therefore cannot communicate it to others easily. To convey this function to a machine, we usually form a dataset of fixed images and their associated labels. We then use a general enough class of functions, typically deep neural networks, and a gradient-based learning algorithm together with backpropagation to teach the machine how to correlate images with their semantic content, e.g. how to assign a label parrot to a picture of a parrot. This process is extremely successful in communicating the implicit human vision function to the computer, and the implicit human and explicit, learned machine functions agree to a large extent. The agreement between the two is striking. Given how different the architectures are (a simulated graph-like function doing a single forward pass vs the wet protein brain of a mammal running continuous inference), how different the learning algorithms are (gradient descent with backpropagation vs something completely different but still unknown), and how differ...

Aug 29, 2024 • 50min

LW - AI #79: Ready for Some Football by Zvi

Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: AI #79: Ready for Some Football, published by Zvi on August 29, 2024 on LessWrong. I have never been more ready for Some Football. Have I learned all about the teams and players in detail? No, I have been rather busy, and have not had the opportunity to do that, although I eagerly await Seth Burn's Football Preview. I'll have to do that part on the fly. But oh my would a change of pace and chance to relax be welcome. It is time. The debate over SB 1047 has been dominating for weeks. I've now said my peace on the bill and how it works, and compiled the reactions in support and opposition. There are two small orders of business left for the weekly. One is the absurd Chamber of Commerce 'poll' that is the equivalent of a pollster asking if you support John Smith, who recently killed your dog and who opponents say will likely kill again, while hoping you fail to notice you never had a dog. The other is a (hopefully last) illustration that those who obsess highly disingenuously over funding sources for safety advocates are, themselves, deeply conflicted by their funding sources. It is remarkable how consistently so many cynical self-interested actors project their own motives and morality onto others. The bill has passed the Assembly and now it is up to Gavin Newsom, where the odds are roughly 50/50. I sincerely hope that is a wrap on all that, at least this time out, and I have set my bar for further comment much higher going forward. Newsom might also sign various other AI bills. Otherwise, it was a fun and hopeful week. We saw a lot of Mundane Utility, Gemini updates, OpenAI and Anthropic made an advance review deal with the American AISI and The Economist pointing out China is non-zero amounts of safety pilled. I have another hopeful iron in the fire as well, although that likely will take a few weeks. And for those who aren't into football? I've also been enjoying Nate Silver's On the Edge. So far, I can report that the first section on gambling is, from what I know, both fun and remarkably accurate. Table of Contents 1. Introduction. 2. Table of Contents. 3. Language Models Offer Mundane Utility. Turns out you did have a dog. Once. 4. Language Models Don't Offer Mundane Utility. The AI did my homework. 5. Fun With Image Generation. Too much fun. We are DOOMed. 6. Deepfaketown and Botpocalypse Soon. The removal of trivial frictions. 7. They Took Our Jobs. Find a different job before that happens. Until you can't. 8. Get Involved. DARPA, Dwarkesh Patel, EU AI Office. Last two in SF. 9. Introducing. Gemini upgrades, prompt engineering guide, jailbreak contest. 10. Testing, Testing. OpenAI and Anthropic formalize a deal with the US's AISI. 11. In Other AI News. What matters? Is the moment over? 12. Quiet Speculations. So many seem unable to think ahead even mundanely. 13. SB 1047: Remember. Let's tally up the votes. Also the poll descriptions. 14. The Week in Audio. Confused people bite bullets. 15. Rhetorical Innovation. Human preferences are weird, yo. 16. Aligning a Smarter Than Human Intelligence is Difficult. 'Alignment research'? 17. People Are Worried About AI Killing Everyone. The Chinese, perhaps? 18. The Lighter Side. Got nothing for you. Grab your torches. Head back to camp. Language Models Offer Mundane Utility Chat with Scott Sumner's The Money Illusion GPT about economics, with the appropriate name ChatTMI. It's not perfect, but he says it's not bad either. Also, did you know he's going to Substack soon? Build a nuclear fusor in your bedroom with zero hardware knowledge, wait what? To be fair, a bunch of humans teaching various skills and avoiding electrocution were also involved, but still pretty cool. Import things automatically to your calendar, generalize this it seems great. Mike Knoop (Co-founder Zapier and Arc Prize): Parent tip: you can upload a ph...

Aug 29, 2024 • 8min

LW - How to hire somebody better than yourself by lukehmiles

Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: How to hire somebody better than yourself, published by lukehmiles on August 29, 2024 on LessWrong. TLDR: Select candidates heterogeneously, then give them all a very hard test, only continue with candidates that do very well (accept that you lose some good ones), and only then judge on interviews/whatever. I'm no expert but I've made some recommendations that turned out pretty well -- maybe like 5 ever. This post would probably be better if I waited 10 years to write it. Nonetheless, I think my method is far better than what most orgs/corps do. If you have had mad hiring success (judging by what your org accomplished) then please comment! Half-remembered versions of Paul Graham's taste thing and Yudkowsky's Vinge's Law have lead some folks to think that judging talent above your own is extremely difficult. I do not think so. Prereqs: It's the kind of position where someone super good at it can generate a ton of value - eg sales/outreach, coding, actual engineering, research, management, ops, ... Lots of candidates are available and you expect at least some of them are super good at the job. You have at least a month to look. It's possible for someone to demonstrate extreme competence at this type of job in a day or two. Your org is trying to do a thing - rather than be a thing. You want to succeed at that thing - ie you don't have some other secret goal. Your goal with hiring people is to do that thing better/faster - ie you don't need more friends or a prestige bump. Your work situation does not demand that you look stand-out competent - ie you don't unemploy yourself if you succeed in hiring well. You probably don't meet the prereqs. You are probably in it for the journey more than the destination; your life doesn't improve if org goals are achieved; your raises depend on you not out-hiring yourself; etc. Don't feel bad - it is totally ok to be an ordinary social creature! Being a goal psycho often sucks in every way except all the accomplished goals. If you do meet the prereqs, then good news, hiring is almost easy. You just need to find people who are good at doing exactly what you need done. Here's the method: Do look at performance (measure it yourself) Accept noise Don't look at anything else (yet) Except that they work hard Do look at performance Measure it yourself. Make up a test task. You need something that people can take without quitting their jobs or much feedback from you; you and the candidate should not become friends during the test; a timed 8-hour task is a reasonable starting point. Most importantly, you must be able to quickly and easily distinguish good results from very good results. The harder the task, the easier it is to judge the success of top attempts. If you yourself cannot complete the task at all, then congratulations, you now have a method to judge talent far above your own. Take that, folk Vinge's law. Important! Make the task something where success really does tell you they'll do the job well. Not a proxy IQ test or leetcode. The correlation is simply not high enough. Many people think they just need to hire someone generally smart and capable. I disagree, unless your org is very large or nebulous. This task must also not be incredibly lame or humiliating, or you will only end up hiring people lacking a spine. (Common problem.) Don't filter out the spines. It can be hard to think of a good test task but it is well worth all the signal you will get. Say you are hiring someone to arrange all your offices. Have applicants come arrange a couple offices and see if people like it. Pretty simple. Say you are hiring someone to build a house. Have contractors build a shed in one day. Ten sheds only cost like 5% of what a house costs, but bad builders will double your costs and timeline. Pay people as much as you can for their time and the...

Aug 28, 2024 • 2min

LW - "Deception Genre" What Books are like Project Lawful? by Double

Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: "Deception Genre" What Books are like Project Lawful?, published by Double on August 28, 2024 on LessWrong. This post is spoiler-free I just finished Project Lawful, a really long, really weird book by Eliezer Yudkowsky. The book's protagonist is a knowledgable and perceptive target. A conspiracy forms around the target to learn from him while keeping him from finding out that helping them is not in the target's best interests. The book is written from the perspective of both the target and the conspiracists. The target notices inconsistencies and performs experiments to test his false reality while also acting in the fabricated reality according to his interests. The conspiracists frantically try to keep the target from catching them or building enough evidence against them that he concludes they have been lying. This is a description of (part of) the plot of Project Lawful. But this could be the description of an entire genre! If the genre doesn't already have a name, it could be the "Deception Genre." Another work in this category would be The Truman Show, which fits the deception and the target's escape within a <2hr movie runtime. Other stories with lying don't really have the same structure. Walter White in Breaking Bad is trying to keep his crimes hidden but isn't constructing a false reality around the cops or his family. Death Note comes close, though Light tries to mislead L about specifically who Kira is and how the Death Note works rather than constructing an entire false reality around L. Many stories about dystopias have the protagonists discover that their realities are false, but fewer of those feature the perspectives of the conspiracists frantically trying to keep the deception running. Do you know any other stories in the Deception Genre? Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org

Aug 28, 2024 • 4min

LW - things that confuse me about the current AI market. by DMMF

Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: things that confuse me about the current AI market., published by DMMF on August 28, 2024 on LessWrong. Paging Gwern or anyone else who can shed light on the current state of the AI market - I have several questions. Since the release of ChatGPT, at least 17 companies, according to the LMSYS Chatbot Arena Leaderboard, have developed AI models that outperform it. These companies include Anthropic, NexusFlow, Microsoft, Mistral, Alibaba, Hugging Face, Google, Reka AI, Cohere, Meta, 01 AI, AI21 Labs, Zhipu AI, Nvidia, DeepSeek, and xAI. Since GPT-4's launch, 15 different companies have reportedly created AI models that are smarter than GPT-4. Among them are Reka AI, Meta, AI21 Labs, DeepSeek AI, Anthropic, Alibaba, Zhipu, Google, Cohere, Nvidia, 01 AI, NexusFlow, Mistral, and xAI. Twitter AI (xAI), which seemingly had no prior history of strong AI engineering, with a small team and limited resources, has somehow built the third smartest AI in the world, apparently on par with the very best from OpenAI. The top AI image generator, Flux AI, which is considered superior to the offerings from OpenAI and Google, has no Wikipedia page, barely any information available online, and seemingly almost no employees. The next best in class, Midjourney and Stable Diffusion, also operate with surprisingly small teams and limited resources. I have to admit, I find this all quite confusing. I expected companies with significant experience and investment in AI to be miles ahead of the competition. I also assumed that any new competitors would be well-funded and dedicated to catching up with the established leaders. Understanding these dynamics seems important because they influence the merits of things like a potential pause in AI development or the ability of China to outcompete the USA in AI. Moreover, as someone with general market interests, the valuations of some of these companies seem potentially quite off. So here are my questions: 1. Are the historically leading AI organizations - OpenAI, Anthropic, and Google - holding back their best models, making it appear as though there's more parity in the market than there actually is? 2. Is this apparent parity due to a mass exodus of employees from OpenAI, Anthropic, and Google to other companies, resulting in the diffusion of "secret sauce" ideas across the industry? 3. Does this parity exist because other companies are simply piggybacking on Meta's open-source AI model, which was made possible by Meta's massive compute resources? Now, by fine-tuning this model, can other companies quickly create models comparable to the best? 4. Is it plausible that once LLMs were validated and the core idea spread, it became surprisingly simple to build, allowing any company to quickly reach the frontier? 5. Are AI image generators just really simple to develop but lack substantial economic reward, leading large companies to invest minimal resources into them? 6. Could it be that legal challenges in building AI are so significant that big companies are hesitant to fully invest, making it appear as if smaller companies are outperforming them? 7. And finally, why is OpenAI so valuable if it's apparently so easy for other companies to build comparable tech? Conversely, why are these no name companies making leading LLMs not valued higher? Of course, the answer is likely a mix of the factors mentioned above, but it would be very helpful if someone could clearly explain the structures affecting the dynamics highlighted here. Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

App store banner

Play store banner