

What can we learn from AI exposure measures?
In a Justified Posteriors first, hosts Seth Benzell and Andrey Fradkin sit down with economist Daniel Rock, assistant professor at Wharton and AI2050 Schmidt Science Fellow, to unpack his groundbreaking research on generative AI, productivity, exposure scores, and the future of work. Through a wide-ranging and insightful conversation, the trio examines how exposure to AI reshapes job tasks and why the difference between exposure and automation matters deeply.
Links to the referenced papers, as well as a lightly edited transcript of our conversation, with timestamps are below:
Timestamps:
[00:08] – Meet Daniel Rock[02:04] – Why AI? The MIT Catalyst Moment[04:27] – Breaking Down “GPTs are GPTs”[09:37] – How Exposed Are Our Jobs?[14:49] – What This Research Changes[16:41] – What Exposure Scores Can and Can’t Tell Us[20:10] – How LLMs Are Already Being Used[27:31] – Scissors, Wage Gaps & Task Polarization[38:22] – Specialization, Modularity & the New Tech Workplace[43:43] – The Productivity J-Curve[53:11] – Policy, Risk & Regulation[1:09:54] – Final Thoughts + Call to ActionShow Notes/Media Mentioned:
* “GPTs are GPTs” – Rock et al.’s paper
* https://arxiv.org/abs/2303.10130
* “The Future of Employment: How susceptible are jobs to computerization?” - Frey and Osborne (2013)
* https://www.oxfordmartin.ox.ac.uk/publications/the-future-of-employment
* “AI exposure predicts unemployment risk: A new approach to technology-driven job loss”— Morgan Frank's paper
* https://academic.oup.com/pnasnexus/article/4/4/pgaf107/8104152
* "Simple Macroeconomics of AI" – By Daron Acemoglu.
* https://economics.mit.edu/sites/default/files/2024-04/The%20Simple%20Macroeconomics%20of%20AI.pdf
* “The Dynamo and the Computer” – Paul A. David
* “Productivity J-Curve” – Erik Brynjolfsson and Chad Syverson
* https://www.nber.org/system/files/working_papers/w25148/w25148.pdf
* “Generative AI for Economic Research: Use Cases and Implications for Economists”– Anton Korinek’s paper
* Kremer’s O-ring Theory
* https://fadep.org/wp-content/uploads/2024/03/D-63_THE_O-RING_THEORY.pdf
* 12 Monkeys (film) – Directed by Terry Gilliam
* Generative AI for Economic Research - Anton Korinek.
* https://www.aeaweb.org/content/file?id=21904
Transcript:
Andrey: Welcome to the Justified Posteriors Podcast, the podcast that updates its beliefs about the economics of AI and technology. I'm Seth Benzell, exposed to and exposing myself to the AI since 2015, coming to you from Chapman University in sunny southern California.
Andrey: I'm Andrey Fradkin, riding the J curve of productivity into infinity, coming to you from Cambridge, Massachusetts. Today, we're delighted to have a friend from the show, Daniel Rock, as our inaugural interview guest.
Daniel: Hey, guys.
Andrey: Daniel is an assistant professor of operations, information, and decisions at the Wharton School, University of Pennsylvania, and is also an AI 2050 Schmidt Science Fellow.
So he is considered one of the bright young minds in the AI world. And it's a real pleasure to get to talk to him about his work and spicy takes, if you will.
Daniel: Well, it's a pleasure to get to be here. I'm a big fan of what you guys are doing. If I had my intro, I'd say I've been enthusiastic about getting machines to do linear algebra for about a decade.
Andrey: Alright, let's get started with some questions. I think before—
Seth: Firstly, how do you pronounce the acronym? O-I-D (Note, OID is the operations, information, and decisions group at Wharton).
Daniel: This is a big debate between the students and the faculty. We always say O-I-D, and the students say OID.
Seth: So our very own. OID boy. All right, you can ask the serious question.
Andrey: Before we get into any of the specific papers, I think one of the things that distinguishes Daniel from many other academics in our circle is that he took AI very seriously as a subject of inquiry for social sciences very early, before almost anyone else. So, what led you to that? Like, why were you so ahead of everyone else?
Daniel: I'm not sure. Well, it's all relative, I suppose, but there's the very far back answer, which we can talk about later as we talk about the kind of labor and AI. And then, there is the sort of Core Catalyst Day. I kind of remember it. so back at the M-I-T-I-D-E, where we've all spent time and gotten to know each other in 2013,
Seth: What is the M-I-T-I-D-E?
Daniel: The MIT Initiative on the Digital Economy, Erik Bryjnolffson’s research group. I was one of Erik's PhD students. My first year, we had a seminar speaker from the Computer Science and Artificial Intelligence Lab, CSAIL. John Leonard was talking about self-driving cars, and he came out there, and he said, “Look, Google's cheating. They're putting sensors in the road. We're building the real deal: cars that can drive themselves in all sorts of different circumstances. And let me be real with all of you. This is not going to be happening anytime soon. It will be decades.”
And there were other people who were knowledgeable about the subject saying, “No, it's coming in like 5 to 10 years.”
And at that point I thought to myself, “Well, if all these really brilliant people can disagree about what's going to happen, surely there's something cool here to try to understand.”
As you're going through econometrics classes, I wouldn't say econometrics is the same thing as AI. We could debate that, but there's enough of an overlap that I could kind of get my head around the optimization routines and things going on in the backend of the AI models and thought, “Well, this is a cool place to learn a lot and, at the same time, maybe say something that other people haven't dug into yet.”
Andrey: Yeah. Very cool. So, with that, I think maybe you can tell us a little bit about your paper GPTs, which is a paper that has had an enormous amount of attention over the years and I think has been quite influential.
Daniel: Yeah, we've been lucky in that sense.
Seth: In two years.
Andrey: that's not—I mean—some version of it was out earlier… No…. Or is it? Has it only really been two years?
Daniel: It has been the longest, , Andrey. If you and I weren't already sort of bald, , it might've been a time period for us to go bald. Yeah, we put it out in March of 2023. I had a little bit of early access to GPT-4. My co-authors can attest to the fact that I rather annoyingly tried to get GPT-4 to delete itself for the first week or two that I had it rather than doing the research we needed to. But yeah, it's only been about two and a half. Okay, so the paper, as I describe it, at least recently, has kind of got a Dickensian quality to it. There is a pessimistic component, there's an optimistic component, and there's a realistic component to it.
So I'll start with the pessimistic, or I'll— why don't I just start with what we do here first? So we go through O*Net's list of tasks., There are 20,000 tasks in O*NET, and for each one of those tasks, we ask a set of humans who are working with OpenAI; they kind of understand what large language models in general are capable of doing.
What would help you cut that time in half? So could you cut the time to do this task in half with a large language model with no drop in quality? And there are three answers. One answer is of course not; that's like flipping a burger or something. Maybe we get large language models imbued into robotics technologies at some point in the future, but it's not quite there yet.
Another answer is, of course, you can. This would be like writing an email or processing billing details or an invoice.
And then there's the middle one, which we call E2. So, E0 is no, E1 is yes, and E2 is yes, you could, but we're going to need to build some additional software and systems around it.
So there's a gain to be had there, but it's not like LLMs are the only component of the system. And the reason we pick other software is because there's a pretty deep literature on how software and information technologies generally require a lot of co-invention, a lot of additional processes, and tangible capital. It makes it difficult to deploy those technologies fruitfully.
And we figured, okay, by comparing that E1 category, the yes you can, with an LLM out-of-the-box, to the E2 category, how much do additional systems and innovation get us? We could say something about whether generative, pre-trained transformers, GPTs, are general-purpose technologies. They'll be pervasive, they improve over time, and they necessitate that kind of complimentary innovation. They change the direction of innovation.
If we can say yes to those three things, then we're in a situation where we get to the pessimistic version of the story. You just can't know what the long-term equilibrium is going to be across different markets as a result of these tools.
So the prognostications that, ‘Oh yes, AI is coming to annihilate all the jobs. That the Machine God is imminent—or at least the Economic Machine God is imminent. I think those are a bit premature if you look and say this is general-purpose technology because historically general-purpose technologies have been hard to predict at the outset.
So the optimistic side of things is that that impact potential is pervasive. There's a lot of benefit to be had in changing how people work. We use this exposure measure—I'm sure we'll get into this—but exposure is not automation. Exposure is potential for change, and if there's potential for fruitful change, we get more value in lots of different places in the economy.
That's a good story we found—and if the reviewer is listening to this, thank you very much. One of our reviewers suggested looking at science and innovation tasks and research and development tasks and seeing how those compare to other areas. We found high levels of exposure in those areas, which means there's potential to turbocharge growth, at least temporarily, hopefully longer term, in the economy.
There’s a temporarily, and an optimistic component on the realistic component. We compare the yes, you can do it temporarily, and better with an LLM here to the yes, you can, but you need more building, the set of tasks that get exposed if you build additional systems. If you were to snap your fingers and say, “Hey, we've got everything we need.”
That's much, much bigger than the stuff that's just exposed to LLMs on its own. So the realistic story is we have a lot of work to do as a society in the global economy to bring about the gains of these tools. And it'll probably take a few decades for it all to play out. As much as we think that the changes have been very quick, it has been a fast two years, or slow, depending on who you ask.
Seth: This has been great. Andrey and I are both bursting with questions. I'll let Andrey go first.
Andrey: I want just a quantification. Like, so what percentage of tasks are exposed according to the first definition? What percentage of tasks are according to the second definition, approximately?
Daniel: Yeah, if I recall correctly, about 14% of tasks, or 15% of tasks, (depending on if you're looking at the human ratings or the GPT-4 ones). GPT-4 and humans tend to agree, by the way. There's some noise there, but if you look at [the] GPT-4 ones, it's about 14% of tasks for E1, the level where it's just LLMs that can help. Now, if you snapped your fingers again and said, Now it's E2 and E1, that's about 46% of tasks. I might have my numbers slightly off there, but that's roughly what the numbers were.
Andrey: And did you calculate what share of occupations have 100% of their tasks?
Daniel: There were very few, if any, occupations that were a hundred percent exposed. I think data scientist was up there, and it depends on the measure, so we actually have three different combinations of these scores. The most conservative is saying it's just E1, and then that's it, and the least conservative is E1 and E2.
We score each task that has either one of those labels as one and E0 as zero. And then there's this kind of intermediate one that I like, but my co-authors don't like as much, where E1 gets a one and E2 gets a 0.5. So it depends on what you look at. Mathematicians were highly exposed. My co-author, Pamela, has gotten some angry emails from mathematicians saying, “No, that can't be.”
I will say I use it for building theory now. I use the language models for building theoretical models, and they do a pretty good job. They make some pretty terrible mistakes occasionally, so you do have to check their work, but to go from a verbal sketch of what you're trying to prove to some math that roughly shows what the setup should be, it makes it easier to be a reviewer instead of a doer, as they say.
Seth: Sure. All right. Okay. A couple questions from me. The first question is: are we talking literally when we are doing these E1 ratings? Are we talking literally about ChatGPT-4, or are we talking kind of generally about LLMs of approximately that quality? Or are we projecting forward to kind of near-future LLMs?
Daniel: Yeah. It was more the latter. We had a sense of where LLM tools were going to go. I think even looking at this set of tools we have now and GPT-4, they're very similar. There are expanded capabilities. It's kind of been a deepening of their capabilities, but the going of the somewhat foreseeable future, especially for my colleagues who had been and co-authors who had been in the weeds with this.
But that does bring up an important weakness of this approach, which is as soon as you see something really qualitatively different or new capabilities showing up, you have to update the rubrics and the method; you have to rerun stuff. I think arguably the reasoning model paradigm is getting to the point where you probably have to rerun things.
Andrey: Are you considering rerunning things? Is this like an ongoing endeavor or—
Daniel: I'm not sure I'm going to return to writing an academic paper. I feel like I've gone to the well one too many times already with this. But if someone else wants to do it, I'm happy to help them out with it. Eric, Mitchell, and I did something in roughly 2016 looking at supervised machine learning and shared some slightly different conclusions, but now that I've been through this twice, I'm not sure that I want to do it just yet.
Andrey: So this is a question that I wanted to kind of raise. 'Cause certainly you guys are not the first to do this sort of exercise, and you've done it before. Frey & Osborne have done it. I remember when I was thinking about these exercises; when I first saw them back in 2017-2018, I was like, “This is an accounting exercise. This is actually useful.” How do you determine in what sense this type of work—
Seth: To throw another critique of this whole research agenda out there. We talk about Frey and Osborne coming out with one of these a decade ago. You talk about your own SML experiences. I know Morgan Frank has a new paper at PNAS Nexus out that compares about 10 different people's different exposure measures.
Daniel: Mm-hmm. Which I'll do different things. Yeah,
Seth: And they're all too; they're all completely different. How should I think about the diversity of these indices?
Daniel: Well, there are different principle components underlying a lot of these different measures. Certainly SML and the GPT scores are very different. And Frey and Osborn—the way they constructed that effectively was—.
Seth: Basically.
Daniel: educated guess vibes with CS professors for a training set.
I think their goal is to measure which jobs, as a whole, could be computerized. Actually, let me answer Andre's question a little bit more directly. Like, when you look at these, what are they useful for? Let me start by saying what they're not useful for. because actually some folks have put words in their mouths on this.
Seth: Including Nobel laureates.
Daniel: No Nobel laureates that I know of, but there are some places and some folks who have who said things like, “If you're exposed, you're hosed.” And this is what the authors tend to value, I will say—
Seth: with the word hosed. You set them up for that.
Daniel: It's possible that that is the case, but I have not seen any data to conclude that that is the case.
So let me state clearly for the record things you do not want to predict with exposure scores. Things that exposure scores are not designed to do: economically meaningful outcomes like wages or employment are not things. I'm not trying to say exposure scores will create unemployment. I'm not saying it'll cause wage loss, and I view it as a risk measure. I'm a recovering finance guy. I think there's a risk that can be good. It can be bad. Like we don't really know. It just means there's an opportunity, technically speaking, to change the types of tasks that people are doing and how they do them. So exposed and hosed are possibly orthogonal ideas.
Nevertheless, I think it's worth tracking now. What else is it not useful for? Besides failing to predict labor market equilibrium. it's not useful for—
Seth: Breakfast?
Daniel: Can what make you breakfast?
Seth: You're—
Daniel: Scores?
Seth: Do you want to list all the things? It's not useful for, excuse me,
Daniel: Exhaustively, yes, we should. You can't eat the scores either. I wouldn't say it's especially useful for saying for sure that this is going to happen, right? Like, if a technical thing that could help someone do a role does not necessarily mean it's appropriate socially, legally, or politically.
There's a whole bunch of different places where using LLMs might be inappropriate. One example, a famous one, is Jeff Hinton, who predicted that radiology demand would drop. And I think we are seeing, say, an appropriate example of where a multimodal model would be helpful in radiology.
It could probably pick up a broken bone, but radiologists as data-enabled doctors have a lot of other components to their work, and they interpret difficult cases. If you're going to tell someone about a condition that they've gotten, it's challenging. That's not the sort of thing where you want an LLM just spitting out, “You have this wrong.” That would be terrible bedside manner.
So even if it's theoretically possible, that doesn't necessarily mean it's going to happen. So turning now to where are they useful then? One is for testing this hypothesis. Are we limited in what we can say? which is my favorite application of them. In the sense that we see pervasiveness and complementarity and necessitating exposure throughout the economy.
So we should dial back our confidence in terms of predictions of what will happen that I think were useful for answering a very specific hypothesis that we had. But then, underneath that—
Seth: So you were able to—the hypothesis is that they are GPTs of GPTs? They're going to affect everything.
Daniel: Yeah. So the only one of the three conditions that we punt on is whether they are GPTs that improve over time? Because that one was obvious. We do have some evidence, but we are mostly getting beyond that. I think about the first-order changes and where they're most likely to happen. I didn't know that this would be the case when we wrote the paper, but I think those measures that we built tended to predict where people would start adopting large language models, and there have been a few papers validating that empirically.
Seth: That makes perfect sense, right? So it's maybe not a good model of what's going to happen to your job, but it's a good model of where the OpenAI salesman should show up and knock on the door?
Daniel: Yeah, potentially. So you guys discussed this paper earlier on the podcast, but the Anthropic Economic Index, the areas where they thought people were or where they were showing people were using Claude, lined up reasonably well with the areas we thought GPTs and LLMs would show up.
Andrey: Except managerial tasks.
Daniel: Except managerial tasks. Those are happening; it's just not clear. I'm not sure what's going on in that dataset. In my work as a startup co-founder, I use all sorts of large language models for managerial tasks all the time. So we'll see what happens there.
Andrey: I used a large language model for managerial tasks earlier today, so I agree with you.
Daniel: Mm-hmm.
Seth: Right. Seems like these AIs are being used. If you look at the philanthropic index, it really does focus on people using it in these kinds of hobby contexts, which is one of our big takeaways from that episode. So I mean, people don't manage as a hobby, so if a lot of Claude usage is hobby usage, you wouldn't expect that. You would expect that to be underrepresented.
Daniel: You're saying that with the exception of the technical folks, software engineers, and data scientists, it's just like ripping with this stuff, right? Like, because that's not necessarily a hobby.
Andrey: Ripping with it and the cursor, I mean. Now we're getting—
Daniel: Sure. Yeah. API use, yeah. Yeah.
Seth: Right, that's the giant use case right now.
Daniel: Yeah, and that one's a great one. It's kind of ironic given our focus on software, but to some extent you can keep doing what you were doing, but just do it way better in software development with these tools. You don't actually have to transform the structure of software engineering too much to just get a very quick benefit, but I think there is a new mode of working and developing with AI-driven tools that has an analogy in that famous computer in the Dynamo paper. The paper mentioned electric power conversion; you think of it like the steam engine, right? For the listeners who aren't aware, this giant thing in the middle of the factory and all these pulley levers and belts come off of that thing, and it powers the whole factory. And then over the next few decades, they realize, ‘let's modularize that power.’ When we convert to electric power, the first thing to do with electric power is to do the same thing, but like, a little bit better.
Take a giant dynamo, stick it in the middle of the room, and we're off and running. But eventually they were like, “Well, what if we make that really small?” And then we have lots of little machines all powered by their own little engine. Sort of similar, and I'm seeing this with some large companies: you start with a really monolithic, large technology function in the middle of the company that kind of like powers off. Lots of subgroups build technology for them, and then something kind of magical happens with these AI models.
You can sit down with a subject matter expert, a product person, or a senior developer to make sure that these people don't hurt themselves as they're building something. And you create these like modular, , the Jeff-Bezos-two-pizza-team version of work where people have input into a process, and then rather than throwing that process over the wall to the dev team, you wait three weeks and see them come back with something that doesn't fit. You just develop together and watch the models go, and it really ups your cadence, but it opens up all sorts of best practice shortfalls that can happen.
Like, have you hardened for security properly? The devs know what questions to ask there. So going from a specification to a finished product can be way, way quicker. If you redesign how the work goes, it's kind of similar to that steam-power-to-electric thing.
Andrey: I guess maybe a natural place to go here with is there's kind of this distinction between the micro-level exposure of a task-level implication. So, should we be thinking about that? And certainly people have used your micro-level exposure metrics in macroeconomic models and so…
Seth: Tell us about what that experience was like.
Daniel: People use them in different ways. There are papers that you guys have discussed on the podcast before. If you look at the Simple Macroeconomics of AI paper by Daron Acemoglu, he uses our sort of experimental automation score. Which it is not. Could you use an LLM to improve your task output?
Here it's like, could you use an LLM to just straight up do this task without a person involved? It's a really small proportion of tasks in the economy; that's a five-point scale. So our fourth or fifth most intensive automation risk scores. I don't love those scores, to be honest, but they are in a pretty narrow area.
So it's not surprising that we find, or that we read in his paper, I should say, a seven-basis-point-a-year outcome. The OECD is a version where they use the exposure scores, and they get to something like 70 basis points of productivity growth per year. So it's all of one MLA's gains right there.
But per year, I think these are a public good, these scores in some sense, and people bring their models and their priors, too; they're trying to discipline what they believe will happen with the economy with these scores. And they're noisy. I wish there were something more useful for these people to deploy in their models.
But to the extent that we can be helpful, we're really happy that this thing is out there. I just caution folks against viewing exposure automation, which is a common failure mode, or even leaning on things like automation and augmentation as the choice that we have ahead of us at the macro level.
Like, and Andrey, to your point, the macro-level conclusions, yes. Labor markets are how we share the gains from economic activity primarily across society. And then, when you get down to a micro-level task and you're asking a worker or a manager or a worker combo. Are you upset if we automate this task or augment this task?
Either one. It's anything goes. It's about the labor market and the unit of work that's being purchased in the labor market. I could automate something I hate doing and be thrilled with it 'cause I could go spend my time doing other stuff. I could automate my whole job and make myself really sad. Well, maybe really sad, but I'd have to find another job.
I could augment someone and make them thrilled and pay them more, or I could augment them such that they take the jobs, they do the work of 10 different people, and then nine people get fired. So I think this augmentation automation, micro-question, really does boil down to just exposure and changing work.
And we can't say much more than that. And I don't think, even though automation and augmentation are like an elegant mathematical framing in these models, I don't think it's, I don't think it's something that we can lean on from a policy perspective at the micro-level. It's just like you're going to change what people do.
Seth: Yeah, I'm going to push back on the idea that it's an elegant micro idea, right? Because for exactly the reasons you—,
Daniel: Macro-idea, I should say. It's an elegant macro idea. I don't think it's an elegant micro-idea. Yeah.
Seth: Right. But even then, it's kind of it, let me put it this way. To me, when people want to distinguish between augmenting and automating technologies, they want to talk about them as somehow separate from the rest of the economy. But as you've been implying, the real reason you can't say a certain technology is automating or augmenting is because that production is embedded in an entire economy.
And that's going to tell you whether, as productivity goes up, you want more or less of that thing. The way I would put it is to use the metaphor of Marshall Scissors, right? So there's a story that's told of the famous economist Marshall from the University of Cambridge, who was the advisor of John Maynard Keynes. And somebody asked him one day whether it was supply or demand that was more important in setting the price for a certain good.
Seth: Marshall said it's like asking what blade of the scissor is doing the cutting, right?
Daniel: Mm-hmm.
Seth: You can't talk about one without talking about the other. If you want to know what the outcome is and what I see, your paper is one blade of the scissor, right? It's the one blade of the scissor that's coming in telling you this job can be changed, but you need to know everything else about the rest of the economy to understand how the job will be changed.
Daniel: That's right.
Seth: And we've, we've talked about examples. There are countless famous examples, from the ATMs to, I like this example of the cotton gin of jobs getting automated and then demand for that form of labor going up.
Daniel: Right. Yeah. Couldn't agree more. Yeah.
Seth: Now Dan, I do have a micro-take, and I'm interested in whether you buy this, take this prediction about what exposure scores will do to an occupation. This is a somewhat out-of-equilibrium take. This is a partial equilibrium dynamic take, and maybe it'll be smoothed out in the long run, but in the short run, my prediction is that in occupations that are more exposed, there will be more wage polarization at middle-tier firms for that job and less wage polarization at extremely good or extremely bad firms that use that job. Alright, so I've got a kind of a framework here. Are you ready? Can you see where I'm going with this, or are you ready for me to give the reason why?
Daniel: I have some hypotheses about how that could work, but I—yeah—don't leave me hanging here.
Seth: Right. Okay. So should I start with the general equilibrium first, or should I start with the micro level first? Let's work from the bottom. So imagine, you've got, a job that uses two tasks, right? Task one and task two. They can be gross compliments in production, but it's actually not important.
But you need them there; there can be gross compliments as long as they're not perfect substitutes, right? They can be gross substitutes. That's also fine. I'm a doctor. I need to spend so much time having bedside manner, so much time recognizing the x-ray. I know that's not a perfect example, right? Okay, imagine a technology comes out that allows you to automate one of the two tasks. Okay, well then obviously people who are worse than the technology at automating the automatable task automate it. And the people who are better than the technology at automating don't automate. I know this is already going to get a little bit off of the way that maybe you think about how things are, but grant me that for a second.
Okay, what happens? People who are bad at task one but good at task two see a big improvement. Whereas people who are good at task one and bad at task two see no improvement. Right? Whereas, it kind of depends on how good the thing is. If you're equally good at both. Kind of depends. Okay. All right, so that's the first step. So where would you get wage polarization from? Automation. You would tend to get it in jobs when people's skills are anti-correlated. Right, because as we just said, if you're good at one and bad at two, we automate one. It doesn't help you. But if you're bad at one and good at two and we automate one, it helps you a lot. So you would expect to see wage polarization, wage distribution, and expansion for jobs where people's skill levels are anti-correlated. Okay? So now you might say, Sure, Professor Benzell, that sounds cool, but why would we ever expect in certain settings for wages and skill levels to be anti-correlated?
Okay, and now I'm going to bring in the O-ring, right? So Kremmer has a general equilibrium theory of the economy: the productivity of a firm or whatever is somehow bounded by the kind of limited, the worst agent in the system, right? So this comes from the space shuttle Challenger explosion; the space shuttle explodes. We think it's because of this one faulty part, the faulty O-ring. Okay. What's the general equilibrium implication of this model?
It's basically that you should get people of different skill levels all concentrated at the same type of firm. So there should be super good firms that have all the high-skilled people, mediocre firms that have all the mediocre people, and bad firms that have all the bad people. How do you get a mediocre person? Most mediocre people are mediocre 'cause they're good at one thing and bad at another thing. So now we come back to my hypothesis—which is that exposure should lead, And in fact, I'd love to bring this to some experimental evidence, some kind of working with Kyle Myers, a great economist friend of the show at HBS, on this—can we predict the experimental outcomes if you introduce AI to a place, and it's exposed to some of the tasks? Do you get that polarization in productivity and wage, and when do you seem to just kind of boost everyone by the same amount?
Daniel: Okay. So some quick reactions there. So just to immediately hop from automation to exposure, we're like, —Folks, I guess I'm going to ask you a question that, funnily enough, I was asked by Joe St. Diglett as a grad student. I was lucky enough to get to sit next to him at a lunch. He was like, why do jobs exist?
Like, why are certain tasks bundled together? And honestly, I don't have a great answer other than to gesture sort of vaguely at coordination costs. but within the task, shifting that you're discussing, you've got this mediocrity or sort of middling productivity that comes from the fact that.
Some of the things they're good at, some of them they're not. It's still really hard to kind of blow apart the job and then reconstitute it with specialization. So I think where it's coming from is like, people are overall high productivity, and then there's a low productivity component, and then there's kind of this middle thing where you've got some CES aggregator that says, “This person is going to be slightly worse than the average of their components.”
Exposure might lift them in some cases and might not affect them in others. So I kind of buy that piece. To move it to the equilibrium framing, though, I think what'll probably happen in a lot of cases is like a mini Bamel cost disease across everything that we do. The areas where we're least productive are going to be the ones that absorb most of our time.
And in the beginning, there'll be a lot of confusion about that because LLMs will make it unclear what the least productive thing is now that you might be really bad at something. Right now, I know I'm really bad at writing, like spec docs for software. Well, now I have a process with Claude where I can write much better spec docs, and I'm not as terrible at it.
So, but, once you get out of this sort of equal, disequilibrium condition, you might end up in a situation that looks a lot like the one we have right now as things settle. But then, the job boundaries have changed. And there are new names for things. I'll give you a small example.
There's a new hot job in Silicon Valley called the Forward Deployed Engineer, where we've got some of these—
Seth: Hazard pay?
Daniel: This is a role at Helix. We've got a forward-deployed engineer looking for more Win Ma shout-outs. She just started.
Seth: Are they waiting for them to call in air support? What's going on?
Daniel: You send them to the customer's site, and they work with customers.
You need really strong interpersonal skills, but you also need engineering skills. That's like a new configuration of work.
Seth: Wasn't that called being a consultant?
Daniel: No, no. Uh,
Andrey: no, no.
Andrey: If they’re a consultant, then you wouldn't be able to pay them as a forward-deployed engineer. Seth, what do you mean? This has nothing to do with what McKinsey would ever do.
Daniel: I'm not sure that calling someone a consultant will—I'm not sure which end of that ends up being cheaper, but for the firm. But the critical thing here is that's a different mixture of work.
Daniel: Those are some initial reactions.
Andrey: I have reactions too. I think on one level, I'm always a little skeptical of intricate theories like this, when—
Seth: I just have two parts. It has two parts you have to give me.
Andrey: No, no, I mean more so that the like order question is even about income inequality, right? Like, it's hard to answer, and then you're trying to answer this even more sub-sub question. And I guess where I'll push back on is in terms of what the highest firms are, right?
Like, production could be an O-ring within a person, or production can be an O-ring across people, right?
Seth: It turns out that the prediction does not rely on whether ordering is within people as long as they're not, as long as the tasks aren't perfect. Substitutes what I just described goes through.
Andrey: But I guess what I would think is that if we have specialists in 10 different tasks at a high-end firm, and then one of those tasks gets automated. Surely, one of those people's jobs will get fully automated, and I know Daniel is not liking automation already. but, that person's
Daniel: I do believe it exists.
Andrey: That person's wage will go down. Right? Creating inequality.
Seth: Yeah. But I have a theory of one of your tasks being automated, not a theory of all of your tasks being automated.
Andrey: That's where my point is. I mean, it's an interesting question. High-end firms have a lot of specialization, maybe perhaps more specialization than lower-end firms. And so then the person is so specialized that if their specialty is very hard, then we might expect a bigger labor market effect for them.
Seth: You might imagine if tasks were organized differently at large firms, this theory would run into issues. Of course, there are admitted variable problems up the wazoo, but I'm intrigued by the idea of looking into whether people's skills in these tasks, which make up their task bundle, which is their job, and their skills in those subtasks are positively or negatively correlated. And I do think that that will tell you a lot about what happens when you automate part of the task or part of the job. So now bringing that to the dwere is complicated, but that's my insight.
Andrey: Saying one more thing, just how much do we expect new firm entry to be the key margin with all of this? Right? We know that organizations are very friction-filled, and adoption decisions even—
Seth: New organizations, new jobs, right? If you slice out half of the task from a job, in the long run it is probably a new job.
Andrey: Yeah, I think both of those. So then, in terms of thinking about existing firms, it's a little for me in general. Or, at least I expect, I'll be wrong; I expect a lot more entry and growth from new companies that are kind of taking advantage of this new production process from the ground up. That's kind of the lesson of the supply-side disruption theory.
Daniel: Yeah, I'd agree with that. I think one of the reasons it takes such a long time for the benefits of sufficiently transformative technologies to show up is that it usually takes a while for the firms that are deploying them well to become economically meaningful. And then they sort of set a standard.
Seth: Right? That's not the margin on your margin. The firms that figure out how to do it grow faster, which is another margin.
Daniel: And I think, agreeing with Andre, that a lot of them are new entrants. Then it's not like an incumbent will always figure out the answer, or do they have to a lot of the time? Where I would ask you a question then, Seth. Just on the idea that the bundled tasks have some spectrum from super negatively correlated to perfectly correlated individual task productivities.
Why do you think those tasks are bundled together? Because there's some coordination and cost benefit? Do you think there's probably some lower bound on how negatively correlated your productivity can be because, like, across these different tasks?
'Cause, if you really suck at half your job, you probably can't do that job. I think you probably need weak, positive correlation everywhere.
Seth: Ooh, man. I think for the sorting to happen. So let's take, we're going to take a thousand people who are all doctors, and I agree that you kind of want to think about the step before that, where before we get the thousand doctors, but I'm saying now that we have a thousand doctors good at task one, and some of them are going to be better at task two. And then you're going to get negative correlation across those abilities in the mediocre firms. Now, you're right; there might be some censoring. You can't be so bad at one of the tasks; you don't become a doctor, but I'm saying conditional on you have become one,
Daniel: Oh, okay. I could see that. Yeah. The thinking is like a Dr. House situation: everybody hates him, but he is really, really good at the diagnostic side of things. But like if he weren't, then no one would put up with that. He would've just been fired.
Seth: Right? He'd have a higher-paying job and be more productive if he was able to be nice for 10 minutes.
Daniel: He’d probably be an investment banker or something.
Andrey: There's a mirroring here too, like a general phenomenon in digitization, which is like the ability for specialization, for more niche content to do really well, right? So, if you’re only good at a task, and now that all the complementary tasks have been automated away, then you shouldn't be bound by your firm anymore.
Like, you should be able to essentially create your own small business or join the most productive firm as the specialist in that specific area because all your other characteristics don't really matter that much anymore. So Dr. House would be able to essentially, officially run a business, even though he is really bad at organizational things, because all that stuff comes out of the box.
Seth: I think that's why I talked about this theory as being kind of a short-term partial equilibrium theory 'cause in the long run you're reinventing businesses.
But, you said something really interesting, Dan. And maybe I will start to transition us now about the idea that it's going to take time for people to figure out how to use these GPTs, right? The general (that is, chatbots or LLMs), excuse me. What sort of macroeconomic implications does that have? I understand you've written a little bit on this topic.
Daniel: Yeah, right. Then, we call this the Eric and Chad Seavers, and I call this the productivity J-curve. I think the dynamic is when you see pretty much any kind of investment, there's an initial outlay period where things are expensive, and then there's a harvesting period later.
There's the famous Robert Solo quote: You see computers everywhere, except in the productivity statistics. People were already starting that. With AI, I've seen a number of news articles that say there's no ROI for this. I think the way you kind of square the circle here is, well, in the beginning of a new technology, when everyone realizes, Okay, we're going to take the plunge; you're actually going to invest in this.
You spend a lot of time kind of reconfiguring work, building new business processes, trying to figure out what new products to build, and collecting information—a whole bunch of really expensive stuff that's really hard to quantify. so it doesn't end up in GDP, to the extent that it could, but that's building up a capital asset.
So, output is going to be understated. In the meantime, while we have this, it's going to look like we're putting in more to get less out. Then later that intangible asset is actually there, but not measured, and now it's an input instead of an output. And when it starts to spit off money, then everyone's going to say, “Oh, hey, look at how productive we're being, because it looks like you're getting more as an output for less as input.” Really, it's just that thing paying off. So that tension between the growth rate of investment in this new type of capital and the growth rate of the capital stock that you're missing, that difference depending on its share and the overall economy can be meaningful. And if you do, we use the stock market to measure it because investors aren't dumb.
On average, they price these assets, or companies wouldn't invest in them, and under a roughly efficient markets hypothesis version of the world. But, if you're pricing those assets, then you can kind of back out roughly the magnitude of that adjustment you should be making to productivity growth.
So it's kind of a fun spin on growth accounting, which I know isn't the reason everybody gets out of bed in the morning—to go account for where the growth is. But—
Seth: Don't underestimate our audience, Dan.
Andrey: Look, I mean, big political debates hinge on the measured rate of GDP growth. So, it's important. How big of an effect did you find in that paper?
Daniel: Oh, I don't remember the exact numbers anymore. It's been a little while. I should look it up. But it's a lot. If I recall correctly, it might be something like 75 basis points a year for some period of time. The overall view is: look, we have good news and bad news. The good news is that the productivity growth rate level is actually a bit higher than we had thought once you account for these hidden assets. The bad news is that the slowdown from 2005 is even bigger than we thought because they were building intangible assets back then too. so,
Andrey: Well, how do you compare the intangible asset investment? I think this is kind of the key
Seth: Yeah. What's bigger? The invisible teapot or the invisible elephant?
Andrey: Because right now we're getting a lot of intangible investment into learning new production processes with AI, or is the answer just to look at how much the stock market has gone up? Is that the answer?
Daniel: Oh, that's basically it, Seth; you're not too far off. We do a hedonic regression. If we were to look at, say, the R&D assets, because this one's kind of mature, you don't really see too much from R&D on its own, but we can see if a dollar of R&D investment capitalized is actually worth a dollar and 10 cents in market value. We assume that there is 10 cents of intangible correlate value there.
Or if you really wanna be pedantic about it, it's 10 cents of intangible correlate combined with quasi rents from the fact that you can integrate R&D investment better for productive purposes than your competitors could. And then I'm going to wave my hands and say, But that's actually an asset, so it's an intangible asset too.
Seth: Right. It's the, the, this is, this is something. I mean, I remember us spending lots of time back in the day in the M-I-T-I-D-E break room, having a cup of coffee looking out over the Charles Jerome, walking by the Aour, locked in these intense conversations about just how do you measure these intangible assets?
They seem so essential to everything, yet they are literally the latent vaporware. They're our generation's. TFP, if you will.
Andrey: I don't know. I think the principle I obviously agree with, right? Like you have these investments that are not easily measurable. and they surely should be counted in some way. But it's not obvious to me. If the rate of intangible investment were constant over time, then it's a constant adjustment, and we don't really have to think very much about how the world works. But then I think measuring the intangibles—that's kind of tricky because I think about market cap, which is something that not only you're already talking about rents, but to me competition is so important there, right? You don't gain market cap just because you're doing investment. You gain market cap because you have market power in the future.
Seth: Yeah, but now you have to think about it. Why would you ever pay an adjustment cost in a perfectly competitive economy? You never make the adjustment cost, right?
Andrey: Well, I would say that there are different degrees of market power that can exist, or you can have your kind of standard monopolistic competition model where everyone's kind of keeping up to keep up, but then you can have companies like your Googles and whatever, who clearly don't think that the right model of the world is that.
Yeah, and I guess the other thing is I will not always be skeptical of firm value regressions. I think the endogeneity issues are fatal, but I don't know.
Daniel: Yeah, I disagree with you there, that it is just—
Seth: You just died. You were just killed.
Daniel: I feel so devastated.
Andrey: Yeah.
Daniel: No, I think where I disagree is, I think Tim Bresnahan put it this way. He is just like, “Well, everything's an asset here, including the capacity to generate rents, so it's just an interpretation question more than anything else.”
And you can bind things, right? Like, it's not when you go and run some of these regressions; you're not saying, I think that an additional unit of AI investment causes this market cap. They're the endogeneity; it's predictive. It's like, “Here's a price on this thing; it's not at all saying if you are.”
Seth: Here's a model: there's only room for one social media platform. So whoever got there first planted their flag on that land. They didn't make an intangible investment. They just planted their flag first.
Daniel: Right. That's what I'm saying too. It's like they planted the flag first, and now it's worth 10 bucks. but I'm 'm not saying if you were to just go up—-
Seth: 10 bucks. Which seems marginal…
Daniel: Oh, yeah. Oh, you're talking about the marginal versus inframarginal differences. And the way you deal with that, as opposed to how you do in any structural models, is you assume it away and say that marginal equals average queue for some of these.
But it's not like when you run these regressions that you get coefficients of a thousand; you get coefficients of like somewhere between 4 and 12. So, is it unsatisfying—
Seth: That—you get 4 and 12—what?
Daniel: Oh, if I were to say… regress market value on measures of IT capital, the multiplier, I get, and this has been sort of stable in weird ways for 20 years; the coefficients you got are somewhere between 1 dollar of IT investments correlated with like 4 dollars of market value on the low end to like 12 dollars of market value on the high end. and it's that which bounds the debate. It's not saying this is infinitely valuable. There's this enormous intangible asset that's the entire economy.
And then it's also not saying it's nothing. So I think that imposing some assumptions, which you can absolutely question, and I think we all should to try to get better models, imposing some assumptions and doing the best you can is a way to learn something as opposed to, like, just throwing our hands up.
But yeah, I agree with you that the causal interpretation of these things is not correct. so.
Seth: You then—so okay, the useful question—are we in the bad part of the J-curve?
Daniel: Which part's good and which part's bad?
Seth: The good part is when you're going to get more growth down the line than it looks like you have now.
Daniel: We are in the hard work investment stage of the J-curve.
Seth: Okay.
Daniel: I don't think we're in the—we're anywhere close, at least not for AI. I don't think we're anywhere close to the harvesting side yet.
Seth: But you think the GDP is on the underestimated side, which is what I mean by the good side.
Daniel: Yeah, I would say very modestly, GDP is underestimated right now.
Seth: Very modestly, 1%, 2%,
Daniel: I think that's because I'm probably ambitious. But what's GD?
Seth: Order of magnitude, 1%.
Daniel: Yeah. So where it's tough is like the parts of AI investment that are happening right now, I think, are actually fairly well captured by GDP seeing a huge amount of CapEx, and data centers, GPUs, and those things are priced pretty well.
But eventually people are going to question, how do you make someone responsible for hallucinations that the models might make or come up with good policies that get people to create good outcomes there? That's a hard thing to do. I don't think we're like anywhere close to scratching the surface with that.
Andrey: I guess the intangible investments now are more about how we go about teaching using ChatGPT. 'cause that's not going to be measured in a change in labor inputs, but it's something that is not going to materialize until we actually figure out how to teach people more effectively.
No, it's not clear that that was ever a GPT build. But, if we were a regular for-profit firm at the university, that's—
Daniel: Yeah. So, that stuff will take a while… I don't know… I don't think even if we stopped—
Seth: Of all the people who actually do work in the economy, are the people you're referring to—
Daniel: Right. And in particular the AI researchers—if AI researchers stopped building new LLM tools and making these things better today, we would still have quite a while to actually integrate this and put them to their best use. So that's kind of a bummer.
Seth: Then let me ask it that way. So if you don't wanna give me a percentage rate of intangible investments either—below average—do we need to spend a hundred percent of GDP over the course of the next 20 years in order to take these advantages cumulatively? How many intangible investments do we have in front of us? Do you have a sense of the order of magnitude of that?
Daniel: I don't know how deep the well goes. No. But it might be quite a lot.
Seth: One thing related to this, I was thinking about when we were talking about part one, is you've got these two measures of jobs: AI exposure, one of which is “just the LLM” and one of which is the LLM plus software tools. Didn't you tell us that you can use LLMs to make software tools?
Daniel: Oh yeah. It's, it's totally recursive. But the reason we pick up on software tools is because it also requires the changing of business practices and these organizational things.
Seth: So that's the way to do it. Can we play that game then? Can we look at the wedge between E1 and E2 as telling us something about the size of the adjustment costs needed or the intangible assets needed?
Daniel: I don't think it gives you that, to be honest. Sorry, Seth. I know my tools are unsatisfying here. That's a good research question, though. I think actually, the market value regressions that Andre hates are more likely to get you a ballpark for that.
Seth: Do any sorts of policies or ideas come out of the J-curve? Should we be somehow subsidizing intangible investment? Do you think this is happening at a socially suboptimal rate? I mean, you would expect that, like any innovation, you'd expect there to be positive externalities as people copy and learn from each other.
Daniel: I don't have any evidence to suggest it happens at all, that there's an externality here that needs some sort of correction. Where I could see some policy considerations, and obviously I'm not in charge of any of these things, so take what I have to say with a grain of salt, as you would for anything else I say.
Daniel: I think when it comes to monetary policy and thinking about how quickly how hot or cold the economy is, it may be helpful to know how much intangible asset creation is happening because it's a compositional shift. And you might think that the economy is in a recession when it's actually doing quite well, at least in certain pockets.
There’s a distribution of gains question here that's pretty important. Like who creates the intangible capital versus who benefits from it versus who's just like, shut out of that part of the economy altogether. But I think on average you might want to know if your growth rate is actually, in real terms, two-and-a-half percent versus one-and-a-quarter percent or something.
Andrey: And I guess you would look at the stock market. So if we have kind of this case where the stock market is going up, but GDP is not going up as much. Maybe, you'd be like, “That's okay on some margin.”
Daniel: The stock market is an increasingly less useful tool, sadly, because there are fewer public firms, and there are other reasons that those large firms would be different than the rest of the economy. It's just a quick thing to do. So it's easy to get those market values and start to pull that info.
But I think the ideal thing to do is to have an actual sense of how these assets are priced. Like you could look at M&A and costs for whole software firms. Sadly, you can't shave off a tiny piece of your digital culture and market it and sell it to someone to get a little bit of a value indication.
But I think much more complete data would give you a sense of what these assets are being valued at. It could be helpful that that's if you're willing to buy into an enterprise that I more or less do, which is that on the margin, either these asset markets or securities markets are doing a pretty good job.
If you think that there's some sort of bias in them that prevents you from actually sorting 'em out. Like, let's say everything is priced in terms of e-commerce, and I mean, obviously there's no hype factor in crypto, but yeah. Let's assume a wild assumption, for a second, that crypto is not priced at its actual long-term fundamental value and you were using crypto prices to back out the value of all illicit trade around the world. You might mistake illicit trade assets as being super valuable in that case. if those crypto coins are a claim on future, illicit trade value, so—
Andrey: What—what?
Daniel: I'm probably saying too much?
Seth: The stock market may look really good, but the companies are building evil products, so don't—
Daniel: Right?
Seth: —welfare growth.
Andrey: Well, this is—
Daniel: Yeah.
Andrey: Deone has the point of view that all the AI innovation is for making social media more addictive. o.e,
Daniel: All right. Which is, in my view of the world, an asset.
Andrey: What about what the GPT or GPTs do? Does that have any policy implications or, I guess, any follow-on work that you have on that? .
Seth: I understand you've looked at how firms differ by these exposure measures.
Daniel: One of the conclusions there. So, if you were to look at the exposure of firms against their quantities of tech workers, there's a little bit of a mechanical relationship here because tech workers are highly exposed. But, there is a difference across companies, like whatever exposure you, exposure measure you want to use.
And the reason we do that, Seth, is precisely 'cause of what you brought up. You can use these tools to build better technology. So in some sense those companies might have a good reason to run away, and performance. But like the differences from low to high exposure and entity measures across firms are not nearly as big as the differences from E1 to E2 to E1 + E2.
Those are really big. So, every company could benefit if they went and started actually trying to transform if they knew what a good direction to transform would be. So that was kind of one of the points I think from a policy perspective. I have a hard time separating what Tyler Cowen, whom we call mood affiliation, from what I think are good policies, but I'll just spit them out as some things I think are good to do.
I would, but there are a few risks with these tools that scare me. The virology community, I think, should be fairly concerned about using turbocharged models to manufacture COVID or something. Or like, God forbid, some degrowth person decides that they want to kill half of humanity and go full Thanos.
Seth: That’s the plot to 12 Monkeys.
Daniel: It is, but so would 12 monkeys, which would be a bad reality to face. But aside from that, I think there's just so much drudgery, so much additional work that these things could do for us, and a lot of gains to be had. So my preference is not to regulate these models in any kind of aggressive way; I think it's to figure out what they're good for and to develop with them.
Not to say you can't mitigate other risks like bias—that Mecca Hitler thing with Grock was terrible. There are going to be bumps in the road along the way, but they're not the kind that would say to me. Oh, we should do like a six-month pause of development. None of that really scares me yet.
Seth: Not in favor of bombing the data centers?
Daniel: No, I'm not, but I'm not a fan of Harry Potter fanfiction either. So I don't know. Maybe it's just correlated beliefs.
Seth: So you brought up bioterror in particular—
Daniel: Yeah.
Seth: As we speak, AI is being used en masse in warfare for identifying targets for terminals and target acquisition by missiles and drones. Increasingly in Ukraine, we're seeing use of automated ground vehicles for transporting resources to the front and for evac. People often go to these super sort of—I'm not gonna say 12 Monkeys is bizarre, but it's a pretty weird movie if you've ever seen it. Why do we have to appeal to that rather than just using AI to make murder bots?
Daniel: I mean, to some extent the murder bot thing doesn't scare me that much. It’s human beings doing those things is also bad. I think the issue people have with those applications often will be scaling evil individuals, which is a serious concern, or just issues with war in general, which I understand.
But, if it's gonna happen, we're kind of caught in a prisoner's dilemma there, which is what freaks me out.
Seth: Near-term AI worry is: I have a drone hanging out downtown—a suicide drone that just hangs out somewhere in Manhattan and waits for the particular person to walk out. And then I target assassinate people untraceably, right? That seems like here as opposed to “I use AI to build a lab to make a super disease, blah, blah, blah.” That's got a lot of steps in it.
Andrey: Untraceable, Seth? I guess my presumption is these sorts of actions do tend to be traced. In fact, AI is a way to trace people, right? So this is kind of one where, as with many AI questions, it's a defensive and an offensive technology.
Seth: So it favors the offense or the defense. We had thought, it seems like intuitively you would think that AI would favor the offense, right? We think about these super weapons like Daniel brought up. But if you actually look at Ukraine, it seems to create this transparent battlefield where no one can even march to the front and in some ways seems to favor the defense. It's gonna take a long, long time to play out.
Daniel: Yeah, you guys would know the answer to this. I'm gonna butcher this quote, but who's that sci-fi writer who said that like, the job of a sci-fi storyteller is not to predict the driving cars but to predict the traffic jam or whatever? I think that—
Andrey: I don't remember who it is.
Daniel: Yeah, I think that's kind of the idea here. I think here that we want to predict what the traffic jams are. I think the—
Seth: Frederik Pohl
Daniel: There we go—I should remember that. The reason the bio-risk stuff scares me so much is 'cause we just had a test of this and what one virus does to society and how damaging that can be.
And I think, Seth, what you're bringing up is what I alluded to; it's like the scaling. One really bad long-term trend in technology is just like making individuals more powerful.
Seth: Andrey and I just read a book. We just read a sci-fi novel that's masquerading as his political economy. That argument that AI is all about individual disempowerment, that we're gonna get the God machine that's built by the state in the project, and it's going to 1984 us constantly—that's radical human disempowerment.
Daniel: Right. So if our response to individuals becoming much more powerful with technology is to expand their surveillance and control capacities of the state, and we get a loss of freedom, I think that's a genuine worry. In a general equilibrium framework, those things do freak me out for sure. But writing emails with LLMs just does not.
There's somewhere in between that we should, where we, we start worrying, and I don't think I'm at that point yet.
Andrey: What about things like transparency requirements that you oftentimes hear written about, reporting requirements, and registrations with the state? Do you have any opinions about those types of policies?
Daniel: I don't like 'em. I'll shop my book here a little bit. Like they're terrible for startups, right? Like any compliance burden you stick on startups, even if we might be okay, specifically the ecosystem suffers as a result, and they do a lot of the work to discover things. So, there's a big trade-off, and this happens in the privacy debate too with GDPR and what Europe's trying to do politically; no one's willing to acknowledge that there is a compliance burden and competition trade-off. So if you're willing to hold firms to account in really expensive ways, you're gonna get monopoly power.
And that may be okay. You may decide we don't want competition with this super private data that could get out to everybody—unwise with LLMs or AI regulation. If you don't want this to be an oligopoly situation, you probably need to make it so it's easy for people to build and develop.
And I'm fine with whatever choice policy makers wanna make, so long as they're taking that trade-off into account. I mean, they're elected officials. They're trying to make those choices on behalf of all of us. If we don't like them, we can vote them out.
Seth: Using the AI to manipulate us to have the beliefs that they want us to have.
Andrey: Is there anything you wanna tell us before we wrap that up?
Daniel: No, I thought this was a great discussion with you guys, as always. It's a pleasure to get to join you, especially as your first conversation-based guest. But, as a fan, it's kind of exciting for me as well. So please keep it up. Listen to Justified Posteriors, folks.
I would say the message I would have for listeners and economists, maybe in the audience as well, is just that I think these tools are really valuable in our work. I kind of joke—I got a model that I'm building where it shows that lower types are going to use LLMs more for assignments.
And then, of course, I'm using LLMs to help me build the model. So infer what you want about my type from that, but I think it.
Seth: You've got this. You're assuming everybody has to be equally good at everything, but you can just be good at one thing and bad at another.,
Daniel: Yeah, I would never claim to be a good modeler, but it does help me get my thoughts straight.
Seth: I think you could be a modeler
Daniel: I'll leave that one alone. But I would just encourage folks to kind of be their own R&D department. As Ethan Mollick says, “Play around with these things.” I think when I talk with computer scientists, they get upset with me because I'm a little bit too pessimistic about what the models will do long-term. When I talk with economists, the modal disagreement point is the other direction, where folks don't think it's gonna be a big enough deal. So I would say, get out there, play with these things, and learn how they work. And Anton Korineck has got a great paper on using AI in your own work, so check that one out too.
Andrey: All right. Well, awesome.
Seth: I can't think of a better place to end it
Andrey: Listeners, please do comment and subscribe and stay tuned for more exciting episodes.
Daniel: Thanks, guys.
Seth: And if you are a super fan, you too. Might one day be a guest on the Justified Posteriors podcast.
This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit empiricrafting.substack.com