LLMs are Superstition – ØF

Jan 29, 2024

Ask episode

Chapters

Transcript

Episode notes

Pablos: So what happens right now in scientific research is, if you’re going to do a research study on something, like “are M&Ms is bad for you?” It’s impossible to do that study. You have to be very specific and ask a much more fine grained questions like ” how many M&Ms does it take to, Kill a mouse?” or to cause a mouse to vomit. You just have to be very specific cause that’s something testable. You could test that, you can get multiple mice, you can feed them enough M&M’s that they eventually vomit. The whole research study can be done that way.

And so when you read scientific research studies, that’s typically what you’re looking at is some very narrowly defined thing that they believe is correlated to a much more significant or bigger effect, but you can’t test the whole thing. You can’t ask questions like, “does this thing cause cancer?”

You can ask questions like, ” does this amount of exposure to this thing over this much time cause this specific, type of cancer in this type of rat?” Things like that. So that’s great and all because it means, we’re structuring, tests that we can actually perform, but the downside is that for most people, what they would actually like to know is ” do M&Ms cause cancer or how many of them is too much, things like that.

Getting those answers is often not straightforward from scientific literature. And so the way that we. usually try to compensate for that is to do what’s called a meta analysis. And a meta analysis is where somebody will go and dig up all of the studies on a given topic, combine them and try to say, “across a hundred studies involving M&Ms and cancer, this is kind of what happened” and, to just sort of give you a general sense of whether or not, the effect you’re interested in is happening. Good examples of this are like, chiropraction is largely, debunked.

A lot of people get pissed off at me talking about it because it can be a deluxe placebo, but in clinical trials, very few clinical trials are performed. It’s hard to do them. Different practitioners have, different effectivity levels anyway. And so the problem is it’s hard to run those studies, but even if you do, you can’t find any indication that chiropraction actually cures anything. So this is a case where we don’t have good research and the only way to try and get to the bottom of it is with a meta analysis where you find the studies that have been done and you sort of combine their results and try to say whether or not chiropraction works.

People, there’s no point arguing with me if you’re listening and you think chiropractic is great. Go nuts. I encourage you not to do that, but, whatever, do your own thing. But the point is the only way you could get a reasonable answer is with this kind of meta analysis. Now meta analysis is very time consuming and difficult to perform and often isn’t getting done, but what it really involves is just go read a bunch of studies. Well, it turns out that’s what an LLM is really fucking good at. So you, so right now we’re in a stunted position because one of the big problems with OpenAI and ChatGPT is they’ve crippled ChatGPT. It doesn’t read scientific literature and even if it does, it’s not really allowed to comment on it.

So they’ve crippled the thing to keep you from talking to it about anything that might be health related and stuff like that. What you would really want an LLM to do, and one of the things that would be really good at is doing ad hoc meta analysis. So you could just say, “Hey, I feel like I’m getting a cold, should I take zinc?”

There’s people marketing zinc for that purpose. We’ve all been told to take zinc, but I don’t fucking know if that’s an old wives tale,

Ash: It’s like echinacea, zinc, doesn’t matter, it’s all those things.

Pablos: I don’t have time to go read every scientific research study, but I bet you collectively we have that answer, and so if I could just ask an LLM.

Ash: Wasn’t wasn’t IBM’s Watson at some point pretty good? Watson Health actually had all this.

Pablos: That’s probably what they were trying to do.

Ash: They were doing it and they were doing pretty well. They weren’t they weren’t using a full LLM model. That’s that was the whole breakthrough.

Pablos: They were kind of in the pre LLM days. It was LM. It was just LM. It wasn’t LLM.

Ash: Just language models. And they were taking huge amounts of data. But what they had is they had their own normalized structures underneath. So that was the difference, right? They didn’t let the structure form itself. But what you’re saying is true.

Pablos: You’re right, and we could probably build like a Watson for health in a weekend now using, Stable Diffusion or something. It would be way better. You would just basically load it up with all the research and let it go nuts and then let people ask questions like, ” Hey, should I be taking zinc?”

Ash: The problem is reliability score.

Pablos: Oh no, it’d be terrible, but it’s already horrific. Right now, we’re just going off superstition. I mean, literally that question of, should I take zinc? You’re gonna get as many answers as people you ask because somebody’s Chinese grandmother said You should be, taking echinacea instead.

Ash: You should listen to my first class. The first part we were talking about is what is known as “triangulation of information truth.” What is provenance for data. Then you have to figure out, how do you weigh it? LLMs are fantastic, like you said, because they can take all your source inputs.

So if you go back to, to signal analysis, or analytics for like intelligence again. We’ll just lean on that for a moment. Truth is great if you’re playing with mathematics. You get QED and you call it a day for the most part. But for other things, truth, zinc, for example, like your zinc example. There’s some balance between like how much did it really? Was it an emotional support protocol? Did it help you because you were convinced that, your grandmother was right or whatever’s happening to, to actual physical actions internally, right?

We can be scientific about it, but it comes back to source and information. If you pick a really, really dangerous topic and we won’t go there, but let’s just pick Gaza for one second. How do you find what’s really happening? Well, you hear a lot and someone’s like, “well, I read it in the Wall Street Journal.”

I read it here. I heard it there. I took Al Jazeera. I did Briebert. Whatever you picked. The question was, did you do it in all the languages? Did you listen to a local radio station? Did you find someone’s signal data from nearby? What was happening? Did the bomb go off or did this happen?

If you look at information, just like you’re looking at these scientific papers, the question becomes the weighting factor. We as humans, I think one of the things we know how to maybe do, at least a good analyst should be able to do, is try to give weighting based on time and location and stuff.

And I think the large language models have to start to put in context again. I think they have to add one more dimension.

Pablos: For sure. And I think that you touched on the other thing, which is that right now, all this information is like floating around without, tracking provenance, and so, interestingly, like in scientific research, you at least have citations. which is a lightweight form of provenance. It’s a start, but ultimately, the way these things all need to be built, not only , the LLMs for doing meta analysis, but really every knowledge graph needs to be built off of assertions that are tracked. You keep track of provenance, okay the sky is blue, well, who said the sky is blue?

Where did you get that from? And that way, whenever you’re ingesting some knowledge, it’s coming with a track record. That’s how we’re going to solve news online, eventually.

Ash: Kind of like, the Google Scholar score or whatever.

I go back to my partner, to Palle, right? So Palé actually has a patent. It’s probably expiring soon, so for those of you who want to do this, we should go do it. He owns webprovenance.com. And he owns the patent on how you check provenance. One of the things that came out of the BlackDuck software stuff was that at BlackDuck, we needed to know who created something. So do you remember the Sun Microsystems, IBM, lawsuit, Java? If you’re a compiler theorist, then you know that, just because West Side Story takes place in New York, You could probably say, well, doesn’t it sound exactly like, Romeo and Juliet? So maybe you change the variables, but it’s the same stuff. And the idea was that when we were looking at, open source, with open source, the interesting thing is you’re trying to figure out, where did this little rogue piece of code, this little GPL or LGPL infection come from?

You need to find it. So it’s one thing to talk about the combinatorials, but the other was to find it. And then Palle was like, well, I can do something cooler., He said, if Brewster Kahle’s Way Back Machine, remember the original Alexa project? So If you could go in and take all that data, he’s like, I could pretty much tell you like who killed JFK.

You can find the provenance of almost any information. He wrote this wild algorithm for it. I’d love to see some of that incorporated into the LLM stuff because that algorithm, and again, we would happily, anyone out there if you’re willing, this has been a project we’ve been looking at for the better part of 15 years.

Pablos: Well Stability might pick it up. They love that kind of stuff, that would be a huge coup for them.

Ash: Well, we should, we should have this conversation offline, but it’s a, it’s interesting. It’s an incredibly cool algorithm. He was a compiler theorist anyway, an algorithmist, at Thinking Machines. So, he always wondered where the info came from.

And I sat there and said, hey, we should find a way. And I remember the stunt I wanted was like, to figure out if they were aliens. And he’s like, what do you mean? It’s like, well, who started that rumor? Like, where did it happen? Right? So, imagine you could take any rumor, and I can tell you how it started.

Pablos: That’s so cool.

Ash: Wouldn’t that be the coolest thing ever?

Pablos: So important.

Ash: Yeah, and we need that.

Pablos: That is super important. I’ve seen somewhere, a map that somebody made of where are all the UFO sightings reported? And like 98 percent of them are in the United States. I think the rest of the world doesn’t even have UFO as like a notion. it’s not even a, thing for them.

Ash: It’s cause we have no healthcare. Look, all I know is, years ago, we just didn’t have enough data. Years ago, we couldn’t. We were like, looking at the Wayback Machine, and we were like, I was like, well, who can we go to to get all the data?

Can we get the entire web? Today, large language models have already stolen all the data. They already have it. So if you have enough of the data, we could definitely help you figure out the algorithm to go backwards and it’s complicated.

Pablos: That’s super exciting.

Ash: He actually patented it himself because he was trying to figure out if he didn’t need a patent attorney. So that was his project, can I make a patent? And his patents on provenance. So I think it’s a big coup if they could pull it off. Can you imagine you could just type in who started, where did this first start?

Pablos: Dude, that’s crazy cool.

Ash: It’s super cool.

Pablos: I’m kind of always on a rant about this, but we need a variety of models. Like LLM is the beginning, not It’s a thing that you need, like the way we’re doing it now actually kind of sucks and requires a lot of brute force, but there’s so many things that it’s not good for.

Ash: And it’s so susceptible to the thing that, what did I do in my life? Psych warfare is all about information corruption. Dude, you corrupt a large language model, that thing is convinced that the sky is red at that point.

Pablos: Exactly, well, I’ve been thinking about that. Why don’t I just..

Ash: Corrupt it?.

We’re bad hackers.

Pablos: I can fire up, 100,000, blogs written by an LLM that all just talk about my, prowess with the ladies.

Ash: Exactly.

Pablos: And the next thing you know, all the future LLMs will be trained on a massive amount of data that indicates that, Pablos is the man.

Why wouldn’t we do that?

Ash: At the end of the day, the LLMs are basically superstition. There you go. I’ve just said it.

Pablos: Right. They’re superstition.

There you go.

Ash: LLMs are superstition.

They’re based on some concept of something that it derived because it took a whole lot of information from a lot of grandmothers.

Pablos: And that’s the thing, Like what’s posted on the internet is all that they know.

It’s driving me crazy.

Ash: Worse, it’s only the people who have given them permission, so the quality sources are going to start cutting them off. So, all they’ve got, all you’ve got are the people who are generating rumors that they’ve seen UFOs.

Pablos: Well, that’s all true for the LLMs made in America.

Ash: Yeah, so the American LLMs know where the UFOs are.

Pablos: Japan decided that copyright doesn’t apply to training LLMs. So the most powerful LLMs, for now, are gonna be in Japan.

Sign me up.

Ash: Even better, that means Japanese information…

Pablos: That’s probably true, learn Japanese.

Ash: Which, think of it, if I wanted to build, my 100,000 LLMs generating your prowess, I’m gonna do it all in Japanese. I’ll do kanji, hiragana, and katakana. I’ll give it to them in all three formats. You could crush it.

I I would love to see any of these. I think that’s, that should be our ask for everyone.

Pablos: Yeah.

Ash: if someone, someone wants to run with it, go build it.

Pablos: Yeah, people, build this shit.

Ash: Tell us. We can help you commercialize. We will find you.

Recorded on January 8, 2024

The post LLMs are Superstition – ØF appeared first on .