Speaker 2
no, I'm not buying a Mac Studio. Did Apple accidentally make the world's best ai desktop i
Speaker 1
think yes i think they accidentally realized i'm half joking but also not really that they're making the best ai computers yeah
Speaker 2
i don't think that was the goal when they started you know sketching out apple silicon for the mac probably eight years ago now because we're five years into it so eight ten ten years ago but they kind of fell into it i think now i think they've embraced it since then right like they're not continuing to be surprised by it but this has come to light especially with the mac studio and the m3 ultra uh y'all have been writing about this on mac story so so what's the deal here the
Speaker 1
deal is that uh benchmarks came out for the new max studio with the m3 ultra and it it looks like a beast of a machine uh i especially want to point people to this review slash benchmarks by Max Weinbach at Creative Strategies. They have done an excellent job comparing the performance of the new M3 Ultra Mac Studio running on-device local large language models. So the models downloaded and running locally on the computer. Between the Mac Studio and a PC setup using the latest Intel desktop high-end CPUs and the NVIDIA RTX 5090. So currently the best GPU in the world for consumers. That
Speaker 2
is very expensive. It's
Speaker 1
a very expensive GPU and a very expensive setup all around. And the most surprising aspect to me is that in a vanilla setup, meaning you're just downloading the model and you're running the model with a non-optimized, not to get too much into the weeds, there's plenty of ways for programmers and developers to optimize a large language model for an NVIDIA GPU. But if you just set it up and you run it vanilla out of the box, the M3 Ultra absolutely obliterates the 5090. And to the point where, so in this story, just I wanted to look up the specs again. They're using the 256 gigabyte version of the Mac Studio with the Ultra. So they're actually not even using the top of the line version. But with these specs and, you know, they have the 32 core CPU, the 80 core GPU, They have that unified memory. 192 gigabytes of those 256 can be allocated to VRAM. They have a 4-terabyte SSD. And if you just look at the numbers, you know, if you scroll down in the story, that is going to be linked in the show notes. And if you scroll to the LLM performance table, you're going to see that, for example, this person is running one of the QAN models. So QAN is the large language model created by Alibaba. That's going to be, according to the reports, one of the partners for Apple intelligence in China. They're running the 32 billion 4-bit version of QAN, the RTX 5090 can output almost 16 tokens per second. I know that tokens are not necessarily words, but for the sake of simplicity and clarity on this episode, think of 16 instances of the word hello. The word hello tends to be a single token. So imagine that every, yes, hello. Imagine that in, hello. Imagine that every second, that large language model on a 5090 can spit out 16 hellos. Hello. Hello, hello, hello, like really fast. It's very eager to say hello. But the ultra in the Mac Studio can spit out 33 per second of them. And that's basically like twice the performance that you're getting in a computer that is a chunky Mac Mini, essentially. So yeah, I do think that, and the story goes on to sort of explain how a lot of the work here is being done by the open source framework that Apple created for developers, which is called MLX. It's an open source framework that allows developers to fine tune large language models specifically for Apple Silicon and specifically for the unified memory architecture of Apple Silicon. And so when you combine those gains and those optimizations with models, and you just want to run them out of the box, you get significant better performance on a Mac Studio compared to a 5090, which basically leads into this bigger topic of is Apple making the best AI computers for consumers and just thinkers right now? Potentially, yes. Potentially,
Speaker 2
yes. There are a couple of things here that are interesting to me and reading through this. I read through the Apple open source MLX stuff. And I couldn't help but think about things like OpenCL, which back in like 2013 with the Trash Can Mac Pro, Apple was like, OpenCL, this is how we're going to compute on GPUs. And it just didn't really take off. I mean, it did in some circles, but not hugely. Think about Metal for graphics, right? Apple's graphics frameworks that by all intents and purposes seem to be very good and very performant. Apple's basically having to twist the arms of video game companies to port their games to use metal, right? But here, there are a couple distinct differences. One, because all of these researchers and all this work happens in the open, Apple's been forced to publish these things in a way that they probably didn't do with OpenCL and probably haven't done with Metal. And even if there is a hill to climb to adopt Apple's frameworks, the hardware is so compelling that people will do it. Right. With Metal, yeah, like you can really like make your game sing on an iPhone. But there, the market factor is there's a bunch of iPhones and people with iPhones spend money on apps. But here, if you can make your model really sing on Mac hardware using Apple's open source frameworks, you could be saving significantly in terms of budget when you're putting these things together. And, you know, the 5090 is an interesting card because it can do AI stuff, but it's also marketed to consumers for gaming. But when you get into like NVIDIA's AI specific hardware that like these AI companies are buying and putting in like big racks and data centers, it's a different ball game in terms of price. And you can't just get some of that stuff you've got to have contracts and you got to have minimum buys and there's waiting times and you can just order a pretty nice mac studio on your phone and get it on your doorstep in you know 10 days that's amazing and i
Speaker 1
think yeah i mean for con for context an nvidia h100 which is the NVIDIA Tensor-based GPU that is using data centers, that is usually, it goes for about $30,000. So that gives you some context.
Speaker 2
That's a lot of Mac Studios.
Speaker 2
I just think it's so interesting. And they talked about this on Upgrade this week as well. Jason had John Gruber on, and they were talking about this. And I think there's kind of two levels to the conversation. There's like the framework model sort of level that we've been at. But then there's the conversation of, well, if Apple wants to continue to entice these companies, is Apple intelligence a stumbling block for that? Like, should Apple take all of this and turn the Mac into not only the best AI computer for researchers and developers, but the best AI computer for consumers? That I think is a very different conversation and one that I'm not sure. I'm just not sure where that lands. I don't think, look, if you're serious about AI, you're just going to ignore Apple intelligence and like download these models on your Mac or like run the chat GBT app like a lot of us do, or you're going to find ways to make it work for you. I don't think this means that Apple's like Apple intelligence efforts are suddenly going to evaporate. Now, to be clear, there haven't been great yet and they're behind and they're going to fall more behind with, with the Siri stuff. It seems like, so I don't know. Like, do you view those as like two different things? What are your thoughts there?
Speaker 1
I think... I keep coming back to this idea that if Apple is falling behind, and by falling behind, we mean they don't have a comparable suits of services when it comes to AI that they can offer to users and to developers. If that's the case, I think one potential avenue for them to sort of get out of this problem would be sort of this two-fold approach of, well, we're going to make the best possible hardware that consumers can buy and developers can buy, right? And anecdotally speaking, it does seem to me like everybody who's really into AI in terms of like, you know, heavy users of AI products or developers of AI software, like everybody's just using a Mac these days. So that's the hardware route. Like to just say, we're going to make the best possible hardware. And sure, maybe NVIDIA, you know, captured the market in data centers, but we're going to capture the market at home. We're going to make sure that the developers who may be using, you know, expensive data centers, when it comes to their desks, they all have Macs. And I would wager that 90% of OpenAI employees and engineers, they're all using MacBook Pros or desktop Macs. Like that's sort of the scenario that I think Apple would be more than happy to be in. But at the same time, and I had an article on Mac Stories about this this morning, when it comes to software, I think there's one way to mitigate this narrative right now. And this problem right now would be for apple to say well so everybody's building ai powered apps and and software features these days and we as apple and with apple intelligence we are being completely left out of the conversation because all of these developers are building against apis from google from OpenAI, from Anthropic, from DeepSeek, and they're not using any of our... They're using our developer tools to build and run the apps, but they're not using our APIs. And all of these user data and all of these APIs are not going to us, they're going to somebody else.
Speaker 1
I think if you're Apple and this is the article that i posted like there is an interesting thought of apple saying well but what if we sort of regain control of that developer angle and instead offered apple intelligence as sort of like a middleman, as an intermediary between developers and those third-party AI tools. So I had this article where I imagined, like, what would an Apple intelligence SDK as a bridge to Chagipiti or Cloud or Gemini, what would it look like? And why should Apple do it? And I think it's interesting to think about, you know, Apple in this way, Apple saying, well, you know, instead of bringing your own API key for Chagipity, you can just keep working with the Apple SDK. You can keep working. We're going to give you APIs that sort of normalize the usage of Cloud or Chagipity. They're all going to be integrated. They're going to be available in UIKit. They're going to be available in SwiftUI, in Swift, in all the languages that we support. They're going to be supported in all of our frameworks. You can build for Apple platforms. They're going to work everywhere. You don't need to bring your own API key for a third-party provider. You can just be a member of the Apple developer program, and we're going to give you some API calls for free on a monthly basis. Otherwise, we're going to sell you a subscription, and that subscription for additional AI integrations, again, we're going to act as the middleman. We're going to act as an aggregator of sorts. Those subscriptions are going to be more affordable than if you just go as an individual, as an indie developer, or as a small company, if you sign up for the standard business or enterprise plans from OpenAI, Anthropic, Google, and so forth. So a potential idea would be for Apple to say, we already have the control of the hardware because everybody's just using Macs these days. But while we're building our own large language model, we may sort of regain control of the software aspect by saying you can build with Apple intelligence, you can use those third-party APIs, but you're going to go through us and it's going to be better for you because it's going to be cheaper, and you're going to get a privacy guarantee from Apple. That's something I keep thinking about. You know, big picture right now.
Speaker 2
Exciting times. For
Speaker 2
Mac. What a time.
Speaker 1
What did gruber call it on their inferable like a happy accident um and i think it is i think it is and um i look i don't know if apple is gonna have is actually gonna say you know what instead of rolling your own integration with chagiptee use the Apple intelligence SDK. But I'll give you this much. They themselves are doing it for Siri. They're falling back to JGPT and Google to an extent to sort of supplement the functionalities that by themselves alone they cannot offer. So if Apple does it for themselves, is there a scenario in which they can do it for others? And I think there is.