📆 ThursdAI - Nov 6, 2025 - Kimi’s 1T Thinking Model Shakes Up Open Source, Apple Bets $1B on Gemini for Siri, and Amazon vs. Perplexity!

ThursdAI - The top AI news from the past week

00:00

Automating ThursdAI with n8n: Alex's Workflow Build

Alex outlines building an n8n workflow to aggregate tweets, fetch links, summarize with LLMs, and deliver show notes and channel announcements.

Play episode from 01:12:53

chevron_right

Transcript

chevron_right

Transcript

Episode notes

Hey, Alex here!

Quick note, while preparing for this week, I posted on X that I don’t remember such a quiet week in AI since I started doing ThursdAI regularly, but then 45 min before the show started, Kimi dropped a SOTA oss reasoning model, turning a quiet week into an absolute banger.

Besides Kimi, we covered the updated MCP thinking from Anthropic, and had Kenton Varda from cloudflare as a guest to talk about Code Mode, chatted about Windsurf and Cursor latest updates and covered OpenAI’s insane deals.

Also, because it was a quiet week, I figured I’d use the opportunity to create an AI powered automation, and used N8N for that, and shared it on the stream, so if you’re interested in automating with AI with relatively low code, this episode is for you. Let’s dive in

ThursdAI - Recaps of the most high signal AI weekly spaces is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

Kimi K2 Thinking is Here and It’s a 1 Trillion Parameter Beast! (X, HF, Tech Blog)

Let’s start with the news that got everyone’s energy levels skyrocketing right as we went live. Moonshot AI dropped Kimi K2 Thinking, an open-source, 1 trillion-parameter Mixture-of-Experts (MoE) model, and it’s an absolute monster.

This isn’t just a numbers game; Kimi K2 Thinking is designed from the ground up to be a powerful agent. With just around 32 billion active parameters during inference, a massive 256,000 token context window, and an insane tool-calling capacity. They’re claiming it can handle 200-300 sequential tool calls without any human intervention.

The benchmarks are just as wild. On the Humanities Last Exam (HLE), they’re reporting a score of 44.9%, beating out both GPT-5 and Claude 4.5 Thinking. While it doesn’t quite top the charts on SWE-bench verified, it’s holding its own against the biggest closed-source models out there. Seeing an open-source model compete at this level is incredibly exciting.

During the show, we saw some truly mind-blowing demos, from a beautiful interactive visualization of gradient descent to a simulation of a virus attacking cells, all generated by the model. The model’s reasoning traces, which are exposed through the API, also seem qualitatively different from other models, showing a deep and thoughtful process. My co-hosts and I were blown away. The weights and a very detailed technical report are available on Hugging Face, so you can dive in and see for yourself. Shout out to the entire Moonshot AI team for this incredible release!

Other open source updates from this week

* HuggingFace released an open source “Smol Training Playbook” on training LLMs, it’s a 200+ interactive beast with visualizations, deep dives into pretraining, dataset, postraining and more! (HF)

* Ai2 launches OlmoEarth — foundation models + open, end-to-end platform for fast, high-resolution Earth intelligence (X, Blog)

* LongCat-Flash-Omni — open-source omni-modal system with millisecond E2E spoken interaction, 128K context and a 560B ScMoE backbone (X, HF, Announcement)

Big Tech’s Big Moves: Apple, Amazon, and OpenAI

The big companies were making waves this week, starting with a blockbuster deal that might finally make Siri smart. Apple is reportedly will be paying Google around $1 billion per year to license a custom 1.2 trillion-parameter version of Gemini to power a revamped Siri.

This is a massive move. The Gemini model will run on Apple’s Private Cloud Compute, keeping user data walled off from Google, and will handle Siri’s complex summarizer and planner functions. After years of waiting for Apple to make a significant move in GenAI, it seems they’re outsourcing the heavy lifting for now while they work to catch up with their own in-house models. As a user, I don’t really care who builds the model, as long as Siri stops being dumb!

In more dramatic news, Perplexity revealed that Amazon sent them a legal threat to block their Comet AI assistant from shopping on Amazon.com. This infuriated me. My browser is my browser, and I should be able to use whatever tools I want to interact with the web. Perplexity took a strong stand with their blog post, “Bullying is Not Innovation,” arguing that user agents are distinct from scrapers and act on behalf of the user with their own credentials. An AI assistant is just that—an assistant. It shouldn’t matter if I ask my wife or my AI to buy something for me on Amazon. This feels like a move by Amazon to protect its ad revenue at the expense of user choice and innovation, and I have to give major props to Perplexity for being so transparent and fighting back.

Finally, OpenAI continues its quest for infinite compute, announcing a multi-year strategic partnership with AWS. This comes on top of massive deals with NVIDIA, Microsoft, Oracle, and others, bringing their total commitment to compute into the trillions of dollars. It’s getting to a point where OpenAI seems “too big to fail,” as any hiccup could have serious repercussions for the entire tech economy, which is now heavily propped up by AI investment. Sam has clarified that they don’t think OpenAI wants to be too big to fail in a recent post on X, and that the recent miscommunications around the US government backstopping OpenAI’s infrastructure bailouts were taken out of context. 🤔

Coding with AI: The Evolution of MCP and New Dev Tools

This week, we kicked off a new segment on the show: Coding with AI! Essentially realizing that we talk about AI coding a LOT, and decided to add a dedicated corner to it! And we started with a fascinating development in the world of agentic tooling. Anthropic published a blog post arguing that the standard way of using the Model Context Protocol (MCP) — by loading full tool definitions into the context window — is inefficient.

Their solution? Have LLMs write code to interact with tools instead. This approach can slash token usage by over 98% in some cases. This idea sounded familiar, and that’s because Cloudflare had already explored it with a feature called “Code Mode.” We were lucky enough to have Kenton Varda, one of the authors of the Code Mode post and head of engineering for Cloudflare Workers, join us to discuss this shift.

Kenton explained that LLMs are trained on vast amounts of code, making it a more “native language” for them than the artificial construct of tool calls. By generating code, agents can chain multiple tool calls together, process intermediate results, and operate much more efficiently without sending everything back through the neural network. While MCP still provides crucial standardization for discovering and authorizing tools, this “code execution” pattern seems to be the way forward for building more powerful and scalable agents.

Windsurfs CodeMaps and Cursor multi agent executions

In other coding with AI news, Windsurf has pushed an incredible feature, called CodeMaps. They will use their SWE-1 model to (quickly) generate Codemaps that will expalins a code-base to you, in a visual way. What starts where and goes where. It’s really useful to understand a new codebase or re-understand one you forgot about already! You can even chat with codemaps, to see if your overall system’s design is solid! Great addition that I’m sure will help many folks adopt Windsurf!

And Cursor, another popular AI-native IDE, released a super-performant in-IDE browser and a wild multi-agent feature that queries multiple LLMs in parallel and then synthesizes their answers.

This Week’s Tutorial

I finally got around to building some serious automations for ThursdAI, and folks, N8N has been a game-changer. What used to take me 30+ minutes of manual work now happens automatically in the background.

Here’s what I built: A Telegram bot that takes Twitter/X links, fetches the tweets and all linked content, uses AI agents to extract and summarize the information, and then posts it to our announcement channel and my notes app. The coolest part? I built this whole thing in about 4 hours with the help of Atlas browser and GPT-5 literally telling me what to do at each step.

During the show, we even live-tested swapping out GPT-4o-mini for Kimi K2 - took literally 30 seconds to connect via OpenRouter. I went through my node and explains how this all works on the show, so if you’ve wanted to learn about n8n, check it out starting around 01:13:00. If you want to see how my automation turned out, it will be posting all my links to the new telegram channel t.me/thursdai_news (expect it to be messy at first as I’m testing out the automation)

Robotics - Xpeng’s “Iron” humanoid: big vibes, few specs

Another week, another humanoid robot that is supposedly “coming” in 2026!

A humanoid from Xpeng went viral this week, marketed as “the most human‑like” robot with soft skin, bionic muscles, customizable sexes (yes, really, they have a woman humanoid), something called a VLT brain, and a 2026 production goal. Here’s what we didn’t get: a spec sheet. No DOF, speed, payload, compute TOPS, battery capacity, runtime, or safety pathway. No pricing, manufacturing strategy, or clear target markets. In other words: lots of sizzle, no steak.

Apparently, there was folks thinking Xpend pulled an Elon and put a human in a robot suit, making the CEO do the “we’ll cut a part of the soft skin to expose the robot underneath so you don’t think we’re lying” stunt. Which I agree, was very effective.

But, If Xpeng is serious, the next thing we’ll see should be a crisp engineering document: joints, actuation, sensors, compute, and a locomotion/manipulation demo with independent measurements. Until then, treat this as a branding salvo and a reminder that the humanoid category is still sorting itself into “industrial payload first” versus “human likeness first” approaches.

Voice & Audio

Maya‑1: open‑source voice design from natural language

We highlighted Maya‑1, a 3B Llama‑backboned TTS system designed to generate voices from natural language descriptions. Instead of picking from a menu, you describe the voice—age, accent, affect—and Maya conjures it. It supports real‑time streaming and over twenty “emotion tags.” The quality is compelling for its size and the Apache 2 license will make a lot of builders happy. There’s a growing middle class of TTS: tiny but expressive, good enough for in‑app narrators, prototyping, and even stylized content when you don’t want the constraints of commercial voice marketplaces.

Inworld TTS: a new leader on independent rankings

We also listened to Inworld’s latest, which currently tops the Artificial Analysis TTS leaderboard. It’s not open source, but the combo of expressivity, speed (sub‑250 ms), and multilingual support puts it firmly in the “commercially viable at scale” tier alongside the usual suspects. If you need SaaS TTS today and care about emotional range, add this to your shortlist. Pricing on their site targets availability rather than hobbyist tinkering, but the quality argues for itself.

Whew! For a week that started slow, it certainly ended with a bang. It just goes to show you can never count AI out. We’re seeing open source continue to push the boundaries, big tech making landscape-defining moves, and agentic AI becoming more powerful and accessible every day.

As always, thanks for tuning in. If you’re going to be at the AI.engineer conference in New York, please hit me up—I’d love to meet you.

TL;DR and Show Notes + Links

* Hosts and Guests

* Alex Volkov - AI Evangelist & Weights & Biases (@altryne)

* Co Hosts - @WolframRvnwlf @yampeleg @nisten

* Kenton Varda @ Cloudflare (@KentonVarda)

* Open Source LLMs

* Smol Training Playbook — a 200+ page, end-to-end guide to reliably pretrain and operate LLMs (X, Announcement)

* Ai2 launches OlmoEarth — foundation models + open, end-to-end platform for fast, high-resolution Earth intelligence (X, Blog)

* Moonshot AI releases Kimi K2 Thinking — an open-source 1T-parameter MoE agent with 256K context and huge tool-calling capacity (X, HF, Blog, Arxiv)

* LongCat flash Omni - 560B (27A) omni model (text, audio, video input)

* Big CO LLMs + APIs

* Apple will pay roughly $1B/year to license a custom 1.2 trillion‑parameter Google Gemini model to power a revamped Siri (X, Announcement)

* Perplexity says Amazon issued a legal threat to block Comet AI assistants from shopping on Amazon (X, Blog)

* AWS announces multi-year strategic infrastructure partnership with OpenAI to power ChatGPT inference, training, and agentic AI (X)

* Robotics

* Xpeng unveils ‘Iron’ humanoid claiming ‘most human-like’ design with soft skin, bionic muscles, VLT brain and a 2026 production plan (X)

* Coding with AI

* Anthropic shows how running MCP-connected tools as code slashes token use and scales agents (X, Blog)

* Windsurf Codemaps — AI‑annotated, navigable maps of your codebase powered by SWE-1.5 (Fast) and Sonnet 4.5 (Smart) (X, Announcement)

* Conversation with Kenton Varda (@KentonVarda) from Cloudflare about MCP and Code Mode

* Cursor added in IDE browser - very performant!

* Audio & Video

* Maya-1 - Open source voice generation model.

* Inworld TTS - new #1 on artifical analysis benchmark.

* Tools & Gadgets

* Sandbar launches Stream — a voice-first personal assistant — and Stream Ring, a wearable ‘mouse for voice’, available for preorder (X, Blog)

This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit sub.thursdai.news/subscribe

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app

Home Top podcasts Popular guests Top books