
#049 BAML: The Programming Language That Turns LLMs into Predictable Functions
How AI Is Built
Enhancing Intent Classification in LLMs
This chapter explores the concept of symbol tuning in large language models to improve intent classification and minimize bias in category naming. The discussion emphasizes the importance of using concise and neutral language in prompts, advocating for streamlined communication to enhance model understanding and efficiency.
Nicolay here,
I think by now we are done with marveling at the latest benchmark scores of the models. It doesn’t tell us much anymore that the latest generation outscores the previous by a few basis points.
If you don’t know how the LLM performs on your task, you are just duct taping LLMs into your systems.
If your LLM-powered app can’t survive a malformed emoji, you’re shipping liability, not software.
Today, I sat down with Vaibhav (co-founder of Boundary) to dissect BAML—a DSL that treats every LLM call as a typed function.
It’s like swapping duct-taped Python scripts for a purpose-built compiler.
Vaibhav advocates for building first principle based primitives.
One principle stood out: LLMs are just functions; build like that from day 1. Wrap them, test them, and let a human only where it counts.
Once you adopt that frame, reliability patterns fall into place: fallback heuristics, model swaps, classifiers—same playbook we already use for flaky APIs.
We also cover:
- Why JSON constraints are the wrong hammer—and how Schema-Aligned Parsing fixes it
- Whether “durable” should be a first-class keyword (think async/await for crash-safety)
- Shipping multi-language AI pipelines without forcing a Python microservice
- Token-bloat surgery, symbol tuning, and the myth of magic prompts
- How to keep humans sharp when 98 % of agent outputs are already correct
💡 Core Concepts
- Schema-Aligned Parsing (SAP)
- Parse first, panic later. The model can handle Markdown, half-baked YAML, or rogue quotes—SAP puts it into your declared type or raises. No silent corruption.
- Symbol Tuning
- Labels eat up tokens and often don’t help with your accuracy (in some cases they even hurt). Rename PasswordReset to C7, keep the description human-readable.
- Durable Execution
- Durable execution refers to a computing paradigm where program execution state persists despite failures, interruptions, or crashes. It ensures that operations resume exactly where they left off, maintaining progress even when systems go down.
- Prompt Compression
- Every extra token is latency, cost, and entropy. Axe filler words until the prompt reads like assembly. If output degrades, you cut too deep—back off one line.
📶 Connect with Vaibhav:
📶 Connect with Nicolay:
- Newsletter
- X / Twitter
- Bluesky
- Website
- My Agency Aisbach (for ai implementations / strategy)
⏱️ Important Moments
- New DSL vs. Python Glue [00:54]
- Why bolting yet another microservice onto your stack is cowardice; BAML compiles instead of copies.
- Three-Nines on Flaky Models [04:27]
- Designing retries, fallbacks, and human overrides when GPT eats dirt 5 % of the time.
- Native Go SDK & OpenAPI Fatigue [06:32]
- Killing thousand-line generated clients; typing go get instead.
- “LLM = Pure Function” Mental Model [15:58]
- Replace mysticism with f(input) → output; unit-test like any other function.
- Tool-Calling as a Switch Statement [18:19]
- Multi-tool orchestration boils down to switch(action) {…}—no cosmic “agent” needed.
- Sneak Peek—durable Keyword [24:49]
- Crash-safe workflows without shoving state into S3 and praying.
- Symbol Tuning Demo [31:35]
- Swapping verbose labels for C0,C1 slashes token cost and bias in one shot.
- Inside SAP Coercion Logic [47:31]
- Int arrays to ints, scalars to lists, bad casts raise—deterministic, no LLM in the loop.
- Frameworks vs. Primitives Rant [52:32]
- Why BAML ships primitives and leaves the “batteries” to you—less magic, more control.
🛠️ Tools & Tech Mentioned
📚 Recommended Resources
🔮 What's Next
Next week, we will continue going more into getting generative AI into production talking to Paul Iusztin.
💬 Join The Conversation
Follow How AI Is Built on YouTube, Bluesky, or Spotify.
If you have any suggestions for future guests, feel free to leave it in the comments or write me (Nicolay) directly on LinkedIn, X, or Bluesky. Or at nicolay.gerold@gmail.com.
I will be opening a Discord soon to get you guys more involved in the episodes! Stay tuned for that.
♻️ Here's the deal: I'm committed to bringing you detailed, practical insights about AI development and implementation. In return, I have two simple requests:
- Hit subscribe right now to help me understand what content resonates with you
- If you found value in this post, share it with one other developer or tech professional who's working with AI
That's our agreement - I deliver actionable AI insights, you help grow this. ♻️