Mark Brooker, VP and Distinguished Engineer at AWS, dives into how agentic workflows are revolutionizing database infrastructure. He shares insights on why agents demand serverless, elastic databases and discusses the shift from traditional data models to vectors and relational databases. Mark explores the significance of tools like D-SQL for managing global agent workloads and highlights real-world applications, such as agent-driven SQL fuzzing. He also emphasizes the need for improved identity and authorization in our evolving data landscape.
52:58
forum Ask episode
web_stories AI Snips
view_agenda Chapters
auto_awesome Transcript
info_circle Episode notes
insights INSIGHT
Agents Make Data The Core
Agents transform database needs by making data access central to AI workflows.
They drive new operational requirements like elasticity, rapid provisioning, and simplified ops.
volunteer_activism ADVICE
Make Databases Fast And Elastic
Provision databases in seconds and pay-as-you-go to support spiky agent workloads.
Prefer serverless, automated patching, and elastic pricing to handle ephemeral or long-lived agent databases.
insights INSIGHT
Vector Is A Tool, Not A Replacement
Vector databases are a powerful new tool but not a replacement for other models.
Relational, graph, document, and key-value stores remain essential alongside vectors.
Get the Snipd Podcast app to discover more snips from this episode
Summary In this episode of the AI Engineering Podcast Mark Brooker, VP and Distinguished Engineer at AWS, talks about how agentic workflows are transforming database usage and infrastructure design. He discusses the evolving role of data in AI systems, from traditional models to more modern approaches like vectors, RAG, and relational databases. Mark explains why agents require serverless, elastic, and operationally simple databases, and how AWS solutions like Aurora and DSQL address these needs with features such as rapid provisioning, automated patching, geodistribution, and spiky usage. The conversation covers topics including tool calling, improved model capabilities, state in agents versus stateless LLM calls, and the role of Lambda and AgentCore for long-running, session-isolated agents. Mark also touches on the shift from local MCP tools to secure, remote endpoints, the rise of object storage as a durable backplane, and the need for better identity and authorization models. The episode highlights real-world patterns like agent-driven SQL fuzzing and plan analysis, while identifying gaps in simplifying data access, hardening ops for autonomous systems, and evolving serverless database ergonomics to keep pace with agentic development.
Announcements
Hello and welcome to the Data Engineering Podcast, the show about modern data management
Data teams everywhere face the same problem: they're forcing ML models, streaming data, and real-time processing through orchestration tools built for simple ETL. The result? Inflexible infrastructure that can't adapt to different workloads. That's why Cash App and Cisco rely on Prefect. Cash App's fraud detection team got what they needed - flexible compute options, isolated environments for custom packages, and seamless data exchange between workflows. Each model runs on the right infrastructure, whether that's high-memory machines or distributed compute. Orchestration is the foundation that determines whether your data team ships or struggles. ETL, ML model training, AI Engineering, Streaming - Prefect runs it all from ingestion to activation in one platform. Whoop and 1Password also trust Prefect for their data operations. If these industry leaders use Prefect for critical workflows, see what it can do for you at dataengineeringpodcast.com/prefect.
Data migrations are brutal. They drag on for months—sometimes years—burning through resources and crushing team morale. Datafold's AI-powered Migration Agent changes all that. Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches. And they're so confident in their solution, they'll actually guarantee your timeline in writing. Ready to turn your year-long migration into weeks? Visit dataengineeringpodcast.com/datafold today for the details.
Your host is Tobias Macey and today I'm interviewing Marc Brooker about the impact of agentic workflows on database usage patterns and how they change the architectural requirements for databases
Interview
Introduction
How did you get involved in the area of data management?
Can you describe what the role of the database is in agentic workflows?
There are numerous types of databases, with relational being the most prevalent. How does the type and purpose of an agent inform the type of database that should be used?
Anecdotally I have heard about how agentic workloads have become the predominant "customers" of services like Neon and Fly.io. How would you characterize the different patterns of scale for agentic AI applications? (e.g. proliferation of agents, monolithic agents, multi-agent, etc.)
What are some of the most significant impacts on workload and access patterns for data storage and retrieval that agents introduce?
What are the categorical differences in that behavior as compared to programmatic/automated systems?
You have spent a substantial amount of time on Lambda at AWS. Given that LLMs are effectively stateless, how does the added ephemerality of serverless functions impact design and performance considerations around having to "re-hydrate" context when interacting with agents?
What are the most interesting, innovative, or unexpected ways that you have seen serverless and database systems used for agentic workloads?
What are the most interesting, unexpected, or challenging lessons that you have learned while working on technologies that are supporting agentic applications?
From your perspective, what is the biggest gap in the tooling or technology for data management today?
Closing Announcements
Thank you for listening! Don't forget to check out our other shows. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used. The AI Engineering Podcast is your guide to the fast-moving world of building AI systems.
Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
If you've learned something or tried out a project from the show then tell us about it! Email hosts@dataengineeringpodcast.com with your story.