

Generative AI in the Real World
O'Reilly
In 2023, ChatGPT put AI on everyone’s agenda. Now, the challenge will be turning those agendas into reality. In Generative AI in the Real World, Ben Lorica interviews leaders who are building with AI. Learn from their experience to help put AI to work in your enterprise.
Episodes
Mentioned books

Aug 29, 2025 • 35min
Tom Smoker on Getting Started with GraphRAG
Join Ben Lorica and Tom Smoker for a discussion of GraphRAG, one of the hottest topics of the last few months. GraphRAG goes a step beyond RAG to make the output of language models more consistent, accurate, and explainable. But what is a graph? A graph is a way of structuring data. In the end, it’s the structure that’s important, along with the work you do to create that structure.Points of Interest0:15: GraphRAG is RAG with a knowledge graph. Do you have a more strict definition?1:00: A lot of what I do is the R in RAG: retrieve. Retrieval is better if you have structured data. I’ve yet to find a definition for GraphRAG. You want to bring in structured data.2:03: At the end of the day, the lesson is structure. Sometimes structure is a SQL database. Don’t lose hope if you don’t have a knowledge graph.2:49: A knowledge graph is a knowledge base and a list of axioms (rules). The knowledge base is just a word connected to another word through a third word. Fundamentally, the benefit comes from the list of triples. The value is in having extracted and defined those triples.4:01: Knowledge graphs are cool again. What are your two favorite examples of GraphRag in production?4:57: My examples are people who are structuring their data so that it’s consistent. Then you can bring it into a context window and do something with it.5:18: LinkedIn and Pinterest are the best examples of existing graph structures that work.5:35: A new application is a veterinary radiology example. Without GraphRAG, the LLM kept recommending conditions specific to Labradors not bulldogs. GraphRAG controlled the problem.6:37: The underlying data was almost exclusively text. It’s difficult to build up a consistent dataset for veterinary radiology because animals move.7:12: My favorite examples: Google uses their data commons to build a Q&A application. Metaphor Data: The starting point is structured data, then they create a second graph from the first graph that maps technical terms to business terms. Then they construct a social graph based on who is using the data.9:41: Structured data can be the basis for a graph.10:06: Unstructured data is valuable, but you need a way to navigate and categorize unstructured data.11:04: Where are we on GraphRAG? Do you still have to explain what GraphRAG is?11:28: More people know about it, but I have to explain it more than I did previously. Exactly what are we referring to? Most people want accuracy in the beginning; the value is often that it is more explainable. People may have seen a fantastic example, but what they haven’t seen is the iterative process in schema design. The upfront cost of these systems is nontrivial.13:13: What are the key bottlenecks? How do I get a knowledge graph?13:23: The biggest question is: Do you need a graph in the first place? There’s a whole spectrum. It’s in most people's interest to stop before they get to the end.14:01: For people who come to us brand-new, we say, “You should try vector RAG first. If that doesn’t work, there’s a lot of good that structuring data can provide.”15:01: If the chunks are structured, and a lot of the work is done up front, then it’s possible to navigate through structured information. At that point, you get value out of vector RAG. Academic papers have to follow a certain structure. If you spend time making sure you know what the chunks are, where they’re split and why, and they’re labeled, you can get a lot of value.16:43: What are some of your pointers about how to get started?16:47: The knowledge base is often a compressed representation. That means less tokens. That means better rate limits and less cost. So some people want a graph to help scale. That’s one start. Another is the desire for a system to be explainable. Getting that information into a structured representation and tracing back that structured representation can be very useful.

Aug 28, 2025 • 30min
Robert Nishihara on AI and the Future of Data
Robert Nishihara is one of the creators of Ray and cofounder of Anyscale, a platform for high-performance distributed data analysis and artificial intelligence. Ben Lorica and Robert discuss the need for data for the next generation of AI, which will be multimodal. What kinds of data will we need to develop models for video and multimodal data? And what kinds of tools will we use to prepare that data?Points of Interest1:06: Are we running out of data?1:35: There is a paradigm shift in how ML is thinking about AI. The innovation is on the data side: finding data, evaluating sources of data, curating data, creating synthetic data, filtering low-quality data. People are curating and processing data using AI. Filtering out low-quality data or unimportant image data is an AI task.5:02: A lot of the tools were aimed at warehouses and lakehouses. Now we increasingly have more unstructured multimodal data. What's the challenge for tooling?5:44: Lots of companies have lots of data. They get value out of data by running SQL queries on structured data, but structured data is limited. The real insight is in unstructured data, which will be analyzed using AI. Data will shift from SQL-centric to AI-centric. And tooling for multimodal data processing is almost nonexistent.8:23: In part of the pipeline, you might be able to use CPUs instead of GPUs.8:44: Data processing is not just running inference with an LLM. You might want to decompress video, re-encode video, find scene changes, transcribe, or classify. Some stages will be GPU bound, some will be memory bound, some will be CPU bound. You will want to be able to aggregate these different resources.10:03: Most likely, with this kind of data, it's assumed you will have to go distributed and scale out. There is no choice but to scale the computation.10:46: In the past, we were only using structured data. Now we have multimodal data. We are only scratching the surface of what we can do with video—so people weren't collecting it as much. We will now collect more data.11:41: We need to enable training on 100 times more data.12:43: ML infrastructure teams are now on the critical path.13:52: Companies at the cutting edge have been doing this, but nearly every company has its own data about its specific business that they can use to improve their platform. The value is there. The challenge is the tooling and the infrastructure.15:15: There's another interesting angle around data and scale: experimentation. You will have to run experiments. Data processing and experimentation is part of experimentation.16:18: Customization isn't just at the level of the model. There are decisions to be made at every stage of the pipeline. What to collect, how to chunk, how to embed, how to do retrieval, what model to use, what data to use to fine tune—there are so many decisions to make. To iterate quickly, you need to try different choices and evaluate how they work. Companies should overinvest in evals early.17:29: If you don't have the right foundation, these experiments will be impossible.18:23: What's the next data type to get popular?18:42: Image data will be ubiquitous. People will do a lot with PDFs. Video will be the most challenging. Video combines images and audio; text can be in video too. But the data size is enormous. There are modeling challenges around video understanding. There's so much information in video that isn't being mined.22:50: Companies aren't saying that scaling laws are over, but scaling is slowing down. What's happening?

Aug 27, 2025 • 27min
Getting Ahead of the Curve with Claire Vo
In this episode, Ben Lorica talks with Claire Vo, chief product officer at Launch Darkly and founder of ChatPRD. AI gives us a new set of tools that make everyone more productive and efficient. Those tools will allow more experimentation; they will allow more people to participate in product development; and they will create new opportunities for startups. As Claire says, this new tooling lets everyone get more ambitious—and if you start now, you’re on the leading edge. Lean in to the opportunities.Points of Interest0:25: ChatPRD is an AI copilot for product managers and people who build products. The goal is to make more efficient people who need to generate ideas, build our requirements.1:15: It improves the quality of product work: it’s an on-demand coach or colleague.2:05: In a hybrid world, there needs to be some kind of artifact describing what we want to build. No matter the culture, you should try to make high-quality documents to improve the thinking.3:44: We ingest your product documents for two reasons: to have context of what you’ve built, what matter, and to inform style and quality.5:13: To become a 100x PM you need to embrace tools and accelerate your work. It’s learning how to scale and do your best in a highly efficient way. Getting 2–3 days back in your week.7:17: Will the programming language of the future be natural language? You will still have to think and describe things as a software engineer or a product manager.7:54: My favorite users are engineers who don’t have product managers, sales people who get customer requests, and even founders who can’t afford a product manager.8:41: In frontier models, I’d like to see up-to-date training data. The killer feature is performance. The models need to support a workflow that requires speed. Models need more control over output mechanisms than they have now, so users don’t have to massage output.10:38: There isn’t capability parity between the models, so you have to make trade-offs between performance, features, API support, latency, user experience, and streaming.11:05: Always design your application to be model agnostic. LaunchDarkly allows engineers to decouple the configuration and release of their code from deploying in production.12:14: With AI, prompts become feature flags. You can measure things like latency and token count, and make informed decisions about what works best.13:21: It’s important to have the ability to experiment in classic software development. That matters even more with nondeterministic software, because the ability to predict output goes down. You need to think about instrumentation from the beginning.14:37: I have been through a couple of technology waves, but this one has stopped me in my tracks. The difference between what is possible and what is not possible is unbelievable. I could have built the product from my startup 10 years ago before lunchtime.16:01: People need to prepare to be expected to do more because the ability to do more is powered by these tools and automations. People should educate themselves on how to automate tasks in their current job, and they should add additional skills like the ability to code.16:42: The shape of organizations will change. The triad of the product manager, engineering lead, and design lead will collapse into an individual. Individual contributors will become more efficient.17:35: Everyone can get more ambitious. There won’t be less to do. More people will be empowered to do more things and have bigger impact.18:44: Everything requires a radical cultural shift inside companies. It can feel scary. You need to set the aspiration and why it matters; you need to organize among motivated individuals and reward the behavior you want to see; new organizations will fall out of the centers of gravity around people who are operating in an AI-native way.

Aug 26, 2025 • 46min
The Future of Programming with Matt Welsh
Join us for a conversation between Ben Lorica and Matt Welsh, cofounder of Fixie.ai, former engineer at Apple and Google, and one of Mark Zuckerberg’s professors at Harvard. Learn how AI is changing computing. Whether it’s in C or a human language, programming is telling a computer what you want it to do—but AI opens up new classes of things that we can ask it to do.It’s not just simplifying (or replacing) coding; it’s creating new opportunities and new kinds of applications that we couldn’t imagine two or three years ago.Points of Interest0:00: Introduction.2:38: The changing nature of programming. What will replace programming?3:07: Ultimately, the idea of writing a program will be replaced by telling a language model what you want to do. The language model will do what you want directly.5:03: I can do things I couldn’t imagine doing—for example, summarize a transcript or find bios of speakers and relevant papers.7:01: There’s a whole new field of kinds of computation we couldn’t do before.7:48: People in fields like medicine used to have to ask computer scientists to do things for them. Now, you don’t have to get a computer scientist to translate an idea into reality.11:30: What is missing from the current tooling?11:40: It’s way too hard for people without programming ability to integrate language models into their workflows. Ultimately, AI needs to be deeply integrated into products and the OS.13:45: Are people in the UX community inventing new ways to interact?14:40: We are very embedded in a web/mobile-based way of thinking about interacting. AI changes the ways we interact with computers—for example, voice.16:07: There’s a lot of information encoded into voice that you miss when you encode it into text.18:15: What about programming itself?18:30: Programming is changing radically. At Fixie, we mandated that employees have access to ChatGPT and similar tools.20:34: What is the role of testing and QA?21:28: People will struggle to find the right trade-offs. We’re not throwing out all of the processes we’ve developed, like testing and code reviews.25:25: Every company can train AI to scale their best engineers.25:55: We’re being sloppy as an industry. Curation of good code and good documents will be important. We don’t just need more data, we need better data.28:23: What is Aryn doing?29:17: When people wanted to use AI models to ask questions about their data, they started with simple processes: break text into chunks, store in vector database, and at question time, feed them back in to the prompt.30:10: We need the ability to extract data from unstructured documents. The structure is there, but it’s hidden. The first part of Aryn: How do you extract the structure inherent in documents?32:46: The second part of Aryn: A Python framework, Sycamore, lets you build ETL pipelines from these documents. ETL does things like normalize location information.35:45: Another part of the Aryn stack is LLM-powered unstructured analytics (LUNA) that allows you to make queries based on the unstructured data in the documents.37:34: The future of programming is using language models as computers to perform computation that would be difficult to express in a programming language.38:22: People are talking about GraphRAG, which is RAG with knowledge graphs, but how do you get a knowledge graph? Can Aryn help that?39:15: Yes, we’re effectively doing knowledge graph construction. But once you have the right underlying structure, you may not need knowledge graphs at all.40:50: Are tools for evaluating AI lagging behind development tools?41:16: The meaning of “evaluation” is often not well-defined.43:03: Evaluation will come down to establishing trust.43:32: We need tools that will allow people to collaborate early on evaluations. You need to give people that help them understand what’s happening.

Aug 25, 2025 • 35min
Kingsley Ndoh on Improving Cancer Care with AI
What can AI do to improve healthcare? Kingsley Ndoh, founder of Hurone AI, talks with Ben Lorica about how Hurone is making cancer care more effective for people who are underserved by the medical system. He discusses how AI can streamline the medical process, both helping doctors to treat patients more effectively and making clinical trials more diverse.Points of Interest0:36: What motivated you to apply AI to cancer care? What problems are you trying to solve?1:39: We need environments for training AI models that are effective for all populations.2:31: Current oncology solutions serve advanced healthcare systems, leaving community oncology centers and international markets underserved.3:31: Lack of diversity in clinical trials means we don’t have full evidence on the efficacy of drugs.5:00: What is an oncologist?6:10: Cancer is a very complex disease; every cancer is different and has its own solutions.6:43: What advantages do you bring as a domain expert?7:11: I’ve been a physician taking care of patients. I understand clinical workflows in Nigeria and the US. I’ve also been an entrepreneur since I was in high school. I’ve also worked in the global oncology space with governments and pharma companies. That network is very important.9:15: What was the situation before Gukiza [Hurone’s app]? What does Gukiza enable today?9:44: Gukiza makes care more accessible to patients and optimizes workflows for oncologists. They may have to travel long distances to see an oncologist; they may have side effects or even emergencies that are avoidable; data about events may be lost.12:53: Gukiza streamlines the process; it’s a two-way system that can be used standalone. There is a HIPPA-compliant API that can be integrated into major electronic medical records systems. Patients aren’t limited to an app; there is an API for WhatsApp, Telegram, and text messaging.14:13: Patients can describe their problems. Clinicians can click a button and generate a response that they can review and send to the patient. Clinicians can also call patients, do clinical summaries, and see how patients are progressing.17:08: One should think about this as a copilot. The app makes suggestions; the physician makes the decision.17:35: There are definitely risks. We are building our model and fine-tuning it to ensure that hallucination is limited. But there is still a final human review.18:40: What if I want to use the system in a completely new country? What does it take to get the system into a viable, usable state?19:41: We conform to the country’s guidelines for the management of patients. Cancer care is usually based on established guidelines. In the US, we have NCCN guidelines. To make sure guidelines are responsive to different regions, the NCCN looked at evidence for research done in different countries to harmonize guidelines. That gave birth to the resource stratified guidelines for regions like Sub-Saharan Africa. We don’t need to customize a lot.21:38: We are also building agreements for access to de-identified cancer data. As we scale, it will get better.24:02: Health data is the most sensitive data in the world, but also the most abundant. Compared to other industries, healthcare is lagging behind. But many regions are looking for disruption and innovation and are willing to be flexible to work with us.25:20: Our solution isn’t a magic bullet, but it will shift the needle.26:12: We are excited about LLMs with text and images. But before LLMs, people were excited about computer vision. What models are you using?27:10: We’re relying on LLMs and NLPs. There are established startups with computer vision for radiology and pathology; we are partnering with those companies. The major data we collect is genomic data. We are also incorporating wearable device data with things like geolocation, sleep patterns, heart rates, etc.28:28: Social determinants of health data are also important: ZIP code, employment status, activities, food.

Aug 22, 2025 • 35min
Putting AI in the Hands of Farmers with Rikin Gandhi
Rikin Gandhi, CTO of Digital Green, talks with Ben Lorica about using generative AI to help farmers in developing countries become more productive. Farmer.Chat integrates information from training videos, sources of weather and crop information, and other data sources in a multimodal app that farmers can use in real-time.Points of Interest0:45: Digital Green helps farmers become more productive. Two years ago, Digital Green developed Farmer.Chat, an app that uses generative AI to put local language training videos together with weather data, market information, and other data.2:09: Our primary data source is our library of 10,000 videos in 40 languages that have been produced by farmers. We integrate additional sources for weather and market information. More recently, we’ve added information support tools.3:38: We have a smartphone app. Users who only have feature phones can call into a number and interact with a bot.5:00: Prior to Farmer.Chat, our work was primarily offline: videos shown on mobile projectors to an in-person audience. Sending content to phones flips the paradigm: rather than attending a video, farmers can ask questions relevant to their situation.6:40: When did you realize that generative AI opened up new possibilities? It was a gradual transition from offline videos on projectors. COVID didn’t allow us to get groups of farmers together. And more farmers came online in the same period.8:17: We had a deterministic bot before Farmer.Chat. But users had to traverse a tree to get the information they wanted. That tree was challenging to create and difficult to use.9:33: With GPT-3, we saw that we could move away from complexity and cost of using a deterministic bot.11:15: Did ChatGPT alert you to more possibilities? ChatGPT has scoured open internet knowledge. Farmers are looking for location and time-specific information. Even in the earliest version of ChatGPT, we saw that it had a lot of this information. Putting this world together with our video was powerful.13:07: Accuracy, precision, and recall are all important. Are you fine-tuning and using RAG to make sure you are accurate? We had problems with hallucinations even within our knowledge base. We implemented reranking and filtering, which reduced hallucinations to <1%. We’ve created a golden Q&A set.16:01: People are now talking about GraphRAG, the use of knowledge graphs for RAG. Can you create a knowledge graph because you know your data so well? A lot of concepts in agriculture are related—for example, crop calendars for how crops develop. We’re trying to build those relations into the system.17:05: We are leveraging agentic orchestration for the overall pipeline. Based on the user’s query, we may be able to answer questions directly rather than go through the RAG pipeline.18:44: Your situation is inherently multimodal: video, speech-to-text, voice; is this a challenge? We’re now using tools like GPT Vision to get descriptive metadata about what’s in videos. It becomes part of the database. We began with text queries; we added voice support. And now people can take a photo of a crop or an animal.21:04: Foundation models are becoming multimodal. What’s your user interface today? What are you moving towards? We started with messaging apps that the users already use. We’re plugging the bot into that ecosystem. We’re migrating towards a reality that isn’t text first: putting video first so farmers can speak and take a video. For many farmers, this is the first time they’ve interacted with a bot. Autoprompts are important so they know that it has weather and locale-specific information.23:57: What are specific challenges around AI—privacy, security, and ethics? Agriculture is often a sensitive subject. There’s a lot of personally identifiable information. We try to mask that information so it’s not used to train models. Farmers need to be able to trust that their information won’t be taken away from them.

Aug 21, 2025 • 34min
Adopting AI in the Enterprise with Timothy Persons
Timothy Persons of PricewaterhouseCoopers (PwC) talks with Ben Lorica about adoption of AI in the enterprise. They discuss the challenges enterprises experience, including the need to change corporate culture. To succeed, it’s important to focus on solving well-defined problems rather than just doing something cool with AI. Good data strategies and data governance are essential. Persons also highlights the importance of training and education for everyone in the organization and the need to create safe environments where people can experiment.Points of Interest0:00: Introduction.1:00: We are seeing an uptick in adoption of AI in the enterprise. CEOs are planning to adopt AI and pursue business reinvention. Many companies are still kicking the tires. There is more adoption in the backend where risks are lower.3:36: AI budgets are on an upward trend. It is not a small spend and there’s a tendency to underestimate cost.4:54: What are some of the key challenges that enterprises face when they go to deployment?5:10: It’s all about trust and culture: getting employees and executives comfortable with the technology. That implies upskilling and internal conversations.7:09: What is a data strategy for generative AI?7:37: Companies need data governance, which must be more than a well-written policy document.Governance means operationalizing the policy. Once you focus on quality data and abide by governance, you have the foundation for a good future.9:26: How do you measure that you’re delivering ROI? How do you evaluate so that you know your LLM-backed application is ready to go?10:50: ROI—We need to separate R&D. For R, ROI doesn’t work well. But when you cross from R to D and investments scale, you need to think about ROI.12:15: Evaluation—We can measure LLMs today. But what does that mean in the context of the problem you’re solving? AI in autonomous vehicles is different from AI in medical systems.13:58: Companies need to invest in educating the workforce. Upskilling is not just for expertise; it is also for interdisciplinarity. Changing organizational culture means changing the way organizations communicate and partner.15:38: People underestimate the importance of creating a good user experience. Design thinking is needed. Focus on end-user experience and work back from that.16:59: What are some of the most common use cases for AI?17:17: In the back office, you often have a corpus of information customized to your situation. You can build fit-for-purpose chatbots for key support functions. The best lawyers can’t read everything possible in the corpus or keep up with all the regulatory changes coming in.21:11: AI will increase the value of labor investments. It will expedite the L&D curve for new employees. It will improve users’ lives. And AI is getting much better. We’ve only seen the floor, not the ceiling.24:38: Do you have a checklist or a playbook to help companies prioritize use cases?24:57: Companies need to think “What problems do I need to solve?” Think from a problem-centric approach.27:32 Are there best practices for sharing learning across different groups?28:17: We’ve seen centers of excellences rise. Sharing what didn’t work is important. GenAI is very democratizing—not everyone needs a PhD. When companies reward sharing, including what didn’t work, it really engenders collective learning and great ideas.30:15: What have leading companies done to prepare their workforces?30:31: PwC made a major investment in MyAI, which was focused on the ability to get AI into the hands of users, down to entry-level interns. It was an intentional L&D process that was focused on AI. We gave people the tools and a safe space to use them.32:43: It’s learning by doing, and it’s fun. And it can be customized to a company or a firm.33:03: If we didn’t provide a controlled environment, our people would go out into an uncontrolled environment.

Aug 21, 2025 • 40min
Learning How to Do AI Effectively with Alfred Spector
Alfred Spector has been a leader in AI and machine learning at Google, IBM, and Two Sigma. He is now a visiting scholar at MIT, an advisor at Blackstone, and coauthor of the text book Data Science in Context. Alfred talks with Ben Lorica about what people developing with AI need to be successful. Succeeding with AI is about more than just a model. We need to think about the application and its context. We need humanities and social sciences in addition to technology. Alfred also discusses the AI skills gap, resistance to adopting AI, “hybrid intelligence,” and the calls to regulate AI.Points of Interest0:00: Intro0:54: What do we need to do to apply generative AI effectively?2:10: Why did you end up writing the book Data Science in Context?3:14: Data science is about more than the model. More than "just get some data and hope."8:22: Ethics alone isn't enough.11:08: Students need a good basis in economics, political science, history, and literature. We have to think more broadly than "which ad gets the most clicks."14:20: There's an AI literacy and skills gap, particularly outside Silicon Valley.15:43: Companies be probing opportunities.16:20: Is there resistance to adopting AI? Fear of displacement or distrust?18:18: Most people think there is more to do than people to do the work.19:21: To what extent are companies trying to come up with an overarching vision for AI?19:51: For some companies, GenAI will be formative. Others need to kick the tires and put together a road map.21:35: Internal applications can be more fault tolerant. Keep employees in the loop; don't be lazy.23:12: Prior to ChatGPT, barrier to entry was higher. AI is now very developer friendly.24:13: What level of data science or ML knowledge should companies have?25:01: There are two categories of expertise; broad perspective on products and services.28:25: It may take a long time to evaluate whether an application can be deployed.29:07: With agents, the stakes are higher.30:07: Hybrid intelligence will be a coalition that includes AI.32:38: Even task-specific agents can break. Agents are fragile. Humans aren't fast but are good at dealing with things we haven't encountered before.33:43: Regulate uses of technology, not technologies.

Aug 19, 2025 • 28min
Andrew Ng on where AI is headed. It’s about agents.
Andrew Ng is one of the pioneers of modern AI. He was Google Brain’s founding technical lead, Coursera’s founder, Baidu’s Chief Scientist, DeepLearning.ai’s founder, a Professor at Stanford—and much more. Andrew talks with Ben Lorica about scaling AI, agents, the future of open source AI, and openness among AI researchers. Have you experienced an “agentic moment” when you’re surprised and thrilled by AI’s ability to generate a plan and then to enact that plan? You will.Points of interest0:00: Introduction1:00: Advancing AI required scaling up. Better algorithms weren’t the issue.2:57: Just as we needed GPUs and other new hardware for training, we may need new hardware for inference.3:18: People are pushing Data-centric AI forward. Engineering the data is important—maybe even more important than engineering the model.4:41: The idea of agents has been around for a while. What’s new here?6:00: Agentic workflows let AI work iteratively, which yields a huge improvement in performance.8:01: Agent can be used for Robotic Process Automation (RPA), but it’s much bigger than that. We will experience “agentic moments” when we see AI that plans and executes a task without human intervention.10:42: Do you anticipate new Agentic applications that weren’t possible before?12:21: What are the risks of training on copyright-free datasets? Will using copyright-free datasets degrade performance?15:05: AI is a tool; I dispatch it to do things for me. I don’t see it as a different “species.”16:17: How do we know when an application is ready to release? What are best practices for enterprise use?17:18: It’s still very early. We need more work on evaluation. It’s easy to build applications—but when you build an app in a week, it’s hard to spend 10 weeks evaluating it.19:14: A lot of people build an application on one LLM, but won’t switch because evaluation is hard.20:12: Are you concerned that Meta is the only consistent supplier of open source language models?22:10: The cost of training is falling. The decrease in the cost of training means that the ability to train large models will become open to more players.26:15: The AI community seems less open than it was, and more dominated by commercial interests. Is it possible that the next big innovation won’t get published?26:50: We’re starting to see papers about alternatives to transformers. It’s very difficult to keep technical ideas secret for a long time.

Aug 19, 2025 • 34min
Democratizing AI with Gwendolyn Stripling
Gwendolyn Stripling, author of Low-Code AI, talks about the democratization of AI, the primacy of data, the future of data science, and the coming of agents. It’s easy to think that AI is all about algorithms and models but it’s not; it’s really about understanding the business use case and the data that can be applied to that use case. We’re only beginning to have tools for the rest of the job: collecting, preparing, and exploring the data to find out what’s relevant to your business. Looking ahead, Gwendolyn sees generative AI automating even more of the workload. But focusing on the data, and collecting, understanding, and interpreting it, will always be the human part of the job.Points of interest0:57: What’s the boundary between no-code and low-code?3:10: Using the minimum amount of code necessary to achieve your goal.4:09: Low-code reduces the heavy lifting. But what if you want to learn about AI and ML?6:35: Learning ML isn’t about the tools; it’s about the business case and the data.7:55: What made you think about exposing more people to low-code AI?11:21: The key to all of this is the use case and then the data.14:32: What if I primarily use SQL?15:30: Is there an equivalent of AutoML for data collection and preparation?16:50: Generative AI looks like it will be able to help prepare data.19:22: How did the release of ChatGPT and other LLMs affect your book?24:00: Is there a low-code or no-code approach to RAG?26:30: The GenAI pipeline is becoming completely automated.26:49: The word of 2024 is agents. A lot of what can be automated will be automated.28:00: A lot of people are sharing lessons and best practices. That makes this an exciting time.29:17: Looking ahead five years, what will data scientists and ML Engineers do?


