Ed Donner, Co-founder and CTO of Nebula.io, shares his insights from a career that spanned JPMorgan Chase to AI entrepreneurship. He delves into the role of an AI engineer and the exciting job opportunities in the field. Ed explains how to choose a large language model effectively, introduces key AI terms like RAG and agentic AI, and discusses the value of blends of real and synthetic data for testing. They even engage in a competitive game to pit LLMs against each other, showcasing innovative methods in AI evaluation.
AI engineers are in high demand as their role combines data science, software engineering, and machine learning to implement AI solutions.
Choosing a large language model often starts with closed-source options for prototyping before transitioning to open-source models based on specific needs.
Techniques like fine-tuning, Retrieval-Augmented Generation (RAG), and agentic AI are crucial for developing effective and proactive AI systems.
Deep dives
The Role of AI Engineer in Today's Workforce
AI engineers, sometimes referred to as LLM engineers, are in high demand, rivaling the number of job openings for data scientists. This new role is a hybrid between data science, software engineering, and machine learning engineering, requiring professionals to understand various aspects of AI. With approximately 4,000 job openings available in the U.S., the growth in this area reflects the increasing reliance on machine learning and AI technologies across industries. Organizations seek individuals who possess the ability to design and implement AI solutions effectively.
Selecting the Right LLM for Applications
When choosing a large language model (LLM), AI engineers typically start with closed-source models for prototyping, like GPT-4, before considering open-source options. Closed-source models may be preferable for their robust performance in initial testing and prototyping phases. However, engineers might transition to open-source models in applications that require handling proprietary data, tighter privacy controls, or reduced inference costs. This careful selection process ensures that the chosen model aligns with the specific needs of the task or business requirements.
Techniques Used in AI Engineering
AI engineering employs various techniques, such as fine-tuning models with domain-specific data, Retrieval-Augmented Generation (RAG), and agentic AI for creating proactive systems. Fine-tuning allows models to be tailored to specific tasks or industries, improving overall effectiveness. RAG enhances model responses by augmenting them with relevant context retrieved from external data sources, which improves accuracy and functionality. Agentic AI, on the other hand, empowers systems to operate autonomously and proactively, responding to user needs before they ask.
Evaluating Model Performance with Benchmarks
Evaluating the effectiveness of AI models requires reference to various benchmarks, such as GPQA for testing expert-level knowledge and MMLU Pro for assessing language understanding. The Big Bench Hard (BB Hard) benchmark is particularly useful for identifying future capabilities that models may not yet possess but are essential for solving complex problems. These benchmarks aid AI engineers in selecting the best-suited model for their specific application, guiding their development process. Benchmark results can be found on platforms like Hugging Face and serve as critical data points in model evaluation.
Deployment Strategies for AI Solutions
Deploying AI models into production can take several forms, from using serverless platforms like modal.com to deploying via containerized solutions such as Docker and Kubernetes. AI engineers often create end-to-end solutions that incorporate both training optimizations and inference choices tailored to their business cases. The choice of deployment method can vary, with some teams focusing on building complete pipelines while others simply provide models for use by different teams. Ultimately, the deployment strategy will depend on organizational needs and the specific complexities of the AI systems being built.
Ed Donner co-founded AI-driven recruitment platform, Nebula.io, with The SuperDataScience Podcast’s host, Jon Krohn. Ed and Jon reminisce about how they launched their company, the growing opportunities for data scientists, how to choose an LLM, and today’s top technical terms in AI.
Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
In this episode you will learn:
(11:15) What an AI engineer does
(19:23) Defining today’s key terms in AI: RAG, fine tuning, agentic.
(27:09) How to select an LLM
(49:41) Pitting LLMs against each other in a game
(53:14) What to do once you’ve selected an AI model