

Evolving Responsibilities in AI Data Management
30 snips Feb 16, 2025
Bartosz Mikulski, an MLOps engineer with a rich background in data engineering, dives deep into the realm of AI data management. He highlights the crucial role of data testing in AI applications, especially with the rise of generative AI. Bartosz discusses the need for specialized datasets and the skills required for data engineers to transition into AI. He also addresses challenges like frequent data reprocessing and unstructured data handling, showcasing the evolving responsibilities in this fast-paced field.
AI Snips
Chapters
Transcript
Episode notes
Data Testing in AI
- In AI, data testing is more crucial than software development, especially the evaluation dataset (test dataset).
- Multiple-step AI applications require separate test datasets for each step and the entire workflow.
Test Data for RAG and Agents
- Prepare test datasets for every step in a Retrieval Augmented Generation (RAG) application, including user input, queries, and responses.
- For AI agents, datasets must include expected tools, parameters, and queries for comprehensive testing.
AI Team Responsibilities
- AI engineers often handle all AI-related tasks, but existing data engineering, data science, and MLOps teams can adapt to generative AI.
- Generative AI adds the challenge of working with text input and output.