Jeremie Dreyfuss, Head of AI Research and Development at Intel, shares his expertise on scaling machine learning solutions and AI infrastructure. He delves into the challenges of data collection, emphasizing the need for robust systems in high-stakes environments. The discussion touches on deploying ML models and the transformative impact of large language models. Jeremie also explores strategies for effective AI development, highlighting the importance of balancing user needs with sophisticated algorithms. His insights illuminate the path forward for enterprises navigating the AI landscape.
Effective collaboration between data scientists and hardware engineers is essential for integrating AI algorithms into Intel's hardware ecosystem.
The rapid advancements in large language models have prompted Intel to explore new opportunities in AI, enhancing their competitive edge in the NLP domain.
Deep dives
Diverse Machine Learning Production Models at Intel
Machine learning production at Intel varies significantly between different projects and teams, reflecting the complexity of the organization. One key area involves algorithms that are integrated directly into hardware, such as those found in Intel computers, necessitating collaboration between data scientists and hardware engineers. Another aspect includes analyzing terabytes of testing data weekly using tools like Spark, which helps identify the most effective tests to identify potential bugs. Additionally, some projects deploy machine learning models in real-time environments, further highlighting the spectrum of production models used throughout the company.
Challenges in Data Collection and Testing
Effective data collection is a major challenge faced by Intel’s engineering teams, as they often have established processes that need integration with new systems. Engineers provide test data both from physical tests and simulations, and securing the APIs for effective data transfer is crucial to the workflow. Teams analyze testing results to identify under-tested areas of code and recommend targeted tests to improve validation. The ability to gather and effectively utilize data across multiple teams is critical for improving the overall testing process.
Adapting to Rapid Changes in Machine Learning Needs
As Intel’s hardware evolves, so too must the machine learning models used to validate and test it. The unpredictable nature of hardware development requires retraining models frequently, often discarding previous models after use. This ongoing cycle emphasizes the need for a robust deployment system that automatically handles data ingestion, model training, and inference to accommodate fluctuating requirements. With every new hardware version, the impact on the overall testing approach highlights the importance of flexibility and adaptability in machine learning practices.
Leveraging Large Language Models (LLMs) in AI Solutions
The emergence of large language models has encouraged a significant shift in how Intel approaches artificial intelligence projects, especially within NLP. LLMs have demonstrated the capability to deliver results that traditionally took considerably more time to train and refine, making them an appealing option. The rapid improvement allows Intel to pursue uncharted areas and enhance their overall market offering. However, as the initial excitement fades, the focus may shift back toward core structured data problems, reflecting the need to continuously assess technology's business impact.
In this episode, Dean speaks with Jeremie Dreyfuss, Head of AI Research and Development at Intel, about the evolving role of AI in the enterprise. Jeremie shares insights into scaling machine learning solutions, the challenges of building AI infrastructure, and the future of AI-driven innovation in large organizations. Learn how enterprises are leveraging AI for efficiency, the latest advancements in AI research, and the strategies for staying competitive in a rapidly changing landscape.
Join our Discord community: https://discord.gg/tEYvqxwhah
---
Timestamps:
00:00 Introduction and Overview
00:55 Challenges of Data Collection and Infrastructure
05:00 Optimizing Test Recommendations
14:42 Tips for Deploying Entire ML Pipelines
21:19 The Impact of Large Language Models (LLMs)
25:30 How to Decide About LLM Investment in the Enterprise
29:29 Evaluating Models and Using Synthetic Data
35:34 Choosing the Right Tools for ML and LLM Projects
45:21 The Beauty of Small Data in Machine Learning
48:22 Recommendations for the Audience
➡️ Jeremie Dreyfuss on LinkedIn – https://www.linkedin.com/in/jeremie-dreyfuss/
🌐 Check Out Our Website! https://dagshub.com
Social Links:
➡️ LinkedIn: https://www.linkedin.com/company/dagshub
➡️ Twitter: https://x.com/TheRealDAGsHub
➡️ Dean Pleban: https://x.com/DeanPlbn
Get the Snipd podcast app
Unlock the knowledge in podcasts with the podcast player of the future.
AI-powered podcast player
Listen to all your favourite podcasts with AI-powered features
Discover highlights
Listen to the best highlights from the podcasts you love and dive into the full episode
Save any moment
Hear something you like? Tap your headphones to save it with AI-generated key takeaways
Share & Export
Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more
AI-powered podcast player
Listen to all your favourite podcasts with AI-powered features
Discover highlights
Listen to the best highlights from the podcasts you love and dive into the full episode