Evolving MLOps Platforms for Generative AI and Agents with Abhijit Bose - #714
Jan 13, 2025
auto_awesome
Abhijit Bose, Head of Enterprise AI and ML platforms at Capital One, shared insights into the evolution of their generative AI platform. He discussed the transition to a platform-centric approach in finance and the integration challenges faced by MLOps. Bose delved into optimizing Llama models for improved customer service and the role of Kubernetes in enhancing machine learning workflows. He also highlighted the significance of cloud architecture in AI experimentation and the new skill sets required for thriving in the generative AI landscape.
Capital One employs a platform-centric approach to AI, integrating both classic ML methods and generative AI to enhance operational efficiency.
The company emphasizes robust observability and governance systems to address unique challenges presented by generative AI in production environments.
Deep dives
Capital One's AI Infrastructure and Strategy
Capital One is deeply embedding AI into its operations, utilizing proprietary solutions across its tech stack to enhance customer experiences. The company's head of enterprise AI and ML platforms emphasizes their commitment to developing both classic machine learning methods and generative AI applications. Over the past four years, the firm has rebuilt its machine learning stack, enabling data scientists and engineers to efficiently deploy critical models like credit approval and fraud detection. This strong focus on AI not only helps maintain operational responsibilities but also creates opportunities for exponential advancements in financial technology.
The Importance of a Platform Approach
Capital One operates with a strong platform-centric mindset, which sets it apart as a tech-focused financial institution. Centralizing AI capabilities allows the organization to leverage expensive resources like GPUs across various teams effectively, ensuring optimal utilization. Furthermore, this strategic approach fosters a robust data culture, streamlining governance and compliance procedures necessary for operating in the financial sector. High user satisfaction is evident, as their machine learning platform boasts the highest net promoter score among the company's platforms, validating the effectiveness of their user-focused improvements.
Integrating Traditional Machine Learning with Generative AI
Capital One is balancing its investments between traditional machine learning techniques and newer generative AI solutions, recognizing the relevance of both methodologies in solving various business challenges. While some tasks may still require established ML strategies, others, particularly in customer servicing and fraud detection, benefit from generative AI's capabilities. This dual investment approach not only addresses immediate needs but also propels traditional techniques into the executive spotlight, potentially accelerating their evolution. Ultimately, the company aims to utilize both methodologies effectively to meet evolving customer demands.
Observability and Infrastructure Challenges for Generative AI
Implementing generative AI poses unique challenges, particularly concerning observability in production environments. Traditional model monitoring processes must evolve to accommodate new issues such as 'hallucinations' from language models and complex agentic workflows that interact with external tools. Capital One has recognized the need for comprehensive logging and governance systems, integrating these requirements directly into their existing platforms. This careful balance between scientific rigor and engineering practices enables the company to build resilient infrastructure, ensuring that their generative AI applications are not only effective but also reliable.
Today, we're joined by Abhijit Bose, head of enterprise AI and ML platforms at Capital One to discuss the evolution of the company’s approach and insights on Generative AI and platform best practices. In this episode, we dig into the company’s platform-centric approach to AI, and how they’ve been evolving their existing MLOps and data platforms to support the new challenges and opportunities presented by generative AI workloads and AI agents. We explore their use of cloud-based infrastructure—in this case on AWS—to provide a foundation upon which they then layer open-source and proprietary services and tools. We cover their use of Llama 3 and open-weight models, their approach to fine-tuning, their observability tooling for Gen AI applications, their use of inference optimization techniques like quantization, and more. Finally, Abhijit shares the future of agentic workflows in the enterprise, the application of OpenAI o1-style reasoning in models, and the new roles and skillsets required in the evolving GenAI landscape.