Morningstar Intelligence Engine with Aravind Kesiraju - Weaviate Podcast #111!
Jan 8, 2025
auto_awesome
Join Aravind Kesiraju, Principal Software Engineer at Morningstar, as he shares insights on the development of the Morningstar Intelligence Engine. They discuss the fascinating world of no-code/low-code AI applications and how to build advanced financial chatbots. Discover the intricacies of integrating diverse data sources and optimizing language models for financial tasks. Explore the evolution of Retrieval-Augmented Generation (RAG) data pipelines and the challenges of managing sensitive financial information while enhancing chatbot performance with intelligent question classification.
The Morningstar Intelligence Engine offers a no-code, low-code platform to build advanced AI applications, enhancing user interaction and tool integration.
Innovations in RAG pipelines at Morningstar focus on chunking strategies and automated ingestion, improving the efficiency and accuracy of data retrieval.
Deep dives
Overview of the Morningstar Intelligence Engine
The Morningstar Intelligence Engine is designed to facilitate the creation of next-generation AI applications through a no-code, low-code platform. It provides an API-driven solution that goes beyond simple chatbot interactions, allowing users to build AI-based agents and incorporate custom tools. The engine supports capabilities such as text-to-SQL operations and the integration of various data sources, enhancing its functionality for both internal teams and external customers. This setup helps streamline workflows and enables users to efficiently process and retrieve relevant information from vast datasets.
Data Ingestion and Vector Databases
The platform utilizes a robust pipeline to ingest a variety of data types, including research content and financial documents sourced from Morningstar. This ingestion process involves generating embeddings that are stored in a vector database, enabling semantic searches and accurate retrieval of relevant information based on user queries. The initial setup used an early vector database, which allowed the team to recognize the technology's potential, ultimately leading to partnerships with open-source projects. Ongoing enhancements in ingestion techniques have been informed by iterative re-evaluation of data processing strategies to optimize efficiency.
The evolution of the Retrieval-Augment-Generate (RAG) pipelines at Morningstar has included innovations in chunking strategies, focusing on integrating semantically relevant content for improved answer generation. Initial techniques involved simple document reading and fixed-size chunking, which have since advanced to incorporate overlapping chunks that provide comprehensive context to the language model. Developing a systematic approach to pipeline automation has helped streamline the frequent ingestion of new content, thereby enhancing the retrieval accuracy of the system. Continuous experimentation with chunking and re-ranking methods aims to refine the process and boost the system's overall performance.
Evaluating and Ensuring Compliance in AI-driven Systems
Morningstar has implemented a comprehensive evaluation framework to monitor the performance of its Generative AI applications, which assesses accuracy, context relevance, and overall response quality. This framework also integrates human evaluations alongside automated assessments to enhance reliability, especially in sensitive areas such as financial advice where errors can have significant consequences. Compliance measures are built into the system to guard against data leaks and system vulnerabilities, thereby securing sensitive information and ensuring that only entitled users have access to specific data queries. As part of an ongoing commitment to safety, the system encourages continuous feedback and improvements based on user interactions.
Hey everyone! I am SUPER EXCITED to publish the 111th Weaviate Podcast with Aravind Kesiraju from Morningstar! Aravind is a Principal Software Engineer who has lead the development behind the Morningstar Intelligence Engine! There are so many interesting aspects to this, and if you are building Agentic systems that would benefit from a high-quality financial retrieval API, you should check this out right now! The podcast dives into all sorts of ingredients that went into building this system: from custom RAG data pipelines with content management system integrations and embedding task queues, to exploring new chunking strategies, tool marketplaces, ReAct Agents, Text-to-SQL, and all sorts of other things!
Get the Snipd podcast app
Unlock the knowledge in podcasts with the podcast player of the future.
AI-powered podcast player
Listen to all your favourite podcasts with AI-powered features
Discover highlights
Listen to the best highlights from the podcasts you love and dive into the full episode
Save any moment
Hear something you like? Tap your headphones to save it with AI-generated key takeaways
Share & Export
Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more
AI-powered podcast player
Listen to all your favourite podcasts with AI-powered features
Discover highlights
Listen to the best highlights from the podcasts you love and dive into the full episode