Thibaut Labarre, AngelList investing and natural language processing expert, discusses the innovative use of large language models at AngelList, including news article classification for investor dashboards. They also talk about the challenges of prompt engineering, the importance of involving domain experts, and the ethical concerns of using AI models for reading legal texts.
Read more
AI Summary
AI Chapters
Episode notes
auto_awesome
Podcast summary created with Snipd AI
Quick takeaways
AngelList Relay is a tool that utilizes OpenAI models to extract key information from investment documents, providing investors with automated organization and tracking of important data.
Scalability and cost optimization were challenges faced by AngelList in utilizing OpenAI models, but they found solutions by partnering with Azure and leveraging both Azure and OpenAI APIs for routing and rate limit management.
While AI models significantly improve accuracy and efficiency, human intervention and expertise are crucial in the AI implementation process, particularly in verifying and interpreting information from legal documents and investments.
Deep dives
Introducing AngelList Relay: Using OpenAI Models to Extract Information from Investment Documents
AngelList has recently launched AngelList Relay, a new tool that leverages OpenAI models to extract key terms and information from investment documents. The tool allows investors to stay informed about their investments by automatically organizing and tracking important data such as investment amounts, valuations, and terms. By forwarding or uploading investment documents to Relay, users can easily view and monitor the progress of their investments, including updates from the invested companies. The implementation of OpenAI models has significantly improved the accuracy and efficiency of extracting information from legal documents, streamlining the process and saving time for investors. AngelList is also exploring the potential of training its own models to further enhance the capabilities of Relay.
Challenges and Scaling Issues with OpenAI APIs
One of the challenges faced by AngelList in utilizing OpenAI models is the scalability and cost of API calls. The team encountered difficulties in obtaining access to high-scale models and increasing rate limits to handle a larger volume of documents. However, they found solutions by partnering with Azure and leveraging both Azure and OpenAI APIs for routing and rate limit management. Cost optimization and the potential of building and training proprietary models are considerations for the future to reduce dependence on third-party vendors and enhance operational excellence.
Balancing Automation and Human-in-the-Loop in AI Implementation
AngelList emphasizes the importance of human intervention and expertise in the AI implementation process, particularly in the context of legal documents and investments. While the use of OpenAI models significantly improves accuracy and efficiency, human review and verification are essential to ensure the correct extraction and interpretation of information. The legal documents serve as the ultimate source of truth, and having humans in the loop enables the verification and validation of the model's output. The AI models act as powerful tools to assist human users in processing large amounts of text and identifying relevant information, contributing to increased productivity and reduced errors.
Ethical Considerations and Risk Mitigation
When it comes to using AI models in the finance industry, AngelList emphasizes the importance of legal documents as the ultimate source of truth. The models are used to extract key terms and information, but the legal documents serve as the authoritative reference point for any discrepancies. The risk of incorrect information or hallucinations is mitigated by cross-referencing the model's output with the original documents, ensuring that accuracy is maintained. While ongoing monitoring and improvement of the models are important, the current implementation provides significant value in terms of time savings and accuracy compared to manual interpretation of legal documents.
Future Plans: Expanding Capabilities and Ownership of Models
AngelList's future plans involve expanding the capabilities of AngelList Relay, such as training bespoke models for individual customers and exploring the potential of utilizing their own proprietary models. They aim to provide even more functionality and advanced capabilities, allowing users to interact with their investment data in novel ways. While emphasizing the value of off-the-shelf models in their current implementation, AngelList recognizes the potential benefits of owning and fine-tuning their models for specific use cases. This would provide more flexibility, scalability, and cost optimization, further enhancing the capabilities and value of their AI-powered tools.
MLOps Coffee Sessions #171 with Thibaut Labarre, Using Large Language Models at AngelList co-hosted by Ryan Russon.
We are now accepting talk proposals for our next LLM in Production virtual conference on October 3rd. Apply to speak here: https://go.mlops.community/NSAX1O
// Abstract
Thibaut innovatively addressed previous system constraints, achieving scalability and cost efficiency. Leveraging AngelList investing and natural language processing expertise, they refined news article classification for investor dashboards. Central is their groundbreaking platform, AngelList Relay, automating parsing and offering vital insights to investors. Amid challenges like Azure OpenAI collaboration and rate limit solutions, Thibaut reflects candidly. The narrative highlights prompt engineering's strategic importance and empowering domain experts for ongoing advancement.
// Bio
Thibaut LaBarre is an engineering lead with a background in Natural Language Processing (NLP). Currently, Thibaut focuses on unlocking the potential of Large Language Model (LLM) technology at AngelList, enabling everyone within the organization to become prompt engineers on a quest to streamline and automate the infrastructure for Venture Capital.
Prior to that, Thibaut began his journey at Amazon as an intern where he built Heartbeat, a state-of-the-art NLP tool that consolidates millions of data points from various feedback sources, such as product reviews, customer contacts, and social media, to provide valuable insights to global product teams. Over the span of seven years, he expanded his internship project into an organization of 20 engineers.
He received a M.S. in Computational Linguistics from the University of Washington.
// MLOps Jobs board
https://mlops.pallet.xyz/jobs
// MLOps Swag/Merch
https://mlops-community.myshopify.com/
// Related Links
Website: https://www.angellist.com/venture/relay
--------------- ✌️Connect With Us ✌️ -------------
Join our slack community: https://go.mlops.community/slack
Follow us on Twitter: @mlopscommunity
Sign up for the next meetup: https://go.mlops.community/register
Catch all episodes, blogs, newsletters, and more: https://mlops.community/
Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/
Connect with Ryan on LinkedIn: https://www.linkedin.com/in/ryanrusson/
Connect with Thibaut on LinkedIn: https://www.linkedin.com/in/thibautlabarre/
Timestamps:
[00:00] Thibaut's preferred beverage
[00:50] Takeaways
[04:05] Please like, share, and subscribe to our MLOps channels!
[04:44] A huge fan of Isaac Asimov
[07:20] Thibaut Labarre background
[09:13] AngelList as an organization
[10:50] AI sense of building
[12:29] System trade-offs
[15:20] OpenAI's limitation
[16:31] Human in the loop
[17:22] Classifying relevance
[18:09] Fight for value
[19:37] Added value
[22:10] Exploring efficient ways to automate tasks.
[24:20] Investing in off-the-shelf models
[27:56] AngelList Relay
[30:49] News article and investment document classification technology
[32:39] Back-end tech
[34:09] Prompt layer
[35:28] Prompt layer as a living
[37:04] Foreseeing no human intervention
[39:00] Blocking hallucinations
[40:33] Challenges
[43:49] Investments in other models besides OpenAI
[45:20] Integration with other models
[46:28] Ethical concerns when
[48:37] OpenAI breaking Prompts
[50:46] Wrap up
Get the Snipd podcast app
Unlock the knowledge in podcasts with the podcast player of the future.
AI-powered podcast player
Listen to all your favourite podcasts with AI-powered features
Discover highlights
Listen to the best highlights from the podcasts you love and dive into the full episode
Save any moment
Hear something you like? Tap your headphones to save it with AI-generated key takeaways
Share & Export
Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more
AI-powered podcast player
Listen to all your favourite podcasts with AI-powered features
Discover highlights
Listen to the best highlights from the podcasts you love and dive into the full episode