Open Source Startup Podcast cover image

Open Source Startup Podcast

E134: Making Complex Data RAG-Ready with Unstructured

May 20, 2024
Brian Raymond, Founder & CEO of Unstructured, discusses the importance of data preparation in NLP, creating a single API endpoint for handling diverse data formats, transitioning from open source to commercial success, engaging with government design partners, and the value of world-class design & marketing for open source companies.
37:06

Episode guests

Podcast summary created with Snipd AI

Quick takeaways

  • Unstructured was founded to address the lack of tools for data preprocessing in NLP projects, focusing on making complex data RAG-ready for vector databases.
  • Starting as an open source project, Unstructured gained widespread usage and valuable feedback, guiding platform development and market strategy.

Deep dives

Origin of Unstructured Idea and Need for Data Tooling

The idea for Unstructured originated from the lack of tooling on the data side for natural language processing (NLP) projects, while there were abundant resources for model development. Companies struggled with hard-coded preprocessing pipelines, affecting data readiness for tasks like labeling and inference. Primer AI's experience highlighted this challenge, leading Brian to envision a solution that focuses solely on preparing data for easy integration into NLP applications and knowledge graphs.

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode