undefined

Brian Raymond

CEO / Founder of Unstructured providing technology for Weaviate

Top 5 podcasts with Brian Raymond

Ranked by the Snipd community
undefined
12 snips
May 23, 2023 • 43min

Unstructured with Brian Raymond - Weaviate Podcast #48!

Hey everyone, thank you so much for watching the 48th episode of the Weaviate Podcast!! This is a SUPER exciting one, welcoming Brian Raymond the CEO / Founder of Unstructured! Unstructured is a perfect complimenting technology for Weaviate, helping people get their Unstructured data into Weaviate! The podcast dives into the nuances of this task, but it generally revolves around Unstructured's abstraction of Partitioning, Cleaning, and Staging! Unstructured is making groundbreaking innovations on using Visual Document Layout models for Partitioning, for example saying that this part of the PDF is the header, body, image caption, and so on. Cleaning then describes removing pesky details like whitespaces or odd characters. Staging then describes the transformations of say formatting a text chunk with it's metadata into the JSON for a Weaviate object upload! I really hope you find this podcast interesting! We are publishing a blog post as well showing an example of how to use Unstructured to get PDF data into Weaviate, please please check that out and let us know if it works for your data and how we can improve it! This blog post can be found on weaviate.io and we will be managing discussions around it both in the Weaviate slack, as well as Unstructured! Thank you so much for listening! Check out Unstructured here! https://www.unstructured.io/ Chapters 0:00 Welcome Brian!! 0:27 What is Unstructured? 5:42 Why now? New Advancements in Unstructured 8:02 Thoughts on Data Connectors Hub 10:55 PDFs to Weaviate with Unstructured 13:53 State-of-the-Art in OCR and Document Parsing 16:10 How to get the data from Weaviate.io? 18:06 Foundation Models from Unstructured 20:45 Evaporate-Code+ 23:15 CSV, Parquet, JSON transformations in Staging 25:08 Cleaning Bricks 28:02 Visual Document Examples 30:45 Text Chunking with Metadata 33:25 Knowledge Graphs with Goldman Sachs example 39:10 LLM Hallucinations 42:10 Announcements from Brian!
undefined
May 20, 2024 • 37min

E134: Making Complex Data RAG-Ready with Unstructured

Brian Raymond, Founder & CEO of Unstructured, discusses the importance of data preparation in NLP, creating a single API endpoint for handling diverse data formats, transitioning from open source to commercial success, engaging with government design partners, and the value of world-class design & marketing for open source companies.
undefined
Feb 24, 2024 • 31min

Unlocking $25M: Unstructured's CEO Brian Raymond on Data Prep for LLMs

CEO Brian Raymond discusses data preparation for Large Language Models, challenges faced in preprocessing data for AI applications, developing a single API for data processing, handling different document types, transitioning from open source to commercial API, monetization strategy, and the influence of working with the government and importance of analytics.
undefined
Jan 28, 2024 • 48min

Episode 13: Open-source panel with Anton Troynikov, Brian Raymond, and Harrison Chase

Open-source leaders in AI, Anton Troynikov, Brian Raymond, and Harrison Chase, discuss topics such as chatbot development with Lang Smith, usage of Chroma in AI applications, building open-source and commercial products, limitations of vector search in AI retrieval, favorite AI companies and TV portrayals, and the importance of agility in developing language models.
undefined
Aug 3, 2023 • 36min

ETL for LLMs

Founder of Unstructured, Brian Raymond, discusses challenges in data preprocessing for NLP solutions, efficient file processing architecture for data extraction, innovative data engineering solutions, comparison of connector capabilities in AirBite and 5trend, and evolution of ETL pipelines for Large Language Models.