Infinite Curiosity Pod with Prateek Joshi cover image

LLM tech stack, Open source startups, Building products for the government | Brian Raymond, cofounder and CEO of Unstructured

Infinite Curiosity Pod with Prateek Joshi

00:00

How to Build a Data Pre-Processing Pipeline

Unstructured.com is a company that pre-processes data for machine learning pipelines. The process involves chunking out text, tokenization and configuring the JSONscheme. It also includes last mile work to ensure files are immediately consumable downstream.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app