
Weaviate Podcast
Join Connor Shorten as he interviews machine learning experts and explores Weaviate use cases from users and customers.
Latest episodes

Nov 29, 2023 • 56min
Rudy Lai on Tactic Generate - Weaviate Podcast #78!
Hey everyone! Thank you so much for watching the 78th episode of the Weaviate podcast featuring Rudy Lai, the founder and CEO of Tactic Generate! Tactic Generate has developed a user experience around applying LLMs in parallel to multiple documents, or even folders / collections / databases. Rudy discussed the user research that lead the company to this direction and how he sees the opportunities in building AI products with new LLM and Vector Database technologies! I hope you enjoy the podcast, as always more than happy to answer any questions or discuss any ideas you have about the content in the podcast!
Learn more about Tactic Generate here: https://tactic.fyi/generative-insights/
Weaviate Podcast #69 with Charles Pierse: https://www.youtube.com/watch?v=L_nyz1xs9AU
Chapters
0:00 Welcome Rudy!
0:48 Story of Tactic Generate
7:45 Finding Common Workflows
19:30 Multiple Document RAG UIs
26:14 Parallel LLM Execution
32:40 Aggregating Parallel LLM Analysis
38:25 Pretty Reports
44:28 Research Agents

Nov 20, 2023 • 50min
RAGAS with Jithin James, Shahul Es, and Erika Cardenas - Weaviate Podcast #77!
Hey everyone, thank you so much for watching the 77th Weaviate Podcast on RAGAS, featuring Jithin James, Shahul ES, and Erika Cardenas! RAGAS is one of the hottest rising startups in Retrieval-Augmented Generation! RAGAS began it's journey with the RAGAS score, a matrix of evaluations for generation and retrieval. Generation evaluated on Faithfulness (is the response grounded in the context) as well as Relevancy (is the response useful). Retrieval is then evaluated on Precision (How many of the search results are relevant to the question?) and Recall (How many of the relevant search results are captured in the retrieved results?). Now, the super novel thing about this is that an LLM is used to determine these metrics. So we circumvent painstaking manual labeling effort with the RAGAS score! This podcast dives into the development of the RAGAS score as well as how RAG application builders should think about the knobs to tune for optimizing their RAGAS score: embedding models, chunking strategies, hybrid search tuning, rerankers, ... ?!? We also discussed tons of exciting directions for the future such as fine-tuning smaller LLMs for these metrics, agents that use tuning APIs, and long context RAG!
Check out the docs here for getting started with RAGAS! https://docs.ragas.io/en/latest/getstarted/index.html#get-started
Chapters
0:00 Welcome Jithin and Shahul!
0:44 Welcome Erika!
0:56 RAGAS, Founding Story
2:38 Weaviate + RAGAS integration plans
4:44 RAG Knobs to Tune
25:50 RAG Experiment Tracking
34:52 LangSmith and RAGAS
38:55 LLM Evaluation
40:25 RAGAS Agents
44:00 Long Context RAG Evaluation

Nov 14, 2023 • 59min
Patrick Lewis on Retrieval-Augmented Generation - Weaviate Podcast #76!
Hey everyone, I am SUPER excited to present our 76th Weaviate Podcast featuring Patrick Lewis, an NLP Research Scientist at Cohere! Patrick has had an absolutely massive impact on Natural Language Processing with AI and Deep Learning! Especially notable for the current climate in AI and Weaviate is that Patrick is the lead author of the original "Retrieval-Augmented Generation" paper!! Patrick has contributed to many other profoundly impactful papers in the space as well such as DPR, Atlas, Task-Aware Retrieval with Instruction, and many many others! This was such an illuminating conversation, here is a quick overview of the chapters in the podcast!
1. Origin of RAG - Patrick explains the build-up that lead to the RAG paper, AskJeeves, IBM Watson, conceptual shift to retrieve-read in mainstream connectionist approaches to AI.
2. Atlas - Atlas shows that a much smaller LLM when paired with Retrieval-Augmentation can still achieve competitive few-shot and zero-shot task performance. This is super impactful because this few-shot and zero-shot capability has been a massive evangelist for AI broadly, and the fact that smaller Retrieval-Augmented models can do this is massive for the economically unlocking these applications.
Teasing apart some architectural details of RAG:
3. Fusion In-Decoder - Interesting encoder-decoder transformer design in which each document + the query is encoded separately, then concatenated and passed to the LM.
4. End-to-End RAG - How to think about jointly training an embedding model and an LLM augmented with retrieval?
5. Query Routers - How to route queries from say SQL or Vector DBs? (More nuance on this later with Multi-Index Retrieval)
6. ConcurrentQA - Super interesting work on the privacy of multi-index routers. For example, if you ask "Who is the father of our new CEO" - this may reveal the private information of the new CEO with the public query of their father.
7. Multi-Index Retrieval
8. New APIs for LLMs
9. Self-Instructed Gorillas
10. Task-Aware Retrieval with Instructions
11. Editing Text, EditEval and PEER
12. What future direction excites you the most?
Links:
Learn more about Patrick Lewis: https://www.patricklewis.io/
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks: https://arxiv.org/abs/2005.11401
Atlas: https://arxiv.org/pdf/2208.03299.pdf
Fusion In-Decoder: https://arxiv.org/pdf/2007.01282.pdf
Chapters
0:00 Welcome Patrick Lewis!
0:36 Origin of RAG
5:20 Atlas
10:43 Fusion In-Decoder
17:50 End-to-End RAG
27:05 Query Routers
32:05 ConcurrentQA
37:30 Multi-Index Retrieval
40:05 New APIs for LLMs
41:50 Self-Instructed Gorillas
44:35 Task-Aware Retrieval with Instructions
52:00 Editing Text, EditEval and PEER
55:35 What future direction excites you the most?

Nov 8, 2023 • 50min
Tanmay Chopra on Emissary - Weaviate Podcast #75!
Hey everyone! Thank you so much for watching the 75th Weaviate Podcast featuring Tanmay Chopra! The podcast details Tanmay's incredible career in Machine Learning from Tik Tok to Neeva and now building his own startup, Emissary! Tanmay shared some amazing insights into Search AI such as how to process Temporal Queries, how to think about diversity in Retrieval, and Query Recommendation products! We then dove into the opportunity Tanmay sees in fine-tuning LLMs and knowledge distillation that motivated Tanmay to build Emissary! I thought Tanmay's analogy of GPT-4 to 3D printers was really interesting, tons of great nuggets in here! I really hope you enjoy the podcast, as always more than happy to answer any questions or discuss any ideas with you related to the content in the podcast!
Chapters
0:00 Welcome Tanmay!
0:23 Early Career Story
2:02 Tik Tok
4:10 Neeva
8:45 Temporal Queries
11:40 Retrieval Diversity
17:22 Query Recommendation
23:20 Emissary, starting a company!
30:20 A Simple API for Custom Models
35:42 GPT-4 = 3D Printer?

Nov 7, 2023 • 57min
Simba Khadder on FeatureForm - Weaviate Podcast #74!
Hey everyone! Thank you so much for watching the 74th Weaviate Podcast feature Simba Khadder, the CEO and Co-Founder of FeatureForm! To begin, "features" broadly describe the inputs to machine learning models that they use to produce outputs, or predictions. Feature stores orchestrate the construction of features, whether that be transformations for tabular machine learning models such as XGBoost, to chunking for vector embedding inference, and now features for LLM inference in RAG. Right out of the gate, Simba really opened my eyes to the role that feature engineering plays in RAG. Further touching on this at the very end under the "Exciting future for RAG with Features" chapter, Simba further describes how we can use more advanced features to provide better context to LLMs. In addition to these insights on RAG, there are so many nuggets in the podcast, Simba is a world class professional when it comes to building distributed systems, production scale recommendation systems, and more! I learned so much from chatting with Simba, I hope you enjoy listening to the podcast! As always we are more than happy to answer any questions or discuss any ideas you have about the content in the podcast!
FeatureForm: https://www.featureform.com/
Highly Recommend!! Simba Khadder at the CMU DB Seminar series: https://www.youtube.com/watch?v=ZsWa6XiBc-U
FeatureForm and Weaviate demo! https://docs.featureform.com/providers/weaviate
Chapters
0:00 Simba Khadder
0:35 RAG and Feature Stores
4:30 Experience building Recommendation Systems
9:47 The End-to-End Feature Lifecycle
15:08 Virtual Feature Store Orchestration
26:45 RAG Evaluation
31:27 Feature Engineering
34:15 LLM Tuning and Features
39:55 Streaming Features
51:15 Data Drift Detection
54:20 Exciting future for RAG with Features

6 snips
Nov 6, 2023 • 52min
Charles Packer on MemGPT - Weaviate Podcast #73!
Charles Packer, lead author of MemGPT at UC Berkeley, discusses the concept of explicit memory management in GPT models, the use of prompts to handle memory limitations, interrupts in retrieval augmented generation (RAG), achieving ideal running speed in high parameter models, fine-tuning MemGBT for long conversations, search actions pagination, role-playing language models, and the future integration of memory in chatbot platforms.

Nov 1, 2023 • 50min
Madelon Hulsebos on Tabular Machine Learning - Weaviate Podcast #72!
Hey everyone! Thank you so much for watching the 72nd episode of the Weaviate Podcast with Madelon Hulsebos!! Madelon is one of the world's experts on Machine Learning with Tables and Tabular-Structured Data, this was such an eye-opening conversation! We discussed all sorts of topics from the relationship of tabular data and embeddings, to searching through tables, semantic joins, more complex Text-to-SQL, using machine learning for query execution, using tabular data in search and recommendation reranking, and many more! This was easily one of the most knowledge packed episodes of the Weaviate podcast so far, please don't hesitate to leave any questions or ideas you have related to the content discussed!
You can learn more about Madelon's incredible research career and publications / talks here: https://www.madelonhulsebos.com/! Papers such as GitTables are listed here!
Another nice nugget form the podcast - Madelon introduced me to the BIRD-SQL benchmark which really expanded my understanding of Text-to-SQL (https://arxiv.org/pdf/2305.03111.pdf.
Chapters
0:00 Welcome Madelon!
0:58 Tabular Data and Embeddings
3:10 Tabular Representation Learning
5:48 Semantic Type Detection
9:50 Pandas as an LLM Tool
11:52 Table-Based Question Answering and Text-to-SQL
19:35 Joins with Machine Learning
21:38 Query Execution with Machine Learning
22:45 Graph Neural Networks
24:07 XGBoost
28:28 Merging Tables
32:10 Fact Representation
35:50 GPT-4V and Tables
39:00 Metadata in Embeddings
42:45 Table Retrieval in Weaviate
46:25 Exciting future directions!!

Oct 26, 2023 • 56min
Vibs Abhishek on Alltius AI - Weaviate Podcast #71!
Hey everyone! Thank you so much for watching the 71st Weaviate Podcast with Vibs Abhishek! Vibs is the CEO and Founder of Alltius AI, as well as a professor at UC Irvine business school! In order to tame the somewhat chaotic emerging landscape of RAG and LLM applications, Alltius has settled on 3 core pillars of Knowledge, Skills, and Deployment Channels! Vibs further explained how he sees the distinction between Assistants and Agents and many more topics important to Enterprise deployment of RAG applications such as reducing hallucinations and employing classifiers to route skills and knowledge sources! I learned so much from this conversation, I hope you enjoy the podcast!
Alltius KNO Plus Demo Video: https://www.loom.com/share/fcfe516b75ea4f069b1a8d6a3510fa4c?sid=5f43317f-c20b-4dd9-91d3-2cde993fd91f
Chapters
0:00 Welcome Vibs
0:22 Background
2:30 Alltius’ UI for Assistants
7:15 The Knowledge Pillar
12:05 SQL Router and Intent Management
14:10 Classifying a Pipeline / Skill
17:30 Flexibility of Zero-Shot versus Fine-Tuning
21:00 The Channels Pillar
23:00 Connecting the Warehouse / Lakehouse
24:50 Assistant versus Agent
28:30 MemGPT
31:25 Offline LLM Research
35:50 Multi-Agent Role-Playing Assistants
39:25 From Clicks to Conversations
44:10 CEO / Professor and Evolution of the Field

Oct 24, 2023 • 31min
MemGPT Explained!
Thank you so much for watching our paper summary video on MemGPT! MemGPT is a super exciting new work bridging together concepts in how Operating Systems manage memory and LLMs!
Links:
Paper: https://arxiv.org/pdf/2310.08560.pdf
Andrej Karpathy on Operating Systems and LLMs: https://twitter.com/karpathy/status/1707437820045062561
Run LLM Podcast with Charles Packer: https://www.youtube.com/watch?v=4aOLxPdx1Dg
SciPhi: https://github.com/SciPhi-AI/sciphi/tree/main
Our perspectives on Database Agents that WRITE to Vector Databases: https://weaviate.io/blog/generative-feedback-loops-with-llms
Chapters
0:00 Introduction to MemGPT
2:45 MemGPT Architecture
6:15 Operating System for LLMs
11:48 Types of Context and Storage
15:42 Control Flow
18:00 Experiments
22:04 Future Work
24:46 Personal Takeaways
30:34 Thank you for watching!

Oct 18, 2023 • 55min
Kevin Cohen on Neum AI - Weaviate Podcast #70!
Hey everyone! Thank you so much for watching the 70th episode of the Weaviate podcast with Neum AI CTO and Co-Founder Kevin Cohen! I first met Kevin when he was debugging an issue with his distributed node utilization and have since learned so much from him about how he sees the space of Data Ingestion, also commonly referenced as ETL for LLMs! There are so many interesting parts to this from the general flow of data connectors, chunkers and metadata extractors, embedding inference, and the last leg of the mile of importing the vectors to a Vector DB such as Weaviate! I really loved how Kevin broke down the distributed messaging queue and system design for orchestrating data ingestion at massive scale such as dealing with failures and optimizing the infrastructure as code setup. We also discussed things like new use cases with quadrillion scale vector indexes and the role of knowledge graphs in all this! I really hope you enjoy the podcast, please check out this amazing article below from Neum AI!
https://medium.com/@neum_ai/retrieval-augmented-generation-at-scale-building-a-distributed-system-for-synchronizing-and-eaa29162521
Chapters
0:00 Check this out!
1:18 Welcome Kevin!
1:58 Founding Neum AI
6:55 Data Ingestion, End-to-End Overview
9:10 Chunking and Metadata Extraction
14:20 Embedding Cache
16:57 Distributed Messaging Queues
22:15 Embeddings Cache ELI5
25:30 Customizing Weaviate Kubernetes
38:10 Multi-Tenancy and Resource Allocation
39:20 Billion-Scale Vector Search
45:05 Knowledge Graphs
52:10 Y Combinator Experience
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.