
Weaviate Podcast
Join Connor Shorten as he interviews machine learning experts and explores Weaviate use cases from users and customers.
Latest episodes

9 snips
Jan 15, 2024 • 31min
DSPy and ColBERT with Omar Khattab! - Weaviate Podcast #85
Omar Khattab, leading scientist on AI and NLP, discusses the concept of LLM programs and program optimization with DSPy. He explores the components of query writer, retrieve, rerank, and answer, and the potential of DSPy in optimizing prompts. The podcast also delves into exploring language models and DSPY modules, compilers for program synthesis, and the power of ColBERT in contextual awareness and document scoring.

Dec 21, 2023 • 42min
Subjectivity in AI with Dan Shipper: AI-Native Databases #4
Hey everyone! Thank you so much for watching the fourth and final episode of the AI-Native Database series with Dan Shipper! This was another epic one! Dan has had an absolutely remarkable career creating and selling a company and now co-founding and working as the CEO of Every! Every is an incredibly future-looking business focused on content online, both with an amazing newsletter, community of writers and thinkers, an AI-note taking app, and more! I think Dan brings a very unique perspective to the series, as well as the Weaviate podcast broadly, because of his experience with writers and understanding how writers are going to use these new technologies! We heavily discussed the role of personality or subjectivity in AI, amongst many other topics! I really hope you enjoy the podcast, as always we are more than happy to answer any questions or discuss any ideas you have about the content in the podcast!
Read writings from Dan Shipper on Every: https://every.to/@danshipper
Chapters
0:00 AI-Native Databases
0:58 Welcome Dan Shipper!
1:37 GPT-4 is a Reasoning Engine
8:40 Subjectivity in LLMs
12:14 AI in Note Taking
16:38 The opinions of LLMs
25:50 Cookbooks for you
31:16 Overdrive in LLMs
34:50 Tweaking the voice of AI
40:45 Multi-Agent Personalities

Dec 20, 2023 • 40min
Humans and AI with John Maeda: AI-Native Databases #3
Hey everyone! Thank you so much for watching the 3rd episode of the AI-Native Database series featuring John Maeda and Bob van Luijt! This one dives into how humans perceive AI, from Anthroaormorphization to Doomsday scenario thinking and how important understanding how AI actually work is to the engineering of these systems. Bob and John discuss the evolution of the design in tech report, 3 categories of design, and many others! I hope you enjoy the podcast! As always, we are more than happy to answer any questions or discuss any ideas you have about the content in the podcast!
Links:
Design in Tech Report: https://designintech.report/
3 Kinds of Design: https://qz.com/1585165/john-maeda-on-the-importance-of-computational-design
Microsoft Semantic Kernel: https://github.com/microsoft/semantic-kernel
Chapters
0:00 AI-Native Databases
0:58 Welcome John Maeda!
1:35 Design in Tech Report
4:07 Anthropomorphizing AI
15:30 3 Types of Design
19:30 The ChatGPT Shift
22:58 Explaining Technology
32:54 Impact of AI on the Creative Industries
39:00 Semantic Kernel

Dec 19, 2023 • 45min
Structure in Data with Paul Groth: AI-Native Databases #2
Hey everyone! Thank you so much for watching the second episode of AI-Native Databases with Paul Groth! This was another epic one, diving deep into the role of structure in our data! Beginning with Knowledge Graphs and LLMs, there are two perspectives: LLMs for Knowledge Graphs (using LLMs to extract relationships or predict missing links) and then Knowledge Graph for LLMs (to provide factual information in RAG). There is another intersection that sits in the middle of both LLMs for KGs and KGs for LLMs, which is using LLMs to query Knowledge Graphs, e.g. Text-to-Cypher/SPARQL/... From there I think the conversation evolves in a really fascinating way exploring the ability to structure data on-the-fly. Paul says "Unstructured data is now becoming a peer to structured data"! I think in addition to RAG, Generative Search is another underrated use case -- where we use LLMs to summarize search results or parse out the structure. Super interesting ideas, I hope you enjoy the podcast -- as always more than happy to answer any questions or discuss any ideas you have about the content in the podcast!
Learn more about Professor Groth's research here: https://scholar.google.com/citations?...
Knowledge Engineering using Large Language Models: https://arxiv.org/pdf/2310.00637.pdf
How Much Knowledge Can You Pack into the Parameters of a Language Model? https://arxiv.org/abs/2002.08910
Chapters
0:00 AI-Native Databases!
0:58 Welcome Paul!
1:25 Bob’s overview of the series
2:30 How do we build great datasets?
4:28 Defining Knowledge Graphs
7:15 LLM as a Knowledge Graph
15:18 Adding CRUD Support to Models
28:10 Database of Model Weights
32:50 Structuring Data On-the-Fly

Dec 18, 2023 • 1h 15min
Self-Driving Databases with Andy Pavlo: AI-Native Databases #1
Hey everyone! Thank you so much for watching the first episode of AI-Native Databases with Andy Pavlo! This was an epic one! We began by explaining the "Self-Driving Database" and all the opportunities to optimize DBs with AI and ML at both the low-level, as well as how we query and interact with them. We also discussed new opportunities with DBs + LLMs, such as bringing the data to the model (such as ROME, MEMIT, GRACE), in addition to bringing the model to the data (such as RAG). We also discuss the subjective "opinion" of these models and many more!
I hope you enjoy the podcast! As always we are more than happy to answer any questions or discuss any ideas you have about the content in the podcast! This one means a lot to me. Andy Pavlo's CMU DB course was one of the most impactful resources in my personal education, and I love the vision for the future outlined by OtterTune! It was amazing to see Etienne Dilocker featured in the ML for DBs, DBs for ML series at CMU. I am so grateful to Andy for joining the Weaviate Podcast!
Links:
CMU Database Group on YouTube: https://www.youtube.com/@CMUDatabaseGroup/videos
Self-Driving Database Management Systems - Pavlo et al. - https://db.cs.cmu.edu/papers/2017/p42-pavlo-cidr17.pdf
Database of Databases: https://dbdb.io/
Generative Feedback Loops: https://weaviate.io/blog/generative-feedback-loops-with-llms
Weaviate Gorilla: https://weaviate.io/blog/weaviate-gorilla-part-1
Chapters
0:00 AI-Native Databases
0:58 Welcome Andy
1:58 Bob’s overview of the series
3:20 Self-Driving Databases
8:18 Why isn’t there just 1 Database?
12:46 Collaboration of Models and Databases
20:05 LLM Schema Tuning
23:44 The Opinion of the System
28:20 PyTorchDB - Moving the Data to the Model
33:30 Database APIs
38:15 Learning to operate Databases
42:54 Vector DBs and the DB Hype Cycle
51:38 SQL in Weaviate?
1:07:40 The Future of DBs
1:14:00 Thank you Andy!

Dec 14, 2023 • 55min
Weaviate 1.23 Release Podcast with Etienne Dilocker!
Hey everyone! Thank you so much for watching the Weaviate 1.23 Release Podcast with Weaviate Co-Founder and CTO Etienne Dilocker! Weaviate 1.23 is a massive step forward for managing multi-tenancy with vector databases. For most RAG and Vector DB applications, you will have an uneven distribution in the # of vectors per user. Some users have 10k docs, others 10M+! Weaviate now offers a flat index with binary quantization to efficiently balance when you need an HNSW graph for the 10M doc users and when brute force is all you need for the 10k doc users!
Weaviate also comes with some other "self-driving database" features like lazy shard loading for faster startup times with multi-tenancy and automatic resource limiting with the GOMEMLIMIT and other details Etienne shares in the podcast!
I am also beyond excited to present our new integration with Anyscale (@anyscalecompute)! Anyscale has amazing pricing for serving and fine-tuning popular open-source LLMs. At the time of this release we are now integrating the Llama 70B/13B/7B, Mistral 7B, and Code Llama 34B into Weaviate -- but we expect much further development with adding support for fine-tuned models, the super cool new function calling models Anyscale announced yesterday. and other model such as Diffusion and multimodal models!
Chapters
0:00 Weaviate 1.23
1:08 Lazy Shard Loading
8:20 Flat Index + BQ
33:15 Default Segments for PQ
38:55 AutoPQ
42:20 Auto Resource Limiting
46:04 Node Endpoint Update
47:25 Generative Anyscale
Links:
Etienne Dilocker on Native Multi-Tenancy at the AI Conference in SF:
https://www.youtube.com/watch?v=KT2RFMTJKGs
Etienne Dilocker in the CMU DB Series:
https://www.youtube.com/watch?v=4sLJapXEPd4
Self-Driving Databases by Andy Pavlo: https://www.cs.cmu.edu/~pavlo/blog/2018/04/what-is-a-self-driving-database-management-system.html

Nov 29, 2023 • 56min
Rudy Lai on Tactic Generate - Weaviate Podcast #78!
Hey everyone! Thank you so much for watching the 78th episode of the Weaviate podcast featuring Rudy Lai, the founder and CEO of Tactic Generate! Tactic Generate has developed a user experience around applying LLMs in parallel to multiple documents, or even folders / collections / databases. Rudy discussed the user research that lead the company to this direction and how he sees the opportunities in building AI products with new LLM and Vector Database technologies! I hope you enjoy the podcast, as always more than happy to answer any questions or discuss any ideas you have about the content in the podcast!
Learn more about Tactic Generate here: https://tactic.fyi/generative-insights/
Weaviate Podcast #69 with Charles Pierse: https://www.youtube.com/watch?v=L_nyz1xs9AU
Chapters
0:00 Welcome Rudy!
0:48 Story of Tactic Generate
7:45 Finding Common Workflows
19:30 Multiple Document RAG UIs
26:14 Parallel LLM Execution
32:40 Aggregating Parallel LLM Analysis
38:25 Pretty Reports
44:28 Research Agents

Nov 20, 2023 • 50min
RAGAS with Jithin James, Shahul Es, and Erika Cardenas - Weaviate Podcast #77!
Hey everyone, thank you so much for watching the 77th Weaviate Podcast on RAGAS, featuring Jithin James, Shahul ES, and Erika Cardenas! RAGAS is one of the hottest rising startups in Retrieval-Augmented Generation! RAGAS began it's journey with the RAGAS score, a matrix of evaluations for generation and retrieval. Generation evaluated on Faithfulness (is the response grounded in the context) as well as Relevancy (is the response useful). Retrieval is then evaluated on Precision (How many of the search results are relevant to the question?) and Recall (How many of the relevant search results are captured in the retrieved results?). Now, the super novel thing about this is that an LLM is used to determine these metrics. So we circumvent painstaking manual labeling effort with the RAGAS score! This podcast dives into the development of the RAGAS score as well as how RAG application builders should think about the knobs to tune for optimizing their RAGAS score: embedding models, chunking strategies, hybrid search tuning, rerankers, ... ?!? We also discussed tons of exciting directions for the future such as fine-tuning smaller LLMs for these metrics, agents that use tuning APIs, and long context RAG!
Check out the docs here for getting started with RAGAS! https://docs.ragas.io/en/latest/getstarted/index.html#get-started
Chapters
0:00 Welcome Jithin and Shahul!
0:44 Welcome Erika!
0:56 RAGAS, Founding Story
2:38 Weaviate + RAGAS integration plans
4:44 RAG Knobs to Tune
25:50 RAG Experiment Tracking
34:52 LangSmith and RAGAS
38:55 LLM Evaluation
40:25 RAGAS Agents
44:00 Long Context RAG Evaluation

Nov 14, 2023 • 59min
Patrick Lewis on Retrieval-Augmented Generation - Weaviate Podcast #76!
Hey everyone, I am SUPER excited to present our 76th Weaviate Podcast featuring Patrick Lewis, an NLP Research Scientist at Cohere! Patrick has had an absolutely massive impact on Natural Language Processing with AI and Deep Learning! Especially notable for the current climate in AI and Weaviate is that Patrick is the lead author of the original "Retrieval-Augmented Generation" paper!! Patrick has contributed to many other profoundly impactful papers in the space as well such as DPR, Atlas, Task-Aware Retrieval with Instruction, and many many others! This was such an illuminating conversation, here is a quick overview of the chapters in the podcast!
1. Origin of RAG - Patrick explains the build-up that lead to the RAG paper, AskJeeves, IBM Watson, conceptual shift to retrieve-read in mainstream connectionist approaches to AI.
2. Atlas - Atlas shows that a much smaller LLM when paired with Retrieval-Augmentation can still achieve competitive few-shot and zero-shot task performance. This is super impactful because this few-shot and zero-shot capability has been a massive evangelist for AI broadly, and the fact that smaller Retrieval-Augmented models can do this is massive for the economically unlocking these applications.
Teasing apart some architectural details of RAG:
3. Fusion In-Decoder - Interesting encoder-decoder transformer design in which each document + the query is encoded separately, then concatenated and passed to the LM.
4. End-to-End RAG - How to think about jointly training an embedding model and an LLM augmented with retrieval?
5. Query Routers - How to route queries from say SQL or Vector DBs? (More nuance on this later with Multi-Index Retrieval)
6. ConcurrentQA - Super interesting work on the privacy of multi-index routers. For example, if you ask "Who is the father of our new CEO" - this may reveal the private information of the new CEO with the public query of their father.
7. Multi-Index Retrieval
8. New APIs for LLMs
9. Self-Instructed Gorillas
10. Task-Aware Retrieval with Instructions
11. Editing Text, EditEval and PEER
12. What future direction excites you the most?
Links:
Learn more about Patrick Lewis: https://www.patricklewis.io/
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks: https://arxiv.org/abs/2005.11401
Atlas: https://arxiv.org/pdf/2208.03299.pdf
Fusion In-Decoder: https://arxiv.org/pdf/2007.01282.pdf
Chapters
0:00 Welcome Patrick Lewis!
0:36 Origin of RAG
5:20 Atlas
10:43 Fusion In-Decoder
17:50 End-to-End RAG
27:05 Query Routers
32:05 ConcurrentQA
37:30 Multi-Index Retrieval
40:05 New APIs for LLMs
41:50 Self-Instructed Gorillas
44:35 Task-Aware Retrieval with Instructions
52:00 Editing Text, EditEval and PEER
55:35 What future direction excites you the most?

Nov 8, 2023 • 50min
Tanmay Chopra on Emissary - Weaviate Podcast #75!
Hey everyone! Thank you so much for watching the 75th Weaviate Podcast featuring Tanmay Chopra! The podcast details Tanmay's incredible career in Machine Learning from Tik Tok to Neeva and now building his own startup, Emissary! Tanmay shared some amazing insights into Search AI such as how to process Temporal Queries, how to think about diversity in Retrieval, and Query Recommendation products! We then dove into the opportunity Tanmay sees in fine-tuning LLMs and knowledge distillation that motivated Tanmay to build Emissary! I thought Tanmay's analogy of GPT-4 to 3D printers was really interesting, tons of great nuggets in here! I really hope you enjoy the podcast, as always more than happy to answer any questions or discuss any ideas with you related to the content in the podcast!
Chapters
0:00 Welcome Tanmay!
0:23 Early Career Story
2:02 Tik Tok
4:10 Neeva
8:45 Temporal Queries
11:40 Retrieval Diversity
17:22 Query Recommendation
23:20 Emissary, starting a company!
30:20 A Simple API for Custom Models
35:42 GPT-4 = 3D Printer?