
Weaviate Podcast
Join Connor Shorten as he interviews machine learning experts and explores Weaviate use cases from users and customers.
Latest episodes

Oct 4, 2023 • 1h 9min
Charles Pierse on Tactic Generate - Weaviate Podcast #69!
Hey everyone! Thank you so much for watching the 69th episode of the Weaviate Podcast featuring Charles Pierse from Tactic! Tactic has recently launched their new Tactic Generate project, an incredible UI for conducting research across multiple documents. I think there is a massive opportunity to pair these prompts and LLM workflows with User Interfaces and take more of a holistic User Experience perspective. Tactic Generate has done an incredible job of that, please take a look from the link below! I had such a fun conversation catching up with Charles (Charles was our 2nd Weaviate Podcast guest!), I hope you enjoy the podcast!
Tactic Generate: https://tactic.fyi/generative-insights/
Chapters
0:00 Tactic Generate
1:40 Welcome Charles!
2:38 Charles’ work at Tactic
4:40 LLMs comparing documents
9:10 LLM Chaining
17:30 Discovering LLM Chains
20:28 Moats in ML Products
28:48 Fine-Tuning vs. RAG
34:30 Fine-Tuning Search Models
39:45 Skepticism on RLHF
41:52 Gorilla, Integrations, and CRM
45:40 Query Routers
47:55 CRM and Tree-of-Thoughts
55:54 Graph Embeddings
1:02:20 Llama CPP / GGML
1:04:28 What are you looking forward to most in AI?

Sep 20, 2023 • 52min
Weights and Biases on Fine-Tuning LLMs - Weaviate Podcast #68!
Hey everyone! Thank you so much for watching the 68th episode of the Weaviate Podcast! We are super excited to welcome Morgan McGuire, Darek Kleczek, and Thomas Capelle! This was such a fun discussion beginning with generally how see the space of fine-tuning from why you would want to do it, to the available tooling, intersection with RAG and more!
Check out W&B Prompts! https://wandb.ai/site/prompts
Check out the W&B Tiny Llama Report! https://wandb.ai/capecape/llamac/reports/Training-Tiny-Llamas-for-Fun-and-Science--Vmlldzo1MDM2MDg0
Chapters
0:00 Tiny Llamas!
1:53 Welcome!
2:22 LLM Fine-Tuning
5:25 Tooling for Fine-Tuning
7:55 Why Fine-Tune?
9:55 RAG vs. Fine-Tuning
12:25 Knowledge Distillation
14:40 Gorilla LLMs
18:25 Open-Source LLMs
22:48 Jonathan Frankle on W&B
23:45 Data Quality for LLM Training
25:55 W&B for Data Versioning
27:25 Curriculum Learning
29:28 GPU Rich and Data Quality
30:30 Vector DBs and Data Quality
32:50 Tuning Training with Weights & Biases
35:47 Training Reports
42:28 HF Collections and W&B Sweeps
44:50 Exciting Directions for AI

Sep 13, 2023 • 1h 1min
Farshad Farahbakhshian and Etienne Dilocker on Weaviate and AWS - Weaviate Podcast #67!
Hey everyone! Thank you so much for watching the 67th Weaviate Podcast, announcing Weaviate on the AWS Marketplace! This was one of my favorite podcasts to date with a deep dive on the details of running RAG applications in the cloud, our general understanding of LLM Fine-Tuning and RAG, as well as a really interesting discussion on VPCs and Hybrid SaaS! I hope you find the podcast useful, as always we are more than happy to answer any questions or discuss any ideas you have about the content presented in the podcast!
Learn more here: https://aws.amazon.com/marketplace/seller-profile?id=seller-jxgfug62rvpxs
As well as here: https://weaviate.io/developers/weaviate/installation/aws-marketplace
Chapters
0:00 Welcome Farshad
0:38 Weaviate’s Journey to AWS
2:05 Retrieval-Augmented Generation and Vector DBs
3:44 Running AI in the Cloud
9:40 Fine-Tuning LLMs vs. RAG
10:30 Skill vs. Knowledge (Lawyer Example)
14:28 Continual Learning of LLMs
16:50 Searching through multiple sources
19:58 Hybrid Search controlled by LLMs
22:10 Classes versus Filters
25:00 SQL and Vector Search
25:55 Favorite RAG Use Cases
31:55 Cloud Benchmarking
37:00 Price Performance
38:20 Tuning HNSW
42:15 Horizontal Scalability on AWS Marketplace
47:00 Privacy Requirements
54:45 Weaviate Hybrid SaaS
59:00 AWS Marketplace

Sep 12, 2023 • 4min
Hybrid SaaS in Weaviate Explained!
Hey everyone! Here is a clip from our newest Weaviate podcast with Farshad Farahbakhshian, Gen AI specialist at AWS and Etienne Dilocker, CTO and Co-Founder of Weaviate! This podcast announces Weaviate on the AWS marketplace and is packed with info on running Weaviate in the cloud such as this clip explaining how Hybrid SaaS works! I hope you find the clip useful, we are more than happy to answer any questions you have about the content in this clip!
Chapters
0:00 Quick Intro for Context
0:29 Etienne Dilocker on Hybrid SaaS

Sep 7, 2023 • 1h 5min
David Garnitz on VectorFlow - Weaviate Podcast #66!
Hey everyone! Thank you so much for watching the 66th Weaviate Podcast with David Garnitz, the creator of VectorFlow! VectorFlow (open-sourced on GH and linked below) is a new tool for ingesting data into Vector Databases such as Weaviate! There is quite an interesting End-to-End stack emerging at the ingestion layer, from retrieving data from misc. sources such as Slack, Salesforce, GitHub, Google Drive, Notion, ... to then Chunking the Text (maybe with the use of Visual Document Layout parsers like what Unstructured is imagining), extracting Metadata potentially (say the "age" of an NBA player as in the Evaporate-Code+ research) -- then sending this data off to embedding model inference and unpacking that can of worms from inference acceleration to load balancing, and finally -- importing the vectors themselves to Weaviate! I learned so much from this conversation, I really hope you enjoy listening and please check out VectorFlow below!
VectorFlow: https://github.com/dgarnitz/vectorflow
Chapters
0:00 VectorFlow on GitHub!
0:52 Welcome David Garnitz!
1:17 Vector Flow, Founding Vision
2:00 Billions of Vectors in Weaviate!
4:20 End-to-end data importing
6:30 Metadata Extraction in Vector Database Flows
10:15 Vectorizing 100s of millions of billions of chunks
15:58 Fine-Tuning Embedding Models
23:50 Zero-Shot Models in Metadata and Chunking
36:36 Vector + SQL
42:45 Self-Driving Databases
49:23 Generative Feedback Loop REST API
51:38 GPT Cache
55:55 Building VectorFlow

Aug 31, 2023 • 1h 7min
Ofir Press on AliBi and Self-Ask - Weaviate Podcast #65!
Hey everyone! Thank you so much for watching the Weaviate Podcast! I am SUPER excited to publish my conversation with Ofir Press! Ofir has done incredible work pioneering AliBi attention and Self-Ask prompting and I learned so much from speaking with him! As always we are more than happy to answer any questions or discuss any ideas you have about the content in the podcast!
+Huge Congratulations on your Ph.D. Ofir!
AliBi Attention: https://arxiv.org/abs/2108.12409
Self-Ask Prompting: https://arxiv.org/abs/2210.03350
Ofir Pres on YouTube: https://www.youtube.com/@ofirpress
Chapters
0:00 Welcome Ofir Press
0:41 Large Context LLMs
12:38 Quadratic Complexity of Attention
19:12 AliBi Attention, Visual Demo!
24:53 Recency Bias in LLMs
28:57 RAG in Long Context LLM Training
36:27 Self-Ask Prompting
46:07 Chain-of-Thought and Self-Ask
50:47 Gorilla LLMs
58:42 New Directions for New Training Data

Aug 30, 2023 • 49min
Shishir Patil and Tianjun Zhang on Gorilla - Weaviate Podcast #64!
Hey everyone! Thank you so much for watching the 64th Weaviate Podcast with Shishir Patil and Tianjun Zhang, co-authors of Gorilla: Large Language Models Connected with Massive APIs! I learned so much about Gorilla from Shishir and Tianjun, from the APIBench dataset to the continually evolving APIZoo, how the models are trained with Retrieval-Aware Training, Self-Instruct Training data and how the authors think of fine-tuning LLaMA-7B models for tasks such as this, and many more! I hope you enjoy the podcast! As always I am more than happy to answer any questions or discuss any ideas you have about the content in the podcast!
Please check out the paper here! https://arxiv.org/abs/2305.15334
Chapters
0:00 Welcome Shishir and Tianjun
0:25 Gorilla LLM Story
1:50 API Examples
7:40 The APIZoo
10:55 Gorilla vs. OpenAI Funcs
12:50 Retrieval-Aware Training
19:55 Mixing APIs, Gorilla for Integration
25:12 LlaMA-7B Fine-Tuning vs. GPT-4
29:08 Weaviate Gorilla
33:52 Gorilla and Baby Gorillas
35:40 Gorilla vs. HuggingFace
38:32 Structured Output Parsing
41:14 Reflexion Prompting for Debugging
44:00 Directions for the Future

Aug 17, 2023 • 1h 5min
Nils Reimers on Cohere Search AI - Weaviate Podcast #63!
Nils Reimers, AI researcher, discusses the collaboration between Weaviate and Cohere, temporal queries, metadata extraction, long document representation, and future directions for Retrieval-Augmented Generation in the Weaviate Podcast. They also explore the challenges of search analysis, fine-tuning language models, and user preferences in search.

Aug 9, 2023 • 56min
Atai Barkai on PodcastGPT - Weaviate Podcast #62!
Hey everyone! Thank you so much for watching the 62nd Weaviate Podcast with Atai Barkai! We are stepping into the meta with this one for a podcast about podcasts! Podcasts are one of the biggest opportunities of new technologies, starting with Whisper's ability to transcribe audio to text and advances with speaker diarization, .. the question to be explored is, What Vector Database and LLM applications can we build with this data?! What is the future of podcasting with these new technologies?! I had so much fun discussing all these ideas with Atai! As always we are more than happy to answer any questions or discuss any ideas you have about content discussed in the podcast! Thank you so much for watching!
Chapters
0:00 Welcome Atai!
1:04 TawkitAI and PodcastGPT!
2:20 Chat with Podcast
PodcastGPT - https://www.podcastgpt.ai/
Tawkit AI - https://twitter.com/tawkitapp
Weaviate Podcast Search Demo!
https://github.com/weaviate/weaviate-podcast-search

Aug 3, 2023 • 49min
Rohit Agarwal on Portkey - Weaviate Podcast #61!
Hey everyone! Thank you so much for watching the 61st episode of the Weaviate Podcast! I am beyond excited to publish this one! I first met Rohit at the Cal Hacks event hosted by UC Berkeley where we had a debate about the impact of Semantic Caching! Rohit taught me a ton about the topic and I think it's going to be one of the most impactful early applications of Generative Feedback Loops! Rohit is building Portkey, a SUPER interesting LLM middleware that does things like load balancing between LLM APIs, and as discussed in the podcast there are all sorts of opportunities for this kind of space whether it be routing to tool-specific LLMs, different cost / accuracy requirements, or multiple models in the HuggingGPT sense. It was amazing chatting with Rohit, this was the best dive into LLMOps I have personally been apart of! As always we are more than happy to answer any questions or discuss any ideas you have about the content in the podcast!
Check out portkey here! https://portkey.ai/blog
Chapters
0:00 Introduction
0:24 Portkey, Founding Vision
2:20 LLMOps vs. MLOps
4:00 Inference Hosting Options
7:05 3 Layers of LLM Use
8:35 LLM Load Balancers
12:45 Fine-Tuning LLMs
17:08 Retrieval-Aware Tuning
21:16 Portkey Cost Savings
23:08 HuggingGPT
26:28 Semantic Caching
32:40 Frequently Asked Questions
34:00 Embeddings vs. Generative Tasks
35:30 AI Moats, GPT Wrappers
39:56 Unlocks from Cheaper LLM Inference
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.