Weaviate Podcast cover image

Weaviate Podcast

Latest episodes

undefined
Oct 19, 2022 • 45min

Jonathan Frankle on MosaicML Cloud

Weaviate Podcast #26. Thank you so much for watching the 26th episode of the Weaviate Podcast! This is another really special episode! Jonathan Frankle is one of the world's experts in Deep Learning and is making incredible advances at MosaicML in efficient Deep Learning training. The headline event is the release of MosaicML Cloud and a set of new cost estimates for GPT language models at different scales (linked below). Jonathan explains that these numbers are a baseline and he predicts they could get to as low as $100K as they seek opportunities for efficiency optimizations. This story has already played out in the realm of ResNet ImageNet training as MosaicML has demolished expectations of how fast we can train these models and it seems highly likely they will do the same for large language model costs. Jonathan and I also discussed the general space of Language Models and their applications, especially discussing their role as Databases in things like the Weaviate Vector Search Engine. We also discussed Self-Ask, Chain-of-thought Prompting, and tool use in Language Models. I had an awesome time picking Jonathan's brain about these topics and I hope you all enjoy the podcast, more than happy to answer any questions or entertain any ideas / discussion! Thanks again for watching!  Blog post: GPT-3 Quality for less than $500K - https://www.mosaicml.com/blog/gpt-3-q...
undefined
Oct 6, 2022 • 45min

Erik Bernhardsson and Etienne Dilocker on Vector Search in Production.

Weaviate Podcast #25. Thank you so much for watching the 25th episode of the Weaviate Podcast! This is a really special episode with Erik Bernhardsson! Erik is one of the early thought leaders on Approximate Nearest Neighbor (ANN) Search, creating the ANNOY library at Spotify. Erik shared incredible insights about vector search at Spotify such as the role of Offline and Online Machine Learning inference and the role of multi-stage re-ranking pipelines. Erik has also done massively impactful work on benchmarking ANN algorithms! We really hope you enjoy the podcast and would be thrilled to answer any questions you have about the conversation topics! Thanks again for watching!
undefined
Sep 8, 2022 • 1h 7min

Weaviate v1.15 Release with Etienne Dilocker and Dirk Kulawiak

Weaviate Podcast #24. Weaviate v1.15 Release! Thank you so much for checking out the Weaviate podcast -- here is a summary of what is new in Weaviate 1.15:  1. Cloud-native backups – allows you to configure your environment to create backups – of selected classes or the whole database – straight into AWS S3, GCS or local filesystem 2. Reduced memory usage - we found new ways to optimize memory usage, reducing RAM usage by 10-30%. 3. Better control over Garbage Collector – with the introduction of GOMEMLIMIT we gained more control over the garbage collector, which significantly reduced the chances of OOM kills for your Weaviate setups. 4. Faster imports for ordered data – by extending the Binary Search Tree structure with a self-balancing Red-black tree, we were able to speed up imports from O(n) to O(log n) 5. More efficient filtered aggregations – thanks to optimization to a library reading binary data, filtered aggregations are now 10-20 faster and require a lot less memory. 6. Two new distance metrics – with the addition of Hamming and Manhattan distance metrics, you can choose the metric (or a combination of) to best suit your data and use case. 7. Two new Weaviate modules – with the Summarization module, you can summarize any text on the fly, while with the HuggingFace module, you can use compatible transformers from the HuggingFace 8. Other improvements and bug fixes – it goes without saying that with every Weaviate release, we strive to make Weaviate more stable – through bug fixes – and more efficient – through many optimizations.  Please check out this awesome blog post from Sebastian Witalec and the team describing these further - https://weaviate.io/blog/2022/09/Weav....
undefined
Aug 31, 2022 • 1h 2min

Ori Ram on Learning to Retrieve Passages without Supervision

Weaviate Podcast #23. Thank you so much for watching the 23rd episode of the Weaviate Podcast! This episode dives into a new technique for Self-Supervised retrieval in NLP with some incredible results!
undefined
Aug 10, 2022 • 58min

Yaoshiang Ho on Masterful AI

Weaviate Podcast #22. Thank you so much for watching the 22nd Weaviate Podcast with Yaoshiang Ho! Yaoshiang is a Co-Founder of Masterful AI, a company doing incredible work in the Computer Vision model training and deployment space (https://www.masterfulai.com/). I really hope you enjoy this podcast, Yaoshiang and I went deep into some of the cutting edge Computer Vision algorithms such as Noisy Student, SimCLR, and Barlow Twins -- as well as the broader topic of Semi-Supervised Learning in which we have a small labeled dataset and a large unlabelled dataset. I am so excited about model training tools like Masterful and the integration with the Weaviate Vector Search Indexes and other Database features! More than happy to answer any questions/host any discussion on topics mentioned in the podcast! Please hit the like and subscribe to support more content like this!
undefined
Jul 27, 2022 • 53min

Laura Ham on Weaviate User Experience

Weaviate Podcast #21. Thank you for watching the 21st Weaviate Podcast with Laura Ham! Laura Ham has worked on Weaviate at SeMI Technologies for a little over 5 years. She has had a heavy influence on all things from the GraphQL User Experience design to the Graph data model, and the creation of educational content! I really enjoyed this podcast, please see the list of topics under “chapters”! Here are some examples of recent coding tutorial videos Laura has made on “How to add custom modules to Weaviate” and integrations of Weaviate with Jina AI and Haystack.
undefined
Jul 13, 2022 • 47min

Tuana Celik on Question Answering with Haystack

Weaviate Podcast #20. Tuana Celik, a Developer Advocate at Deepset, presented many exciting ideas around Question Answering! We began with her Game of Thrones Question Answering Demo on HuggingFace Spaces and continued to discuss all topics QA from Extractive to Abstractive, benefits of Retrieve-then-Read, and Zero-Shot Generalization, to give a quick preview. For our Weaviate users, please check out this demo from Laura Ham on how to use Haystack QA in tandem with the Weaviate Vector Search Database: https://www.youtube.com/watch?v=Bkoza.... I really hope you enjoy this podcast, please don't forget to check the Chapters to see if any topics appeal particularly to what you are working on! Please also see the links below with referenced content in the podcast!
undefined
Jul 8, 2022 • 48min

Etienne Dilocker on Weaviate v1.14 Release!

Weaviate Podcast #19. SeMI Technologies Co-Founder and CTO Etienne Dilocker returns to the Weaviate podcast to describe what's new with Weaviate v1.14! Please see the chapter outlines if you would like to skip ahead to the update most relevant to you! Please also see this blog post lead by Sebastian Witalec describing the new changes to Weaviate!  Weaviate v1.14 Blog Post: https://weaviate.io/blog/2022/07/Weav...
undefined
Jun 28, 2022 • 58min

Vincent D. Warmerdam on Applications of Nearest Neighbor Search

Weaviate Podcast #18. Thank you for watching the 18th Weaviate Podcast with Vincent D. Warmerdam! Vincent is an engineer at Spacy working on exciting tools such as Prodigy! Vincent describes how nearest neighbor search can aid in tasks such as Data De-Duplication and Data Labeling! Vincent shared many interesting ideas from representations of text, challenges with annotator disagreement, lessons from hosting data labeling workshops to train data scientists, and many more!
undefined
May 31, 2022 • 1h 7min

Kyle Lo on Scientific Literature Mining

Weaviate Podcast #17. Thank you for watching the 17th Weaviate Podcast with Kyle Lo! Vector Search enables us to find semantically similar items in large collections. Scientific Literature Mining is an extremely interesting case of this where we search through enormous collections of scientific papers to find evidence and ideas. Kyle has an extremely impressive resume in this application domain, tackling tasks such as Question Answering, Summarization, Fact Verification, and more! We really hope you enjoy the lessons Kyle describes from building these systems. Further, we hope that this inspires excitement in your own Vector Search applications!

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app