

Weaviate Podcast
Weaviate
Join Connor Shorten as he interviews machine learning experts and explores Weaviate use cases from users and customers.
Episodes
Mentioned books

Nov 17, 2022 • 53min
Maarten Grootendorst on BERTopic
Weaviate Podcast #28. Thank you so much for watching the 28th Weaviate Podcast! This episode features Maarten Grootendorst, developer of the BERTopic python library and an active evangelist of this exciting cluster analysis technology, (Maarten has written some incredible articles here - https://medium.com/@maartengrootendorst)! In this podcast, Maarten did an incredible job explaining how BERTopic works, with particular details such as k-Means clustering vs. HDBSCAN, Semi-Supervised topic modeling, Dynamic topic modeling, and many more! I was amazed at Maarten's expertise in the miscellaneous details of these algorithms! We are extremely excited about adding BERTopic to Weaviate, please see this proposal if interested in contributing to the discussion: https://github.com/semi-technologies/...!

Oct 26, 2022 • 44min
Michael Goin on Neural Magic
Weaviate Podcast #27. Thank you so much for watching the 27th episode of the Weaviate Podcast! This is truly one of my favorite podcasts we have published so far, I think the way Weaviate and Neural Magic fit together is really exciting! Michael did an amazing job explaining the concepts behind how Neural Magic delivers and tests inference acceleration, as well as the vision for the future of Deep Learning with Sparsity and CPU inference. I really hope you enjoy the podcast, more than happy to answer any questions or entertain any ideas/discussion! Thanks again for watching! Weaviate users can begin using Neural Magic's text vectorization pipeline as a custom text2vec-transformers docker image here - cshorten/experimental-text2vec-neuralmagic. Please note this is an experimental build and we will be releasing the full integration with thorough testing very soon!

Oct 19, 2022 • 45min
Jonathan Frankle on MosaicML Cloud
Weaviate Podcast #26. Thank you so much for watching the 26th episode of the Weaviate Podcast! This is another really special episode! Jonathan Frankle is one of the world's experts in Deep Learning and is making incredible advances at MosaicML in efficient Deep Learning training. The headline event is the release of MosaicML Cloud and a set of new cost estimates for GPT language models at different scales (linked below). Jonathan explains that these numbers are a baseline and he predicts they could get to as low as $100K as they seek opportunities for efficiency optimizations. This story has already played out in the realm of ResNet ImageNet training as MosaicML has demolished expectations of how fast we can train these models and it seems highly likely they will do the same for large language model costs. Jonathan and I also discussed the general space of Language Models and their applications, especially discussing their role as Databases in things like the Weaviate Vector Search Engine. We also discussed Self-Ask, Chain-of-thought Prompting, and tool use in Language Models. I had an awesome time picking Jonathan's brain about these topics and I hope you all enjoy the podcast, more than happy to answer any questions or entertain any ideas / discussion! Thanks again for watching! Blog post: GPT-3 Quality for less than $500K - https://www.mosaicml.com/blog/gpt-3-q...

Oct 6, 2022 • 45min
Erik Bernhardsson and Etienne Dilocker on Vector Search in Production.
Weaviate Podcast #25. Thank you so much for watching the 25th episode of the Weaviate Podcast! This is a really special episode with Erik Bernhardsson! Erik is one of the early thought leaders on Approximate Nearest Neighbor (ANN) Search, creating the ANNOY library at Spotify. Erik shared incredible insights about vector search at Spotify such as the role of Offline and Online Machine Learning inference and the role of multi-stage re-ranking pipelines. Erik has also done massively impactful work on benchmarking ANN algorithms! We really hope you enjoy the podcast and would be thrilled to answer any questions you have about the conversation topics! Thanks again for watching!

Sep 8, 2022 • 1h 7min
Weaviate v1.15 Release with Etienne Dilocker and Dirk Kulawiak
Weaviate Podcast #24. Weaviate v1.15 Release! Thank you so much for checking out the Weaviate podcast -- here is a summary of what is new in Weaviate 1.15: 1. Cloud-native backups – allows you to configure your environment to create backups – of selected classes or the whole database – straight into AWS S3, GCS or local filesystem 2. Reduced memory usage - we found new ways to optimize memory usage, reducing RAM usage by 10-30%. 3. Better control over Garbage Collector – with the introduction of GOMEMLIMIT we gained more control over the garbage collector, which significantly reduced the chances of OOM kills for your Weaviate setups. 4. Faster imports for ordered data – by extending the Binary Search Tree structure with a self-balancing Red-black tree, we were able to speed up imports from O(n) to O(log n) 5. More efficient filtered aggregations – thanks to optimization to a library reading binary data, filtered aggregations are now 10-20 faster and require a lot less memory. 6. Two new distance metrics – with the addition of Hamming and Manhattan distance metrics, you can choose the metric (or a combination of) to best suit your data and use case. 7. Two new Weaviate modules – with the Summarization module, you can summarize any text on the fly, while with the HuggingFace module, you can use compatible transformers from the HuggingFace 8. Other improvements and bug fixes – it goes without saying that with every Weaviate release, we strive to make Weaviate more stable – through bug fixes – and more efficient – through many optimizations. Please check out this awesome blog post from Sebastian Witalec and the team describing these further - https://weaviate.io/blog/2022/09/Weav....

Aug 31, 2022 • 1h 2min
Ori Ram on Learning to Retrieve Passages without Supervision
Weaviate Podcast #23. Thank you so much for watching the 23rd episode of the Weaviate Podcast! This episode dives into a new technique for Self-Supervised retrieval in NLP with some incredible results!

Aug 10, 2022 • 58min
Yaoshiang Ho on Masterful AI
Weaviate Podcast #22. Thank you so much for watching the 22nd Weaviate Podcast with Yaoshiang Ho! Yaoshiang is a Co-Founder of Masterful AI, a company doing incredible work in the Computer Vision model training and deployment space (https://www.masterfulai.com/). I really hope you enjoy this podcast, Yaoshiang and I went deep into some of the cutting edge Computer Vision algorithms such as Noisy Student, SimCLR, and Barlow Twins -- as well as the broader topic of Semi-Supervised Learning in which we have a small labeled dataset and a large unlabelled dataset. I am so excited about model training tools like Masterful and the integration with the Weaviate Vector Search Indexes and other Database features! More than happy to answer any questions/host any discussion on topics mentioned in the podcast! Please hit the like and subscribe to support more content like this!

Jul 27, 2022 • 53min
Laura Ham on Weaviate User Experience
Weaviate Podcast #21. Thank you for watching the 21st Weaviate Podcast with Laura Ham! Laura Ham has worked on Weaviate at SeMI Technologies for a little over 5 years. She has had a heavy influence on all things from the GraphQL User Experience design to the Graph data model, and the creation of educational content! I really enjoyed this podcast, please see the list of topics under “chapters”! Here are some examples of recent coding tutorial videos Laura has made on “How to add custom modules to Weaviate” and integrations of Weaviate with Jina AI and Haystack.

Jul 13, 2022 • 47min
Tuana Celik on Question Answering with Haystack
Weaviate Podcast #20. Tuana Celik, a Developer Advocate at Deepset, presented many exciting ideas around Question Answering! We began with her Game of Thrones Question Answering Demo on HuggingFace Spaces and continued to discuss all topics QA from Extractive to Abstractive, benefits of Retrieve-then-Read, and Zero-Shot Generalization, to give a quick preview. For our Weaviate users, please check out this demo from Laura Ham on how to use Haystack QA in tandem with the Weaviate Vector Search Database: https://www.youtube.com/watch?v=Bkoza.... I really hope you enjoy this podcast, please don't forget to check the Chapters to see if any topics appeal particularly to what you are working on! Please also see the links below with referenced content in the podcast!

Jul 8, 2022 • 48min
Etienne Dilocker on Weaviate v1.14 Release!
Weaviate Podcast #19. SeMI Technologies Co-Founder and CTO Etienne Dilocker returns to the Weaviate podcast to describe what's new with Weaviate v1.14! Please see the chapter outlines if you would like to skip ahead to the update most relevant to you! Please also see this blog post lead by Sebastian Witalec describing the new changes to Weaviate! Weaviate v1.14 Blog Post: https://weaviate.io/blog/2022/07/Weav...