Weaviate Podcast

CEO Han Xiao From Jina AI

Mar 15, 2022
Han Xiao, Founder and CEO of Jina AI, shares insights into the evolving world of neural search. He discusses his early experiences at Zalando and Tencent that fueled his passion for this tech. Han dives into building effective neural search pipelines, including hierarchical embeddings for images and the innovative DocumentArray structure. He outlines Jina Hub's foundations and how developers can publish their workflows easily. Lastly, he touches on the challenges of running an open-source company and the exciting future of multimodal searches.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
ANECDOTE

From Frankenstein Models To Microservices

  • Han Xiao describes building early "Frankenstein" neural search models at Zalando and later refactoring Tencent's Elasticsearch into microservices.
  • These experiences drove him to design Jina as a microservice-first neural search framework and later found Jina AI.
INSIGHT

Two Pillars And The Devil In The Details

  • Neural search needs two pillars: strong representations from ML models and fast vector retrieval/storage systems.
  • Preprocessing and postprocessing (segmentation, hierarchical embeddings, score aggregation) crucially determine search quality.
INSIGHT

Recursive Document Structure Matters

  • Documents should be recursive and nested: documents contain sub-documents and nearest-neighbor relations at multiple levels.
  • This hierarchical, horizontal/vertical structure simplifies matching and score aggregation for complex objects.
Get the Snipd Podcast app to discover more snips from this episode
Get the app