How Keenious uses Weaviate to Enable Semantic Search through 60M+ Academic PUBs
Dec 13, 2021
auto_awesome
Connor Shorten and Charles Pierse discuss how Keenious uses Weaviate for semantic search in academic publications. They explore the concept of neuro-symbolic search and serendipity in academic paper search, the role of ontologies and knowledge graphs, and the versatility and modularity of Weaviate. They also discuss the H&S W algorithm and pre-filtering, as well as the potential of automated scientific reviewing as a SaaS product.
Kenius aims to make the process of scientific literature exploration fun and serendipitous, offering not just keyword-based results but also additional interesting results.
Kenius employs context-based search to help users discover relevant research beyond direct queries, mimicking human association of ideas.
Weeviate search engine provides scalable and modular vector search capabilities, making it an ideal tool for handling large-scale vector search applications.
Deep dives
Kenius: A Scientific Literature Mining Tool
Kenius is a scientific literature mining tool that allows users to search through the scientific literature and find relevant papers and information. The tool is a plug-in for Google Docs and Microsoft Word, allowing users to highlight text and query Kenius for relevant ideas. The tool takes a holistic approach to research, catering to researchers from undergraduates to PhDs. Kenius aims to make the process of exploration fun and serendipitous, offering not just exact keyword-based results but also providing additional interesting results that may not match exact keywords. The tool can handle documents of any size and includes features such as searching authors, topics, and institutions.
The Power of Neuro Symbolic Search
Kenius employs neuro symbolic search, combining keyword-based searching with semantic and context-based search. The tool uses heuristics to determine user intent, distinguishing between strict keyword-based searches and more exploratory searches for broader concepts. By understanding user intent, Kenius offers context-based search that helps users discover relevant research beyond direct queries. This approach mimics human association of ideas, where concepts are connected contextually rather than solely through keywords. Kenius aims to facilitate contextual and concept-based searches, helping users explore and discover research they might not have found through traditional query-based searching.
Challenges and Potential in Knowledge Graph Embeddings
Handling a large knowledge graph poses challenges in terms of scalability and optimization. The process of building the knowledge graph and generating embeddings requires custom infrastructure and efficient pipelines. Storing entities and relations in memory can be memory-intensive and time-consuming. While research is exploring methods to handle smaller data sets and optimize the graph embedding process, these challenges remain. Nonetheless, knowledge graph embeddings show promise in facilitating rich searches, allowing users to explore relationships and conduct context-based searches across various entities.
Weeviate: Enabling Efficient Vector Search
Weeviate search engine provides scalable and modular vector search capabilities, making it a suitable choice for use with knowledge graph embeddings. The API allows for plug-and-play components, enabling easy swapping of embedding models and indexes. Weeviate is designed for practical business use cases, offering features like post-filtering and pre-filtering to enhance search results. The distributed nature of Weeviate ensures scalability, making it an ideal tool for handling large-scale vector search applications. The modular architecture of Weeviate allows for future upgrades and compatibility with new indexing strategies as the vector search field advances.
The ease of scalability and plug-and-play approach with the search engine used is highly beneficial for the company
The podcast discusses the advantage of using a search engine that allows for easy scalability and a plug-and-play approach. This confidence in the search engine eliminates the need to rebuild or re-engineer the entire system, saving valuable time and resources. The speaker highlights the convenience of making quick changes, such as a few lines of code or a config YAML update, to seamlessly transition to a new index. This flexibility and efficiency are considered to be significant wins for the company's scaling efforts.
The pre-search filtering system in the index is praised for its elegance and effectiveness
The podcast explores the use of a pre-search filtering system in the index and emphasizes its elegance and efficiency. Unlike a post-search filter, the pre-search filter works by creating an allow list and a set list of entities that should be included in the search, resulting in a higher chance of retrieving desired results. The speaker compares it to the possibility of developing an in-house system for post-search filtering but expresses preference for the pre-search approach. This pre-filtering system is deemed crucial in guaranteeing a sufficient number of relevant search results, especially when building a search engine.
Weaviate Podcast #2. Join Connor Shorten (Henry AI Labs) and Charles Pierse (Keenious) for the second Weaviate vector search engine Podcast. During the show, they will be discussing how Keenious uses Weaviate and broader, all things NLP!
Get the Snipd podcast app
Unlock the knowledge in podcasts with the podcast player of the future.
AI-powered podcast player
Listen to all your favourite podcasts with AI-powered features
Discover highlights
Listen to the best highlights from the podcasts you love and dive into the full episode
Save any moment
Hear something you like? Tap your headphones to save it with AI-generated key takeaways
Share & Export
Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more
AI-powered podcast player
Listen to all your favourite podcasts with AI-powered features
Discover highlights
Listen to the best highlights from the podcasts you love and dive into the full episode