

Building a Unified NLP Framework at LinkedIn with Huiji Gao - #481
May 6, 2021
Huiji Gao, Senior Engineering Manager at LinkedIn, shares his passion for building sophisticated NLP tools, like the open-source DeText framework. He discusses how DeText revolutionized LinkedIn’s approach to model training and its broad applications across the company. The conversation highlights the synergy between DeText and LiBERT, optimized for LinkedIn's data. They delve into the challenges of model evaluation, the importance of user interaction in enhancing performance, and techniques for document ranking optimization.
AI Snips
Chapters
Books
Transcript
Episode notes
Limitations of Keyword Matching
- LinkedIn's older search systems relied on keyword matching and skip-gram embeddings.
- These methods struggled to capture contextual information between words, limiting search accuracy.
Distributed Context of D-TEXT
- D-TEXT operates within a distributed context.
- Its design prioritizes efficient training and inference, even with complex embeddings like BERT.
LiBERT: A LinkedIn-Specific BERT Model
- LinkedIn trained its own BERT model, LiBERT, on LinkedIn data.
- This approach offers better semantic representation and a flexible model structure.