The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Building a Unified NLP Framework at LinkedIn with Huiji Gao - #481

May 6, 2021

Huiji Gao, Senior Engineering Manager at LinkedIn, shares his passion for building sophisticated NLP tools, like the open-source DeText framework. He discusses how DeText revolutionized LinkedIn’s approach to model training and its broad applications across the company. The conversation highlights the synergy between DeText and LiBERT, optimized for LinkedIn's data. They delve into the challenges of model evaluation, the importance of user interaction in enhancing performance, and techniques for document ranking optimization.

Ask episode

AI Snips

Chapters

Books

Transcript

Episode notes

INSIGHT

Limitations of Keyword Matching

LinkedIn's older search systems relied on keyword matching and skip-gram embeddings.
These methods struggled to capture contextual information between words, limiting search accuracy.

INSIGHT

Distributed Context of D-TEXT

D-TEXT operates within a distributed context.
Its design prioritizes efficient training and inference, even with complex embeddings like BERT.

INSIGHT

LiBERT: A LinkedIn-Specific BERT Model

LinkedIn trained its own BERT model, LiBERT, on LinkedIn data.
This approach offers better semantic representation and a flexible model structure.

Get the Snipd Podcast app to discover more snips from this episode

Get the app