

NLP Highlights
Allen Institute for Artificial Intelligence
**The podcast is currently on hiatus. For more active NLP content, check out the Holistic Intelligence Podcast linked below.**
Welcome to the NLP highlights podcast, where we invite researchers to talk about their work in various areas in natural language processing. All views expressed belong to the hosts/guests, and do not represent their employers.
Welcome to the NLP highlights podcast, where we invite researchers to talk about their work in various areas in natural language processing. All views expressed belong to the hosts/guests, and do not represent their employers.
Episodes
Mentioned books

Aug 8, 2018 • 36min
64 - Neural Network Models for Sentence Pair Tasks, with Wuwei Lan and Wei Xu
Best reproduction paper at COLING 2018, by Wuwei Lan and Wei Xu.
This paper takes a bunch of models for sentence pair classification (including paraphrase identification, semantic textual similarity, natural language inference / entailment, and answer sentence selection for QA) and compares all of them on all tasks. There's a very nice table in the paper showing the cross product of models and datasets, and how by looking at the original papers this table is almost empty; Wuwei and Wei fill in all of the missing values in that table with their own experiments. This is a very nice piece of work that helps us gain a broader understanding of how these models perform in diverse settings, and it's awesome that COLING explicitly asked for and rewarded this kind of paper, as it's not your typical "come look at my shiny new model!" paper.
Our discussion with Wuwei and Wei covers what models and datasets the paper looked at, why the datasets can be treated similarly (and some reasons for why maybe they should be treated differently), the differences between the models that were tested, and the difficulties of reproducing someone else's model.
https://www.semanticscholar.org/paper/Neural-Network-Models-for-Paraphrase-Semantic-and-Lan-Xu/6c990c162816bff2133a8e0ed9719bd0f87ae9d9

Aug 2, 2018 • 30min
63 - Neural Lattice Language Models, with Jacob Buckman
TACL 2018 paper by Jacob Buckman and Graham Neubig.
Jacob tells us about marginalizing over latent structure in a sentence by doing a clever parameterization of a lattice with a model kind of like a tree LSTM. This lets you treat collocations as multi-word units, or allow words to have multiple senses, without having to commit to a particular segmentation or word sense disambiguation up front. We talk about how this works and what comes out. One interesting result that comes out of the sense lattice: learning word senses from a language modeling objective tends to give you senses that capture the mode of the "next word" distribution, like uses of "bank" that are always followed by "of". Helpful for local perplexity, but not really what you want if you're looking for semantic senses, probably.
https://www.semanticscholar.org/paper/Neural-Lattice-Language-Models-Buckman-Neubig/f36b961ea5106c19c341763bd9942c1f09038e5d

Jul 30, 2018 • 31min
62 - Sounding Board: A User-Centric and Content-Driven Social Chatbot, with Hao Fang
NAACL 2018 demo paper, by Hao Fang, Hao Cheng, Maarten Sap, Elizabeth Clark, Ari Holtzman, Yejin Choi, Noah A. Smith, and Mari Ostendorf
Sounding Board was the system that won the 2017 Amazon Alexa Prize, a competition to build a social chatbot that interacts with users as an Alexa skill. Hao comes on the podcast to tell us about the project. We talk for a little bit about how Sounding Board works, but spend most of the conversation talking about what these chatbots can do - the competition setup, some example interactions, the limits of current systems, and how chatbots might be more useful in the future. Even the best current systems seem pretty limited, but the potential future uses are compelling enough to warrant continued research.
https://www.semanticscholar.org/paper/Sounding-Board%3A-A-User-Centric-and-Content-Driven-Fang-Cheng/b540fd427a02b19c6ea55dd7d9758ebf15ec3965

Jul 23, 2018 • 31min
61 - Neural Text Generation in Stories, with Elizabeth Clark and Yangfeng Ji
NAACL 2018 Outstanding Paper by Elizabeth Clark, Yangfeng Ji, and Noah A. Smith
Both Elizabeth and Yangfeng come on the podcast to tell us about their work. This paper is an extension of an EMNLP 2017 paper by Yangfeng and co-authors that introduced a language model that included explicit entity representations. Elizabeth and Yangfeng take that model, improve it a bit, and use it for creative narrative generation, with some interesting applications. We talk a little bit about the model, but mostly about how the model was used to generate narrative text, how it was evaluated, and what other interesting applications there are of this idea. The punchline is that this model does a better job at generating coherent stories than other generation techniques, because it can track the entities in the story better.
We've been experimenting with how we record the audio, trying to figure out how to get better audio quality. Sadly, this episode was a failed experiment, and there is a background hiss that we couldn't get rid of. Bear with us as we work on this...
https://www.semanticscholar.org/paper/Generation-in-Stories-Using-Entity-Representations-Clark-Ji/56df50601975f6065b8acc0a08c169aaecad97bc

Jun 28, 2018 • 29min
60 - FEVER: a large-scale dataset for Fact Extraction and VERification, with James Thorne
NAACL 2018 paper by James Thorne, Andreas Vlachos, Christos Christodoulopoulos, and Arpit Mittal
James tells us about his paper, where they created a dataset for fact checking. We talk about how this dataset relates to other datasets, why a new one was needed, how it was built, and how well the initial baseline does on this task. There are some interesting side notes on bias in dataset construction, and on how "fact checking" relates to "fake news" ("fake news" could mean that an article is actively trying to deceive or mislead you; "fact checking" here is just determining if a single claim is true or false given a corpus of assumed-correct reference material). The baseline system does quite poorly, and the lowest-hanging fruit seems to be in improving the retrieval component that finds relevant supporting evidence for claims.
There's a workshop and shared task coming up on this dataset: http://fever.ai/. The shared task test period starts on July 24th - get your systems ready!
https://www.semanticscholar.org/paper/FEVER%3A-a-Large-scale-Dataset-for-Fact-Extraction-Thorne-Vlachos/7b1f840ecfafb94d2d9e6e926696dba7fad0bb88

Jun 12, 2018 • 35min
59 - Weakly Supervised Semantic Parsing With Abstract Examples, with Omer Goldman
ACL 2018 paper by Omer Goldman, Veronica Latcinnik, Udi Naveh, Amir Globerson, and Jonathan Berant
Omer comes on to tell us about a class project (done mostly by undergraduates!) that made it into ACL. Omer and colleagues built a semantic parser that gets state-of-the-art results on the Cornell Natural Language Visual Reasoning dataset. They did this by using "abstract examples" - they replaced the entities in the questions and corresponding logical forms with their types, labeled about a hundred examples in this abstracted formalism, and used those labels to do data augmentation and train their parser. They also used some interesting caching tricks, and a discriminative reranker.
https://www.semanticscholar.org/paper/Weakly-supervised-Semantic-Parsing-with-Abstract-Goldman-Latcinnik/5aec2ab5bf2979da067e2aa34762b589a0680030

Jun 8, 2018 • 47min
58 - Learning What’s Easy: Fully Differentiable Neural Easy-First Taggers, with André Martins
EMNLP 2017 paper by André F. T. Martins and Julia Kreutzer
André comes on the podcast to talk to us the paper. We spend the bulk of the time talking about the two main contributions of the paper: how they applied the notion of "easy first" decoding to neural taggers, and the details of the constrained softmax that they introduced to accomplish this. We conclude that "easy first" might not be the right name for this - it's doing something that in the end is very similar to stacked self-attention, with standard independent decoding at the end. The particulars of the self-attention are inspired by "easy first", however, using a constrained softmax to enforce some novel constraints on the self-attention.
https://www.semanticscholar.org/paper/Learning-What's-Easy%3A-Fully-Differentiable-Neural-Martins-Kreutzer/252571243aa4c0b533aa7fc63f88d07fd844e7bb

Jun 5, 2018 • 32min
57 - A Survey Of Cross-lingual Word Embedding Models, with Sebastian Ruder
Upcoming JAIR paper by Sebastian Ruder, Ivan Vulić, and Anders Søgaard.
Sebastian comes on to tell us about his survey. He creates a typology of cross-lingual word embedding methods, and we discuss why you might use cross-lingual embeddings (low-resource languages in particular), what information they capture (semantics? syntax? both?), how the methods work (lots of different ways), and how to evaluate the embeddings (best when you have an extrinsic task to evaluate on).
https://www.semanticscholar.org/paper/A-survey-of-cross-lingual-embedding-models-Ruder/3dbd28c63a7807280c9531735c715d4598024166

Apr 4, 2018 • 30min
56 - Deep contextualized word representations, with Matthew Peters
NAACL 2018 paper, by Matt Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Chris Clark, Kenton Lee, and Luke Zettlemoyer.
In this episode, AI2's own Matt Peters comes on the show to talk about his recent work on ELMo embeddings, what some have called "the next word2vec". Matt has shown very convincingly that using a pre-trained bidirectional language model to get contextualized word representations performs substantially better than using static word vectors. He comes on the show to give us some more intuition about how and why this works, and to talk about some of the other things he tried and what's coming next.
https://www.semanticscholar.org/paper/Deep-contextualized-word-representations-Peters-Neumann/4b17597b856c087f109381ce77d60d9017cb6f9a

Mar 28, 2018 • 32min
55 - Matchbox: Dispatch-driven autobatching for imperative deep learning, with James Bradbury
In this episode, we take a more systems-oriented approach to NLP, looking at issues with writing deep learning code for NLP models. As a lot of people have discovered over the last few years, efficiently batching multiple examples together for fast training on a GPU can be very challenging with complex NLP models. James Bradbury comes on to tell us about Matchbox, his recent effort to provide a framework for automatic batching with pytorch. In the discussion, we talk about why batching is hard, why it's important, how other people have tried to solve this problem in the past, and what James' solution to the problem is.
Code is available here: https://github.com/salesforce/matchbox