Jay Alammar, renowned AI educator and researcher at Cohere, discusses the latest developments in large language models (LLMs) and their applications in industry. Jay shares his expertise on retrieval augmented generation (RAG), semantic search, and the future of AI architectures.
MLST is sponsored by Brave:
The Brave Search API covers over 20 billion webpages, built from scratch without Big Tech biases or the recent extortionate price hikes on search API access. Perfect for AI model training and retrieval augmentated generation. Try it now - get 2,000 free queries monthly at http://brave.com/api.
Cohere Command R model series: https://cohere.com/command
Jay Alamaar:
https://x.com/jayalammar
Buy Jay's new book here!
Hands-On Large Language Models: Language Understanding and Generation
https://amzn.to/4fzOUgh
TOC:
00:00:00 Introduction to Jay Alammar and AI Education
00:01:47 Cohere's Approach to RAG and AI Re-ranking
00:07:15 Implementing AI in Enterprise: Challenges and Solutions
00:09:26 Jay's Role at Cohere and the Importance of Learning in Public
00:15:16 The Evolution of AI in Industry: From Deep Learning to LLMs
00:26:12 Expert Advice for Newcomers in Machine Learning
00:32:39 The Power of Semantic Search and Embeddings in AI Systems
00:37:59 Jay Alammar's Journey as an AI Educator and Visualizer
00:43:36 Visual Learning in AI: Making Complex Concepts Accessible
00:47:38 Strategies for Keeping Up with Rapid AI Advancements
00:49:12 The Future of Transformer Models and AI Architectures
00:51:40 Evolution of the Transformer: From 2017 to Present
00:54:19 Preview of Jay's Upcoming Book on Large Language Models
Disclaimer: This is the fourth video from our Cohere partnership. We were not told what to say in the interview, and didn't edit anything out from the interview. Note also that this combines several previously unpublished interviews from Jay into one, the earlier one at Tim's house was shot in Aug 2023, and the more recent one in Toronto in May 2024.
Refs:
The Illustrated Transformer
https://jalammar.github.io/illustrated-transformer/
Attention Is All You Need
https://arxiv.org/abs/1706.03762
The Unreasonable Effectiveness of Recurrent Neural Networks
http://karpathy.github.io/2015/05/21/rnn-effectiveness/
Neural Networks in 11 Lines of Code
https://iamtrask.github.io/2015/07/12/basic-python-network/
Understanding LSTM Networks (Chris Olah's blog post)
http://colah.github.io/posts/2015-08-Understanding-LSTMs/
Luis Serrano's YouTube Channel
https://www.youtube.com/channel/UCgBncpylJ1kiVaPyP-PZauQ
Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
https://arxiv.org/abs/1908.10084
GPT (Generative Pre-trained Transformer) models
https://jalammar.github.io/illustrated-gpt2/
https://openai.com/research/gpt-4
BERT (Bidirectional Encoder Representations from Transformers)
https://jalammar.github.io/illustrated-bert/
https://arxiv.org/abs/1810.04805
RoPE (Rotary Positional Encoding)
https://arxiv.org/abs/2104.09864 (Linked paper discussing rotary embeddings)
Grouped Query Attention
https://arxiv.org/pdf/2305.13245
RLHF (Reinforcement Learning from Human Feedback)
https://openai.com/research/learning-from-human-preferences
https://arxiv.org/abs/1706.03741
DPO (Direct Preference Optimization)
https://arxiv.org/abs/2305.18290