Yannic Kilcher Videos (Audio Only)

Yannic Kilcher

I make videos about machine learning research papers, programming, and issues of the AI community, and the broader impact of AI in society.

Twitter: https://twitter.com/ykilcher
Discord: https://discord.gg/4H8xxDF

If you want to support me, the best thing to do is to share out the content :)

If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):
SubscribeStar (preferred to Patreon): https://www.subscribestar.com/yannickilcher
Patreon: https://www.patreon.com/yannickilcher
Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq

Episodes

Mentioned books

Aug 28, 2023 • 1h 2min

RWKV: Reinventing RNNs for the Transformer Era (Paper Explained)

#gpt4 #rwkv #transformer We take a look at RWKV, a highly scalable architecture between Transformers and RNNs. Fully Connected (June 7th in SF) Promo Link: https://www.fullyconnected.com/?promo=ynnc OUTLINE: 0:00 - Introduction 1:50 - Fully Connected In-Person Conference in SF June 7th 3:00 - Transformers vs RNNs 8:00 - RWKV: Best of both worlds 12:30 - LSTMs 17:15 - Evolution of RWKV's Linear Attention 30:40 - RWKV's Layer Structure 49:15 - Time-Parallel vs Sequence Mode 53:55 - Experimental Results & Limitations 58:00 - Visualizations 1:01:40 - Conclusion Paper: https://arxiv.org/abs/2305.13048 Code: https://github.com/BlinkDL/RWKV-LM Abstract: Transformers have revolutionized almost all natural language processing (NLP) tasks but suffer from memory and computational complexity that scales quadratically with sequence length. In contrast, recurrent neural networks (RNNs) exhibit linear scaling in memory and computational requirements but struggle to match the same performance as Transformers due to limitations in parallelization and scalability. We propose a novel model architecture, Receptance Weighted Key Value (RWKV), that combines the efficient parallelizable training of Transformers with the efficient inference of RNNs. Our approach leverages a linear attention mechanism and allows us to formulate the model as either a Transformer or an RNN, which parallelizes computations during training and maintains constant computational and memory complexity during inference, leading to the first non-transformer architecture to be scaled to tens of billions of parameters. Our experiments reveal that RWKV performs on par with similarly sized Transformers, suggesting that future work can leverage this architecture to create more efficient models. This work presents a significant step towards reconciling the trade-offs between computational efficiency and model performance in sequence processing tasks. Authors: Bo Peng, Eric Alcaide, Quentin Anthony, Alon Albalak, Samuel Arcadinho, Huanqi Cao, Xin Cheng, Michael Chung, Matteo Grella, Kranthi Kiran GV, Xuzheng He, Haowen Hou, Przemyslaw Kazienko, Jan Kocon, Jiaming Kong, Bartlomiej Koptyra, Hayden Lau, Krishna Sri Ipsit Mantri, Ferdinand Mom, Atsushi Saito, Xiangru Tang, Bolun Wang, Johan S. Wind, Stansilaw Wozniak, Ruichong Zhang, Zhenyuan Zhang, Qihang Zhao, Peng Zhou, Jian Zhu, Rui-Jie Zhu Links: Homepage: https://ykilcher.com Merch: https://ykilcher.com/merch YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://ykilcher.com/discord LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannickilcher Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

Aug 28, 2023 • 29min

Tree of Thoughts: Deliberate Problem Solving with Large Language Models (Full Paper Review)

#gpt4 #ai #prompt Tree-of-Thought improves prompting of large language models (LLMs) by generalizing the concept of Chain-of-Thought prompting and introduces a tree search across language model thoughts, including state evaluation and backtracking. Experiments on toy tasks show large improvements over both classic and Chain-of-Thought prompting. OUTLINE: 0:00 - Introduction 1:20 - From Chain-of-Thought to Tree-of-Thought 11:10 - Formalizing the algorithm 16:00 - Game of 24 & Creative writing 18:30 - Crosswords 23:30 - Is this a general problem solver? 26:50 - Ablation studies 28:55 - Conclusion Paper: https://arxiv.org/abs/2305.10601 Abstract: Language models are increasingly being deployed for general problem solving across a wide range of tasks, but are still confined to token-level, left-to-right decision-making processes during inference. This means they can fall short in tasks that require exploration, strategic lookahead, or where initial decisions play a pivotal role. To surmount these challenges, we introduce a new framework for language model inference, Tree of Thoughts (ToT), which generalizes over the popular Chain of Thought approach to prompting language models, and enables exploration over coherent units of text (thoughts) that serve as intermediate steps toward problem solving. ToT allows LMs to perform deliberate decision making by considering multiple different reasoning paths and self-evaluating choices to decide the next course of action, as well as looking ahead or backtracking when necessary to make global choices. Our experiments show that ToT significantly enhances language models' problem-solving abilities on three novel tasks requiring non-trivial planning or search: Game of 24, Creative Writing, and Mini Crosswords. For instance, in Game of 24, while GPT-4 with chain-of-thought prompting only solved 4% of tasks, our method achieved a success rate of 74%. Code repo with all prompts: this https URL. Authors: Shunyu Yao, Dian Yu, Jeffrey Zhao, Izhak Shafran, Thomas L. Griffiths, Yuan Cao, Karthik Narasimhan Links: Homepage: https://ykilcher.com Merch: https://ykilcher.com/merch YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://ykilcher.com/discord LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannickilcher Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

Aug 28, 2023 • 16min

OpenAI suggests AI licenses (US Senate hearing on AI regulation w/ Sam Altman)

#ai #openai #gpt4 US Senate hearing on AI regulation. MLST video on the hearing: https://www.youtube.com/watch?v=DeSXnESGxr4 Links: Homepage: https://ykilcher.com Merch: https://ykilcher.com/merch YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://ykilcher.com/discord LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannickilcher Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

Aug 28, 2023 • 39min

[ML News] Geoff Hinton leaves Google | Google has NO MOAT | OpenAI down half a billion

Aug 28, 2023 • 25min

The biggest week in AI (GPT-4, Office Copilot, Google PaLM, Anthropic Claude & more)

#mlnews #gpt4 #copilot Your weekly news all around the AI world Check out W&B courses (free): https://wandb.courses/ OUTLINE: 0:00 - Intro 0:20 - GPT-4 announced! 4:30 - GigaGAN: The comeback of Generative Adversarial Networks 7:55 - ChoppedAI: AI Recipes 8:45 - Samsung accused of faking space zoom effect 14:00 - Weights & Biases courses are free 16:55 - Data Portraits 18:50 - Data2Vec 2.0 19:50 - Gated Models on Hugging Face & huggingface.js 22:05 - Visual ChatGPT 23:35 - Bing crosses 100 million daily active users 24:50 - Casual Conversations Dataset 25:50 - Anthropic AI Safety Research 27:30 - Magnushammer & more advances in AI-assisted math 30:30 - LLaMA license change PR 32:00 - Self-Instruct dataset 33:35 - PaLM-E: Multimodal Pathways 35:45 - USM: Universal Speech Model 37:25 - GILGEN: Grounded Text-to-Image 39:55 - Fruit Fly Connectome released References: https://www.heise.de/news/GPT-4-kommt-naechste-Woche-und-es-wird-multimodal-Vorankuendigung-von-Microsoft-7540383.htmlhttps://mingukkang.github.io/GigaGAN/https://www.choppedai.com/https://www.reddit.com/r/Android/comments/11nzrb0/samsung_space_zoom_moon_shots_are_fake_and_here/https://imgur.com/ULVX933https://imgur.com/9XMgt06https://imgur.com/9kichAphttps://imgur.com/RSHAz1lhttps://imgur.com/PIAjVKphttps://imgur.com/xEyLajWhttps://imgur.com/3STX9mZhttps://imgur.com/ifIHr3Shttps://imgur.com/bXJOZgIhttps://dataportraits.org/https://arxiv.org/abs/2303.03919https://arxiv.org/pdf/2303.03919.pdfhttps://ai.facebook.com/blog/ai-self-supervised-learning-data2vec/https://github.com/facebookresearch/fairseq/tree/main/examples/data2vechttps://huggingface.co/docs/hub/models-gatedhttps://huggingface.co/abouthttps://github.com/huggingface/huggingface.js?utm_source=pocket_readerhttps://github.com/microsoft/visual-chatgpthttps://arxiv.org/abs/2303.04671https://github.com/microsoft/visual-chatgpt/blob/main/visual_chatgpt.pyhttps://huggingface.co/spaces/RamAnanth1/visual-chatGPThttps://www.engadget.com/microsoft-bing-crossed-100-million-daily-active-users-080138371.htmlhttps://ai.facebook.com/blog/casual-conversations-v2-dataset-measure-fairness/https://ai.facebook.com/datasets/casual-conversations-v2-dataset/https://www.anthropic.com/index/core-views-on-ai-safetyhttps://arxiv.org/abs/2303.04488https://arxiv.org/pdf/2303.04488.pdfhttps://arxiv.org/abs/2303.04910https://arxiv.org/pdf/2303.04910.pdfhttps://twitter.com/astro_wassim/status/1633645134934949888https://ai.papers.bar/paper/ede58b1ebca911ed8f9c3d8021bca7c8https://arxiv.org/pdf/2303.03192.pdfhttps://www.theverge.com/2023/3/8/23629362/meta-ai-language-model-llama-leak-online-misusehttps://knightcolumbia.org/blog/the-llama-is-out-of-the-bag-should-we-expect-a-tidal-wave-of-disinformationhttps://github.com/facebookresearch/llama/pull/184https://huggingface.co/datasets/yizhongw/self_instructhttps://openai.com/policies/terms-of-usehttps://palm-e.github.io/https://pickapic.io/https://ai.googleblog.com/2023/03/universal-speech-model-usm-state-of-art.htmlhttps://arxiv.org/abs/2303.01037https://github.com/BlinkDL/RWKV-LM?utm_source=pocket_readerhttps://gligen.github.io/https://github.com/microsoft/GLIPhttps://arxiv.org/abs/2301.07093https://huggingface.co/spaces/gligen/demohttps://www.sciencealert.com/the-first-ever-complete-map-of-an-insect-brain-is-truly-mesmerizinghttps://en.wikipedia.org/wiki/Tidal_locking Links: Homepage: https://ykilcher.com Merch: https://ykilcher.com/merch YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://ykilcher.com/discord LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me, the best thing to do is to share out the content :)

Aug 28, 2023 • 34min

GPT-4 is here! What we know so far (Full Analysis)

#gpt4 #chatgpt #openai References: https://openai.com/product/gpt-4https://openai.com/research/gpt-4https://cdn.openai.com/papers/gpt-4.pdf Links: Homepage: https://ykilcher.com Merch: https://ykilcher.com/merch YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://ykilcher.com/discord LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannickilcher Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

Aug 28, 2023 • 43min

This ChatGPT Skill will earn you $10B (also, AI reads your mind!)

#mlnews #chatgpt #llama ChatGPT goes around the world and is finally available via API. Stunning mind-reading performed using fMRI and Stable Diffusion. LLaMA weights leak and hilarity ensues. GTC23 is around the corner! ERRATA: It's a 4090, not a 4090 ti 🙃 OUTLINE: 0:00 - Introduction 0:20 - GTC 23 on March 20 1:55 - ChatGPT API is out! 4:50 - OpenAI becomes more business-friendly 7:15 - OpenAI plans for AGI 10:00 - ChatGPT influencers 12:15 - Open-Source Prompting Course 12:35 - Flan UL2 20B 13:30 - LLaMA weights leaked 15:50 - Mind-Reading from fMRI 20:10 - Random News / Helpful Things 25:30 - Interview with Bryan Catanzaro Participate in the GTC Raffle: https://ykilcher.com/gtc References: GTC 23 on March 20 https://www.nvidia.com/gtc/https://ykilcher.com/gtc ChatGPT API is out! https://twitter.com/gdb/status/1630991925984755714https://openai.com/blog/introducing-chatgpt-and-whisper-apishttps://twitter.com/greggyb/status/1631121912679002112https://www.haihai.ai/chatgpt-api/ OpenAI becomes more business-friendly https://twitter.com/sama/status/1631002519311888385https://techcrunch.com/2023/02/21/openai-foundry-will-let-customers-buy-dedicated-capacity-to-run-its-ai-models/?guccounter=1&guce_referrer=aHR0cHM6Ly93d3cuZ29vZ2xlLmNvbS8&guce_referrer_sig=AQAAAFL1O8s22qBsEtytYZWR7O2VlTa9nAGhdZPFfeQfZCDWjkNBIac7WlDikRNLEH1tqSszUN02ouqRyyCsShDa1kQyUbiApD1IUPfgmHXZxgIMFxr8bwr8BuBa7sK55dYqMRFFbE7YILuBn_rmj7aJI1tp7GAXubODfCUaKvOkoOYjhttps://www.bain.com/vector-digital/partnerships-alliance-ecosystem/openai-alliance/ OpenAI plans for AGI https://openai.com/blog/planning-for-agi-and-beyond ChatGPT influencers https://www.youtube.com/watch?v=4kp7oVTu9Ckhttps://www.youtube.com/watch?v=k13v8jp8H5ohttps://www.linkedin.com/posts/eniascailliau_create-an-online-course-100-ai-ugcPost-7036969935796891648-H_uj/https://www.linkedin.com/posts/linasbeliunas_must-know-ai-tools-ugcPost-7035700089947836416-Qri4/https://twitter.com/LinusEkenstam/status/1629879567514238976https://www.linkedin.com/posts/imarpit_50-awesome-chatgpt-prompts-ugcPost-7036905788631646209-2CU-/ Open-Source Prompting Course https://learnprompting.org/ Flan UL2 20B https://www.yitay.net/blog/flan-ul2-20bhttps://huggingface.co/google/flan-ul2 LLaMA weights leaked https://github.com/facebookresearch/llama/pull/73https://github.com/facebookresearch/llama/pull/73/files#diff-b335630551682c19a781afebcf4d07bf978fb1f8ac04c6bf87428ed5106870f5https://github.com/ChristopherKing42https://open-assistant.io/dashboard Mind-Reading from fMRI https://sites.google.com/view/stablediffusion-with-brain/?s=09https://www.nature.com/articles/s41562-022-01516-2?utm_content=animation Random News https://www.wired.com/story/alphabet-layoffs-hit-trash-sorting-robots/https://huggingface.co/blog/fast-mac-diffusershttps://pyribs.org/https://twitter.com/rowancheung/status/1630569844654460928https://pimeyes.com/enhttps://cacti-framework.github.io/https://twitter.com/bhutanisanyam1/status/1630980866775330819https://www.linkedin.com/in/bryancatanzaro/ Links: Homepage: https://ykilcher.com Merch: https://ykilcher.com/merch YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://ykilcher.com/discord LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannickilcher Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app