

Yannic Kilcher Videos (Audio Only)
Yannic Kilcher
I make videos about machine learning research papers, programming, and issues of the AI community, and the broader impact of AI in society.
Twitter: https://twitter.com/ykilcher
Discord: https://discord.gg/4H8xxDF
If you want to support me, the best thing to do is to share out the content :)
If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):
SubscribeStar (preferred to Patreon): https://www.subscribestar.com/yannickilcher
Patreon: https://www.patreon.com/yannickilcher
Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq
Twitter: https://twitter.com/ykilcher
Discord: https://discord.gg/4H8xxDF
If you want to support me, the best thing to do is to share out the content :)
If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):
SubscribeStar (preferred to Patreon): https://www.subscribestar.com/yannickilcher
Patreon: https://www.patreon.com/yannickilcher
Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq
Episodes
Mentioned books

May 3, 2021 • 51min
Multimodal Neurons in Artificial Neural Networks (w/ OpenAI Microscope, Research Paper Explained)
#openai #clip #microscope
OpenAI does a huge investigation into the inner workings of their recent CLIP model via faceted feature visualization and finds amazing things: Some neurons in the last layer respond to distinct concepts across multiple modalities, meaning they fire for photographs, drawings, and signs depicting the same concept, even when the images are vastly distinct. Through manual examination, they identify and investigate neurons corresponding to persons, geographical regions, religions, emotions, and much more. In this video, I go through the publication and then I present my own findings from digging around in the OpenAI Microscope.
OUTLINE:
0:00 - Intro & Overview
3:35 - OpenAI Microscope
7:10 - Categories of found neurons
11:10 - Person Neurons
13:00 - Donald Trump Neuron
17:15 - Emotion Neurons
22:45 - Region Neurons
26:40 - Sparse Mixture of Emotions
28:05 - Emotion Atlas
29:45 - Adversarial Typographic Attacks
31:55 - Stroop Test
33:10 - My Findings in OpenAI Microscope
33:30 - Superman Neuron
33:50 - Resting B*tchface Neuron
34:10 - Trash Bag Neuron
35:25 - God Weightlifting Neuron
36:40 - Organ Neuron
38:35 - Film Spool Neuron
39:05 - Feather Neuron
39:20 - Spartan Neuron
40:25 - Letter E Neuron
40:35 - Cleanin Neuron
40:45 - Frown Neuron
40:55 - Lion Neuron
41:05 - Fashion Model Neuron
41:20 - Baseball Neuron
41:50 - Bride Neuron
42:00 - Navy Neuron
42:30 - Hemp Neuron
43:25 - Staircase Neuron
43:45 - Disney Neuron
44:15 - Hillary Clinton Neuron
44:50 - God Neuron
45:15 - Blurry Neuron
45:35 - Arrow Neuron
45:55 - Trophy Presentation Neuron
46:10 - Receding Hairline Neuron
46:30 - Traffic Neuron
46:40 - Raised Hand Neuron
46:50 - Google Maps Neuron
47:15 - Nervous Smile Neuron
47:30 - Elvis Neuron
47:55 - The Flash Neuron
48:05 - Beard Neuron
48:15 - Kilt Neuron
48:25 - Rainy Neuron
48:35 - Electricity Neuron
48:50 - Droplets Neuron
49:00 - Escape Neuron
49:25 - King Neuron
49:35 - Country Neuron
49:45 - Overweight Men Neuron
49:55 - Wedding
50:05 - Australia Neuron
50:15 - Yawn Neuron
50:30 - Bees & Simpsons Neuron
50:40 - Mussles Neuron
50:50 - Spice Neuron
51:00 - Conclusion
Paper: https://distill.pub/2021/multimodal-n...
My Findings: https://www.notion.so/CLIP-OpenAI-Mic...
My Video on CLIP: https://youtu.be/T9XSU0pKX2E
My Video on Feature Visualizations & The OpenAI Microscope: https://youtu.be/Ok44otx90D4
Abstract:
In 2005, a letter published in Nature described human neurons responding to specific people, such as Jennifer Aniston or Halle Berry. The exciting thing wasn’t just that they selected for particular people, but that they did so regardless of whether they were shown photographs, drawings, or even images of the person’s name. The neurons were multimodal. As the lead author would put it: "You are looking at the far end of the transformation from metric, visual shapes to conceptual... information." We report the existence of similar multimodal neurons in artificial neural networks. This includes neurons selecting for prominent public figures or fictional characters, such as Lady Gaga or Spiderman. Like the biological multimodal neurons, these artificial neurons respond to the same subject in photographs, drawings, and images of their name.
Authors: Gabriel Goh, Nick Cammarata, Chelsea Voss, Shan Carter, Michael Petrov, Ludwig Schubert, Alec Radford, Chris Olah

May 3, 2021 • 16min
Machine Learning PhD Survival Guide 2021 | Advice on Topic Selection, Papers, Conferences & more!
#machinelearning #phd #howto
This video is advice for new PhD students in the field of Machine Learning in 2021 and after. The field has shifted dramatically in the last few years and navigating grad school can be very hard, especially when you're as clueless as I was when I started. The video is a personal recount of my mistakes and what I've learned from them. If you already have several published papers and know what to do, this video is not for you. However, if you are not even sure where to start, how to select a topic, or what goes in a paper, you might benefit from this video, because that's exactly how I felt.
Main Takeaways:
- Select niche topics rather than hype topics
- Write papers that can't be rejected
- Don't be discouraged by bad reviews
- Take reviewing & teaching seriously
- Keep up your focus
- Conferences are for networking
- Internships are great opportunities
- Team up with complementary skills
- Don't work too hard
OUTLINE:
0:00 - Intro & Overview
1:25 - Thesis Topic Selection
4:25 - How To Publish Papers
5:35 - Dealing With Reviewers
6:30 - How To Be A Reviewer
7:40 - Take Teaching Seriously
8:30 - Maintain Focus
10:20 - Navigating Conferences
12:40 - Internships
13:40 - Collaborations
14:55 - Don't Forget To Enjoy
Transcript: https://www.notion.so/Yannic-Kilcher-...
Credits to Lanz for editing
Links:
TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick
YouTube: https://www.youtube.com/c/yannickilcher
Twitter: https://twitter.com/ykilcher
Discord: https://discord.gg/4H8xxDF
BitChute: https://www.bitchute.com/channel/yann...
Minds: https://www.minds.com/ykilcher
Parler: https://parler.com/profile/YannicKilcher
LinkedIn: https://www.linkedin.com/in/yannic-ki...
BiliBili: https://space.bilibili.com/1824646584
If you want to support me, the best thing to do is to share out the content :)
If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):
SubscribeStar: https://www.subscribestar.com/yannick...
Patreon: https://www.patreon.com/yannickilcher
Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq
Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2
Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m
Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

May 3, 2021 • 24min
PAIR AI Explorables | Is the problem in the data? Examples on Fairness, Diversity, and Bias.
In the recurring debate about bias in Machine Learning models, there is a growing argument saying that "the problem is not in the data", often citing the influence of various choices like loss functions or network architecture. In this video, we take a look at PAIR's AI Explorables through the lens of whether or not the bias problem is a data problem.
OUTLINE:
0:00 - Intro & Overview
1:45 - Recap: Bias in ML
4:25 - AI Explorables
5:40 - Measuring Fairness Explorable
11:00 - Hidden Bias Explorable
16:10 - Measuring Diversity Explorable
23:00 - Conclusion & Comments
AI Explorables: https://pair.withgoogle.com/explorables/
Links:
TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick
YouTube: https://www.youtube.com/c/yannickilcher
Twitter: https://twitter.com/ykilcher
Discord: https://discord.gg/4H8xxDF
BitChute: https://www.bitchute.com/channel/yann...
Minds: https://www.minds.com/ykilcher
Parler: https://parler.com/profile/YannicKilcher
LinkedIn: https://www.linkedin.com/in/yannic-ki...
BiliBili: https://space.bilibili.com/1824646584
If you want to support me, the best thing to do is to share out the content :)
If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):
SubscribeStar: https://www.subscribestar.com/yannick...
Patreon: https://www.patreon.com/yannickilcher
Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq
Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2
Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m
Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

May 3, 2021 • 48min
DreamCoder: Growing generalizable, interpretable knowledge with wake-sleep Bayesian program learning
#dreamcoder #programsynthesis #symbolicreasoning
Classic Machine Learning struggles with few-shot generalization for tasks where humans can easily generalize from just a handful of examples, for example sorting a list of numbers. Humans do this by coming up with a short program, or algorithm, that explains the few data points in a compact way. DreamCoder emulates this by using neural guided search over a language of primitives, a library, that it builds up over time. By doing this, it can iteratively construct more and more complex programs by building on its own abstractions and therefore solve more and more difficult tasks in a few-shot manner by generating very short programs that solve the few given datapoints. The resulting system can not only generalize quickly but also delivers an explainable solution to its problems in form of a modular and hierarchical learned library. Combining this with classic Deep Learning for low-level perception is a very promising future direction.
OUTLINE:
0:00 - Intro & Overview
4:55 - DreamCoder System Architecture
9:00 - Wake Phase: Neural Guided Search
19:15 - Abstraction Phase: Extending the Internal Library
24:30 - Dreaming Phase: Training Neural Search on Fictional Programs and Replays
30:55 - Abstraction by Compressing Program Refactorings
32:40 - Experimental Results on LOGO Drawings
39:00 - Ablation Studies
39:50 - Re-Discovering Physical Laws
42:25 - Discovering Recursive Programming Algorithms
44:20 - Conclusions & Discussion
Paper: https://arxiv.org/abs/2006.08381
Code: https://github.com/ellisk42/ec
Abstract:
Expert problem-solving is driven by powerful languages for thinking about problems and their solutions. Acquiring expertise means learning these languages -- systems of concepts, alongside the skills to use them. We present DreamCoder, a system that learns to solve problems by writing programs. It builds expertise by creating programming languages for expressing domain concepts, together with neural networks to guide the search for programs within these languages. A ``wake-sleep'' learning algorithm alternately extends the language with new symbolic abstractions and trains the neural network on imagined and replayed problems. DreamCoder solves both classic inductive programming tasks and creative tasks such as drawing pictures and building scenes. It rediscovers the basics of modern functional programming, vector algebra and classical physics, including Newton's and Coulomb's laws. Concepts are built compositionally from those learned earlier, yielding multi-layered symbolic representations that are interpretable and transferrable to new tasks, while still growing scalably and flexibly with experience.
Authors: Kevin Ellis, Catherine Wong, Maxwell Nye, Mathias Sable-Meyer, Luc Cary, Lucas Morales, Luke Hewitt, Armando Solar-Lezama, Joshua B. Tenenbaum
Links:
TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick
YouTube: https://www.youtube.com/c/yannickilcher
Twitter: https://twitter.com/ykilcher
Discord: https://discord.gg/4H8xxDF
BitChute: https://www.bitchute.com/channel/yann...
Minds: https://www.minds.com/ykilcher
Parler: https://parler.com/profile/YannicKilcher
LinkedIn: https://www.linkedin.com/in/yannic-ki...
BiliBili: https://space.bilibili.com/1824646584
If you want to support me, the best thing to do is to share out the content :)
If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):
SubscribeStar: https://www.subscribestar.com/yannick...
Patreon: https://www.patreon.com/yannickilcher

May 2, 2021 • 34min
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis (ML Research Paper Explained)
#nerf #neuralrendering #deeplearning
View Synthesis is a tricky problem, especially when only given a sparse set of images as an input. NeRF embeds an entire scene into the weights of a feedforward neural network, trained by backpropagation through a differential volume rendering procedure, and achieves state-of-the-art view synthesis. It includes directional dependence and is able to capture fine structural details, as well as reflection effects and transparency.
OUTLINE:
0:00 - Intro & Overview
4:50 - View Synthesis Task Description
5:50 - The fundamental difference to classic Deep Learning
7:00 - NeRF Core Concept
15:30 - Training the NeRF from sparse views
20:50 - Radiance Field Volume Rendering
23:20 - Resulting View Dependence
24:00 - Positional Encoding
28:00 - Hierarchical Volume Sampling
30:15 - Experimental Results
33:30 - Comments & Conclusion
Paper: https://arxiv.org/abs/2003.08934
Website & Code: https://www.matthewtancik.com/nerf
My Video on SIREN: https://youtu.be/Q5g3p9Zwjrk
Abstract:
We present a method that achieves state-of-the-art results for synthesizing novel views of complex scenes by optimizing an underlying continuous volumetric scene function using a sparse set of input views. Our algorithm represents a scene using a fully-connected (non-convolutional) deep network, whose input is a single continuous 5D coordinate (spatial location (x,y,z) and viewing direction (θ,ϕ)) and whose output is the volume density and view-dependent emitted radiance at that spatial location. We synthesize views by querying 5D coordinates along camera rays and use classic volume rendering techniques to project the output colors and densities into an image. Because volume rendering is naturally differentiable, the only input required to optimize our representation is a set of images with known camera poses. We describe how to effectively optimize neural radiance fields to render photorealistic novel views of scenes with complicated geometry and appearance, and demonstrate results that outperform prior work on neural rendering and view synthesis. View synthesis results are best viewed as videos, so we urge readers to view our supplementary video for convincing comparisons.
Authors: Ben Mildenhall, Pratul P. Srinivasan, Matthew Tancik, Jonathan T. Barron, Ravi Ramamoorthi, Ren Ng
Links:
TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick
YouTube: https://www.youtube.com/c/yannickilcher
Twitter: https://twitter.com/ykilcher
Discord: https://discord.gg/4H8xxDF
BitChute: https://www.bitchute.com/channel/yann...
Minds: https://www.minds.com/ykilcher
Parler: https://parler.com/profile/YannicKilcher
LinkedIn: https://www.linkedin.com/in/yannic-ki...
BiliBili: https://space.bilibili.com/1824646584
If you want to support me, the best thing to do is to share out the content :)
If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):
SubscribeStar: https://www.subscribestar.com/yannick...
Patreon: https://www.patreon.com/yannickilcher
Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq
Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2
Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m
Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

May 2, 2021 • 39min
DINO: Emerging Properties in Self-Supervised Vision Transformers (Facebook AI Research Explained)
#dino #facebook #selfsupervised
Self-Supervised Learning is the final frontier in Representation Learning: Getting useful features without any labels. Facebook AI's new system, DINO, combines advances in Self-Supervised Learning for Computer Vision with the new Vision Transformer (ViT) architecture and achieves impressive results without any labels. Attention maps can be directly interpreted as segmentation maps, and the obtained representations can be used for image retrieval and zero-shot k-nearest neighbor classifiers (KNNs).
OUTLINE:
0:00 - Intro & Overview
6:20 - Vision Transformers
9:20 - Self-Supervised Learning for Images
13:30 - Self-Distillation
15:20 - Building the teacher from the student by moving average
16:45 - DINO Pseudocode
23:10 - Why Cross-Entropy Loss?
28:20 - Experimental Results
33:40 - My Hypothesis why this works
38:45 - Conclusion & Comments
Paper: https://arxiv.org/abs/2104.14294
Blog: https://ai.facebook.com/blog/dino-paws-computer-vision-with-self-supervised-transformers-and-10x-more-efficient-training
Code: https://github.com/facebookresearch/dino
My Video on ViT: https://youtu.be/TrdevFK_am4
My Video on BYOL: https://youtu.be/YPfUiOMYOEE
Abstract:
In this paper, we question if self-supervised learning provides new properties to Vision Transformer (ViT) that stand out compared to convolutional networks (convnets). Beyond the fact that adapting self-supervised methods to this architecture works particularly well, we make the following observations: first, self-supervised ViT features contain explicit information about the semantic segmentation of an image, which does not emerge as clearly with supervised ViTs, nor with convnets. Second, these features are also excellent k-NN classifiers, reaching 78.3% top-1 on ImageNet with a small ViT. Our study also underlines the importance of momentum encoder, multi-crop training, and the use of small patches with ViTs. We implement our findings into a simple self-supervised method, called DINO, which we interpret as a form of self-distillation with no labels. We show the synergy between DINO and ViTs by achieving 80.1% top-1 on ImageNet in linear evaluation with ViT-Base.
Authors: Mathilde Caron, Hugo Touvron, Ishan Misra, Hervé Jégou, Julien Mairal, Piotr Bojanowski, Armand Joulin
Links:
TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick
YouTube: https://www.youtube.com/c/yannickilcher
Twitter: https://twitter.com/ykilcher
Discord: https://discord.gg/4H8xxDF
BitChute: https://www.bitchute.com/channel/yannic-kilcher
Minds: https://www.minds.com/ykilcher
Parler: https://parler.com/profile/YannicKilcher
LinkedIn: https://www.linkedin.com/in/yannic-kilcher-488534136/
BiliBili: https://space.bilibili.com/1824646584
If you want to support me, the best thing to do is to share out the content :)
If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):
SubscribeStar: https://www.subscribestar.com/yannickilcher
Patreon: https://www.patreon.com/yannickilcher
Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq
Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2
Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m
Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

May 2, 2021 • 37min
Why AI is Harder Than We Think (Machine Learning Research Paper Explained)
#aiwinter #agi #embodiedcognition
The AI community has gone through regular cycles of AI Springs, where rapid progress gave rise to massive overconfidence, high funding, and overpromise, followed by these promises being unfulfilled, subsequently diving into periods of disenfranchisement and underfunding, called AI Winters. This paper examines the reasons for the repeated periods of overconfidence and identifies four fallacies that people make when they see rapid progress in AI.
OUTLINE:
0:00 - Intro & Overview
2:10 - AI Springs & AI Winters
5:40 - Is the current AI boom overhyped?
15:35 - Fallacy 1: Narrow Intelligence vs General Intelligence
19:40 - Fallacy 2: Hard for humans doesn't mean hard for computers
21:45 - Fallacy 3: How we call things matters
28:15 - Fallacy 4: Embodied Cognition
35:30 - Conclusion & Comments
Paper: https://arxiv.org/abs/2104.12871
My Video on Shortcut Learning: https://youtu.be/D-eg7k8YSfs
Abstract:
Since its beginning in the 1950s, the field of artificial intelligence has cycled several times between periods of optimistic predictions and massive investment ("AI spring") and periods of disappointment, loss of confidence, and reduced funding ("AI winter"). Even with today's seemingly fast pace of AI breakthroughs, the development of long-promised technologies such as self-driving cars, housekeeping robots, and conversational companions has turned out to be much harder than many people expected. One reason for these repeating cycles is our limited understanding of the nature and complexity of intelligence itself. In this paper I describe four fallacies in common assumptions made by AI researchers, which can lead to overconfident predictions about the field. I conclude by discussing the open questions spurred by these fallacies, including the age-old challenge of imbuing machines with humanlike common sense.
Authors: Melanie Mitchell
Links:
TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick
YouTube: https://www.youtube.com/c/yannickilcher
Twitter: https://twitter.com/ykilcher
Discord: https://discord.gg/4H8xxDF
BitChute: https://www.bitchute.com/channel/yannic-kilcher
Minds: https://www.minds.com/ykilcher
Parler: https://parler.com/profile/YannicKilcher
LinkedIn: https://www.linkedin.com/in/yannic-kilcher-488534136/
BiliBili: https://space.bilibili.com/1824646584
If you want to support me, the best thing to do is to share out the content :)
If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):
SubscribeStar: https://www.subscribestar.com/yannickilcher
Patreon: https://www.patreon.com/yannickilcher
Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq
Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2
Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m
Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n