Yannic Kilcher Videos (Audio Only)

Yannic Kilcher
undefined
May 3, 2021 • 51min

Multimodal Neurons in Artificial Neural Networks (w/ OpenAI Microscope, Research Paper Explained)

#openai​ #clip​ #microscope​ OpenAI does a huge investigation into the inner workings of their recent CLIP model via faceted feature visualization and finds amazing things: Some neurons in the last layer respond to distinct concepts across multiple modalities, meaning they fire for photographs, drawings, and signs depicting the same concept, even when the images are vastly distinct. Through manual examination, they identify and investigate neurons corresponding to persons, geographical regions, religions, emotions, and much more. In this video, I go through the publication and then I present my own findings from digging around in the OpenAI Microscope. OUTLINE: 0:00​ - Intro & Overview 3:35​ - OpenAI Microscope 7:10​ - Categories of found neurons 11:10​ - Person Neurons 13:00​ - Donald Trump Neuron 17:15​ - Emotion Neurons 22:45​ - Region Neurons 26:40​ - Sparse Mixture of Emotions 28:05​ - Emotion Atlas 29:45​ - Adversarial Typographic Attacks 31:55​ - Stroop Test 33:10​ - My Findings in OpenAI Microscope 33:30​ - Superman Neuron 33:50​ - Resting B*tchface Neuron 34:10​ - Trash Bag Neuron 35:25​ - God Weightlifting Neuron 36:40​ - Organ Neuron 38:35​ - Film Spool Neuron 39:05​ - Feather Neuron 39:20​ - Spartan Neuron 40:25​ - Letter E Neuron 40:35​ - Cleanin Neuron 40:45​ - Frown Neuron 40:55​ - Lion Neuron 41:05​ - Fashion Model Neuron 41:20​ - Baseball Neuron 41:50​ - Bride Neuron 42:00​ - Navy Neuron 42:30​ - Hemp Neuron 43:25​ - Staircase Neuron 43:45​ - Disney Neuron 44:15​ - Hillary Clinton Neuron 44:50​ - God Neuron 45:15​ - Blurry Neuron 45:35​ - Arrow Neuron 45:55​ - Trophy Presentation Neuron 46:10​ - Receding Hairline Neuron 46:30​ - Traffic Neuron 46:40​ - Raised Hand Neuron 46:50​ - Google Maps Neuron 47:15​ - Nervous Smile Neuron 47:30​ - Elvis Neuron 47:55​ - The Flash Neuron 48:05​ - Beard Neuron 48:15​ - Kilt Neuron 48:25​ - Rainy Neuron 48:35​ - Electricity Neuron 48:50​ - Droplets Neuron 49:00​ - Escape Neuron 49:25​ - King Neuron 49:35​ - Country Neuron 49:45​ - Overweight Men Neuron 49:55​ - Wedding 50:05​ - Australia Neuron 50:15​ - Yawn Neuron 50:30​ - Bees & Simpsons Neuron 50:40​ - Mussles Neuron 50:50​ - Spice Neuron 51:00​ - Conclusion Paper: https://distill.pub/2021/multimodal-n...​ My Findings: https://www.notion.so/CLIP-OpenAI-Mic...​ My Video on CLIP: https://youtu.be/T9XSU0pKX2E​ My Video on Feature Visualizations & The OpenAI Microscope: https://youtu.be/Ok44otx90D4​ Abstract: In 2005, a letter published in Nature described human neurons responding to specific people, such as Jennifer Aniston or Halle Berry. The exciting thing wasn’t just that they selected for particular people, but that they did so regardless of whether they were shown photographs, drawings, or even images of the person’s name. The neurons were multimodal. As the lead author would put it: "You are looking at the far end of the transformation from metric, visual shapes to conceptual... information." We report the existence of similar multimodal neurons in artificial neural networks. This includes neurons selecting for prominent public figures or fictional characters, such as Lady Gaga or Spiderman. Like the biological multimodal neurons, these artificial neurons respond to the same subject in photographs, drawings, and images of their name. Authors: Gabriel Goh, Nick Cammarata, Chelsea Voss, Shan Carter, Michael Petrov, Ludwig Schubert, Alec Radford, Chris Olah
undefined
May 3, 2021 • 16min

Machine Learning PhD Survival Guide 2021 | Advice on Topic Selection, Papers, Conferences & more!

#machinelearning​ #phd​ #howto​ This video is advice for new PhD students in the field of Machine Learning in 2021 and after. The field has shifted dramatically in the last few years and navigating grad school can be very hard, especially when you're as clueless as I was when I started. The video is a personal recount of my mistakes and what I've learned from them. If you already have several published papers and know what to do, this video is not for you. However, if you are not even sure where to start, how to select a topic, or what goes in a paper, you might benefit from this video, because that's exactly how I felt. Main Takeaways: - Select niche topics rather than hype topics - Write papers that can't be rejected - Don't be discouraged by bad reviews - Take reviewing & teaching seriously - Keep up your focus - Conferences are for networking - Internships are great opportunities - Team up with complementary skills - Don't work too hard OUTLINE: 0:00​ - Intro & Overview 1:25​ - Thesis Topic Selection 4:25​ - How To Publish Papers 5:35​ - Dealing With Reviewers 6:30​ - How To Be A Reviewer 7:40​ - Take Teaching Seriously 8:30​ - Maintain Focus 10:20​ - Navigating Conferences 12:40​ - Internships 13:40​ - Collaborations 14:55​ - Don't Forget To Enjoy Transcript: https://www.notion.so/Yannic-Kilcher-...​ Credits to Lanz for editing Links: TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick​ YouTube: https://www.youtube.com/c/yannickilcher​ Twitter: https://twitter.com/ykilcher​ Discord: https://discord.gg/4H8xxDF​ BitChute: https://www.bitchute.com/channel/yann...​ Minds: https://www.minds.com/ykilcher​ Parler: https://parler.com/profile/YannicKilcher​ LinkedIn: https://www.linkedin.com/in/yannic-ki...​ BiliBili: https://space.bilibili.com/1824646584​ If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannick...​ Patreon: https://www.patreon.com/yannickilcher​ Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n
undefined
May 3, 2021 • 24min

PAIR AI Explorables | Is the problem in the data? Examples on Fairness, Diversity, and Bias.

In the recurring debate about bias in Machine Learning models, there is a growing argument saying that "the problem is not in the data", often citing the influence of various choices like loss functions or network architecture. In this video, we take a look at PAIR's AI Explorables through the lens of whether or not the bias problem is a data problem. OUTLINE: 0:00​ - Intro & Overview 1:45​ - Recap: Bias in ML 4:25​ - AI Explorables 5:40​ - Measuring Fairness Explorable 11:00​ - Hidden Bias Explorable 16:10​ - Measuring Diversity Explorable 23:00​ - Conclusion & Comments AI Explorables: https://pair.withgoogle.com/explorables/​ Links: TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick​ YouTube: https://www.youtube.com/c/yannickilcher​ Twitter: https://twitter.com/ykilcher​ Discord: https://discord.gg/4H8xxDF​ BitChute: https://www.bitchute.com/channel/yann...​ Minds: https://www.minds.com/ykilcher​ Parler: https://parler.com/profile/YannicKilcher​ LinkedIn: https://www.linkedin.com/in/yannic-ki...​ BiliBili: https://space.bilibili.com/1824646584​ If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannick...​ Patreon: https://www.patreon.com/yannickilcher​ Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n
undefined
May 3, 2021 • 48min

DreamCoder: Growing generalizable, interpretable knowledge with wake-sleep Bayesian program learning

#dreamcoder​ #programsynthesis​ #symbolicreasoning​ Classic Machine Learning struggles with few-shot generalization for tasks where humans can easily generalize from just a handful of examples, for example sorting a list of numbers. Humans do this by coming up with a short program, or algorithm, that explains the few data points in a compact way. DreamCoder emulates this by using neural guided search over a language of primitives, a library, that it builds up over time. By doing this, it can iteratively construct more and more complex programs by building on its own abstractions and therefore solve more and more difficult tasks in a few-shot manner by generating very short programs that solve the few given datapoints. The resulting system can not only generalize quickly but also delivers an explainable solution to its problems in form of a modular and hierarchical learned library. Combining this with classic Deep Learning for low-level perception is a very promising future direction. OUTLINE: 0:00​ - Intro & Overview 4:55​ - DreamCoder System Architecture 9:00​ - Wake Phase: Neural Guided Search 19:15​ - Abstraction Phase: Extending the Internal Library 24:30​ - Dreaming Phase: Training Neural Search on Fictional Programs and Replays 30:55​ - Abstraction by Compressing Program Refactorings 32:40​ - Experimental Results on LOGO Drawings 39:00​ - Ablation Studies 39:50​ - Re-Discovering Physical Laws 42:25​ - Discovering Recursive Programming Algorithms 44:20​ - Conclusions & Discussion Paper: https://arxiv.org/abs/2006.08381​ Code: https://github.com/ellisk42/ec​ Abstract: Expert problem-solving is driven by powerful languages for thinking about problems and their solutions. Acquiring expertise means learning these languages -- systems of concepts, alongside the skills to use them. We present DreamCoder, a system that learns to solve problems by writing programs. It builds expertise by creating programming languages for expressing domain concepts, together with neural networks to guide the search for programs within these languages. A ``wake-sleep'' learning algorithm alternately extends the language with new symbolic abstractions and trains the neural network on imagined and replayed problems. DreamCoder solves both classic inductive programming tasks and creative tasks such as drawing pictures and building scenes. It rediscovers the basics of modern functional programming, vector algebra and classical physics, including Newton's and Coulomb's laws. Concepts are built compositionally from those learned earlier, yielding multi-layered symbolic representations that are interpretable and transferrable to new tasks, while still growing scalably and flexibly with experience. Authors: Kevin Ellis, Catherine Wong, Maxwell Nye, Mathias Sable-Meyer, Luc Cary, Lucas Morales, Luke Hewitt, Armando Solar-Lezama, Joshua B. Tenenbaum Links: TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick​ YouTube: https://www.youtube.com/c/yannickilcher​ Twitter: https://twitter.com/ykilcher​ Discord: https://discord.gg/4H8xxDF​ BitChute: https://www.bitchute.com/channel/yann...​ Minds: https://www.minds.com/ykilcher​ Parler: https://parler.com/profile/YannicKilcher​ LinkedIn: https://www.linkedin.com/in/yannic-ki...​ BiliBili: https://space.bilibili.com/1824646584​ If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannick...​ Patreon: https://www.patreon.com/yannickilcher
undefined
May 2, 2021 • 34min

NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis (ML Research Paper Explained)

#nerf​ #neuralrendering​ #deeplearning​ View Synthesis is a tricky problem, especially when only given a sparse set of images as an input. NeRF embeds an entire scene into the weights of a feedforward neural network, trained by backpropagation through a differential volume rendering procedure, and achieves state-of-the-art view synthesis. It includes directional dependence and is able to capture fine structural details, as well as reflection effects and transparency. OUTLINE: 0:00​ - Intro & Overview 4:50​ - View Synthesis Task Description 5:50​ - The fundamental difference to classic Deep Learning 7:00​ - NeRF Core Concept 15:30​ - Training the NeRF from sparse views 20:50​ - Radiance Field Volume Rendering 23:20​ - Resulting View Dependence 24:00​ - Positional Encoding 28:00​ - Hierarchical Volume Sampling 30:15​ - Experimental Results 33:30​ - Comments & Conclusion Paper: https://arxiv.org/abs/2003.08934​ Website & Code: https://www.matthewtancik.com/nerf​ My Video on SIREN: https://youtu.be/Q5g3p9Zwjrk​ Abstract: We present a method that achieves state-of-the-art results for synthesizing novel views of complex scenes by optimizing an underlying continuous volumetric scene function using a sparse set of input views. Our algorithm represents a scene using a fully-connected (non-convolutional) deep network, whose input is a single continuous 5D coordinate (spatial location (x,y,z) and viewing direction (θ,ϕ)) and whose output is the volume density and view-dependent emitted radiance at that spatial location. We synthesize views by querying 5D coordinates along camera rays and use classic volume rendering techniques to project the output colors and densities into an image. Because volume rendering is naturally differentiable, the only input required to optimize our representation is a set of images with known camera poses. We describe how to effectively optimize neural radiance fields to render photorealistic novel views of scenes with complicated geometry and appearance, and demonstrate results that outperform prior work on neural rendering and view synthesis. View synthesis results are best viewed as videos, so we urge readers to view our supplementary video for convincing comparisons. Authors: Ben Mildenhall, Pratul P. Srinivasan, Matthew Tancik, Jonathan T. Barron, Ravi Ramamoorthi, Ren Ng Links: TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick​ YouTube: https://www.youtube.com/c/yannickilcher​ Twitter: https://twitter.com/ykilcher​ Discord: https://discord.gg/4H8xxDF​ BitChute: https://www.bitchute.com/channel/yann...​ Minds: https://www.minds.com/ykilcher​ Parler: https://parler.com/profile/YannicKilcher​ LinkedIn: https://www.linkedin.com/in/yannic-ki...​ BiliBili: https://space.bilibili.com/1824646584​ If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannick...​ Patreon: https://www.patreon.com/yannickilcher​ Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n
undefined
May 2, 2021 • 39min

DINO: Emerging Properties in Self-Supervised Vision Transformers (Facebook AI Research Explained)

#dino #facebook #selfsupervised Self-Supervised Learning is the final frontier in Representation Learning: Getting useful features without any labels. Facebook AI's new system, DINO, combines advances in Self-Supervised Learning for Computer Vision with the new Vision Transformer (ViT) architecture and achieves impressive results without any labels. Attention maps can be directly interpreted as segmentation maps, and the obtained representations can be used for image retrieval and zero-shot k-nearest neighbor classifiers (KNNs). OUTLINE: 0:00 - Intro & Overview 6:20 - Vision Transformers 9:20 - Self-Supervised Learning for Images 13:30 - Self-Distillation 15:20 - Building the teacher from the student by moving average 16:45 - DINO Pseudocode 23:10 - Why Cross-Entropy Loss? 28:20 - Experimental Results 33:40 - My Hypothesis why this works 38:45 - Conclusion & Comments Paper: https://arxiv.org/abs/2104.14294 Blog: https://ai.facebook.com/blog/dino-paws-computer-vision-with-self-supervised-transformers-and-10x-more-efficient-training Code: https://github.com/facebookresearch/dino My Video on ViT: https://youtu.be/TrdevFK_am4 My Video on BYOL: https://youtu.be/YPfUiOMYOEE Abstract: In this paper, we question if self-supervised learning provides new properties to Vision Transformer (ViT) that stand out compared to convolutional networks (convnets). Beyond the fact that adapting self-supervised methods to this architecture works particularly well, we make the following observations: first, self-supervised ViT features contain explicit information about the semantic segmentation of an image, which does not emerge as clearly with supervised ViTs, nor with convnets. Second, these features are also excellent k-NN classifiers, reaching 78.3% top-1 on ImageNet with a small ViT. Our study also underlines the importance of momentum encoder, multi-crop training, and the use of small patches with ViTs. We implement our findings into a simple self-supervised method, called DINO, which we interpret as a form of self-distillation with no labels. We show the synergy between DINO and ViTs by achieving 80.1% top-1 on ImageNet in linear evaluation with ViT-Base. Authors: Mathilde Caron, Hugo Touvron, Ishan Misra, Hervé Jégou, Julien Mairal, Piotr Bojanowski, Armand Joulin Links: TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://discord.gg/4H8xxDF BitChute: https://www.bitchute.com/channel/yannic-kilcher Minds: https://www.minds.com/ykilcher Parler: https://parler.com/profile/YannicKilcher LinkedIn: https://www.linkedin.com/in/yannic-kilcher-488534136/ BiliBili: https://space.bilibili.com/1824646584 If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannickilcher Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n
undefined
May 2, 2021 • 37min

Why AI is Harder Than We Think (Machine Learning Research Paper Explained)

#aiwinter #agi #embodiedcognition The AI community has gone through regular cycles of AI Springs, where rapid progress gave rise to massive overconfidence, high funding, and overpromise, followed by these promises being unfulfilled, subsequently diving into periods of disenfranchisement and underfunding, called AI Winters. This paper examines the reasons for the repeated periods of overconfidence and identifies four fallacies that people make when they see rapid progress in AI. OUTLINE: 0:00 - Intro & Overview 2:10 - AI Springs & AI Winters 5:40 - Is the current AI boom overhyped? 15:35 - Fallacy 1: Narrow Intelligence vs General Intelligence 19:40 - Fallacy 2: Hard for humans doesn't mean hard for computers 21:45 - Fallacy 3: How we call things matters 28:15 - Fallacy 4: Embodied Cognition 35:30 - Conclusion & Comments Paper: https://arxiv.org/abs/2104.12871 My Video on Shortcut Learning: https://youtu.be/D-eg7k8YSfs Abstract: Since its beginning in the 1950s, the field of artificial intelligence has cycled several times between periods of optimistic predictions and massive investment ("AI spring") and periods of disappointment, loss of confidence, and reduced funding ("AI winter"). Even with today's seemingly fast pace of AI breakthroughs, the development of long-promised technologies such as self-driving cars, housekeeping robots, and conversational companions has turned out to be much harder than many people expected. One reason for these repeating cycles is our limited understanding of the nature and complexity of intelligence itself. In this paper I describe four fallacies in common assumptions made by AI researchers, which can lead to overconfident predictions about the field. I conclude by discussing the open questions spurred by these fallacies, including the age-old challenge of imbuing machines with humanlike common sense. Authors: Melanie Mitchell Links: TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://discord.gg/4H8xxDF BitChute: https://www.bitchute.com/channel/yannic-kilcher Minds: https://www.minds.com/ykilcher Parler: https://parler.com/profile/YannicKilcher LinkedIn: https://www.linkedin.com/in/yannic-kilcher-488534136/ BiliBili: https://space.bilibili.com/1824646584 If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannickilcher Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app