Yannic Kilcher Videos (Audio Only)

Yannic Kilcher
undefined
Jan 28, 2022 • 13min

IT ARRIVED! YouTube sent me a package. (also: Limited Time Merch Deal)

LIMITED TIME MERCH DEAL: http://store.ykilcher.com Links: TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://discord.gg/4H8xxDF BitChute: https://www.bitchute.com/channel/yann... LinkedIn: https://www.linkedin.com/in/ykilcher BiliBili: https://space.bilibili.com/2017636191 If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannick... Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n
undefined
Jan 28, 2022 • 19min

[ML News] ConvNeXt: Convolutions return | China regulates algorithms | Saliency cropping examined

#mlnews #convnext #mt3 Your update on what's new in the Machine Learning world! OUTLINE: 0:00 - Intro 0:15 - ConvNeXt: Return of the Convolutions 2:50 - Investigating Saliency Cropping Algorithms 9:40 - YourTTS: SOTA zero-shot Text-to-Speech 10:40 - MT3: Multi-Track Music Transcription 11:35 - China regulates addictive algorithms 13:00 - A collection of Deep Learning interview questions & solutions 13:35 - Helpful Things 16:05 - AlphaZero explained blog post 16:45 - Ru-DOLPH: HyperModal Text-to-Image-to-Text model 17:45 - Google AI 2021 Review References: ConvNeXt: Return of the Convolutions https://arxiv.org/abs/2201.03545 https://github.com/facebookresearch/C... https://twitter.com/giffmana/status/1... https://twitter.com/wightmanr/status/... https://twitter.com/tanmingxing/statu... Investigating Saliency Cropping Algorithms https://openaccess.thecvf.com/content... https://vinayprabhu.github.io/Salienc... https://vinayprabhu.medium.com/on-the... https://vinayprabhu.github.io/Salienc... YourTTS: SOTA zero-shot Text-to-Speech https://github.com/coqui-ai/TTS?utm_s... https://arxiv.org/abs/2112.02418?utm_... https://coqui.ai/?utm_source=pocket_m... https://coqui.ai/blog/tts/yourtts-zer... MT3: Multi-Track Music Transcription https://arxiv.org/abs/2111.03017 https://github.com/magenta/mt3 https://huggingface.co/spaces/akhaliq... https://www.reddit.com/r/MachineLearn... China regulates addictive algorithms https://technode.com/2022/01/05/china... https://qz.com/2109618/china-reveals-... A collection of Deep Learning interview questions & solutions https://arxiv.org/abs/2201.00650?utm_... https://arxiv.org/pdf/2201.00650.pdf Helpful Things https://docs.deepchecks.com/en/stable... https://github.com/deepchecks/deepchecks https://docs.deepchecks.com/en/stable... https://www.dagshub.com/ https://www.dagshub.com/docs/index.html https://www.dagshub.com/blog/launchin... https://bayesiancomputationbook.com/w... https://mlcontests.com/ https://github.com/Yard1/ray-skorch https://github.com/skorch-dev/skorch https://www.rumbledb.org/?utm_source=... https://github.com/DarshanDeshpande/j... https://github.com/s3prl/s3prl AlphaZero explained blog post https://joshvarty.github.io/AlphaZero... Ru-DOLPH: HyperModal Text-to-Image-to-Text model https://github.com/sberbank-ai/ru-dolph https://colab.research.google.com/dri... Google AI 2021 Review https://ai.googleblog.com/2022/01/goo... Links: TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://discord.gg/4H8xxDF BitChute: https://www.bitchute.com/channel/yann... LinkedIn: https://www.linkedin.com/in/ykilcher BiliBili: https://space.bilibili.com/2017636191 If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannick... Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m
undefined
Jan 24, 2022 • 1h 23min

Dynamic Inference with Neural Interpreters (w/ author interview)

#deeplearning #neuralinterpreter #ai This video includes an interview with the paper's authors! What if we treated deep networks like modular programs? Neural Interpreters divide computation into small modules and route data to them via a dynamic type inference system. The resulting model combines recurrent elements, weight sharing, attention, and more to tackle both abstract reasoning, as well as computer vision tasks. OUTLINE: 0:00 - Intro & Overview 3:00 - Model Overview 7:00 - Interpreter weights and function code 9:40 - Routing data to functions via neural type inference 14:55 - ModLin layers 18:25 - Experiments 21:35 - Interview Start 24:50 - General Model Structure 30:10 - Function code and signature 40:30 - Explaining Modulated Layers 49:50 - A closer look at weight sharing 58:30 - Experimental Results Paper: https://arxiv.org/abs/2110.06399 Guests: Nasim Rahaman: https://twitter.com/nasim_rahaman Francesco Locatello: https://twitter.com/FrancescoLocat8 Waleed Gondal: https://twitter.com/Wallii_gondal Abstract: Modern neural network architectures can leverage large amounts of data to generalize well within the training distribution. However, they are less capable of systematic generalization to data drawn from unseen but related distributions, a feat that is hypothesized to require compositional reasoning and reuse of knowledge. In this work, we present Neural Interpreters, an architecture that factorizes inference in a self-attention network as a system of modules, which we call \emph{functions}. Inputs to the model are routed through a sequence of functions in a way that is end-to-end learned. The proposed architecture can flexibly compose computation along width and depth, and lends itself well to capacity extension after training. To demonstrate the versatility of Neural Interpreters, we evaluate it in two distinct settings: image classification and visual abstract reasoning on Raven Progressive Matrices. In the former, we show that Neural Interpreters perform on par with the vision transformer using fewer parameters, while being transferrable to a new task in a sample efficient manner. In the latter, we find that Neural Interpreters are competitive with respect to the state-of-the-art in terms of systematic generalization Authors: Nasim Rahaman, Muhammad Waleed Gondal, Shruti Joshi, Peter Gehler, Yoshua Bengio, Francesco Locatello, Bernhard Schölkopf Links: TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://discord.gg/4H8xxDF BitChute: https://www.bitchute.com/channel/yann... LinkedIn: https://www.linkedin.com/in/ykilcher BiliBili: https://space.bilibili.com/2017636191 If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannick... Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n
undefined
Jan 21, 2022 • 1h 9min

Noether Networks: Meta-Learning Useful Conserved Quantities (w/ the authors)

#deeplearning #noether #symmetries This video includes an interview with first author Ferran Alet! Encoding inductive biases has been a long established methods to provide deep networks with the ability to learn from less data. Especially useful are encodings of symmetry properties of the data, such as the convolution's translation invariance. But such symmetries are often hard to program explicitly, and can only be encoded exactly when done in a direct fashion. Noether Networks use Noether's theorem connecting symmetries to conserved quantities and are able to dynamically and approximately enforce symmetry properties upon deep neural networks. OUTLINE: 0:00 - Intro & Overview 18:10 - Interview Start 21:20 - Symmetry priors vs conserved quantities 23:25 - Example: Pendulum 27:45 - Noether Network Model Overview 35:35 - Optimizing the Noether Loss 41:00 - Is the computation graph stable? 46:30 - Increasing the inference time computation 48:45 - Why dynamically modify the model? 55:30 - Experimental Results & Discussion Paper: https://arxiv.org/abs/2112.03321 Website: https://dylandoblar.github.io/noether... Code: https://github.com/dylandoblar/noethe... Abstract: Progress in machine learning (ML) stems from a combination of data availability, computational resources, and an appropriate encoding of inductive biases. Useful biases often exploit symmetries in the prediction problem, such as convolutional networks relying on translation equivariance. Automatically discovering these useful symmetries holds the potential to greatly improve the performance of ML systems, but still remains a challenge. In this work, we focus on sequential prediction problems and take inspiration from Noether's theorem to reduce the problem of finding inductive biases to meta-learning useful conserved quantities. We propose Noether Networks: a new type of architecture where a meta-learned conservation loss is optimized inside the prediction function. We show, theoretically and experimentally, that Noether Networks improve prediction quality, providing a general framework for discovering inductive biases in sequential problems. Authors: Ferran Alet, Dylan Doblar, Allan Zhou, Joshua Tenenbaum, Kenji Kawaguchi, Chelsea Finn Links: TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://discord.gg/4H8xxDF BitChute: https://www.bitchute.com/channel/yann... LinkedIn: https://www.linkedin.com/in/ykilcher BiliBili: https://space.bilibili.com/2017636191 If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannick... Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n
undefined
Jan 20, 2022 • 1h 24min

This Team won the Minecraft RL BASALT Challenge! (Paper Explanation & Interview with the authors)

#minerl #minecraft #deeplearning The MineRL BASALT challenge has no reward functions or technical descriptions of what's to be achieved. Instead, the goal of each task is given as a short natural language string, and the agent is evaluated by a team of human judges who rate both how well the goal has been fulfilled, as well as how human-like the agent behaved. In this video, I interview KAIROS, the winning team of the 2021 challenge, and discuss how they used a combination of machine learning, efficient data collection, hand engineering, and a bit of knowledge about Minecraft to beat all other teams. OUTLINE: 0:00 - Introduction 4:10 - Paper Overview 11:15 - Start of Interview 17:05 - First Approach 20:30 - State Machine 26:45 - Efficient Label Collection 30:00 - Navigation Policy 38:15 - Odometry Estimation 46:00 - Pain Points & Learnings 50:40 - Live Run Commentary 58:50 - What other tasks can be solved? 1:01:55 - What made the difference? 1:07:30 - Recommendations & Conclusion 1:11:10 - Full Runs: Waterfall 1:12:40 - Full Runs: Build House 1:17:45 - Full Runs: Animal Pen 1:20:50 - Full Runs: Find Cave Paper: https://arxiv.org/abs/2112.03482 Code: https://github.com/viniciusguigo/kair... Challenge Website: https://minerl.io/basalt/ Paper Title: Combining Learning from Human Feedback and Knowledge Engineering to Solve Hierarchical Tasks in Minecraft Abstract: Real-world tasks of interest are generally poorly defined by human-readable descriptions and have no pre-defined reward signals unless it is defined by a human designer. Conversely, data-driven algorithms are often designed to solve a specific, narrowly defined, task with performance metrics that drives the agent's learning. In this work, we present the solution that won first place and was awarded the most human-like agent in the 2021 NeurIPS Competition MineRL BASALT Challenge: Learning from Human Feedback in Minecraft, which challenged participants to use human data to solve four tasks defined only by a natural language description and no reward function. Our approach uses the available human demonstration data to train an imitation learning policy for navigation and additional human feedback to train an image classifier. These modules, together with an estimated odometry map, are then combined into a state-machine designed based on human knowledge of the tasks that breaks them down in a natural hierarchy and controls which macro behavior the learning agent should follow at any instant. We compare this hybrid intelligence approach to both end-to-end machine learning and pure engineered solutions, which are then judged by human evaluators. Codebase is available at this https URL. Authors: Vinicius G. Goecks, Nicholas Waytowich, David Watkins, Bharat Prakash Links: TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://discord.gg/4H8xxDF BitChute: https://www.bitchute.com/channel/yann... LinkedIn: https://www.linkedin.com/in/ykilcher BiliBili: https://space.bilibili.com/2017636191 If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannick... Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m
undefined
Jan 16, 2022 • 1h 24min

This Team won the Minecraft RL BASALT Challenge! (Paper Explanation & Interview with the authors)

#minerl #minecraft #deeplearning The MineRL BASALT challenge has no reward functions or technical descriptions of what's to be achieved. Instead, the goal of each task is given as a short natural language string, and the agent is evaluated by a team of human judges who rate both how well the goal has been fulfilled, as well as how human-like the agent behaved. In this video, I interview KAIROS, the winning team of the 2021 challenge, and discuss how they used a combination of machine learning, efficient data collection, hand engineering, and a bit of knowledge about Minecraft to beat all other teams. OUTLINE: 0:00 - Introduction 4:10 - Paper Overview 11:15 - Start of Interview 17:05 - First Approach 20:30 - State Machine 26:45 - Efficient Label Collection 30:00 - Navigation Policy 38:15 - Odometry Estimation 46:00 - Pain Points & Learnings 50:40 - Live Run Commentary 58:50 - What other tasks can be solved? 1:01:55 - What made the difference? 1:07:30 - Recommendations & Conclusion 1:11:10 - Full Runs: Waterfall 1:12:40 - Full Runs: Build House 1:17:45 - Full Runs: Animal Pen 1:20:50 - Full Runs: Find Cave Paper: https://arxiv.org/abs/2112.03482 Code: https://github.com/viniciusguigo/kair... Challenge Website: https://minerl.io/basalt/ Paper Title: Combining Learning from Human Feedback and Knowledge Engineering to Solve Hierarchical Tasks in Minecraft Abstract: Real-world tasks of interest are generally poorly defined by human-readable descriptions and have no pre-defined reward signals unless it is defined by a human designer. Conversely, data-driven algorithms are often designed to solve a specific, narrowly defined, task with performance metrics that drives the agent's learning. In this work, we present the solution that won first place and was awarded the most human-like agent in the 2021 NeurIPS Competition MineRL BASALT Challenge: Learning from Human Feedback in Minecraft, which challenged participants to use human data to solve four tasks defined only by a natural language description and no reward function. Our approach uses the available human demonstration data to train an imitation learning policy for navigation and additional human feedback to train an image classifier. These modules, together with an estimated odometry map, are then combined into a state-machine designed based on human knowledge of the tasks that breaks them down in a natural hierarchy and controls which macro behavior the learning agent should follow at any instant. We compare this hybrid intelligence approach to both end-to-end machine learning and pure engineered solutions, which are then judged by human evaluators. Codebase is available at this https URL. Authors: Vinicius G. Goecks, Nicholas Waytowich, David Watkins, Bharat Prakash Links: TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://discord.gg/4H8xxDF BitChute: https://www.bitchute.com/channel/yann... LinkedIn: https://www.linkedin.com/in/ykilcher BiliBili: https://space.bilibili.com/2017636191 If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannick... Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m
undefined
Jan 7, 2022 • 42min

Full Self-Driving is HARD! Analyzing Elon Musk re: Tesla Autopilot on Lex Fridman's Podcast

#tesla #fsd #elon Watch the original podcast: https://www.youtube.com/watch?v=DxREm... An analysis of Elon's appearance on Lex Fridman. Very interesting conversation and a good overview of past, current, and future versions of Tesla's Autopilot system. OUTLINE: 0:00 - Intro 0:40 - Tesla Autopilot: How hard is it? 9:05 - Building an accurate understanding of the world 16:25 - History of Tesla's neural network stack 26:00 - When is full self-driving ready? 29:55 - FSD 11: Less code, more neural networks 37:00 - Auto-labelling is essential 39:05 - Tesla Bot & Discussion Links: TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://discord.gg/4H8xxDF BitChute: https://www.bitchute.com/channel/yann... LinkedIn: https://www.linkedin.com/in/ykilcher BiliBili: https://space.bilibili.com/2017636191 If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannick... Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n
undefined
Jan 5, 2022 • 54min

Player of Games: All the games, one algorithm! (w/ author Martin Schmid)

#playerofgames #deepmind #alphazero Special Guest: First author Martin Schmid (https://twitter.com/Lifrordi) Games have been used throughout research as testbeds for AI algorithms, such as reinforcement learning agents. However, different types of games usually require different solution approaches, such as AlphaZero for Go or Chess, and Counterfactual Regret Minimization (CFR) for Poker. Player of Games bridges this gap between perfect and imperfect information games and delivers a single algorithm that uses tree search over public information states, and is trained via self-play. The resulting algorithm can play Go, Chess, Poker, Scotland Yard, and many more games, as well as non-game environments. OUTLINE: 0:00 - Introduction 2:50 - What games can Player of Games be trained on? 4:00 - Tree search algorithms (AlphaZero) 8:00 - What is different in imperfect information games? 15:40 - Counterfactual Value- and Policy-Networks 18:50 - The Player of Games search procedure 28:30 - How to train the network? 34:40 - Experimental Results 47:20 - Discussion & Outlook Paper: https://arxiv.org/abs/2112.03178 Abstract: Games have a long history of serving as a benchmark for progress in artificial intelligence. Recently, approaches using search and learning have shown strong performance across a set of perfect information games, and approaches using game-theoretic reasoning and learning have shown strong performance for specific imperfect information poker variants. We introduce Player of Games, a general-purpose algorithm that unifies previous approaches, combining guided search, self-play learning, and game-theoretic reasoning. Player of Games is the first algorithm to achieve strong empirical performance in large perfect and imperfect information games -- an important step towards truly general algorithms for arbitrary environments. We prove that Player of Games is sound, converging to perfect play as available computation time and approximation capacity increases. Player of Games reaches strong performance in chess and Go, beats the strongest openly available agent in heads-up no-limit Texas hold'em poker (Slumbot), and defeats the state-of-the-art agent in Scotland Yard, an imperfect information game that illustrates the value of guided search, learning, and game-theoretic reasoning. Authors: Martin Schmid, Matej Moravcik, Neil Burch, Rudolf Kadlec, Josh Davidson, Kevin Waugh, Nolan Bard, Finbarr Timbers, Marc Lanctot, Zach Holland, Elnaz Davoodi, Alden Christianson, Michael Bowling Links: TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://discord.gg/4H8xxDF BitChute: https://www.bitchute.com/channel/yann... LinkedIn: https://www.linkedin.com/in/ykilcher BiliBili: https://space.bilibili.com/2017636191 If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannick... Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n
undefined
Jan 5, 2022 • 42min

GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models

#glide #openai #diffusion Diffusion models learn to iteratively reverse a noising process that is applied repeatedly during training. The result can be used for conditional generation as well as various other tasks such as inpainting. OpenAI's GLIDE builds on recent advances in diffusion models and combines text-conditional diffusion with classifier-free guidance and upsampling to achieve unprecedented quality in text-to-image samples. Try it yourself: https://huggingface.co/spaces/valhall... OUTLINE: 0:00 - Intro & Overview 6:10 - What is a Diffusion Model? 18:20 - Conditional Generation and Guided Diffusion 31:30 - Architecture Recap 34:05 - Training & Result metrics 36:55 - Failure cases & my own results 39:45 - Safety considerations Paper: https://arxiv.org/abs/2112.10741 Code & Model: https://github.com/openai/glide-text2im More diffusion papers: https://arxiv.org/pdf/2006.11239.pdf https://arxiv.org/pdf/2102.09672.pdf Abstract: Diffusion models have recently been shown to generate high-quality synthetic images, especially when paired with a guidance technique to trade off diversity for fidelity. We explore diffusion models for the problem of text-conditional image synthesis and compare two different guidance strategies: CLIP guidance and classifier-free guidance. We find that the latter is preferred by human evaluators for both photorealism and caption similarity, and often produces photorealistic samples. Samples from a 3.5 billion parameter text-conditional diffusion model using classifier-free guidance are favored by human evaluators to those from DALL-E, even when the latter uses expensive CLIP reranking. Additionally, we find that our models can be fine-tuned to perform image inpainting, enabling powerful text-driven image editing. We train a smaller model on a filtered dataset and release the code and weights at this https URL. Authors: Alex Nichol, Prafulla Dhariwal, Aditya Ramesh, Pranav Shyam, Pamela Mishkin, Bob McGrew, Ilya Sutskever, Mark Chen Links: TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://discord.gg/4H8xxDF BitChute: https://www.bitchute.com/channel/yann... LinkedIn: https://www.linkedin.com/in/ykilcher BiliBili: https://space.bilibili.com/2017636191 If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannick... Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n
undefined
Jan 5, 2022 • 26min

[ML News] DeepMind builds Gopher | Google builds GLaM | Suicide capsule uses AI to check access

#mlnews #gopher #glam Your updates on everything going on in the Machine Learning world. Sponsor: Weights & Biases https://wandb.me/yannic OUTLINE: 0:00 - Intro & Overview 0:20 - Sponsor: Weights & Biases 3:05 - DeepMind releases 3 papers on large language models 11:45 - Hugging Face Blog: Training CodeParrot from scratch 14:25 - Paper: Pre-Training vision systems with noise 15:45 - DeepMind advances Quantum Mechanics 16:45 - GoogleAI trains GLaM: 1 Trillion Parameters Mixture of Experts Model 18:45 - Colin Raffel calls for building ML models like we build Open-Source software 22:05 - A rebuke of the hype around DeepMind's math paper 24:45 - Helpful Things 32:25 - Suicide Capsule plans AI to assess your mental state before use 35:15 - Synthesia raises 50M to develop AI avatars Weights & Biases Embedding Projector https://twitter.com/_ScottCondron/sta... https://docs.wandb.ai/ref/app/feature... https://wandb.ai/timssweeney/toy_data... DeepMind releases 3 papers on large language models https://deepmind.com/blog/article/lan... https://arxiv.org/pdf/2112.04426.pdf https://kstatic.googleusercontent.com... https://arxiv.org/pdf/2112.04359.pdf https://deepmind.com/research/publica... Hugging Face Blog: Training CodeParrot from scratch https://huggingface.co/blog/codeparro... Paper: Pre-Training vision systems with noise https://mbaradad.github.io/learning_w... DeepMind advances Quantum Mechanics https://deepmind.com/blog/article/Sim... https://storage.googleapis.com/deepmi... https://github.com/deepmind/deepmind-... GoogleAI trains GLaM: 1 Trillion Parameters Mixture of Experts Model https://ai.googleblog.com/2021/12/mor... Colin Raffel calls for building ML models like we build Open-Source software https://colinraffel.com/blog/a-call-t... A rebuke of the hype around DeepMind's math paper https://arxiv.org/abs/2112.04324?s=09 Helpful Things https://twitter.com/huggingface/statu... https://docs.cohere.ai/prompt-enginee... https://github.blog/2021-12-08-improv... https://huggingface.co/blog/data-meas... https://huggingface.co/spaces/hugging... https://blogs.microsoft.com/ai-for-bu... https://techcommunity.microsoft.com/t... https://github.com/minitorch/minitorc... https://minitorch.github.io/ https://pandastutor.com/ https://pandastutor.com/vis.html https://github.com/IAmPara0x/yuno https://colab.research.google.com/dri... https://www.reddit.com/r/MachineLearn... https://www.drivendata.org/competitio... https://www.reddit.com/r/MachineLearn... https://www.uttt.ai/ https://arxiv.org/abs/2112.02721?utm_... https://arxiv.org/pdf/2112.02721.pdf https://github.com/GEM-benchmark/NL-A... https://www.reddit.com/r/MachineLearn... Suicide Capsule plans AI to assess your mental state before use https://www.swissinfo.ch/eng/sci-tech... Synthesia raises 50M to develop AI avatars https://techcrunch.com/2021/12/08/syn... https://www.synthesia.io/ Links: TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://discord.gg/4H8xxDF SubscribeStar: https://www.subscribestar.com/yannick... Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app