Yannic Kilcher Videos (Audio Only) cover image

Yannic Kilcher Videos (Audio Only)

Latest episodes

undefined
Mar 28, 2022 • 49min

Author Interview - Typical Decoding for Natural Language Generation

#deeplearning #nlp #sampling This is an interview with first author Clara Meister. Paper review video hereé https://youtu.be/_EDr3ryrT_Y Modern language models like T5 or GPT-3 achieve remarkably low perplexities on both training and validation data, yet when sampling from their output distributions, the generated text often seems dull and uninteresting. Various workarounds have been proposed, such as top-k sampling and nucleus sampling, but while these manage to somewhat improve the generated samples, they are hacky and unfounded. This paper introduces typical sampling, a new decoding method that is principled, effective, and can be implemented efficiently. Typical sampling turns away from sampling purely based on likelihood and explicitly finds a trade-off between generating high-probability samples and generating high-information samples. The paper connects typical sampling to psycholinguistic theories on human speech generation, and shows experimentally that typical sampling achieves much more diverse and interesting results than any of the current methods. Sponsor: Introduction to Graph Neural Networks Course https://www.graphneuralnets.com/p/int... OUTLINE: 0:00 - Intro 0:35 - Sponsor: Introduction to GNNs Course (link in description) 1:30 - Why does sampling matter? 5:40 - What is a "typical" message? 8:35 - How do humans communicate? 10:25 - Why don't we just sample from the model's distribution? 15:30 - What happens if we condition on the information to transmit? 17:35 - Does typical sampling really represent human outputs? 20:55 - What do the plots mean? 31:00 - Diving into the experimental results 39:15 - Are our training objectives wrong? 41:30 - Comparing typical sampling to top-k and nucleus sampling 44:50 - Explaining arbitrary engineering choices 47:20 - How can people get started with this? Paper: https://arxiv.org/abs/2202.00666 Code: https://github.com/cimeister/typical-... Abstract: Despite achieving incredibly low perplexities on myriad natural language corpora, today's language models still often underperform when used to generate text. This dichotomy has puzzled the language generation community for the last few years. In this work, we posit that the abstraction of natural language as a communication channel (à la Shannon, 1948) can provide new insights into the behaviors of probabilistic language generators, e.g., why high-probability texts can be dull or repetitive. Humans use language as a means of communicating information, and do so in a simultaneously efficient and error-minimizing manner; they choose each word in a string with this (perhaps subconscious) goal in mind. We propose that generation from probabilistic models should mimic this behavior. Rather than always choosing words from the high-probability region of the distribution--which have a low Shannon information content--we sample from the set of words with information content close to the conditional entropy of our model, i.e., close to the expected information content. This decision criterion can be realized through a simple and efficient implementation, which we call typical sampling. Automatic and human evaluations show that, in comparison to nucleus and top-k sampling, typical sampling offers competitive performance in terms of quality while consistently reducing the number of degenerate repetitions. Authors: Clara Meister, Tiago Pimentel, Gian Wiher, Ryan Cotterell Links: Merch: store.ykilcher.com TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://discord.gg/4H8xxDF
undefined
Mar 28, 2022 • 49min

Typical Decoding for Natural Language Generation (Get more human-like outputs from language models!)

#deeplearning #nlp #sampling Modern language models like T5 or GPT-3 achieve remarkably low perplexities on both training and validation data, yet when sampling from their output distributions, the generated text often seems dull and uninteresting. Various workarounds have been proposed, such as top-k sampling and nucleus sampling, but while these manage to somewhat improve the generated samples, they are hacky and unfounded. This paper introduces typical sampling, a new decoding method that is principled, effective, and can be implemented efficiently. Typical sampling turns away from sampling purely based on likelihood and explicitly finds a trade-off between generating high-probability samples and generating high-information samples. The paper connects typical sampling to psycholinguistic theories on human speech generation, and shows experimentally that typical sampling achieves much more diverse and interesting results than any of the current methods. Sponsor: Fully Connected by Weights & Biases https://wandb.ai/fully-connected OUTLINE: 0:00 - Intro 1:50 - Sponsor: Fully Connected by Weights & Biases 4:10 - Paper Overview 7:40 - What's the problem with sampling? 11:45 - Beam Search: The good and the bad 14:10 - Top-k and Nucleus Sampling 16:20 - Why the most likely things might not be the best 21:30 - The expected information content of the next word 25:00 - How to trade off information and likelihood 31:25 - Connections to information theory and psycholinguistics 36:40 - Introducing Typical Sampling 43:00 - Experimental Evaluation 44:40 - My thoughts on this paper Paper: https://arxiv.org/abs/2202.00666 Code: https://github.com/cimeister/typical-... Abstract: Despite achieving incredibly low perplexities on myriad natural language corpora, today's language models still often underperform when used to generate text. This dichotomy has puzzled the language generation community for the last few years. In this work, we posit that the abstraction of natural language as a communication channel (à la Shannon, 1948) can provide new insights into the behaviors of probabilistic language generators, e.g., why high-probability texts can be dull or repetitive. Humans use language as a means of communicating information, and do so in a simultaneously efficient and error-minimizing manner; they choose each word in a string with this (perhaps subconscious) goal in mind. We propose that generation from probabilistic models should mimic this behavior. Rather than always choosing words from the high-probability region of the distribution--which have a low Shannon information content--we sample from the set of words with information content close to the conditional entropy of our model, i.e., close to the expected information content. This decision criterion can be realized through a simple and efficient implementation, which we call typical sampling. Automatic and human evaluations show that, in comparison to nucleus and top-k sampling, typical sampling offers competitive performance in terms of quality while consistently reducing the number of degenerate repetitions. Authors: Clara Meister, Tiago Pimentel, Gian Wiher, Ryan Cotterell Links: Merch: store.ykilcher.com TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://discord.gg/4H8xxDF BitChute: https://www.bitchute.com/channel/yann... LinkedIn: https://www.linkedin.com/in/ykilcher BiliBili: https://space.bilibili.com/2017636191 If you want to support me, the best thing to do is to share out the content :)
undefined
Mar 25, 2022 • 49min

One Model For All The Tasks - BLIP (Author Interview)

#blip #interview #salesforce Paper Review Video: https://youtu.be/X2k7n4FuI7c Sponsor: Assembly AI https://www.assemblyai.com/?utm_sourc... This is an interview with Junnan Li and Dongxu Li, authors of BLIP and members of Salesforce research. Cross-modal pre-training has been all the rage lately in deep learning, especially training vision and language models together. However, there are a number of issues, such as low quality datasets that limit the performance of any model trained on it, and also the fact that pure contrastive pre-training cannot be easily fine-tuned for most downstream tasks. BLIP unifies different tasks and objectives in a single pre-training run and achieves a much more versatile model, which the paper immediately uses to create, filter, clean and thus bootstrap its own dataset to improve performance even more! OUTLINE: 0:00 - Intro 0:40 - Sponsor: Assembly AI 1:30 - Start of Interview 2:30 - What's the pitch? 4:40 - How did data bootstrapping come into the project? 7:10 - How big of a problem is data quality? 11:10 - Are the captioning & filtering models biased towards COCO data? 14:40 - Could the data bootstrapping be done multiple times? 16:20 - What was the evolution of the BLIP architecture? 21:15 - Are there additional benefits to adding language modelling? 23:50 - Can we imagine a modular future for pre-training? 29:45 - Diving into the experimental results 42:40 - What did and did not work out during the research? 45:00 - How is research life at Salesforce? 46:45 - Where do we go from here? Paper: https://arxiv.org/abs/2201.12086 Code: https://github.com/salesforce/BLIP Demo: https://huggingface.co/spaces/Salesfo... Abstract: Vision-Language Pre-training (VLP) has advanced the performance for many vision-language tasks. However, most existing pre-trained models only excel in either understanding-based tasks or generation-based tasks. Furthermore, performance improvement has been largely achieved by scaling up the dataset with noisy image-text pairs collected from the web, which is a suboptimal source of supervision. In this paper, we propose BLIP, a new VLP framework which transfers flexibly to both vision-language understanding and generation tasks. BLIP effectively utilizes the noisy web data by bootstrapping the captions, where a captioner generates synthetic captions and a filter removes the noisy ones. We achieve state-of-the-art results on a wide range of vision-language tasks, such as image-text retrieval (+2.7% in average recall@1), image captioning (+2.8% in CIDEr), and VQA (+1.6% in VQA score). BLIP also demonstrates strong generalization ability when directly transferred to video-language tasks in a zero-shot manner. Code, models, and datasets are released at this https URL. Authors: Junnan Li, Dongxu Li, Caiming Xiong, Steven Hoi Links: TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://discord.gg/4H8xxDF BitChute: https://www.bitchute.com/channel/yann... LinkedIn: https://www.linkedin.com/in/ykilcher BiliBili: https://space.bilibili.com/2017636191 If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannick... Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2
undefined
Mar 25, 2022 • 47min

BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding&Generation

The podcast discusses a paper on BLIP, a technique for bootstrapping data sets in vision and language pre-training. They explore the benefits of cross-modal pre-training and the issues it faces. The paper introduces BLIP, a more versatile model that creates and improves its own dataset. The podcast covers the paper's contributions, model architecture, and how data flows through the model. They also discuss captioning and filtering bootstrapping, as well as fine-tuning the model for downstream tasks.
undefined
Mar 22, 2022 • 33min

[ML News] AI Threatens Biological Arms Race

#mlnews #gtc22 #ithaca GTC Registration Link: https://ykilcher.com/gtc Your regular updates on what's going on in the ML world! OUTLINE: 0:00 - Intro 0:20 - Register to Nvidia GTC and win a 3090! 4:15 - DeepMind's Ithaca deciphers Lost Ancient Texts 6:45 - Drug discovery model turns toxic 10:00 - Gary Marcus: Deep Learning is hitting a wall 19:40 - GopherCite: Backing up answers with citations 22:40 - Yoshua Bengio appointed knight of the legion of honour 23:00 - Meta AI tags parody account of Yoshua Bengio 23:40 - Building games using just natural language 24:55 - YOU.com adds writing assistant 25:45 - Horace He: How to brrr 26:35 - Karpathy: Reproducing Yann LeCun's 1989 paper 27:50 - Pig grunt emotion classifier 28:20 - AI annotates protein domain functions 29:40 - Atwood & Carmack: 10k self-driving car bet 30:50 - Helpful Things References: Register to GTC and win a 3090! https://twitter.com/NVIDIAEU/status/1... https://www.nvidia.com/gtc/keynote/?n... https://www.nvidia.com/gtc/?ncid=ref-... https://www.nvidia.com/gtc/keynote/ https://www.nvidia.com/gtc/training/ https://developer.nvidia.com/nvidia-o... DeepMind deciphers Lost Ancient Texts https://deepmind.com/blog/article/Pre... https://www.nature.com/articles/s4158... https://github.com/deepmind/ithaca https://ithaca.deepmind.com/?job=eyJy... Drug discovery model turns toxic https://www.theverge.com/2022/3/17/22... https://www.nature.com/articles/s4225... Gary Marcus: Deep Learning is hitting a wall https://nautil.us/deep-learning-is-hi... https://www.youtube.com/watch?v=fVkXE... GopherCite: Backing up answers with citations https://deepmind.com/research/publica... Yoshua Bengio appointed knight of the legion of honour https://mila.quebec/en/professor-yosh... Meta AI tags parody account https://twitter.com/MetaAI/status/150... Building games using just natural language https://andrewmayneblog.wordpress.com... YOU.com adds writing assistant https://you.com/search?q=how%20to%20w... Horace He: How to brrr https://horace.io/brrr_intro.html Karpathy: Reproducing Yann LeCun's 1989 paper https://karpathy.github.io/2022/03/14... Pig grunt emotion classifier https://science.ku.dk/english/press/n... AI annotates protein domain functions https://ai.googleblog.com/2022/03/usi... https://google-research.github.io/pro... Atwood & Carmack: 10k self-driving car bet https://blog.codinghorror.com/the-203... Helpful Things https://github.com/recognai/rubrix https://twitter.com/taiyasaki/status/... https://github.com/mosaicml/composer?... https://mujoco.org/ https://mujoco.readthedocs.io/en/late... https://github.com/deepmind/mctx?utm_... https://padl.ai/ https://github.com/LaihoE/did-it-spill https://pytorch.org/blog/pytorch-1.11... Links: TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://discord.gg/4H8xxDF BitChute: https://www.bitchute.com/channel/yann... LinkedIn: https://www.linkedin.com/in/ykilcher BiliBili: https://space.bilibili.com/2017636191 If you want to support me, the best thing to do is to share out the content :)
undefined
Mar 21, 2022 • 57min

Active Dendrites avoid catastrophic forgetting - Interview with the Authors

#multitasklearning #biology #neuralnetworks This is an interview with the paper's authors: Abhiram Iyer, Karan Grewal, and Akash Velu! Paper Review Video: https://youtu.be/O_dJ31T01i8 Check out Zak's course on Graph Neural Networks (discount with this link): https://www.graphneuralnets.com/p/int... Catastrophic forgetting is a big problem in mutli-task and continual learning. Gradients of different objectives tend to conflict, and new tasks tend to override past knowledge. In biological neural networks, each neuron carries a complex network of dendrites that mitigate such forgetting by recognizing the context of an input signal. This paper introduces Active Dendrites, which carries over the principle of context-sensitive gating by dendrites into the deep learning world. Various experiments show the benefit in combatting catastrophic forgetting, while preserving sparsity and limited parameter counts. OUTLINE: 0:00 - Intro 0:55 - Sponsor: GNN Course 2:30 - How did the idea come to be? 7:05 - What roles do the different parts of the method play? 8:50 - What was missing in the paper review? 10:35 - Are biological concepts viable if we still have backprop? 11:50 - How many dendrites are necessary? 14:10 - Why is there a plateau in the sparsity plot? 20:50 - How does task difficulty play into the algorithm? 24:10 - Why are there different setups in the experiments? 30:00 - Is there a place for unsupervised pre-training? 32:50 - How can we apply the online prototyping to more difficult tasks? 37:00 - What did not work out during the project? 41:30 - How do you debug a project like this? 47:10 - How is this related to other architectures? 51:10 - What other things from neuroscience are to be included? 55:50 - Don't miss the awesome ending :) Paper: https://arxiv.org/abs/2201.00042 Blog: https://numenta.com/blog/2021/11/08/c... Link to the GNN course (with discount): https://www.graphneuralnets.com/p/int... Authors: Abhiram Iyer, Karan Grewal, Akash Velu, Lucas Oliveira Souza, Jeremy Forest, Subutai Ahmad Links: TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://discord.gg/4H8xxDF BitChute: https://www.bitchute.com/channel/yann... LinkedIn: https://www.linkedin.com/in/ykilcher
undefined
Mar 21, 2022 • 1h 5min

Avoiding Catastrophe: Active Dendrites Enable Multi-Task Learning in Dynamic Environments (Review)

#multitasklearning #biology #neuralnetworks Catastrophic forgetting is a big problem in mutli-task and continual learning. Gradients of different objectives tend to conflict, and new tasks tend to override past knowledge. In biological neural networks, each neuron carries a complex network of dendrites that mitigate such forgetting by recognizing the context of an input signal. This paper introduces Active Dendrites, which carries over the principle of context-sensitive gating by dendrites into the deep learning world. Various experiments show the benefit in combatting catastrophic forgetting, while preserving sparsity and limited parameter counts. OUTLINE: 0:00 - Introduction 1:20 - Paper Overview 3:15 - Catastrophic forgetting in continuous and multi-task learning 9:30 - Dendrites in biological neurons 16:55 - Sparse representations in biology 18:35 - Active dendrites in deep learning 34:15 - Experiments on multi-task learning 39:00 - Experiments in continual learning and adaptive prototyping 49:20 - Analyzing the inner workings of the algorithm 53:30 - Is this the same as just training a larger network? 59:15 - How does this relate to attention mechanisms? 1:02:55 - Final thoughts and comments Paper: https://arxiv.org/abs/2201.00042 Blog: https://numenta.com/blog/2021/11/08/c... ERRATA: - I was made aware of this by https://twitter.com/ChainlessCoder: "That axon you showed of the pyramidal neuron, is actually the apical dendrite of the neuron". Sorry, my bad :) Abstract: A key challenge for AI is to build embodied systems that operate in dynamically changing environments. Such systems must adapt to changing task contexts and learn continuously. Although standard deep learning systems achieve state of the art results on static benchmarks, they often struggle in dynamic scenarios. In these settings, error signals from multiple contexts can interfere with one another, ultimately leading to a phenomenon known as catastrophic forgetting. In this article we investigate biologically inspired architectures as solutions to these problems. Specifically, we show that the biophysical properties of dendrites and local inhibitory systems enable networks to dynamically restrict and route information in a context-specific manner. Our key contributions are as follows. First, we propose a novel artificial neural network architecture that incorporates active dendrites and sparse representations into the standard deep learning framework. Next, we study the performance of this architecture on two separate benchmarks requiring task-based adaptation: Meta-World, a multi-task reinforcement learning environment where a robotic agent must learn to solve a variety of manipulation tasks simultaneously; and a continual learning benchmark in which the model's prediction task changes throughout training. Analysis on both benchmarks demonstrates the emergence of overlapping but distinct and sparse subnetworks, allowing the system to fluidly learn multiple tasks with minimal forgetting. Our neural implementation marks the first time a single architecture has achieved competitive results on both multi-task and continual learning settings. Our research sheds light on how biological properties of neurons can inform deep learning systems to address dynamic scenarios that are typically impossible for traditional ANNs to solve. Authors: Abhiram Iyer, Karan Grewal, Akash Velu, Lucas Oliveira Souza, Jeremy Forest, Subutai Ahmad
undefined
Mar 17, 2022 • 36min

Author Interview - VOS: Learning What You Don't Know by Virtual Outlier Synthesis

#deeplearning #objectdetection #outliers An interview with the authors of "Virtual Outlier Synthesis". Watch the paper review video here: https://youtu.be/i-J4T3uLC9M Outliers are data points that are highly unlikely to be seen in the training distribution, and therefore deep neural networks have troubles when dealing with them. Many approaches to detecting outliers at inference time have been proposed, but most of them show limited success. This paper presents Virtual Outlier Synthesis, which is a method that pairs synthetic outliers, forged in the latent space, with an energy-based regularization of the network at training time. The result is a deep network that can reliably detect outlier datapoints during inference with minimal overhead. OUTLINE: 0:00 - Intro 2:20 - What was the motivation behind this paper? 5:30 - Why object detection? 11:05 - What's the connection to energy-based models? 12:15 - Is a Gaussian mixture model appropriate for high-dimensional data? 16:15 - What are the most important components of the method? 18:30 - What are the downstream effects of the regularizer? 22:00 - Are there severe trade-offs to outlier detection? 23:55 - Main experimental takeaways? 26:10 - Why do outlier detection in the last layer? 30:20 - What does it take to finish a research projects successfully? Paper: https://arxiv.org/abs/2202.01197 Code: https://github.com/deeplearning-wisc/vos Abstract: Out-of-distribution (OOD) detection has received much attention lately due to its importance in the safe deployment of neural networks. One of the key challenges is that models lack supervision signals from unknown data, and as a result, can produce overconfident predictions on OOD data. Previous approaches rely on real outlier datasets for model regularization, which can be costly and sometimes infeasible to obtain in practice. In this paper, we present VOS, a novel framework for OOD detection by adaptively synthesizing virtual outliers that can meaningfully regularize the model's decision boundary during training. Specifically, VOS samples virtual outliers from the low-likelihood region of the class-conditional distribution estimated in the feature space. Alongside, we introduce a novel unknown-aware training objective, which contrastively shapes the uncertainty space between the ID data and synthesized outlier data. VOS achieves state-of-the-art performance on both object detection and image classification models, reducing the FPR95 by up to 7.87% compared to the previous best method. Code is available at this https URL. Authors: Xuefeng Du, Zhaoning Wang, Mu Cai, Yixuan Li Links: Merch: store.ykilcher.com TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://discord.gg/4H8xxDF BitChute: https://www.bitchute.com/channel/yann... LinkedIn: https://www.linkedin.com/in/ykilcher BiliBili: https://space.bilibili.com/2017636191 If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannick... Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n
undefined
Mar 14, 2022 • 36min

VOS: Learning What You Don't Know by Virtual Outlier Synthesis (Paper Explained)

#vos #outliers #deeplearning Sponsor: Assembly AI Check them out here: https://www.assemblyai.com/?utm_sourc... Outliers are data points that are highly unlikely to be seen in the training distribution, and therefore deep neural networks have troubles when dealing with them. Many approaches to detecting outliers at inference time have been proposed, but most of them show limited success. This paper presents Virtual Outlier Synthesis, which is a method that pairs synthetic outliers, forged in the latent space, with an energy-based regularization of the network at training time. The result is a deep network that can reliably detect outlier datapoints during inference with minimal overhead. OUTLINE: 0:00 - Intro 2:00 - Sponsor: Assembly AI (Link below) 4:05 - Paper Overview 6:45 - Where do traditional classifiers fail? 11:00 - How object detectors work 17:00 - What are virtual outliers and how are they created? 24:00 - Is this really an appropriate model for outliers? 26:30 - How virtual outliers are used during training 34:00 - Plugging it all together to detect outliers Paper: https://arxiv.org/abs/2202.01197 Code: https://github.com/deeplearning-wisc/vos Abstract: Out-of-distribution (OOD) detection has received much attention lately due to its importance in the safe deployment of neural networks. One of the key challenges is that models lack supervision signals from unknown data, and as a result, can produce overconfident predictions on OOD data. Previous approaches rely on real outlier datasets for model regularization, which can be costly and sometimes infeasible to obtain in practice. In this paper, we present VOS, a novel framework for OOD detection by adaptively synthesizing virtual outliers that can meaningfully regularize the model's decision boundary during training. Specifically, VOS samples virtual outliers from the low-likelihood region of the class-conditional distribution estimated in the feature space. Alongside, we introduce a novel unknown-aware training objective, which contrastively shapes the uncertainty space between the ID data and synthesized outlier data. VOS achieves state-of-the-art performance on both object detection and image classification models, reducing the FPR95 by up to 7.87% compared to the previous best method. Code is available at this https URL. Authors: Xuefeng Du, Zhaoning Wang, Mu Cai, Yixuan Li Links: Merch: store.ykilcher.com TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://discord.gg/4H8xxDF BitChute: https://www.bitchute.com/channel/yann... LinkedIn: https://www.linkedin.com/in/ykilcher BiliBili: https://space.bilibili.com/2017636191 If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannick... Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n
undefined
Mar 10, 2022 • 1h 37min

Spurious normativity enhances learning of compliance and enforcement behavior in artificial agents

#deepmind #rl #society This is an in-depth paper review, followed by an interview with the papers' authors! Society is ruled by norms, and most of these norms are very useful, such as washing your hands before cooking. However, there also exist plenty of social norms which are essentially arbitrary, such as what hairstyles are acceptable, or what words are rude. These are called "silly rules". This paper uses multi-agent reinforcement learning to investigate why such silly rules exist. Their results indicate a plausible mechanism, by which the existence of silly rules drastically speeds up the agents' acquisition of the skill of enforcing rules, which generalizes well, and therefore a society that has silly rules will be better at enforcing rules in general, leading to faster adaptation in the face of genuinely useful norms. OUTLINE: 0:00 - Intro 3:00 - Paper Overview 5:20 - Why are some social norms arbitrary? 11:50 - Reinforcement learning environment setup 20:00 - What happens if we introduce a "silly" rule? 25:00 - Experimental Results: how silly rules help society 30:10 - Isolated probing experiments 34:30 - Discussion of the results 37:30 - Start of Interview 39:30 - Where does the research idea come from? 44:00 - What is the purpose behind this research? 49:20 - Short recap of the mechanics of the environment 53:00 - How much does such a closed system tell us about the real world? 56:00 - What do the results tell us about silly rules? 1:01:00 - What are these agents really learning? 1:08:00 - How many silly rules are optimal? 1:11:30 - Why do you have separate weights for each agent? 1:13:45 - What features could be added next? 1:16:00 - How sensitive is the system to hyperparameters? 1:17:20 - How to avoid confirmation bias? 1:23:15 - How does this play into progress towards AGI? 1:29:30 - Can we make real-world recommendations based on this? 1:32:50 - Where do we go from here? Paper: https://www.pnas.org/doi/10.1073/pnas... Blog: https://deepmind.com/research/publica... Abstract: The fact that humans enforce and comply with norms is an important reason why humans enjoy higher levels of cooperation and welfare than other animals. Some norms are relatively easy to explain; they may prohibit obviously harmful or uncooperative actions. But many norms are not easy to explain. For example, most cultures prohibit eating certain kinds of foods and almost all societies have rules about what constitutes appropriate clothing, language, and gestures. Using a computational model focused on learning shows that apparently pointless rules can have an indirect effect on welfare. They can help agents learn how to enforce and comply with norms in general, improving the group’s ability to enforce norms that have a direct effect on welfare. Authors: Raphael Köster, Dylan Hadfield-Menell, Richard Everett, Laura Weidinger, Gillian K. Hadfield, Joel Z. Leibo Links: TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://discord.gg/4H8xxDF BitChute: https://www.bitchute.com/channel/yann... LinkedIn: https://www.linkedin.com/in/ykilcher BiliBili: https://space.bilibili.com/2017636191 If you want to support me, the best thing to do is to share out the content :)

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app