AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
HR Heretics is a podcast hosted by Nolan Church and Kelly Dragovich. The podcast aims to address the practical and important topics in people ops and HR that are often overlooked or discussed only theoretically. It offers insights and discussions on areas such as compensation, terminations, M&A, recruiting, and leadership mindset. The podcast features notable guests, including Patty McCord, former Chief Talent Officer at Netflix, who provide candid perspectives on these topics.
The paper 'Think Before You Speak' introduces a novel technique for training language models using pause tokens. The idea is to allow language models to manipulate additional hidden vectors before generating the next token. By giving them the ability to pause and process information more, performance improvements have been observed. This method, which incorporates the use of pause tokens, presents an alternative to traditional few-shot chaining of thought approaches. Empirical results demonstrate improved model performance and potential for better reasoning capabilities beyond the current context window.
The paper 'Analogical Prompting' presents a new approach, outperforming few-shot chaining of thought, for using prompts in language models. The method involves asking the model to recall relevant examples that can inform its response to a given problem. Instead of providing explicit reasoning or few-shot examples, the model retrieves suitable examples from its internal representation. The approach has shown strong potential for improved performance in language models by leveraging the model's ability to associate and process relevant examples and apply them to new challenges.
The paper 'Streaming LLM' proposes a restructuring of compute for language models, allowing the context length to scale linearly with device count. This novel approach addresses memory constraints and increases the context window without excessive computational costs. By optimizing the distribution of information passing between devices, this technique reduces global training compute requirements. With this method, language models can handle context lengths beyond the current limitations, leading to potential advancements in model performance and the ability to incorporate larger bodies of text for improved understanding.
The paper 'Ring Attention' presents a redesign of computation structures in transformers to increase efficiency. By reconfiguring the passing of information between devices, this approach reduces memory and data passing requirements, optimizing computational resources. This redesign allows for longer context lengths and potentially reduces global training compute needs. Although the technique retains full attention without shortcuts, it improves the overall structure of compute, making it more efficient and potentially paving the way for future advancements in language models.
Trey Kollmer returns to discuss the latest AI research revelations with Nathan Labenz. They explore how new techniques will shave 10% off global compute needs, how analogical prompting beats few-shot prompting, and how compressive historical records can increase LLM memory and retention abilities. If you need an ERP platform, check out our sponsor NetSuite: http://netsuite.com/cognitive.
SPONSORS: Shopify | NetSuite | Omneky
Shopify is the global commerce platform that helps you sell at every stage of your business. Shopify powers 10% of ALL eCommerce in the US. And Shopify's the global force behind Allbirds, Rothy's, and Brooklinen, and 1,000,000s of other entrepreneurs across 175 countries.From their all-in-one e-commerce platform, to their in-person POS system – wherever and whatever you're selling, Shopify's got you covered. With free Shopify Magic, sell more with less effort by whipping up captivating content that converts – from blog posts to product descriptions using AI. Sign up for $1/month trial period: https://shopify.com/cognitive
NetSuite has 25 years of providing financial software for all your business needs. More than 36,000 businesses have already upgraded to NetSuite by Oracle, gaining visibility and control over their financials, inventory, HR, eCommerce, and more. If you're looking for an ERP platform ✅ head to NetSuite: http://netsuite.com/cognitive and download your own customized KPI checklist.
Omneky is an omnichannel creative generation platform that lets you launch hundreds of thousands of ad iterations that actually work customized across all platforms, with a click of a button. Omneky combines generative AI and real-time advertising data. Mention "Cog Rev" for 10% off.
RECOMMENDED PODCAST:
Every week investor and writer of the popular newsletter The Diff, Byrne Hobart, and co-host Erik Torenberg discuss today’s major inflection points in technology, business, and markets – and help listeners build a diversified portfolio of trends and ideas for the future. Subscribe to “The Riff” with Byrne Hobart and Erik Torenberg: https://link.chtbl.com/theriff
TIMESTAMPS:
(00:00:00) - Episode Preview
(00:01:11) - Paper: Think Before You Speak
(00:03:13) - Multimodal models for combining vision and language
(00:04:19) - Backspace Paper
(00:06:25) - Chain of thought prompting for step-by-step reasoning
(00:09:14) - Backspacing in language models to correct mistakes
(00:12:05) - Attention sinks for expanding context length
(0012:41) - Paper: Large Language Models as Analogical Reasoners
(00:15:24) - Pause tokens for language models to "think"
(00:18:23) - Analogical prompting to recall relevant examples
(00:20:52) - Long context windows for language models
(00:23:20) - Markdown works best for OpenAI
(00:24:23) - Ring attention to break memory constraints
(00:26:15) - Paper: StreamingLLMs
(00:27:46) - Potential for superhuman performance with longer contexts
(00:31:01) - Dynamic context window adjustment at runtime
(00:33:53) - Retention and memory capabilities for transformers
(00:37:12) - Planning algorithms combined with memory and scale
(00:39:49) - Paper: Ring Attention
(00:42:35) - Executive assistant prompting and critique
(00:45:23) - Self-RAG for language models to find own examples
(00:48:02) - Timelines and predictions for future capabilities
(00:50:37) - Applications like analyzing long texts and scripts
(00:53:15) - Local versus global attention in transformers
(00:55:59) - Architectural changes versus just training adjustments
(00:58:41) - Pre-training strategies like random start points
This show is produced by Turpentine: a network of podcasts, newsletters, and more, covering technology, business, and culture — all from the perspective of industry insiders and experts. We’re launching new shows every week, and we’re looking for industry-leading sponsors — if you think that might be you and your company, email us at erik@turpentine.co.
Listen to all your favourite podcasts with AI-powered features
Listen to the best highlights from the podcasts you love and dive into the full episode
Hear something you like? Tap your headphones to save it with AI-generated key takeaways
Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more
Listen to all your favourite podcasts with AI-powered features
Listen to the best highlights from the podcasts you love and dive into the full episode