Limitation of Long-Term Sequence Handling in Text

1min Snip

00:00

Play full episode

Summary

Transcript

Episode notes

Handling long sequences of text seems to challenge the coherence of the model. While the model is trained on long-term text, it struggles to carry information forward indefinitely. The average length of chat observed is 1467 tokens, indicating a limitation compared to today's context windows. The model can go slightly beyond this length but not dramatically. The handling of long sequences is the least proven aspect, which may affect downstream inferences and speculations.

In this episode, Nathan does an emergency pod deep dive into Mamba, a new state space model architecture. If you need an ecommerce platform, check out our sponsor Shopify: https://shopify.com/cognitive for a $1/month trial period.

We're hiring across the board at Turpentine and for Erik's personal team on other projects he's incubating. He's hiring a Chief of Staff, EA, Head of Special Projects, Investment Associate, and more. For a list of JDs, check out: eriktorenberg.com.

SPONSORS:

Shopify is the global commerce platform that helps you sell at every stage of your business. Shopify powers 10% of ALL eCommerce in the US. And Shopify's the global force behind Allbirds, Rothy's, and Brooklinen, and 1,000,000s of other entrepreneurs across 175 countries.From their all-in-one e-commerce platform, to their in-person POS system – wherever and whatever you're selling, Shopify's got you covered. With free Shopify Magic, sell more with less effort by whipping up captivating content that converts – from blog posts to product descriptions using AI. Sign up for $1/month trial period: https://shopify.com/cognitive

Omneky is an omnichannel creative generation platform that lets you launch hundreds of thousands of ad iterations that actually work customized across all platforms, with a click of a button. Omneky combines generative AI and real-time advertising data. Mention "Cog Rev" for 10% off.

NetSuite has 25 years of providing financial software for all your business needs. More than 36,000 businesses have already upgraded to NetSuite by Oracle, gaining visibility and control over their financials, inventory, HR, eCommerce, and more. If you're looking for an ERP platform ✅ head to NetSuite: http://netsuite.com/cognitive and download your own customized KPI checklist.

X/SOCIAL:

@labenz (Nathan)

@CogRev_Podcast (Cognitive Revolution)

TIMESTAMPS:

(00:00:00) - Episode Preview

(00:03:15) - Inventing something better than the transformer

(00:05:17) - Examination of human cognition

(00:15:44) - Sponsor: Shopify

(00:27:00) - Weaknesses of the transformer

(00:30:16) - Sponsor: Netsuite | Omneky

(00:33:40) - Giving AI memory

(00:39:32) - State Space Model Revolution

(00:55:00) - SRAM and High Bandwidth Memory

(01:36:16) - Block State and Hyena models

(02:04:00) - Advancing AI safety and interpretability

This show is produced by Turpentine: a network of podcasts, newsletters, and more, covering technology, business, and culture — all from the perspective of industry insiders and experts. We’re launching new shows every week, and we’re looking for industry-leading sponsors — if you think that might be you and your company, email us at erik@turpentine.co.