AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
Exploring the Computational Capabilities of LLMs
This chapter delves into groundbreaking research by a Swiss team on large language models (LLMs) and their unexpected computational skills. It highlights how these models tackle reasoning tasks akin to 2D vision challenges, revealing their ability to intuitively understand structural elements of problems despite operating in a 1D text environment.
Daniel Franzen and Jan Disselhoff, the "ARChitects" are the official winners of the ARC Prize 2024. Filmed at Tufa Labs in Zurich - they revealed how they achieved a remarkable 53.5% accuracy by creatively utilising large language models (LLMs) in new ways. Discover their innovative techniques, including depth-first search for token selection, test-time training, and a novel augmentation-based validation system. Their results were extremely surprising.
SPONSOR MESSAGES:
***
CentML offers competitive pricing for GenAI model deployment, with flexible options to suit a wide range of models, from small to large-scale deployments. Check out their super fast DeepSeek R1 hosting!
https://centml.ai/pricing/
Tufa AI Labs is a brand new research lab in Zurich started by Benjamin Crouzier focussed on o-series style reasoning and AGI. They are hiring a Chief Engineer and ML engineers. Events in Zurich.
Goto https://tufalabs.ai/
***
Jan Disselhoff
https://www.linkedin.com/in/jan-disselhoff-1423a2240/
Daniel Franzen
https://github.com/da-fr
ARC Prize: http://arcprize.org/
TRANSCRIPT AND BACKGROUND READING:
https://www.dropbox.com/scl/fi/utkn2i1ma79fn6an4yvjw/ARCHitects.pdf?rlkey=67pe38mtss7oyhjk2ad0d2aza&dl=0
TOC
1. Solution Architecture and Strategy Overview
[00:00:00] 1.1 Initial Solution Overview and Model Architecture
[00:04:25] 1.2 LLM Capabilities and Dataset Approach
[00:10:51] 1.3 Test-Time Training and Data Augmentation Strategies
[00:14:08] 1.4 Sampling Methods and Search Implementation
[00:17:52] 1.5 ARC vs Language Model Context Comparison
2. LLM Search and Model Implementation
[00:21:53] 2.1 LLM-Guided Search Approaches and Solution Validation
[00:27:04] 2.2 Symmetry Augmentation and Model Architecture
[00:30:11] 2.3 Model Intelligence Characteristics and Performance
[00:37:23] 2.4 Tokenization and Numerical Processing Challenges
3. Advanced Training and Optimization
[00:45:15] 3.1 DFS Token Selection and Probability Thresholds
[00:49:41] 3.2 Model Size and Fine-tuning Performance Trade-offs
[00:53:07] 3.3 LoRA Implementation and Catastrophic Forgetting Prevention
[00:56:10] 3.4 Training Infrastructure and Optimization Experiments
[01:02:34] 3.5 Search Tree Analysis and Entropy Distribution Patterns
REFS
[00:01:05] Winning ARC 2024 solution using 12B param model, Franzen, Disselhoff, Hartmann
https://github.com/da-fr/arc-prize-2024/blob/main/the_architects.pdf
[00:03:40] Robustness of analogical reasoning in LLMs, Melanie Mitchell
https://arxiv.org/html/2411.14215
[00:07:50] Re-ARC dataset generator for ARC task variations, Michael Hodel
https://github.com/michaelhodel/re-arc
[00:15:00] Analysis of search methods in LLMs (greedy, beam, DFS), Chen et al.
https://arxiv.org/html/2408.00724v2
[00:16:55] Language model reachability space exploration, University of Toronto
https://www.youtube.com/watch?v=Bpgloy1dDn0
[00:22:30] GPT-4 guided code solutions for ARC tasks, Ryan Greenblatt
https://redwoodresearch.substack.com/p/getting-50-sota-on-arc-agi-with-gpt
[00:41:20] GPT tokenization approach for numbers, OpenAI
https://platform.openai.com/docs/guides/text-generation/tokenizer-examples
[00:46:25] DFS in AI search strategies, Russell & Norvig
https://www.amazon.com/Artificial-Intelligence-Modern-Approach-4th/dp/0134610997
[00:53:10] Paper on catastrophic forgetting in neural networks, Kirkpatrick et al.
https://www.pnas.org/doi/10.1073/pnas.1611835114
[00:54:00] LoRA for efficient fine-tuning of LLMs, Hu et al.
https://arxiv.org/abs/2106.09685
[00:57:20] NVIDIA H100 Tensor Core GPU specs, NVIDIA
https://developer.nvidia.com/blog/nvidia-hopper-architecture-in-depth/
[01:04:55] Original MCTS in computer Go, Yifan Jin
https://stanford.edu/~rezab/classes/cme323/S15/projects/montecarlo_search_tree_report.pdf
Listen to all your favourite podcasts with AI-powered features
Listen to the best highlights from the podcasts you love and dive into the full episode
Hear something you like? Tap your headphones to save it with AI-generated key takeaways
Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more
Listen to all your favourite podcasts with AI-powered features
Listen to the best highlights from the podcasts you love and dive into the full episode