ThursdAI - The top AI news from the past week cover image

ThursdAI - The top AI news from the past week

Latest episodes

undefined
Sep 10, 2023 • 54min

🔥🎙️ ThursdAI Sunday special - Extending LLaMa to 128K context window (2 orders of magnitude) with YaRN [Interview with authors]

This is a free preview of a paid episode. To hear more, visit sub.thursdai.newsHappy Sunday everyone, I am very excited to bring you this interview with the folks who took LLaMa 2 and made it LLoooooongMa!Extending LLaMa 2 context window from 4,000 to a whopping 128,000 tokens (Yarn-Llama-2-13b-128k on Hugging Face), these guys also came up with a paper called YaRN (Efficient Context Window Extension of Large Language Models) and showed that YaRN is not only requires 10x less tokens to create these long contexts, but also 2.5x less training steps! And, the models generalize so there’s now no need to collect extremely long sequences (think books length sequences) for the models to understand those context lengths. I have decided also to do something different (which took me half of Sunday so I can’t promise and am not committing to this format, but for the premium subscribers, you can now watch this interview with running Karaoke style subtitles and improved audio! This will be uploaded to Youtube in a week but aren’t you glad you subscribed and is getting this first?) Here’s a teaser preview: And here’s the chapter for your convenience (the only thing that’s ai generated 😂)0:00 - Introduction3:08 - Discussion of extending LLAMA2's context length from 4,000 tokens to 128,000 tokens using the YaRN method8:23 - Explanation of rope scaling for positional encodings in transformers13:21 - How the rope scaling idea allows for longer context through positional interpolation18:51 - Using in-context learning to train models on shorter sequences but still handle long contexts25:18 - Sourcing long-form data like books to train 128k token models31:21 - Whether future models will natively support longer contexts37:33 - New model from Adept with 16k context using rope scaling42:46 - Attention is quadratic - need better algorithms to make long context usable49:39 - Open source community pushing state of the art alongside big labs52:34 - Closing thoughtsAs always, full (manually edited) transcription (and this time a special video version!) is reserved for the premium subscribers, I promise it’ll be worth it, so why not .. y’know? skip a cup of coffee from SB and support ThursdAI?
undefined
Sep 7, 2023 • 29min

ThursdAI Sep 7 - Falcon 180B 🦅 , 🔥 Mojo lang finally here, YaRN scaling interview, Many OSS models & more AI news

Hey ya’ll, welcome to yet another ThursdAI, this is Alex coming at you every ThursdAI, including a live recording this time! Which was incredible, we chatted about Falcon 180B,had a great interview in the end with 3 authors of the YaRN scaling paper and LLongMa 128K context, had 3 breaking news! in the middle, MOJO🔥 has been released and Adept released a LLaMa comparable OSS model (and friend of the pod) @reach_vb showed an open ASR leaderboard on hugging face! We also covered an incredible tiny model called StarCoder 1B that was finetuned by friend of the pod (who joined the space to talk to us about it!) As always, you can listen to the whole 3 hour long form conversation (raw, unedited) on our Zealous page (and add it to your podcatcher via this RSS) and this short-form pod is available on Apple, Spotify and everywhere. ThursdAI - Hey, if you enjoy these, how about subscribing for real? Would love to do this full time! Every paid subscriber is like a dear friend 🧡TL;DR of all topics covered* Open Source LLM* Falcon 180B announced by TIIUAE (Announcement, Demo)* YaRN scaling paper - scaling LlaMa to 128K context (link)* OpenHermes-13B from @teknium1 (link)* Persimmon-8B from Adept.AI (link)* Starcoder-1B-sft from @abacaj (link) * Big Co LLMs + API updates* OpenAI first ever Dev conference (link)* Claude announces a $20/mo Claude Pro tier (link)* Modular releases Mojo🔥 with 68,000x improvement over python (Link)* Vision* Real time deepfake with FaceFusion (link)* HeyGen released AI avatars and AI video translation with lipsync (link, translation announcement)* Voice* Open ASR (automatic speech recognition) leaderboard from HuggingFace (link)* Tools* LangChain Hub (re) launched * Open Interpreter (Announcement, Github)Open Source LLM🦅 Falcon 180B - The largest open source LLM to date (Announcement, Demo)The folks at the “Technology Innovation Institute” have open sourced the huge Falcon 180B, and have put it up on Hugging Face. Having previously open sourced Falcon 40B, the folks from TIIUAE have given us a huge model that beats (base) LLaMa 2 on several evaluations, if just slightly by a few percentages points. It’s huge, was trained on 3.5 trillion tokens and weights above 100GB as a file and requires 400GB for inference. Some folks were not as impressed with Falcon performance, given it’s parameter size is 2.5 those of LLaMa 2 (and likely it took a longer time to train) but the relative benchmarks is just a few percentages higher than LLaMa. It also boasts an embarrassingly low context window of just 2K tokens, and code was just 5% of it’s dataset, even though we already know that more code in the dataset, makes the models smarter! Georgi Gerganov is already running this model on his M2 Ultra because he’s the Goat, and co-host of ThursdAI spaces, Nisten, was able to run this model with CPU-only and with just 4GB of ram 🤯 We’re waiting for Nisten to post a Github on how to run this monsterous model on just CPU, because it’s incredible! However, given the Apache2 license and the fine-tuning community excitement about improving these open models, it’s an incredible feat. and we’re very happy that this was released! The complete open sourcing also matters in terms of geopolitics, this model was developed in the UAE, while in the US, the export of A100 GPUs was banned to the middle easy, and folks are talking about regulating foundational models, and this release, size and parameter model that’s coming out of the United Arab Emirates, for free, is going to definitely add to the discussion wether to regulate AI, open source and fine-tuning huge models! YaRN scaling LLaMa to 128K context windowLast week, just in time for ThursdAI, we posted about the release of Yarn-Llama-2-13b-128k, a whopping 32x improvement in context window size on top of the base LLaMa from the folks at Nous Research, Enrico Shippole, @theemozilla with the help of Eluether AI.This week, they released the YaRN: Efficient Context Window Extension of Large Language Models paper which uses Rotary Position Embeddings to stretch the context windows of transformer attention based LLMs significantly. We had friends of the pod Enrico Shippole, theemozilla (Jeff) and Bowen Peng on the twitter space and an special interview with them will be released on Sunday, if you’re interested in scaling and stretching context windows work, definitely subscribe for that episode, it was incredible! It’s great to see that their work is already applied into several places, including CodeLLaMa (which was released with 16K - 100K context) and the problem is now compute, basically, context windows can be stretched, and the models are able to generalize from smaller datasets, such that the next models are predicted to be released with infinite amount of context window, and it’ll depend on your hardware memory requirements.Persimmon-8B from AdeptAI (announcement, github)AdeptAI, the company behind Act-1, a foundational model for AI Agent that does browser driving, and has a few co-founders that are the original transformers paper authors, have dropped a ThursdAI surprise, a fresh (read, not a LLaMa clone) model!Releasing an completely open source model called Persimmon-8B, with a full Apache 2 license, 16K context window (using custom RoPE scaling methods) and some interesting inference speedups with C++. A very interesting 8B model that can fit on most consumer hardware, with additional tricks and a huge context window, is definitely welcome! Additional interesting point is, they have 70K unused embeddings for multimodal extensions! Can’t wait to see what’s that about!Starcoder-1B-sft - tiny model that’s great at codeAnton Bacaj (@abacaj) has finetuned StarCoder, to achieve some incredible results, for such a tiny model! Remember the first item, a whopping 180B parameter Falcon? We’ll, this is just 1B parameters model, finetuned on 65K sampled dataset of code, that’s outperforming Falcon, LLaMa2, Palm-2 (and Persimmon) on coding tasks, and runs on your device, so fast, that it’s hard to read! It boasts an incredible 39% on HumanEval task and 31% on MBPP! (Anton reran and updated the MBPP score later) and can run locally. Friend of the pod Xenova has already ported this model to transformers.js and it’ll soon run in your browser! OpenHermes-13B from @teknium1 (link)Our friend Teknium1 (who we’ve interviewed a few weeks ago) releases OpenHermes on top of LLaMa2, but this time it’s a completely open model and datasets, marking this the first time that Hermes models have been open!OpenHermes was trained on 242,000 entries of primarily GPT-4 generated data, from open datasets across the AI landscape, including: * GPTeacher - General Instruct, * Roleplay v1, Roleplay v2, and Code Instruct Datasets, by Teknium * WizardLM (v1, evol_instruct 70k), by WizardLM * Team/nlpxucan Airoboros GPT-4 (v1.0), by JonDurbin * Camel-AI's domain expert datasets, by the Camel-AI Team * CodeAlpaca, by Sahil2801 * GPT4-LLM and * Unnatural Instructions, by MicrosoftCheck it out folks! Big Co LLM + API updatesModular finally ships Mojo 🔥 (Announcement)I just knew it, that Mojo will finally be shipped during ThursdAI, and in fact, this was a great #BreakingNews moment on twitter spaces!Modular, and it’s co-founder Chris Lattner (author of LLVM, MLIR, Swift and many other things) have finally released their Mojo 🔥 language, for AI. Mojo 🔥 is like Python++, includes strong types, full interoperability with python ecosystem but is able to run basic vanilla python, and has so so much more in it, but the main thing Modular is claiming is a whopping 68,000x improvement over vanilla python! You didn’t misread this, 68,000 improvement, when using all the Modular inference compilers, and Mojo virtualization tricks and compilation improvements. It’s incredible. The beauty of Mojo is that it meets developers where they are and allows them to adopt new features to achieve high performance gradually. By combining the best of dynamic and static languages, Mojo can deliver performance up to 68,000 times faster than Python today. That's quite a leap! If you want to delve deeper into Mojo's origin story, you can find more information in their documentation. But for now, let me highlight a few key benefits that Mojo offers:Firstly, Mojo allows you to write everything in one language, merging the usability of Python with the systems programming features that typically require developers to rely on C, C++, or CUDA. This means that both research and deployment teams can work within a common codebase, streamlining the workflow from research to production.Secondly, Mojo unlocks Python's performance potential. While Python is widely used, it may not be the best tool for high-performance or specialized hardware tasks. However, Mojo bridges that gap by enabling high performance on CPUs and providing support for exotic accelerators like GPUs and ASICs. With Mojo, you can achieve performance levels on par with C++ and CUDA.Thirdly, and this is a big one, Mojo seamlessly integrates with the entire Python ecosystem. You can leverage the extensive library collection available in Python while making use of Mojo's features and performance benefits. This means you can easily combine libraries like NumPy and Matplotlib with your Mojo code – talk about flexibility!Finally, Mojo allows you to upgrade your AI workloads effortlessly. By tightly integrating with the Modular AI Engine, Mojo empowers you to extend your AI workloads with custom operations. This includes pre-processing and post-processing operations, as well as high-performance mathematical algorithms. You can even integrate kernel fusion, graph rewrites, shape functions, and more. Mojo is all about expanding the possibilities!Mojo’s playground has been around since May and I have a deep dive here but you should really watch for over 3 hours on everything from Why they chose to be a python superset, to why he thinks the community will pick it up, it’s an incredible watch and will make you excited about Mojo! WebGPU ships with support for FP16 in Chromium Chrome has shipped with WebGPU back in April of 23’, after years of development, it allows high performance 3D graphics (and of course, transformers inference) in the browser and on the web! However, for inference of models, GPU access is not enough, you also need to be able to run smaller models. Well, one way to make models smaller is to run them in fp16 format. Essentially cutting the precision of the weights numbers by half, we can use much smaller (read compressed) models with a slight loss in accuracy. Friends of the pod Nisten and Xenova (transformers.js author) have given us an update that a new, updated fp16 support has shipped in nightly of chromium, allowing for much much smaller models to be run on clientside! OpenAI first dev conference (Announcement)OpenAI has announced their first developer focused conference, to happen in SF during November 6th! In person only (with the keynote being streamed to all) and they also said that they won’t do any model announcement like GPT-5 😂But we'll all expect at least a few API updates! VisionFaceFusion 1.1.0 - a deepfake faceswapper (Announcement, Github) We all know deepfakes are here, I mean, don’t we? But did you know that it’s now super easy to face swap your face into an image or a video? FaceFusion does just that, an incredibly fast way to deepfake someone’s face into an image or a video with a few clicks, works on CPU (I couldn’t make it work on GPU but it’s possible) and shows incredible results! Enjoy Steve Buschemi dance around as Harry Styles? 3 clicks and 10 minutes and you get this 🔥Friend of the pod CocktailPeanut, has made it incredible easy to install with just 1 click with his pinokio.computer app, which I use and love! Facefusion also has a webcam mode that is able to deepfake any image onto a webcam stream for a lot of fun on zoom calls! (which I wasn’t able to test for some reason) HeyGen launches their deep AI face creatorMany of us used 11Labs to clone voices, but what if you can clone a voice AND an image of a person? With just 2 minutes of their recording? That’s what HeyGen are claiming to be able to do, and we’ve previously reported that their incredible realistic AI avatar generation from videos/images + voice really blew us away. Heygen just launched their service and you can sign up and get a few minutes for free, here’s a sample (with the CEO avatar, they couldn’t make my own for some launch day errors) The video you see on top of just that, the CEO of HeyGen, thanking you for reading this weeks ThursdAI! VoiceASR leaderboard + New top ASR model from NvidiaI love doing ThursdAI, and one of the things I love most, is folks sending me stuff they worked on, and then coming to ThursdAI to chat about it. Friend of the pod Vaibhav (VB) Srivastav, who’s an incredible dev rel at HuggingFace, focusing on Audio, has shipped a new Open-ASR (automatic speech recognition) leaderboard on huggingface! Showing the top ASR models like Whisper and a new comer, Nvidia FastConformer, which I didn’t even know existed, and now it’s topping Whisper for english speech to text tasks! HuggingFace leaderboards like these are definitely a boon for the Open Source industry as they allow all of us to easily select open source models, but also allow the open source community to start racing towards the top, while we all benefit! ToolsOpen Interpreter (Announcement, Github)One tool that I’ve used this week, and is incredible, is OpenInterpreter from @heyitskillian It’s incredibly easy to install and run, and behaves like OpenAI Code Interpreter (renamed to Advanced Data Analytics) but on your computer, and is able to do things like control your apps, lower volume, edit images/files and tons morepip install open-interpreterAnd that’s it! Give it a try (and you have to approve each command that it runs) It’s a great agent, and hopefully we’ll get Killian to chat with us about it on next ThursdAI!LangChain hub has launched (link)If you’re into LangChain, and even if you aren’t, it’s undeniable the weight LangChain has in the ai engineer industry! They have a connector for everything, tons of folks use them, and they have raised a bunch of funding. They have just launched their new LangChain Hub and it’s exciting! Many folks are sharing their best prompts on there, and ways to work with langchain, with upvotes and sharable links! Also, worth nothing that our friends swyx and Alessio from Latent Space have recently released an episode with Harrison on Latent space, and it’s WELL worth listening (and reading) as swyx did a deep dive into Landchain, it’s nay-sayers and everything in between! Check it out below : Thank you, see you next time (with some incredible personal news I’ll have to share)ThursdAI - Recaps of the most high signal AI weekly spaces is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber. This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit sub.thursdai.news/subscribe
undefined
Aug 25, 2023 • 1h 8min

ThursdAI Aug 24 - Seamless Voice Model, LLaMa Code, GPT3.5 FineTune API & IDEFICS vision model from HF

Hey everyone, this week has been incredible (isn’t every week?), and as I’m writing this, I had to pause and go check out breaking news about LLama code which was literally released on ThursdAI as I’m writing the summary! I think Meta deserves their own section in this ThursdAI update 👏A few reminders before we dive in, we now have a website (thursdai.news) which will have all the links to Apple, Spotify, Full recordings with transcripts and will soon have a calendar you can join to never miss a live space!This whole thing would have been possible without Yam, Nisten, Xenova , VB, Far El, LDJ and other expert speakers from different modalities who join and share their expertise from week to week, and there’s a convenient way to follow all of them now!TL;DR of all topics covered* Voice* Seamless M4T Model from Meta (demo)* Open Source LLM* LLaMa2 - code from Meta* Vision* IDEFICS - A multi modal text + image model from Hugging face* AI Art & Diffusion* 1 year of Stable Diffusion 🎂* IdeoGram* Big Co LLMs + API updates* GPT 3.5 Finetuninng API* AI Tools & Things* Cursor IDEVoiceSeamless M4t - A multi lingual, mutli tasking, multimodality voice model.To me, the absolute most mindblowing news of this week was Meta open sourcing (not fully, not commercially licensed) SeamlessM4TThis is a multi lingual model that takes speech (and/or text) can generate the following:* Text* Speech* Translated Text* Translated SpeechIn a single model! For comparison sake, I takes a whole pipeline with whisper and other translators in targum.video not to mention much bigger models, and not to mention I don’t actually generate speech!This incredible news got me giddy and excited so fast, not only because it simplifies and unifies so much of what I do into 1 model, and makes it faster and opens up additional capabilities, but also because I strongly believe in the vision that Language Barriers should not exist and that’s why I built Targum.Meta apparently also believes in this vision, and gave us an incredible new power unlock that understands 100 languages and does so multilingually without effort.Language barriers should not existDefinitely checkout the discussion in the podcast, where VB from the open source audio team on Hugging Face goes in deeper into the exciting implementation details of this model.Open Source LLMs🔥 LLaMa CodeWe were patient and we got it! Thank you Yann!Meta releases LLaMa Code, a LlaMa fine-tuned on coding tasks, including “in the middle” completion tasks, which are what copilot does, not just autocompleting code, but taking into account what’s surrounding the code it needs to generate.Available in 7B, 13B and 34B sizes, the largest model beats GPT3.5 on HumanEval, which is a metric for coding tasks. (you can try it here)In an interesting move, they also separately release a specific python finetuned versions, for python code specifically.Additional incredible thing is, it supports 100K context window of code, which is, a LOT of code. However it’s unlikely to be very useful in open source because of the compute requiredThey also give us instruction fine-tuned versions of these models, and recommend using them, since those are finetuned on being helpful to humans rather than just autocomplete code.Boasting impressive numbers, this is of course, just the beginning, the open source community of finetuners is salivating! This is what they were waiting for, can they finetune these new models to beat GPT-4? 🤔Nous updateFriends of the Pod LDJ and Teknium1 are releasing the latest 70B model of their Nous Hermes 2 70B model 👏* Nous-Puffin-70BWe’re waiting on metrics but it potentially beats chatGPT on a few tasks! Exciting times!Vision & Multi ModalityIDEFICS - a new 80B model from HuggingFace, was released after a years effort, and is quite quite good. We love vision multimodality here on ThursdAI, we’ve been covering it since we say that GPT-4 demo!IDEFICS is a an effort by hugging face to create a foundational model for multimodality, and it is currently the only visual language model of this scale (80 billion parameters) that is available in open-access.It’s made by fusing the vision transformer CLIP-VIT-H-14 and LLaMa 1, I bet LLaMa 2 is coming soon as well!And the best thing, it’s openly available and you can use it in your code with hugging face transformers library!It’s not perfect of course, and can hallucinate quite a bit, but it’s quite remarkable that we get these models weekly now, and this is just the start!AI Art & DiffusionStable Diffusion is 1 year oldHas it been a year? wow, for me, personally, stable diffusion is what started this whole AI fever dream. SD was the first model I actually ran on my own GPU, the first model I learned how to.. run, and use without relying on APIs. It made me way more comfortable with juggling models, learning what weights were, and we’ll here we are :) I now host a podcast and have a newsletter and I’m part of a community of folks who do the same, train models, discuss AI engineer topics and teach others!Huge thank you to Emad, Stability AI team, my friends there, and everyone else who worked hard on this.Hard to imagine how crazy of a pace we’ve been on since the first SD1.4 release, and how incredibly realistic the images are now compared to what we got then and got excited about!🎂IdeaoGram joins the AI art raceIdeoGram - new text to image from ex googlers (announcement) is the new kid on the block, not open source (unless I missed it) it boasts significant text capabilities, and really great quality of imagery. It also has a remix ability, and is availble from the web, unlike… MidJourney!Big Co LLMs + API updatesOpen AI pairs with ScaleAI to let enterprises finetune and run finetuned GPT3.4 models!This is an interesting time for OpenAI to dive into fine-tuning, as open source models inch closer and closer to GPT3.5 on several metrics with each week.Reminder, if you finetune a GPT3.5 model ,you need to provide your own data to OpenAI but then also you have to pay them for essentially hosting a model just for you, which means it’s not going to be cheap.Use as much prompting as humanly possible before you consider doing the above fine-tuning and you may be able to solve your task much better and cheaper.AgentsThe most interesting thing to me in the world of agents actually came from an IDE!I installed Cursor, the new AI infused VsCode clone, imported my vscode settings, and off we went! It can use your own GPT-4 keys if you don’t want to send them our code or pay, it embeds your whole repo for easy import and code understand and does so much more, like adding a button to every error in console to “debug” and has an “new AI project” feature, which builds you a template just by typing a few words!Our friends Alessio and Swyx have interviewed the founder of Cursor on their podcast, a strong recommendation to check that episode out!After using Cursor for just a few days, I don’t want to go back to VSCode and even consider … maybe pausing my copilot subscription 🤯That’s all for today folks! I wish you all a great week, and we’ll see you in the next ThursdAI 🫡Thank you for reading ThursdAI - Recaps of the most high signal AI weekly spaces. This post is public so feel free to share it with a friend? Let’s get to 1K readers 🔥 This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit sub.thursdai.news/subscribe
undefined
Aug 20, 2023 • 52min

🎙️ThursdAI - LLM Finetuning deep dive, current top OSS LLMs (Platypus 70B, OrctyPus 13B) authors & what to look forward to

This is a free preview of a paid episode. To hear more, visit sub.thursdai.newsBrief outline for your convenience:[00:00] Introduction by Alex Volkov[06:00] Discussing the Platypus models and data curation process by Ariel, Cole and Nathaniel[15:00] Merging Platypus with OpenOrca model by Alignment Labs* Combining strengths of Platypus and OpenOrca* Achieving state-of-the-art 13B model[40:00] Mixture of Experts (MOE) models explanation by Prateek and Far El[47:00] Ablation studies on different fine-tuning methods by TekniumFull transcript is available for our paid subscribers 👇 Why don’t you become one?Here’s a list of folks and models that appear in this episode please follow all of them on X:* ThursdAI cohosts - Alex Volkov, Yam Peleg, Nisten Tajiraj* Garage Baind - Ariel, Cole and Nataniel (platypus-llm.github.io)* Alignment Lab - Austin, Teknium (Discord server)* SkunkWorks OS - Far El, Prateek Yadav, Alpay Ariak (Discord server)* Platypus2-70B-instruct* Open Orca Platypus 13BI am recording this on August 18th, which marks the one month birthday of the Lama 2 release from Meta. It was the first commercially licensed large language model of its size and quality, and we want to thank the great folks at MetaAI. Yann LeCun, BigZuck and the whole FAIR team. Thank you guys. It's been an incredible month since it was released.We saw a Cambrian explosion of open source communities who make this world better, even since Lama 1. For example, LLaMa.Cpp by Georgi Gerganov is such an incredible example of how open source community comes together and this one guy in the weekend Took the open source weights and made it run on CPUs and much, much faster.Mark Zuckerberg even talked about this, how amazing the open source community has adopted LLAMA, and that Meta is also now adopting many of those techniques and developments back to run their own models cheaper and faster. And so it's been exactly one month since LLAMA 2 was released.And literally every ThursdAI since then, we have covered a new state of the art open source model all based on Lama 2 that topped the open source model charts on Hugging Face.Many of these top models were fine tuned by Discord organizations of super smart folks who just like to work together in the open and open source their work.Many of whom are great friends of the pod.Nous Research, with whom we've had a special episode a couple of weeks back Teknium1 seems to be part of every orgm Alignment Labs and GarageBaind being the last few folks topping the charts.I'm very excited not to only bring you an interview with Alignment Labs and GarageBaind, but also to give you a hint of two additional very exciting efforts that are happening in some of these discords.I also want to highlight how many of those folks do not have data scientist backgrounds. Some of them do. So we had a few PhDs or PhD studies folks, but some of them studied all this at home with the help of GPT 4. And some of them even connected via ThursdAI community and space, which I'm personally very happy about.So this special episode has two parts. The first part we're going to talk with Ariel. Cole and Natniel, currently known as GarageBaind, get it? bAInd, GarageBaind, because they're doing AI in their garage. I love it.🔥 Who are now holding the record for the best performing open source model called Platypus2-70B-Instruct.And then, joining them is Austin from Alignment Labs, the authors of OpenOrca, also a top performing model, will talk about how they've merged and joined forces and trained the best performing 13b model called Open Orca Platypus 13B or Orctypus 13BThis 13b parameters model comes very close to the Base Llama 70b. So, I will say this again, just 1 month after Lama 2 released by the great folks at Meta, we now have a 13 billion parameters model, which is way smaller and cheaper to run that comes very close to the performance benchmarks of a way bigger, very expensive to train and run 70B model.And I find it incredible. And we've only just started, it's been a month. And so the second part you will hear about two additional efforts, one run by Far El, Prateek and Alpay from the SkunksWorks OS Discord, which is an effort to bring everyone an open source mixture of experts model, and you'll hear about what mixture of experts is.And another effort run by a friend of the pod Teknium previously a chart topper himself with Nous Hermes models and many others, to figure out which of the fine tuning methods are the most efficient. and fast and cheap to run. You will hear several mentions of LORAs, which stand for Low Rank Adaptation, which are basically methods of keeping the huge weights of LAMA and other models frozen and retrain and fine tune and align some specific parts of it with new data, which is a method we know from Diffusion World.And it's now applying to the LLM world and showing great promise in how fast, easy, and cheap it is to fine tune these huge models with significantly less hardware costs and time. Specifically, Nataniel Ruiz, the guy who helped Ariel and Cole to train Platypus, the co-author on DreamBooth, StyleDrop and many other diffusion methods, mentioned that it takes around five hours on a single A100 GPU to fine tune the 13B parameter model. That, if you can find an A100 GPU, that's around $10.That's incredible.I hope you enjoy listening and learning from these great folks, and please don’t forget to checkout our website at thursdai.news for all the links, socials and podcast feeds.Brief outline for your convinience:[00:00] Introduction by Alex Volkov[06:00] Discussing the Platypus models and data curation process by Ariel, Cole and Nathaniel[15:00] Merging Platypus with OpenOrca model by Alignment Labs* Combining strengths of Platypus and OpenOrca* Achieving state-of-the-art 13B model[40:00] Mixture of Experts (MOE) models explanation by Prateek and Far El[47:00] Ablation studies on different fine-tuning methods by TekniumFull transcript is available for our paid subscribers 👇 Why don’t you become one?
undefined
Aug 17, 2023 • 17min

ThursdAI Aug 17 - AI Vision, Platypus tops the charts, AI Towns, Self Alignment 📰 and a special interview with Platypus authors!

Hey everyone, this is Alex Volkov, the host of ThursdAI, welcome to yet another recap of yet another incredibly fast past faced week.I want to start with a ThursdAI update, we now have a new website http://thursdai.news and a new dedicated twitter account @thursdai_pod as we build up the ThursdAI community and brand a bit more.As always, a reminder that ThursdAI is a weekly X space, newsletter and 2! podcasts, short form (Apple, Spotify) and the unedited long-form spaces recordings (RSS, Zealous page) for those who’d like the nitty gritty details (and are on a long drive somewhere).Open Source LLMs & FinetuningHonestly, the speed with which LLaMa 2 finetunes are taking over state of the art performance is staggering. We literally talk about a new model every week that’s topping the LLM Benchmark leaderboard, and it hasn’t even been a month since LLaMa 2 release day 🤯 (July 18 for those who are counting)Enter Platypus 70B (🔗)Platypus 70B-instruct is currently the highest ranked open source LLM and other Platypus versionsWe’ve had the great pleasure to chat with new friends of the pod Arielle Lee and Cole Hunter (and long time friend of the pod Nataniel Ruiz, co-author of DreamBooth, and StyleDrop which we’ve covered before) about this incredible effort to finetune LLaMa 2, the open dataset they curated and released as part of this effort and how quick and easy it is possible to train (a smaller 13B) version of Platypus (just 5 hours on a single A100 GPU ~= 6$ on Lambda 🤯)We had a great interview with Garage BAIND the authors of Platypus and we’ll be posting that on a special Sunday episode of ThursdAI so make sure you are subscribed to receive that when it drops.Open Orca + Platypus = OrctyPus 13B? (🔗)We’ve told you about OpenOrca just last week, from our friends at @alignment_lab and not only is Platypus is the best performing 70B model, the open source community comes through with an incredible merge and collaborating to bring you the best 13B model, which is a merge between OpenOrca and Platypus.This 13B model is now very close to the original LLaMa 70B in many of the metrics. LESS THAN A MONTH after the initial open source. It’s quite a remarkable achievement and we salute the whole community for this immense effort 👏 Also, accelerate! 🔥Join the skunksworksSpeaking of fast moving things, In addition to the above interview, we had a great conversation with folks from so called SkunksWorks OS discord, Namely Far El, Prateek Yadav, Alpay Ariak, Teknium and Alignment Labs, and our recurring guest hosts Yam Peleg and Nisten covered two very exciting community efforts, all happening within the SkunksWorks Discord.First effort is called MoE, Open mixture of experts, which is an Open Source attempt at replicating the Mixture of Experts model, which is widely attributed to why GPT-4 is so much better than GPT-3.The second effort is called Ablation studies, which is an effort Teknium is leading to understand once and for all, what is the best, cheapest and most high quality way to finetune open source models, whether it's Qlora or a full finetune or Loras.If you're interested in any of these, either by helping directly or provide resources such as GPU compute, please join the SkunksWorks discord. They will show you how to participate, even if you don't have prior finetuning knowledge! And we’ll keep you apprised of the results once they release any updates!Big Co LLMs + API updatesIn our Big CO corner, we start with an incredible paper from MetaAi, announcing:Self-Alignment w/ Backtranslation method + Humpback LLM - MetaAISummarized briefly (definitely listen to the full episode and @yampeleg detailed overview of this method) it’s a way for an LLM to be trained on a unsupervised way of creating high quality datasets, for itself! Using not a lot of initial “seed” data from a high quality dataset. Think of it this way, fine-tuning a model requires a lot of “question → response” data in your dataset, and back-translation proposes “response → question” dataset generation, coming up with novel ways of saying “what would a potential instruction be that would make an LLM generate this result”This results in a model that effectively learns to learn better and create it’s own datasets without humans (well at least human labelers) in the loop.Here are some more reading material on X for reference.OpenAI new JS SDK (X link)OpenAI has partnered with StainlessAPI to released a major new version 4 of their TS/JS SDK with the following incredible DX improvements for AI engineers* Streaming responses for chat & completions* Carefully crafted TypeScript types* Support for ESM, Vercel edge functions, Cloudflare workers, & Deno* Better file upload API for Whisper, fine-tune files, & DALL·E images* Improved error handling through automatic retries & error classes* Increased performance via TCP connection reuse* Simpler initialization logicThe most exciting part for me is, this is now very easy to get started with AI projects and get streaming on the incredible Cloudflare workers platform (Targum is part of the first Cloudflare workers launchpad but is not affiliated, we’re just superfans 🫶)Vision & Multi ModalityThere’s been some really cool stuff happening in computer vision and multi-modal AI recently. First up, a new method called 3D Gaussian Splatting that shows an incredibly clear and smooth way to generate 3d scenes from just a few images.Compared to neural radiance fields (NeRFs), Gaussian splatting produces much smoother results without the grainy voxel artifacts NeRFs often have. However, it achieves this improved quality without sacrificing the speed and performance of NeRFs. So Gaussian splatting gives a big boost in realism compared to NeRF renderings, while maintaining real-time speeds in cleaning up those “clouds”Supervision from Roboflow (and Piotr)Btw our own friend of the pod and AI Vision expert @skalskiP (who reviewed Gaussian Splatting for us) is also having a crazy ThursdAI week, with their open source library called SuperVision, which is a computer vision toolkit, and is trending #2 on Github 👏Apple stepping in their Vision (not the headset) Transformer gameApple has open sourced ml-fastvit, which is their general purpose Vision Transformers model, which they claim runs at ~1ms on mobile devices, including code and pre-train weights available on Github 🔥This is great to see from Apple ML teams, not only them open sourcing, but also them preparing all of us to the world of spatial computers (Vision Pro coming remember?) and many new Computer Vision heavy apps will be available at those incredible speeds.This is also great for on device inference running these models in node / on edge (as Friend of the pod @visheratin demonstrated with WebAI)Additional updates included Nvidia releasing a web playground for NeVa, which is their MLLM (Multimodal LLM, get used to seeing this term everywhere) and you can play with that here ), and Link-Context learning for MLLMsAgentsOpenAi is also announced that Global Illumination joining OpenAI, that team is CEOd by the creator of Instagram stories algorithm and feed contributor and the team is behind a massive open world minecraft clone. Will we see OpenAI release agents into that world? We know that they are working on agentsA16Z - AI Town (🔗)Speaking of agents roaming free and interacting, we covered the open sourcing of SmallVille just last week ↴ and now we see a new open source framework called AI Town of running letting agents roam and interact with each other from Andreessen Horowitz AI division.AI Town (Github) is a web framework, written in TypeScript and is built to run, get customized and run with different LLMs (even Open source ones) in mind and you can see the AI agents running around in a live demo hereThis ThursdAI was so packed with great information, that it’s really worth listening to the whole recording, you can do this on our Zealous page, RSS and on twitter (all those links can always be found on thursdai.news )If you found this valuable, join our community and let your friends know? This is a great way to support us, as well as participate in the discussion on social, tag #thursdAI on anything you feel is worthwhile for us to summarize and This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit sub.thursdai.news/subscribe
undefined
Aug 10, 2023 • 16min

ThursdAI Aug 10 - Deepfakes get real, OSS Embeddings heating up, Wizard 70B tops tops the charts and more!

Hey everyone, welcome to yet another ThursdAI update! As always, I’m your host, Alex Volkov, and every week, ThursdAI is a twitter space that has a panel of experts, guests and AI enthusiasts who join to get up to date with the incredible fast pace of AI updates, learn together and listen to subject matter experts on several of the topics. Pssst, this podcast is now available on Apple, Spotify and everywhere using RSS and a new, long form, raw and uncut, full spaces recording podcast is coming soon! ThursdAI - Is supported by readers, and I promised my wife I’d ask, if you find this valuable, why not upgrade your subscription so I can keep this going? Get better equipment and produce higher quality shows? I started noticing that our updates spaces are split into several themes, and figured to start separating the updates to these themes as well, do let me know if the comments if you have feedback or preference or specific things to focus on. LLMs (Open Source & Proprietary)This section will include updates pertaining to Large Language Models, proprietary (GPT4 & Claude) and open source ones, APIs and prompting. Claude 1.2 instant in Anthropic API (source)Anthropic has released a new version of their Claude Instant, a very very fast model of Claude, with 100K, a very capable model that’s now better at code task, and most of all, very very fast! Anthropic is also better at giving access to these models, so if you’ve waited in their waitlist for a while, and still don’t have access, DM me (@altryne) and I’ll try to get you API access as a member of ThursdAI community. WizardLM-70B V1.0 tops OSS charts (source)WizardLM 70B from WizardLM is now the top dog in open source AI, featuring the same License as LLaMa and much much better code performance than base LLaMa 2, it’s now the top performing code model that’s also does other LLMy things. Per friend of the pod, and Finetuner extraordinaire Teknium, this is the best HumanEval (coding benchmark) we’ve seen in a LLaMa based open source model 🔥Also from Teknium btw, a recent evaluation of the Alibaba Qwen 7B model we talked about last ThursdAI, by Teknium, actually showed that LLaMa 7B is a bit better, however, Qwen should also be evaluated on tool selection and agent use, and we’re waiting for those metrics to surface and will update! Embeddings Embeddings EmbeddingsIt seems that in OpenSource embeddings, we’re now getting state of the art open source models (read: require no internet access) every week!In just the last few months: - Microsoft open-sourced E5 - Alibaba open-sourced General Text Embeddings - BAAI open-sourced FlagEmbedding - Jina open-sourced Jina EmbeddingsAnd now, we have a new metric MTEB and a new leaderboard from hugging face (who else?) to always know which model is currently leading the pack. With a new winner from this week! BGE (large, base and small (just 140MB) ) Embedding models are very important for many AI applications, RAG (retrieval augmented generation) products, semantic search and vector DBs, and the faster, smaller and more offline they are, the better the whole field of AI tools we’re going to get, including, much more capable, and offline agents. 🔥 Worth noting that text-ada-002, the OpenAI embedding API is now ranked 13 on the above MTEB leaderboard! Open Code Interpreter 👏While we’re on the agents topic, we had the privilege to chat with a new friend of the pod, Shroominic who’s told us about his open source project, called codeinterpreter-api which is an open source implementation of code interpreter. We had a great conversation about this effort, the community push, the ability of this open version to install new packages, access the web, run offline and have multiple open source LLMs that run it, and we expect to hear more as this project develops! If you’re not familiar with OpenAI Code Interpreter, we’ve talked about it at length when it just came out here and it’s probably the best “AI Agent” that many folks have access to right now. Deepfakes are upon us! I want to show you this video and you tell me if you saw this not in an AI newsletter, would you have been able to tell it’s AI generated. This video was generated automatically, when I applied to the waitlist by HeyGen and then I registered again and tried to get AI Joshua to generate an ultra realistic ThursdAI promo vid haha. I’ve played with many tools for AI video generation and never saw anything come close to this quality, and can’t wait for this to launch! While this is a significant update for many folks in terms of how well deepfakes can look (and it is! Just look at it, reflections, HQ, lip movement is perfect, just incredible) this isn’t the only progress data point in this space. Play.ht announced version 2.0 which sounds incredibly natural, increased model size 10x and dataset to more than 1 million hours of speech across multiple languages, accents, and speaking styles and emotions and claims to have sub 1s latency and fake your voice with a sample of only… 3 seconds! 🤯So have you and your loved ones chosen a code word to authenticate over the phone? Or switched to a verifiable communication style? While those of us with multiple accents don’t yet have to worry, everyone should stop believing any video or voice sample from now on, it’s just inevitable that all of that will be deepfaked and we should start coming up with ways to authenticate content. If you made it this far, and any of the above was new/important to you, why not support this pod/newsletter/community? If you’d like to sponsor us more directly, please ping me at altryne [at] gmail.com , I’m also open to consulting, and if you’re a great company, Developer Relations positions :) Finally, we’ve talked for a whopping 2 hours on the spaces, and that whole conversation can be heard on our Zealous page which has transcripts, AudioGrams of key moments, and space summarizations! And the Long form space recordings can be added to your podcatcher separately if you’d prefer the “ThursdAI raw feed” by using this RSS link, and will come as it’s own podcast very soon! Thanks to our friends at ZealousThank you, Alex Volkov.Host ThursdAI - Recaps of the most high signal AI weekly spaces CEO @ Targum.videoAI Consultant with free slots (Lets Talk) This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit sub.thursdai.news/subscribe
undefined
Aug 4, 2023 • 26min

ThursdAI Aug 3 - OpenAI, Qwen 7B beats LLaMa, Orca is replicated, and more AI news

Hi, today’s episode is published on a Friday, it’s been a busy week with at least 4 twitter spaces, countless DMs and research! OpenAI announces UX updates* Example prompts: No more staring at a blank page! * Suggested replies: ChatGPT automatically synthesizes follow up questions. Then you just click a button* GPT-4 by default: When starting a new chat as a Plus user, ChatGPT will remember your previously selected model! * 4. Uploading multiple files is now supported in the Code Interpreter beta for all Plus users.* 5. Stay logged in: You’ll no longer be logged out every 2 weeks and if you do, we have a sweet new welcome page! * 6. Keyboard shortcuts: Work faster with shortcuts, Try ⌘ (Ctrl) + / to see the complete list.ThursdAI - I stay up to date so you don’t have toAlibaba releases Qwen7b* Trained with high-quality pretraining data. Qwen-7B pretrained on a self-constructed large-scale high-quality dataset of over 2.2 trillion tokens. The dataset includes plain texts and codes, and it covers a wide range of domains, including general domain data and professional domain data.* Strong performance. In comparison with the models of the similar model size, outperforms the competitors on a series of benchmark datasets, which evaluates natural language understanding, mathematics, coding, etc.* Better support of languages. New tokenizer, based on a large vocabulary of over 150K tokens, is a more efficient one compared with other tokenizers. It is friendly to many languages, and it is helpful for users to further finetune Qwen-7B for the extension of understanding a certain language.* Support of 8K Context Length. Both Qwen-7B and Qwen-7B-Chat support the context length of 8K, which allows inputs with long contexts.* Support of Plugins. Qwen-7B-Chat is trained with plugin-related alignment data, and thus it is capable of using tools, including APIs, models, databases, etc., and it is capable of playing as an agent.This is an impressive jump in open source capabilities, less than a month after LLaMa 2 release! GTE-large a new embedding model outperforms OPENAI ada-002If you’ve used any “chat with your documents” app or built one, or have used a vector database, chances are, you’ve used openAI ada-002, it’s the most common embedding model (that turns text into embeddings for vector similarity search) This model is ousted by an OpenSource (nee. free) one called GTE-large with improvements on top of ada across most parameters! OpenOrca 2 preview Our friends from AlignmentLab including Teknium and LDJ have discussed the release of OpenOrca 2! If you’re interested in the type of finetuning things these guys do, we had a special interview w/ NousResearch on the pod a few weeks ago OpenOrca tops the charts for the best performing 13B model 👏Hyper-write releases a personal assistantYou know how much we love agents in ThursdAI, and we’re waiting for this field to materialize and I personally am waiting for an agent to summarize the whole links and screenshots for this summary, and… we’re not there yet! But we’re coming close, and our friends from HyperWrite have released their browser controlling agent on ThursdAI. Talk about a full day of releases! I absolutely love the marketing trick they used where one of the examples of how it works, is “upvote us on producthunt” and it actually did work for me, and found out that I already upvotedSuperconductor continuesI was absolutely worried that I won’t make it to this thursdAI or won’t know what to talk about because, well, I’ve become a sort of host and information hub and a interviewer of folks about LK-99. Many people around the world seem interested in it’s properties, replication attempts and to understand this new and exciting thing. We talked about this briefly, but if interests you (and I think it absolutely should) please listen to the below recording. ThursdAI - See ya next week, don’t forget to subscribe and if you are already subscribed, and get value, upgrading will help me buy the proper equipment to make this a professional endeavor and pay for the AI tools! 🫡 This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit sub.thursdai.news/subscribe
undefined
Jul 30, 2023 • 50min

🧪 LK99 - The superconductor that can change the world, and the K-drama behind it!

This is a free preview of a paid episode. To hear more, visit sub.thursdai.newsFirst of all, let me address this from the get go, I’m not a material scientist! I am pretty good at finding information in twitter’s incredibly noisy info stream. (hey, this is how I bring you AI updates every ThursdAI) Since LK-99 is potentially groundbreaking and revolutionary, I’ve compiled a twitter list of everyone who I found credible, interested and a source of new information, and there’s now over 1.5K followers to this list alone!Since this clearly is interesting to a lot of you, I reached out to a few prominent people on this list, and asked them to join a twitter space, to try and stitch together an update on the current state of LK-99, replication attempts, history and lore, as it stands a week after the original papers release. If you found this interesting, you’re the type of person who wants to stay up to date, feel free to subscribe and keep this Substack alive!First of all, let’s do some level setting. Superconductors are real, we’ve used them in MRI machines for example, but the currently available superconductors need extremely low temperature and high pressure to well.., and the promise of a room temperature and ambient pressure superconductor is the holy grail of energy use. For a breakdown on what superconductors are, and what they can mean for the world, I strongly recommend this thread from Andrew Cote (published presciently a full two weeks before the LK-99 paper) or watch this incredible breakdown: July 22nd, the LK-99 arXiv day! On July 22nd, two papers describing “worlds first room temperature superconductor” were uploaded to arXiv: 2307.12008 - Sukbae Lee, Ji-Hoon Kim, Young-Wan Kwon (submitted by Kwon)and after 2 hours and 20 minutes another paper was uploaded2307.12037 - Sukbae Lee, Jihoon Kim, Hyun-Tak Kim, Sungyeon Im, SooMin An, Keun Ho Auh (Submitted by Hyuntak Kim)You may notice that the first two authors on both papers are Sukbae Lee and Ji-Hoon Kim, and in fact LK stands for Lee and Kim and 99 in the LK-99 name stands for the year 1999 they have started research on this.You may also notice that YW Kwon who submitted the first paper, is not included on the second one, and in fact, is no longer part of the Quantum Energy Research Institute (Aka QCentre) where he was a CTO (he’s no longer listed on the site) If this shakes out, and SC is replicated, there’s definitely going to be a Netflix series on the events that led to YW Kwon to release the paper, after he was no longer affiliated with QCentre, with limited information so let’s try to connect the dots (a LOT of this connecting happened on the ground by Seo Sanghyeon and his friends, and translated by me. Their original coverage has a LOT of details and is available in Korean hereLet’s go back to the 90sOn the LinkedIn page of Ji-Hoon Kim (the page turned blank shortly before me writing this), JH Kim showed that he started working on this back in 1999, and they estimated they have a material that contained “very small amount of superconductivity” and together with Sukbae Lee, in 2018 they have established QCentre to complete the work of their Professor Emeritus of Chemistry at Korea University, the late Choi Dong-Sik (1943-2017) who apparently first proposed the LK-99 material (following the 1986 bonanza of the discovery of high temperature superconductors by IBM researchers).Fast forward to 2017, a wish expressed in a last will and testament starts everything again Professor Choi passed away, and in this will requested follow-up research on ISB theory and LK-99 and Quantum Energy Research Institute is now established by Lee and Kim (LK) and they continue their work on this material. In 2018, there’s a potential breakthrough, that could have been an accident that led to the discovery of the process behind LK-99? Here’s a snippet of Seo Sanghyeon explaining this:Kwon Young-Wan the ex-CTOKwon is a Research Professor at Korea University & KIST, is the third author on the first arXiv paper, and the submitter, was previously the CTO, but at the time of the paper to arXiv he was not affiliated with QCentre for “some months” according to an interview with Lee. He uploads a paper, names only 3 authors (Lee, Kim and Himself) and then surprisingly presents LK-99 research at the MML2023 international conference held in Seoul a few days later, we haven’t yet found a video recording, however a few reports mention him asking for an interpreter, and talking about bringing samples without demonstration and proper equipment.Important to note, that Enter Hyun-Tak KimH.T Kim is probably the most cited and well-known professor in academia among the folks involved. See his google scholar profile, with a D-index of 43 and has 261 publications and 11,263 citations. He’s a heavy hitter, and is the submitter and listed as the author of paper number 2 submitted to arXiv, 2 hours and 20 minutes after paper number 1 above. In the second paper, he’s listed as the third author (and the submitter to arXiv) and his contribution is acknowledged like so: An author, Hyun-Tak Kim (H. T. Kim),’s knowledge on mechanisms of both superconductivity and the metal-insulator (gap-nogap) transition highly contributed to writing the mechanism part. The knowledge was acquired over 20 years by processes of performing national projects including project [Grant 2017-0-00830] funded by Institute for Information and Communications Technology Promotion (IITP) in MSIT of Korea government in ETRI. H. T. Kim left ETRI on Nov. of 2022.In the first paper H.T. is not acknowledged, and is only mentioned in in reference no. 52 to his paper from 2021. Ok enough about the people Alex! Does the rock levitate? In January, QCentre youtube channel uploaded an unlisted video that showed magnetic properties of LK-99 and another video, with partial levitation is widely shared on social media.The partial levitation shown is attributed to the Meissner Effect and is a supposed proof of room temperature super conductivity. However, these two videos are inconclusive and are not enough for us to take QCentre claims at face value. The scientific community, having been stung by a recent incident surrounding a supposed room temp superconductor, where the evidence was apparently falsified (Dais et. al.) are not so easily swayed. Adding to that, the mess around the multiple papers, showing different theories, the lack of peer review, or independent replication, the surprised publication, and a rushed follow up publication, all makes people wonder, what is going on here? This doesn’t seem like a fabricated attempt. Summary of replication attempts so far (Sun, Jul 20) Given the importance of this discovery, and the “relative” triviality of replication, common enough materials, the process is not extremely complex (but kids, do not try this at home) so we can bet that “furnaces in solid-state materials labs around the world have been cooking yesterday and today to try to reproduce” [Science Magazine]We have reports from China that supplies of Led Apatite are running dry as many are trying to replicate quietly? Additional reports from India where Dr. VPS. Awana, the Chief scientist at CSIR-NPL and team are trying to replicate, with results expected as early as tomorrow (Monday, Jul 31) and has been emailing with LeeIn addition to this, we’ve had Andrew McCalip from Varda space who has been live-tweetin, twitch streamin his “Meissner effect or bust” campaign to reproduce LK-99, while the world watches (Andrew joined the space as well) and provides ideas, materials and an outpour of support for this gung-ho, almost cowboy effort. We’ve also had folks from MIT who claimed that professors who want to remain anonymous, and went to MML2023 are also in contact with the team and are trying to test the material.Replication failure is … not a failure? Discussing the replication attempts with experts on stage, we all concluded that there are likely 2 ways for the world to know wether LK-99 is a superconductor. * Replication succeeds and scientists analyze the replicated sample* QCentre team provides a sample, and some very smart independent folks put it under a microscope, a magnetism analysis and a bunch of other measurements and confirm that it’s a superconductor at room temperature.While we wait for either of those, I encourage you to check out the resources, the space recording, and the list of folks I’ve collected to stay in the loop! Here’s a list of relevant links: * Paper 1 DOI* Paper 2 Arxiv* Paper 3 Arxiv* New Scientist Interview* ChosunBiz Interview (Korean)* Yonhap Interview (Korean)* Twitter ListAnd the list of folks who participated in the space, give them a follow: * Alex Volkov (@altryne)* Seo Sanghyeon (@sanxiyn)* Ate-a-Pi (@8teAPi)* Andrew McCalip (@andrewmccalip)* Andrew Cote (@Andercot)* Ely Rabani (@radsci)* Robotbeat (@Robotbeat)* Marsh Ray (@marshray)* Ben (@BenShindel)* Ken Condon (@KenCondon1)* Jesus (@jesuslares_me)* Danielle Fong (@DanielleFong)For your convenience, attached is an AI transcription of the space with speakers and timestamps (may be off by a few minutes) : [00:02:40] Alex Volkov (@altryne): Hello. Hello, everyone. There's a lot of you here, and I wanna welcome a shoot for up on stage while we wait for a few more guests, and then we can get started. Thank you so much for taking the time joining us. as you're as interested as all of us in this very exciting, very confusing, very potentially groundbreaking news. So I wanna introduce 2 folks up on stage 2 folks up on stage already, and bringing up another one just now. And hey, Andrew. Hey.[00:03:18] Alex Volkov (@altryne): Hey, How are you guys?[00:03:23] Ben (@BenShindel):Doing well. How are you?[00:03:27] Alex Volkov (@altryne): A little bit you know, the palms are a little bit sweaty. This is a insane turnout. Twitter is indeed a public space on because that we have. And, hopefully, spaces or two spaces, whatever they call it now, will hold. And I only invite Sam here to speak as well. Hey, Tobias. How are you?[00:03:51] Ate-a-Pi (@8teAPi):I'm good. I'm good. So good to good to, you know, hear from you guys in person, Alex. Thanks for putting the space together.[00:04:00] Alex Volkov (@altryne): Thirdly. Andrew, we're gonna introduce Andrew, but many folks who are here already follow you and and follow your work. How how's your evening going, Andrew?[00:04:12] Andrew McCalip (@andrewmccalip):Lee, this has been a wild ride. Thanks for putting all this together. It's gonna be great to get all the information in one place for the first time. This is my first time experiencing the full volume of the Internet, and just been a a lot of fun to see all the positivity around the progress.[00:04:29] Alex Volkov (@altryne): That's great. So I'll do my best that, you know, Mother think this. I will maybe preface this that I am not a scientist. Many of the terms that we'll hear today in the space I've heard for the first time a couple of days ago. What I am is a Twitter for many, many years, and I have collected a a list of folks who I I personally wanted to follow to kinda see the updates as they roll out, and we've seen many, many things roll out very quick. with a lot of confusion and different replication attempts from different places. And I just compiled the list for myself. I started following.[00:05:08] Alex Volkov (@altryne): 8 to buy had incredible incredible content diving into the the timeline. I found I I wanna introduce thank you. Am I saying this right? I think you need to hit the the mute button in a mute. If this is your first time talking on RESTASIS. let me know if you're able to do that. And if not, we'll try to solve this. And out as I was collecting folks, And I I started seeing that Andrew started doing their application attempts and even doing Twitch.[00:05:46] Seo Sanghyeon (@sanxiyn):Can you hear me?[00:05:47] Alex Volkov (@altryne): Can you hear me? We can hear you. Hey, Sam Kim. How are you?[00:05:57] Seo Sanghyeon (@sanxiyn):It it it's the noon in South Korea, and I'm fine.[00:06:01] Alex Volkov (@altryne): the afternoon. Right?[00:06:03] Seo Sanghyeon (@sanxiyn):It's 1. Yes. Yes. It's the 1 PM.[00:06:06] Alex Volkov (@altryne): Awesome. And so I was just doing an introduction maybe as you were telling up, you maybe not heard some of it. However, folks in the audience who followed this kind of thread and how we came to be here I have a a thread that I'll post on top here that has all the folks from the Twitter list that I forgot. And San Kyung and his his team is basically the reason for the space. Me and Nathan kind of found Sunqun. Am I saying Sunqun correctly? Is that is that the right way to say this?[00:06:41] Seo Sanghyeon (@sanxiyn):My name is. Your your, yeah, your pronunciation is not actually not.[00:06:48] Alex Volkov (@altryne): Okay. I'll I'll turn my best to put months at the at the right names. And so we both me and 8 to 5, a a 34 in Saint Kyung, who's in Seoul currently, and definitely speaks the language we don't speak, and so there's a lot of insight and translation. And so, yeah, I guess we'll will get started, so feel free to present yourself, and then talk a little bit about your last few days and how you came around getting in this topic. and then how kinda what you found so far.[00:07:28] Seo Sanghyeon (@sanxiyn):I I didn't really expect to to speak.[00:07:30] Alex Volkov (@altryne): That's okay. That's okay.[00:07:32] Seo Sanghyeon (@sanxiyn):That's put me put me on the spot. Yeah.[00:07:34] Alex Volkov (@altryne): I don't wanna put you on the spot, but give us maybe a brief summary.[00:07:44] Ate-a-Pi (@8teAPi):Maybe maybe do you do you want me to help Sanyon?[00:07:47] Seo Sanghyeon (@sanxiyn):Yes, please. Okay. You you have read my right top, so maybe maybe you can explain what's going on.[00:07:57] Ate-a-Pi (@8teAPi):Okay. So I'm I'm just gonna I'm just gonna just to preface everything, I I'm writing a work of fiction. So all of you guys are just participating in an experiment. So but I'm trying to keep everything to kinda, like, factual and trying to interpret what what is kind of happening on the ground. Right? Shyam is much more factual, and he he has actually been doing a primary source work. So he's been actually digging up the actual Korean language science papers. He's been sitting down with friends They've kinda, you know, summarized and kind of tried to understand what's going on.[00:08:36] Ate-a-Pi (@8teAPi):And he's really the one that's, you know, put together this that that the you know, the the the mentor, you know, whose name, I think, in some transliterations comes out to TS's chair, some Donsick He the mentor was basically in superconductors in this idea of this kind of 1 dimensional super and he had this theory.[00:09:00] Seo Sanghyeon (@sanxiyn):That so the name is che. che. Oh, sure. Yeah. Yeah. Yeah. He was a a professor in the Korean University's Department of Chemistry.[00:09:13] Ate-a-Pi (@8teAPi):Yeah. And and so he he had this idea, this theory, and he had graduate students. and one of those graduate students was Lee, and Lee kind of took up the mantle of this this theory. And then they, you know, tied up with who was an experiment list.[00:09:37] Ate-a-Pi (@8teAPi):And then they kinda discovered this trace this coast of a trace of a material in 1990 And at that point, what happens is having discovered this trace, their path kind of diverge this, and Kim, the experimentalist, goes on to do a masters, not in superconductors. So he does his masters in something else, and then he does the battery materials kind of PhD, and he graduates in 2008.[00:10:12] Ate-a-Pi (@8teAPi):while Lee continues on the superconductor path, does experimental any when he publishes his PhD. It's both a theory and synthesis of superconductors. And then he graduates, and then he he goes to work as a science adjunct professor, which we which we just found out. Like, a computer science adjunct professor, and he's there for about, you know, 4, 5 5 years. He doesn't publish. And and I'm guessing at this point, he kinda gets, like, you know, cashier out of out of academia completely, and he sets up a consulting firm, basically, Q Center.[00:10:50] Ate-a-Pi (@8teAPi):And they start taking on consulting work. And and then, again, the timeline is a little bit unclear on whether or not they continue to work on on the on on the product on what they discovered. And what happens then is in 2017, Chey Dongksik passes.[00:11:18] Ate-a-Pi (@8teAPi):And as he passes, he he gets his former students together, and he asked them to finish off what they started to find this superconducting material that they saw a ghost of a trace of in 1999. And he passes, and they have no money. basically. Song Young has done, again, primary source research, and, you know, the the office space is basically, like, like, a two story building, you know, somewhere in the you know, in in Seoul. It's a very modern kind of office. They don't have much money.[00:11:57] Ate-a-Pi (@8teAPi):My guess my guess is that they need Kim. because KIM is the experimentalist, and I'm guessing also that none of the theory works at this point. The only thing that they have to go on is that they actually did find something in 1999. And Kim, I'm guessing, is also quite practical because he didn't do he didn't pursue the superconductors for the PhD. Right? Because he's quite practical, he's like, dude, you get me money. I'll join you. You don't have money. I'm not joining you for your wild goose, Jason. Right?[00:12:36] Ate-a-Pi (@8teAPi):So Lee goes out and he recruits Kwan. And Kwan is kind of like you know, he's he's a US PhD. He has a research university, you know, position. recruit them, and they get funding. And I think I think Sam Young, you were you were saying that Kwon is the one on the, you know, National Science Foundation of Korea's like you know, list, like, grant. Right? I I think that's what you said.[00:13:08] Seo Sanghyeon (@sanxiyn):So the paper mentions the public grant from South Korea. called the National Resource Foundation, which is like National Science Foundation in United States. And Korn is listed as a primary invest mitigate our PI, if then.[00:13:25] Ate-a-Pi (@8teAPi):Right?[00:13:26] Alex Volkov (@altryne): Mhmm.[00:13:27] Ate-a-Pi (@8teAPi):Yeah. Yeah. That's right. Okay. So he he's the PI. So they recruit him as the PI, and Jade Kim, who is, you know, Lee's partner, basically leaves his very comfortable position as a research director in a hearing aid test.[00:13:44] Seo Sanghyeon (@sanxiyn):Yeah.[00:13:44] Alex Volkov (@altryne): Yeah. Yes.[00:13:45] Seo Sanghyeon (@sanxiyn):Yes. Yeah. Hearing aid Yeah. I Or the eye test there? Yeah. Yeah. For the ISER tech and in manufacture, the battery is specialized for the hearing aid. code. It is a medical device. They have a different standard from other batteries. And company a small business in South Korea, but seems competitive worldwide.[00:14:13] Alex Volkov (@altryne): So he leaves his let me let me -- Yeah. Go ahead. Just real quick and to give folks a quick summary. The main paper that we saw the explosion from that was published on July 22nd, so a week and and almost a day we're, like, almost 8 days into this. The three people that you you just said, besides the first professor, Choi or chair or Troy and or several places write it separately. So the the three people, SoftBank, Jihoon Kim, which is the LK in LK 99, right, Lee and Kim. And the third person you just mentioned is Young Wan, Kwan. Yes.[00:14:52] Alex Volkov (@altryne): Those are the the 3 authors on the paper that kind of was published on our side out of the blue. 8 days ago. Please continue.[00:15:03] Ate-a-Pi (@8teAPi):Right. And then so at this at this point, they're in 2017, And, you know, Lee goes out and does the fundraising. He recruits Kwan, who's the research professor, Kwon is basically he's on the paper. He he's he's the principal investigator on the grant, but he's still a professor at university. So he's basically, I'm guessing, like, a day a day in the, you know, in the office at Q Center, very modest place. I think the grand size is pretty small, and they get this ESR machine.[00:15:41] Ate-a-Pi (@8teAPi):And again, from what I can tell, the ESR machine only came knows how to use it. Because none of the other people are actually synthetic, you know, synthesis people. They're all like theory guys, Kuan is a physicist. And Kim himself, JH Kim himself, he's looking for something which you have to know what you're looking for, right? Because that's what he says in his LinkedIn. He's like, I'm looking for some if you don't know what you're looking for, then forget about it. Right?[00:16:19] Ate-a-Pi (@8teAPi):But he he knows what he's looking for, and they refine, they refine, and they refine, and he keeps doing experiments. He keeps refining the experiment, and he goes through, like, a 1000 iterations. And somehow, starting in 2018, somehow, By the middle of 2018, they find it. So that that's a surprising thing for me because they've I I I suspect they they've been working on it you know, before or, you know, Jay and Lee had a breakthrough on their theory, so they knew how to narrow the workspace down. But somehow in at the end of the day, Kim is the one grinding.[00:16:58] Ate-a-Pi (@8teAPi):Through that 1000 experiments, finally, to get, you know, a sample that works.[00:17:03] Seo Sanghyeon (@sanxiyn):And then they start by -- No. No.[00:17:05] Alex Volkov (@altryne): No.[00:17:05] Ate-a-Pi (@8teAPi):No.[00:17:05] Alex Volkov (@altryne): No.[00:17:05] Seo Sanghyeon (@sanxiyn):No. No. No. No. No. No? So so besides the two papers, there is a paper published in April returning query. And In their own words, they describe what what prompted their breakthrough in 2018.[00:17:27] Seo Sanghyeon (@sanxiyn):and it said that so so they are putting the material in a quartz tube And because they called it to best courts to cancel and Brooke, And the material left after the breaking of the glass was had the property they wanted. So so it was an accidental discovery.[00:18:02] Ate-a-Pi (@8teAPi):So can can you repeat that? Like, they what what happened? They put it in the quartz tube, and the quartz tube broke accidentally?[00:18:10] Seo Sanghyeon (@sanxiyn):Yes.[00:18:10] Alex Volkov (@altryne): Yes. Yes.[00:18:11] Seo Sanghyeon (@sanxiyn):I see. And and And that what's the breakthrough in 2018? I see. It's what I'm saying.[00:18:19] Alex Volkov (@altryne): Yeah. I just wanna confirm what I hear. The breaking of the course you led to the incidental discovery. This is this is the the breakthrough as it's written in the first paper in Korea? Yes. Yes. Okay. So I'll just call ASAP, I'll just give it back for some logistics. Folks, if you look up on on top of the space, there's a few tweets we're pinning. And as we go along, we're gonna add some information on top of this. The 3rd the third we pin from dystopian breaker has a link to the original kind of Korean paper. So please go ahead, Datapai.[00:18:54] Seo Sanghyeon (@sanxiyn):So so quick -- Okay. point.[00:18:56] Alex Volkov (@altryne): Yeah.[00:18:56] Ely Rabani (@radsci):Go ahead. Go ahead. This this could be important because, you know, as as soon as you expose it to the atmosphere, your getting hydration. And hydration, you know, might be harmful, might be helpful. From this, like, little account, it seems like it it it either didn't do anything or was helpful. But, like, no what temperature it was at when it broke, and and things like that could could actually be really pertinent.[00:19:30] Ate-a-Pi (@8teAPi):Yeah. So, absolutely, like so it's not they he does do the 1000 experiments, but the 1000 experiments, whether that gets him there or not, at one point in the experiment, the quartz tube breaks, that gets them there. They get lucky. Right? So they get they get lucky. And then after that, things proceed pretty quick They isolate they isolate it, and then they they get the crystallization. They start working on the papers. They start on the patents, and they start also trying to figure out the chemical vapor deposition process. They seem to have made some way some headway on the chemical vapor deposition process.[00:20:06] Ate-a-Pi (@8teAPi):And then, you know, sometime around September 2021, something start happening. Quant takes a position, sabbatical at, I think, Seoul University at that point. I'm not sure whether that means he's putting more time in the office or not. And then that fast forwards to yeah. Go go ahead, Sunggham.[00:20:33] Seo Sanghyeon (@sanxiyn):No. No.[00:20:33] Alex Volkov (@altryne): No.[00:20:33] Ate-a-Pi (@8teAPi):You go ahead. Okay. So that fast forward about March 2023 when basically the international patent has been filed. And Kuan leaves the team at this time. I'm not sure when Kim comes on board. That's not very to me at what point Yum Tuck comes on board.[00:20:57] Ate-a-Pi (@8teAPi):So I'm guessing it's after the nature, the nature paper gets dinged in 2020, And and and, you know, the the other thing that strikes me also is that every single person on the team is very aware of every single hoax in superconductors to date. Right? They they they all know the space well, They've seen every single hoax before. They know they know what the hoaxes look like. They know what to look for. They know what diamagmatism is. So I I I don't think yeah.[00:21:29] Seo Sanghyeon (@sanxiyn):Go ahead. So the date is So the day before the yesterday, Andrew McCully posted on his Twitter the translation of the Korean paper at Doctor Lloyd. Is that correct? And can can you so so how did you translate and can Can you say something about it?[00:21:59] Alex Volkov (@altryne): Andrew, I think he's Frank to you. So I can just ring to you. You posted a translated paper also. Right?[00:22:08] Andrew McCalip (@andrewmccalip):Yes. Now that was just a machine translation from Google. That was just a very cursory translation.[00:22:19] Seo Sanghyeon (@sanxiyn):Okay.[00:22:19] Ate-a-Pi (@8teAPi):So in basically, quantity is team in March, and then you have the kind of papers being released, you know, haphazardly. The next the next point that of them is that they had started releasing the papers app as early, like, late last week.[00:22:42] Alex Volkov (@altryne): And and then and then we have -- And by the way, I think it's it's important to highlight by Kwan, the guy who's no longer affiliated with with QCenter. Like, this this sole endeavor a business venture that's funded for for this for this purpose. Kwan is no longer affiliated with that. We've seen Sankyo posted an interview in Korea from Friday where I think both of the and Kim say that Kwan, the guy who published the first paper, is no longer affiliated.[00:23:12] Alex Volkov (@altryne): there were some speculation as to maybe the limit of three people on the paper is the limit of the Nobel Prize or 2 or 3 authors. I don't have this confirmed, but this is speculation going around. And it's important to note like, both of them say that the paper was not ready when it was released, and it was released by Juan, the guy who left the first paper. 2 hours later, 2 than 20 minutes later, another paper gets released in the in the same archive with, I wouldn't say, 5 authors. not including Kwan. Right?[00:23:48] Ate-a-Pi (@8teAPi):So Lee -- Yeah. And -- The user the the user name is TumTuk team, the the college professor from, you know, Virginia is the username who who pushes the r archive paper at that Yeah.[00:24:04] Seo Sanghyeon (@sanxiyn):Chantakim is a big name with the 18 days of 45, and If you look at the paper, there is an error message in Korean saying that Bloomberg could not be found. It is a neutral error message when you did the some of the typesetting wrong.[00:24:27] Seo Sanghyeon (@sanxiyn):And You just don't probably see the room temperature, sugar conductor paper with the error deaths that had to bookmark cannot be found if you are following if you are in not in emergency.[00:24:52] Alex Volkov (@altryne): So so it does feel to us at least from the summary so far that the paper that Quang released has different information than than the second paper, and the second paper feels like it was released in the Harry and included more people that currently work at Q Center, including Hyundai Kim. And Sonja, owner, you this question. You mentioned his h h score or something score. Can can you explain the importance of that score for him talking?[00:25:20] Seo Sanghyeon (@sanxiyn):creates someone else to the explanation.[00:25:24] Ate-a-Pi (@8teAPi):Okay. So so the h score is, you know, because we have a web web savvy audience here. It's kind of like a page rank for, you know, researchers. It shows you how influential how influential the researcher was, and so a higher score means that more people have been citing your paper.[00:25:45] Ben (@BenShindel):Go ahead, Ben. Yeah. More precisely. So, like, an h index of, say, 40 means you have 40 papers that each have 40 citations or more. That's a little tricky to understand. So, like, if I get another paper that has only 30 citations, it won't affect my h index at all. I have to get a 41st paper that has 41 citations to to to make it rise.[00:26:07] Alex Volkov (@altryne): So I think it's it's safe to say HUNTAKIM, the guy who submitted the second paper, potentially haphazardly. Correct? Like, we're we're we're saying there's 2 hours after the first one. So likely prompted by these events is a well well sighted very well sighted scientist with a very high kind of confidence score. It's not like a random person of the street that decide that there's now a superconductor of room temperature and, you know, verified it.[00:26:41] Seo Sanghyeon (@sanxiyn):Okay. Sorry for being side tracked, but I just checked the the motion related to Korean paper or not to talk through it by Andrew. And on the page 5, we clearly said that the quartz tube was destroyed due to internal pressure during rapid cooling of reaction and etcetera. So I think, in fact, nobody really read ready carefully. It is it is just there about the quartz tube once destroyed.[00:27:19] Ate-a-Pi (@8teAPi):Yeah. So I think I think it's yeah. Definitely, like, probably the the rest of us are are are not very close readers. of of that paper.[00:27:29] Seo Sanghyeon (@sanxiyn):So so We can we can continue on after the upload to the archive.[00:27:42] Ate-a-Pi (@8teAPi):Indeed. So okay. So they they they it goes into our our archive, and then all of the events of the last week happen you know, I don't think any of us expected any of the events to happen. So we've all just been kind of, like, following along and seeing what happens next. I had no idea that there was a metallics conference in South Korea, and I I definitely had, like, no idea that you know, one of the authors would show up there, and it gets posted on Twitter. And so and then and then Seung Young points it out on the FM Korea Football message board.[00:28:20] Ate-a-Pi (@8teAPi):And so we translate, you know, what the audience reaction was in in in a bad translation to get -- So -- -- whatever message was across.[00:28:30] Alex Volkov (@altryne): -- mind let me interject here because this is around the that I found out about this. Alex, frozen coffee. Alex, I forgot his nickname. We invited him here. He posted a a very long Twitter thread that got the attention of the algorithm and then boosted of this room template ambin pressure, superconductor paper from Korea. I think he only started talking about the first paper, and then after the second paper also came out. And I think at this point, or somewhere around there. Andrew, you found out about this. What what did you first hear about, you know, Twitter drama around LK 90 Right?[00:29:08] Alex Volkov (@altryne): And, Andrew, feel free to at least produce you know, introduce yourself officially and BARDA and how you're interacting with this.[00:29:16] Andrew McCalip (@andrewmccalip):Yeah. So I was just cruising the Internet at night, and this came across. I think my my Twitter feed And so I I'm incredibly curious. This is something that has been a bit of a a hobby for me. And so I was always interested in superconductors, so it it caught my attention. I'm a mechanical engineer. So full disclosure. I am not a subject matter expert. I am simply an aerospace engineer that has a lot of curiosity and some assets at his disposal.[00:29:50] Andrew McCalip (@andrewmccalip):And so reading this paper, it it struck me just the simplicity of of the process. And so I realized that I probably had the ability to replicate with full fidelity, the process that was described in the paper. And so that within about 30 minutes, I I realized I should simply start down this road that Twitter was already picking up at the time.[00:30:21] Andrew McCalip (@andrewmccalip):There's some conversations going back and forth and the it was the classic scenario where on every superconductor discussion, there is the same conversation that happens over and over again. And this synthesis appeared so simple that it seemed that the most expedient thing was to simply test it physically. And so my my work is very receptive of of after hours projects. I'm I'm known as the the guy that has really aggressive hobbies, let's say.[00:30:57] Andrew McCalip (@andrewmccalip):And so I'm always in the back doing something interesting with materials or automation. So within 30 minutes of reading the paper, I had ticked off orders to various chemical suppliers. I've reached out to overseas vendors. to try to procure a couple of the the elements. And so it was just kind of an offhand comment that I made on Twitter and and then the ball really started rolling, and I realized that everyone wanted to see this this made.[00:31:32] Andrew McCalip (@andrewmccalip):And so it was just supposed to be a a a fun little project, but I was really overwhelmed by the the response. Everyone wanted to to see this done. I think there's this incredible curiosity, there's this incredible drive. People wanna see, like, incredible things happen for the the the human race. And so something if this magnitude pops up, everyone's motivated to drop everything and investigate. And I think that's where we're at.[00:32:08] Alex Volkov (@altryne): And I think you met the algorithm at the right place where folks were excited about the future and think this could bring a lot of changes around the future, and you started saying, hey. You know? Here's a here's a direct approach. Let's try to replicate this. And I I wanna just highlight the fact the the materials involved in creating this. And the process, some folks say and please talk about this. Some folks say that has been an attempt at a hoax, it wouldn't be as simple. They wouldn't have released a simple instruction manual kind of quote, unquote simple that many labs around the work they replicate given the materials and and the right equipment. Right?[00:32:48] Ely Rabani (@radsci):So -- Yeah.[00:32:48] Alex Volkov (@altryne): So -- -- straightforwardness of this potentially shows some stuff.[00:32:51] Ely Rabani (@radsci):So this this is a good time for for a PSA. I mean, I know that that Andrew is well aware of this, and and and many of peep of the people who've been following it. But in case anybody who's listening isn't. The these compounds in vapor form at any rate are are highly talked music, and you you have to know lab safety. If you're gonna start trying to experiment with them, you need things like, a glove box and, you know, all kinds of PPE, a fume hood, everything else. Taking risks with this kind of thing is just really not worth it.[00:33:31] Alex Volkov (@altryne): I I I can't stress that. Absolutely. Don't try this at home.[00:33:36] Andrew McCalip (@andrewmccalip):kids definitely. Yeah. Absolutely. There's a lot of chatter in the beginning in the first couple hours about this can be replicated in a garage And, you know, I thought it was interesting. I thought maybe we've got the opportunity to to do it safely. we've got all the right equipment. We've got, you know, the the 1,000,000 of dollars of equipment that support our spacecraft business. that allow us to do some of these things safely. And so I thought Twitter wants to live vicariously through somebody why not do this?[00:34:12] Andrew McCalip (@andrewmccalip):I ended up being in sort of an interesting middle ground because I'm not in academia. I'm also not trying to commercialize any part of this tech. really just doing it for fun because it's incredibly interesting. So I've got no skin in the game except for making this work in a transparent manner. and then getting the materials into the hands of the experts.[00:34:34] Andrew McCalip (@andrewmccalip):So I thought if we can leverage some of our equipment and some of our, you know, very smart people that we have, to speed this timeline up, I didn't see anybody in the United States being vocal about trying to do replication there are so many stories coming out of other parts of the world that all the labs, there must be thousands of furnaces burning right now trying to replicate this. But I wanted to get material into the hands of some local experts in California.[00:35:09] Andrew McCalip (@andrewmccalip):And so that's really our our goal is, you know, can we can we sort of be the face of of the Internet do this experiment in a safe manner and then help advance the science and be sort of a forcing function to to doing this replication.[00:35:27] Alex Volkov (@altryne): So, Andrew, just before just a a small pause before you continue, I want to ask the other, Andrew, here. The Andrew code, if if you're able to unmute and and and talk us if you're available about the potential reasons why all of Twitter jumped on this. Andrew Kot, you had a thread on room temperature superconductors. About 2 weeks before this, like, almost a permanent is kind of a threat. And could you give us some summary first of all, feel free to introduce yourself, but also some summary of what this means if this replicates, what this means for the world.[00:36:07] Alex Volkov (@altryne): Applications, you know, give us, like, some excitement of what happens if this is an actual ambient pressure in room temperature superconductor? Andrew? Does not look like Andrew is Oh, hey.[00:36:33] Andrew Cote (@Andercot):Sorry. My my audio cut out for a second. I I missed the prompt. Oh, here you are. Let you only -- Sure. Yeah. Thanks. Thanks very much.[00:36:44] Alex Volkov (@altryne): So so folks so so I I explained to folks your thread about MBN, you know, pressure room temperature superconductors that you've offered, what, 2 weeks before the paper came out. And then suddenly, this dropped. And I wanted you to highlight some of the potential applications of superconductors and give us some of the highlights of what happens in this replicating. This is an actual, you know, real thing.[00:37:08] Andrew Cote (@Andercot):Yeah. Sure. So it's kind of a funny thing. Yeah. I put that thread out there 7 weeks before this story broke. You know, just I have worked with this kind of stuff in in a few different areas now, so it's very, you know, superconducting radio frequency cavities are standard technology in accelerator physics to fill these to work in.[00:37:31] Andrew Cote (@Andercot):Like, my first job in physics was actually in a condensed matter lab using a a scanning tunneling microscope to look at, you know, electronic structures of potential high temperature superconductors So this has always been sort of like a holy grail of material science, like sort of a holy grail of applied physics. It's one of these properties it's one of these materials where the bulk properties come from its quantum mechanical behavior. And and, you know, when quantum mechanics and its effects escape the realm of the very tiny, it can really manifest as as magical phenomenon at our scale in the world of the kind of the bulk matter or the big stuff.[00:38:10] Andrew Cote (@Andercot):So, you know, superconductors are used currently today, You know, it's it's they've reached engineering applicability through decades of continuous refinements and improvements. And and some of the biggest things to think about in what lets these things get used in industrial applications is their ability to superconducts at higher and higher temperatures And, also most also importantly, is to operate at higher and higher background magnetic field strengths. And so the way to think about this is that a superconductor, it's allowing current to move through it with zero resistance, but it also perfectly spells magnetic fields.[00:38:48] Andrew Cote (@Andercot):And there's an operating point of these materials where it's basically the current density and the temperature and the magnetic field kind of put the bounds or the performance envelope on the material. So some conductors can carry tons of current, but they can't exist in a very high field. And so, you know, those are hard to make as useful. You can use them for carrying, like, electricity, which is awesome, but often what you really wanna do is generate very strong magnetic fields. So I think maybe the most familiar to the most people here would be, like an MRI machine. Right?[00:39:27] Andrew Cote (@Andercot):Magnetic resonance imaging. So the idea there is you're generating very high strength field, and magnetic fields are measured in Tesla, for example. So just for just for context, you know, 3 Tesla is a is a pretty strong field, and that's what is about the strength using an MRI. So, you know, MRIs use these cryogenically cooled magnets, or or they're not I don't think cryogenically cooled. They're actually often just copper, but they do have cooling. But they generate this high strength field, and then, you know, it kind of sets all these little protons in your body spinning and dancing in a little, you know, kind of radiating energy.[00:40:03] Andrew Cote (@Andercot):And then you have a pickup coil, which is like an antenna, and the antenna is trying to pick up that energy and kinda reconstruct what's going on in your body. And this is how we can get, like, a really high detailed, high fidelity, three-dimensional image of what's going on inside someone without any invasive surgery. So it's, like, you know, MRIs are a real kind of amazing breakthrough in medical imaging. Superconductors if they could work without cryogenics would really simplify and make cheaper and more available, high resolution, high fidelity, three d images of people's bodies.[00:40:35] Andrew Cote (@Andercot):not just for making the magnetic fields, but also for picking up the signal emitted by the protons that get put into motion by the field in the first place. So it's kind of, like, one sort of off the shelf example. I think another one that's kind of under the radar, we don't think about it's not just in carrying electricity without resistance, which is useful for long range, like energy transmission, that kind of stuff. But if you look at the national grid, I mean, only 5, 7 percent of energy total, which is still significant, but it's, you know, single digit percentage ends up, you know, burning as weight You're suddenly muffled.[00:41:11] Alex Volkov (@altryne): I don't think yeah. You're suddenly a voice like your -- Oh, better.[00:41:18] Andrew Cote (@Andercot):Now it's better. Okay. Sorry about that. Yeah. So just gonna say so, you know, National Grid Scale Energy Production. Right? So trans transmitting the energy to its endpoint consumption, there's a bit of waste heat along the way. But what's what's also important to think about is how that energy is produced. It's produced also using high strength magnetic fields. And I was looking into this. There's a a experiment where these guys used sort of more modern high temperature superconducting tape to, you know, retrofit a large DC generator then it had, like, a 36 percent power improvement, right, which is pretty substantial. That's that's a that's a serious win.[00:41:58] Andrew Cote (@Andercot):Yeah. So there's there's, you know, sort of thousands of places this stuff could be used that would really just, like you know, it would either greatly improve the performance efficiency, reduce the cost, increase the accessibility of what we think of as, like, high technology like MRIs or particle accelerators. But it would also just decrease the cost of basic things like electricity generation and distribution And that's just the beginning. Right? So, you know, this kind of stuff there's a really good analogy here actually with the transistor, you know, for for years, scientists, then electrical engineers and physicists, they had this idea of a transistor. Right?[00:42:35] Andrew Cote (@Andercot):If only we could have some kind of simple, reliable, current model supplier. We could design all these wonderful things. We could design all these different kinds of logic functions and so forth. And so there was this search for the transistor people were searching for something that could do that, and they had anticipated all the places it could be used ahead of time. And it wasn't until at Bell labs, you know, a very kind of funny crossover here. One of the guys that's on the patent for the transistor is John Bardine. and John Bardeen's actually the only guy to win 2 Nobel Prizes. 1 was for the transistor. The other was for the theory of superconductivity, right, which is Barting Cooper Schiffer Theory, BCS.[00:43:14] Andrew Cote (@Andercot):So, again, it's one of it's one of those things where, you know, physicists, scientists, engineers kinda thought about this for a long time, realize this be amazing. And there's been this, you know, really complicated random walk through the configuration space of possible materials, right, which is so high dimensional. There's so many things you can construct. So I think it's I'm very optimistic about the field in general. I think one thing to think about with this particular result there's so much artisanal craft and and mastery that goes into producing these materials in a reliable, consistent way You know, science people don't often recognize. It's a lot of art involved too. Right?[00:43:52] Andrew Cote (@Andercot):Like like, things that are reduced to expert practice us and know how. And so I'd I'd just be cautious on, you know, jumping to conclusions either on this particular result, if it's if it's valid right now. But, also, if some labs can't fail to reproduce it, it doesn't actually rule it out entirely. I I think there's scientists that have traveled to Korea to work with the original authors. I look closely at that. You know, I'd also you know, I my internal odds are kind of like a 1 in 6 chance, this pans out, and it and it could be big.[00:44:21] Andrew Cote (@Andercot):But that doesn't mean that it's the end of the search or the end of the race, and I'm and I'm also optimistic that Getting people to understand what the massive long term and large scale social benefits of this kind of discovery could be could help direct a lot more basic science research towards this field. You know, I think we spend a lot of things on, like, how to make smartphone cameras better and not a lot of things on and not as much as we could spend on things like high temperature superconductors. And this is a final example.[00:44:48] Andrew Cote (@Andercot):I mean, so right now, you know, I work as a accelerator engineer, accelerator is a type of magnetic confinement fusion reactor The reason the company I work for can't exist, and and the reason there is this current burn and boom in nuclear fusion, is because we've engineered these high temperature superconductors to work in higher and higher magnetic fields, at at higher and higher temperatures. And and the big economic breakthrough there came when we can have these superconductors that can work at liquid nitrogen temperatures, right, which is 77 kelvin. And it's a lot cheaper to make liquid nitrogen and run that kind of cryogenics than it like liquid helium at, like, 4 Kelvin.[00:45:24] Andrew Cote (@Andercot):So, you know, we're already reaping some of the benefits of this sort of tech stack maturing over time. And I think really just getting started in terms of, like, the hunt for promising materials. I mean, I'm hoping this results in positive publicity and more effort, more energy, put into the field. I think if this doesn't pan out as the thing, you know, don't give up hope. Right? I mean, this is a long term game. Science sees by starts and stops. There's no fundamental physics here that's impossible. Right? There's no physical principle that says this can't work. Right? This isn't like a a momentumless or massless propulsion drive like the EM drive.[00:46:04] Andrew Cote (@Andercot):isn't, like, superluminal neutrinos. Right? Those things kind of break laws of physics. This is very much in the realm of, yeah, physically possible. seems seems very you know, in my mind, seems likely there could be something out there given the complexity of state space of electronic structures and given how you know, how large that space of exploration can be. And, yeah, so I think I'm just kind of you know, this is a great time to be interested in material science to appreciate basic science research and educating ourselves on on how good the future can be. You know, I think there's a lot of narratives right now in society and cultural in general. that kinda say, like, you know, you know, we we can't solve our way out of our biggest problems today. Right?[00:46:43] Andrew Cote (@Andercot):And and I'm very much on the other side of that debate. I think we can. I think it's through efforts like this. I think it's through people like Andrew at Varda that are willing to do stuff in their backyard or their garage or their fact or their their work workplace on their extra time. You know? I mean, this is the kind of this is the the let's build mentality. Right? And so I think we can build our way out of the world's greatest problems, and I its fundamental scientific advances like this discovery could be that that kind of paved the way out of there too. So, yeah, overall, very optimistic.[00:47:11] Andrew McCalip (@andrewmccalip):Andrew? That that's incredibly well said. That is an incredibly well balanced viewpoint. So how would you advise people to absorb the the next week of the new cycle? I mean, we're very much on a you know, we're we're back dead. We're back type of hype cycle. So how do you advise people to think about the results that they're seeing knowing that this is a a very difficult thing to replicate when it just because it a negative result is shown in a lab that doesn't mean it's not physically possible.[00:47:49] Andrew McCalip (@andrewmccalip):It's very difficult to prove the negative here. So tell us how we should absorb the new cycle coming up in the next few days.[00:47:59] Ate-a-Pi (@8teAPi):So I I I I I I might I might say something about that. I think I think this is basically tacit knowledge transfer, and you Kim Kim seems to have been this kind of, like, artisanal, like, you know, experiment list. So you need people to actually sit there in the lab with this guy, and he needs to demonstrate to them. And they need to pick up and and there might be things that he does, which he didn't write down. That that's the like, my my take on it given that He is the experiment list. He's the synthesis on on the team.[00:48:38] Ate-a-Pi (@8teAPi):Given that the team seems to have been only, like, 5 or 6 people, is that this guy is the maybe the only person in the world as of, like, you know, 18 months ago. I'm guessing that, you know, he managed to transfer some of that to the JungTux team. So I'm guessing that at at least one more one more team on on earth has this now. And I'm guessing that this knowledge transfer is now happening to a couple more people. So so you need to see this progress maybe 2 or 3 cycles for, like, a bunch of other people to have learned the skill, and then that's when that's when things get interesting.[00:49:14] Seo Sanghyeon (@sanxiyn):I mean, you don't really need to replicate to to verify this. There, the the team can just the team has the working samples. they can adjust the samples to the laps around the world.Hey, the rest of the episode is for paid subscribers to thursdai. I encourage you to subscribe or upgrade your subscription to access it, there’s almost 2 more hours of in depth conversation, stitching of facts, experts on material science, physics, electrical engineering and MIT folks chiming in. It’s really a great space, around 25K folks have listened to it on twitter so far.
undefined
Jul 27, 2023 • 19min

🎙️ThursdAI - Jul 27: SDXL1.0, Superconductors? StackOverflowAI and Frontier Model Forum

⏰ Breaking news, ThursdAI is now on Apple Podcasts and in this RSS ! So use your favorite pod-catcher to subscribe or his this button right here: Our friends at Zealous have provided an incredible platform for us to generate these awesome video podcasts from audio or from twitter spaces so if you prefer a more visual format, our deep thanks to them! P.S - You can find the full 2 hour space with speakers on our Zealous page and on TwitterHere’s a summary of the main things that happened in AI since last ThursdAI: 🧑‍🎨 Stability.ai releases SDXL1.0* Generates 1024px x 1024x stunning images* High high photorealism* Supports hands and text* Different (simpler?) prompting required* Fine-tunes very well! * Supports LORAs, ControlNet in-painting and outcropping and the whole ecosystem built around SD* Refiner is a separate piece that adds high quality detail* Available on Dreamstudio, Github, ClipDrop and HuggingFace* Also, is available with incredible ComfyUI and can be used in a free Colab!Image Credit goes to ThibaudSuperconductors on Hugging Face? What? Honestly, this has nothing immediate to do with AI updates, but, if it pans out, it’s so revolutionary that it will affect AI also!Here’s what we know about LK-99 so far: * 2 papers released on arXiv (and hugging face haha) in the span of several hours* First AND second paper both claim extraordinary claims of solving ambient superconductivity* Ambient pressure and room temp superconductive material called LK-99 * Straightforward process with a clear replication manual and fairly common materials* Papers lack rigor, potentially due to rushing out or due to fighting for credit for nobel prize * The science is potentially sound, and is being “baked and reproduced in multiple labs” per science mag.Potential effects of room temperature superconductivity on AI: While many places (All?) can benefit from the incredible applications of superconductors (think 1000x batteries) the field of AI will benefit as well if the result above replicates.* Production of GPU and CPU is power-constrained and could benefit* GPU/CPUs themselves are power-constrained while running inference* GPT-4 is great but consumes more power (training and inference) than previous models making it hard to scale* Local inference is also power-restricted, so running local models (and local walking robots) could explode with superconductivity * Quantum computing is going to have a field day if this is true* So will fusion reactors (which need superconductors to keep the plasma in place) As we wait for labs to reproduce, I created a twitter list of folks who are following closely, feel free to follow along! AI agents protocol, discussion and state of for July 2023* Participated in an e2b space with tons of AI builders (Full space and recap coming soon!) * Many touted AI agents as a category and discussed their own frameworks* Folks came up and talked about their needs from the agent protocol proposed by e2b* Agents need to be able to communicate with other agents/sub agents* Tasks payloads and artifacts and task completion can be async (think receiving a response email from a colleague) * The ability to debug (with timetravel) and trace and reproduce an agent run* Deployment, running and execution environment issues* Reliability of task finish reporting, and evaluation is hardFrontier model forum* OpenAI, Anthropic, Google, and Microsoft are forming the Frontier Model Forum to promote safe and responsible frontier AI.* The Forum will advance AI safety research, identify best practices, share knowledge on risks, and support using AI for challenges like climate change.* Membership is open to organizations developing frontier models that demonstrate safety commitment.* The Forum will focus on best practices, AI safety research, and information sharing between companies and governments.* Some have expressed concern that this could enable regulatory capture by the “Big LLM” shops that can use the lobbying power to stop innovation. StackOverflow AI - “The reports of my death have been greatly exaggerated” Stack overflow has been in the news lately, when a graphic of it’s decline in traffic has become viral. They have publicly disputed that information claiming they have moved to a different measuring and didn’t update the webpage, but then also… announced Overflow AI!* AI search and aggregation of answers + ability to follow up in natural language* Helps drafting questions* AI answers with a summary, and citations with the ability to “extend” and adjust for your coding level* VSCode integration! * Focusing on “validated and trusted” content* Not only for SO code, stack overflow for teams will also embed other sources (like your company confluence) and will give you attributed answers and tagging abilities on external contentThis has been an insane week in terms of news (👽 anyone?) and superconductors and AI releases! As always, I’m grateful for your attention! Forward this newsletter to 1 friend as a favor to me if you learned something new? Or alternatively, retweet us on twitter for bigger reach! Thank you! See you next ThursdAI (and on Sunday when I release the State Of Agents recap 😅 ) ThursdAI - Get in on this, and share w/ 1 friend 🫡 This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit sub.thursdai.news/subscribe
undefined
Jul 23, 2023 • 37min

ThursdAI - Special Episode, interview with Nous Research and Enrico Shippole, fine-tuning LLaMa 2, extending it's context and more

Hey there, welcome to this special edition of ThursdAI. This episode is featuring an interview with Nous Research, a group of folks who fine-tune open source large language models to make them better. If you are interested to hear how finetuning an open source model works, dataset preparation, context scaling and more, tune in! You will hear from Karan, Teknium, LBJ from Nous Research and Enrico who worked along side them. To clarify, Enrico is going in depth into the method called Rope Scaling, which is a clever hack, that extends the context length of LLaMa models significantly and his project LLongMa which is an extended version of LLaMa with 8000 token context window. The first voice you will hear is Alex Volkov the host of ThursdAI who doesn’t usually have a lisp, but for some reason, during the recording, twitter spaces decided to mute all the S sounds. Links and acknowledgments: * Nous Research - https://nousresearch.com/ (@nousresearch)* Redmond Puffin 13b - First LLaMa Finetune* LLongMa - LLaMa finetune with 8K context (by Encrico, emozilla and KaioKenDev)* Nous-Hermes-Llama2-13b-GPTQ - Hermes Finetune was released after the recording 🎊Psst, if you like this, why don’t you subscribe? Or if you are subscribed, consider a paid subscription to support #ThursdAIShow transcription with timestamps: Alex Volkov - targum.video (@altryne)[00:00:55] Yeah. That's awesome. So I guess with this, maybe, Karan, if you if you are able to, can you you talk about Nous research and how kind of how it started and what the what are you guys doing, and then we'll dive into the kind of, you know, Hermes and and Puffin and the methods and and all of it.karan (@karan4d)[00:01:16] Absolutely. Nous research. I mean, I I myself and many other of us are just, like, enthusiasts that we're fine tuning models like, you know, GPTJ or GPT 2. And, you know, we all are on Twitter. We're all on Discord, and kind of just found each other and had this same mentality of we wanna we wanna make these models. We wanna kinda take the power back from people like OpenAI and anthropic. We want stuff to be able to run easy for everyone. And a lot of like minds started to show up.karan (@karan4d)[00:01:50] I think that Technium's addition initially to Nous research, Jim, kinda showing up. And himself, I and human working on compiling the Hermes dataset was really what came to attract people when Hermes came out. I think we just have a really strong and robust, like, data curation thesis in terms of that. And I think that have just some of the most talented people who have come to join us and just volunteer and work with us on stuff. And I absolutely must say, I can see in the in the listeners is our compute provider, Redmond AI.karan (@karan4d)[00:02:30] And, you know, none of this none of these models would be possible without Redmond's generous sponsorship for us to be able to deliver these things lightning fast, you know, without making us through a bunch of hoops just a a total total pleasure to work with. So I would I have to shell and say, you know, I highly recommend everyone check out Redmond as because they really make our project possible.Alex Volkov - targum.video (@altryne)[00:02:52] Absolutely. So shout out to Redmond AI and folks give them a follow. They're the the only square avatar in the audience. Go take them out. And, Karan, thanks for that. I wanna just do a mic check for teknium. Teknium. Can you speak now? Can you? Can I hear you?Teknium (e/λ) (@Teknium1)[00:03:08] Yeah. My phone died right when you were introducing me earlier.Alex Volkov - targum.video (@altryne)[00:03:10] Yep. What's up, Eric? -- sometimes on Twitter basis. Welcome, Technium. So briefly, going back to question. I don't know if you heard it. What besides the commercial and kind of the the contact window, what kind of caught your eye in the llama, at least the base until you guys started, or have you also, like, the other guys not had a second to play with the base model and dove into fine tuning directly?Teknium (e/λ) (@Teknium1)[00:03:35] Yeah. The only thing that really caught my eye was the chat model and how horribly RLHF it was.Alex Volkov - targum.video (@altryne)[00:03:41] Yeah. I've seen some conversations about and kind of the point of Ira, RLHF as well. And okay. So so now that we've introduced Neus research, sorry, I wanna talk to you guys about what you guys are cooking. Right? The we've seen, the the Hermes model before this was, like, loved it as one of the, you know, the best fine tunes that I've seen at least and the the the most performing ones. Could you guys talk about the process to get to the Hermes model, the previous one? and then give us things about what coming soon?karan (@karan4d)[00:04:16] Teknium, you got this one. man.Teknium (e/λ) (@Teknium1)[00:04:22] Yeah. It was basically I saw Alpaca, and I wanted to make it like, remake it with GPT 4, and then from there and just pretty much exclusively included anything that was GPT 4 only, and that was the beginning of the thesis for that. Going forward, though, We still have a lot of low quality data, I think, in Hermes data set that can be cleaned out, and then there's a lot of new data sets that have come out that I wanna start merging into there. also wanna move to something like chat ML or even Vikura format so that we can do some multi turn stuff. It's not very great, long chat.Alex Volkov - targum.video (@altryne)[00:05:03] Yeah.karan (@karan4d)[00:05:03] Within within within the Hermes dataset, you know, a lot of it is public available stuff that's particularly GPT 4. Of course, Technium's massive GP teacher dataset. We also have a bunch of GPT 4 data we had generate that we didn't release necessarily just yet, as well as an instruction set that's particularly focused on tasks like Python, transformers, linguistics, very small dataset of that. That's inside Hermes that, you know, we don't really talk about much, but figure that we'll put some exposure to right now on the spaces. And yeah.Alex Volkov - targum.video (@altryne)[00:05:42] That's awesome. And so the previous Hermes was released on top of LAMA 1, and for many folks, know, obviously, they couldn't use this for different commercial points. And now that this model relates, what the models that you guys release, are you thinking about the license of them? And could you talk about, like, the availability of folks using them in commercial standing now that, you know, the the back of it is commercially available.LDJ (@Dogesator)[00:06:07] Mhmm. I think we have puffin licensed us MIT I'll have to doublecheck on our own own model. I think that's right, Curran, right, or Tech?karan (@karan4d)[00:06:18] Yeah. I think so either that or Apache 2 point Like, if if the base model is commercially usable, you know, the stuff we put out is you're good to go. It's -- Yeah.LDJ (@Dogesator)[00:06:29] So And, like, in our announcements, I put in kind of, you know, one of the main things. It's it's commercially available. the first Nous as far as I think yeah. I'm pretty sure it's the first commercially available Nous model that's released, and a big differential data from Hermes is the fact that, like tech was saying, Hermes is pretty much all single turn data. And it's surprisingly can do pretty decent at multiturn conversations when you actually use it. But then puffin is almost kind of, like, a 180 where it's a vast majority really on context multi turn data.LDJ (@Dogesator)[00:07:09] And oh, I think can you guys hear me so? I can hear. Okay. It's just something's up with that. Okay. Yeah. So puffin is a vast majority, multi turn data, GPT 4 specifically, and a lot of it is actually real human conversations with GPT for that go on for some of them 4k 6 k context, like, even all the way up to the max 8 k context length of GPT 4. And then we took those few thousand conversations of real humans interacting with GPT 4. And now after that, I'm not sure if you've A lot of people probably heard of Camel AI.LDJ (@Dogesator)[00:07:46] So they have the physics, biology, chemistry, and mathematics data set. And then within those, there's a bunch of subtopics that you can carry it through. And I just pretty much spent a good few days curating just handpicking the right subtopics, like differential geometry, logic problems, optimization problems, a bunch of different GPT, for examples, and responses from those different subtopics. And then I specifically added those in certain ways to the puffin dataset.Alex Volkov - targum.video (@altryne)[00:08:17] Awesome. So just just looking for the audience maybe. The puffin model that I think the official name is the red redmon puffin 7B or, sorry, 13B. Yes. This is this is the model that you guys fine tuned, and one of the first is maybe not the first fine tune of llama v Two. that's now publicly available, like you said, maybe with MIT license on Huggingspace, and I think you even added the GGML quantized version. Correct? Mhmm. So and so folks can can go and download that and and already start playing with this. And so first of all, thank you for contributing to the open source. That's great to see. And the speed with which you guys are fine tuned on this is also great to see.Alex Volkov - targum.video (@altryne)[00:08:55] And maybe now that we've introduced this, maybe this is like repeating a bit. So could you speak about the the difference so the difference is the in the data set, in the task that you fine tune? Like, what is the actual difference between the Hermes or the Hermes that's coming out and the Puffin model? What would people use them for differently? Is that like that? That's a question.Teknium (e/λ) (@Teknium1)[00:09:21] The profit model definitely be better at multi turn stuff. That's for sure. Yeah.nisten (@nisten)[00:09:28] So if you want to do anything like OpenAI I'll I'll paste the link above the GGML version of it because I I really I'm I'm gonna test it thoroughly, but I I really think because you guys have use GPT 4, high quality, multi turn conversations, then this can have actual, like, practical use for whoever else was to use it either as, like, something that tells you about the documentation on the site or walks a user through. In other words, this should be better than Hermes then in for, like, customer service stuff, which is just one example.nisten (@nisten)[00:10:08] Anyway, yeah, I'm gonna try. I'll I'll paste the the link above.karan (@karan4d)[00:10:14] It's it's likely better for production use alongside, like, stuff that you have with, like, a retrieval pipeline, like, with lang chain, etcetera. Like, I I would believe that without to get it, you know, or just talking, of course. But, you know, there is even though, you know, with this Lima tech unique of of small examples where we can get, like, a a really good really good model that does really well.karan (@karan4d)[00:10:41] The thing about Hermes dataset and just its size and the various types of data and topics that are in there, I think you get a totally different like, role play or storytelling experience or completion experience with Hermes. Personally, I feel that way.Alex Volkov - targum.video (@altryne)[00:11:01] Awesome.Teknium (e/λ) (@Teknium1)[00:11:01] So and that. Another thing about Puffin Dataset is that it does go up to, like, 8K and Enrico here. has been doing a ton of work on extending Llama's context.Alex Volkov - targum.video (@altryne)[00:11:13] Right. So I wanna I wanna give an introduction then introduce Enrique and and talk about this real quick. Right? LAMA version 1 was released with, again, 2,000 tokens in the contact window. And then many folks, including KaioKendev, and Emozhila. Right? And and some other folks, I think, were involved in bringing some of the quote on quote tricks about what eventually ended up being named rope, scaling, if I'm if I'm not mistaken. And we follow this, and we've talked about the previous news ThursdAI, I. And Llama V2 was released with 4000 tokens in the context window.Alex Volkov - targum.video (@altryne)[00:11:52] And, you know, we're now still used to kind of Claude and the 16k GPT 3 that four didn't seem like a lot. And then many folks were wondering, and, meanwhile, Enrico was working, whether or not the rope scaling method would apply to the next plumber and look like it did. And so I wanna introduce Enrico uh Enrico Shippole. I hope on staying this right. Welcome to the state. Hopefully, you can unmute and and this place works with you. And The second finetune that I saw rest of the was also back with Nous, the Nouse research, and this was the extended version, what's called Longma.Alex Volkov - targum.video (@altryne)[00:12:28] So Enrique will go out of the stage and feel free to introduce yourself, your affiliation with news and LlongMa with with the context window.Enrico Shippole (@EnricoShippole)[00:12:38] Hello. So I'm actually a independent researcher. I'm sponsored by Stability AI, Eleuther AI, and a few other different organizations, including NewsNow. Awesome. I work with different people like Tanishq from Medark, Aaron Komatsusaki, who also is from a Luther and Duck AI. John Ney from Nomosai. So I I have a I have a lot of affiliation with a bunch of different organizations. including together. We're starting a project right now with them.Alex Volkov - targum.video (@altryne)[00:13:13] That's that's so great to hear, and so welcome to Thursday. Welcome to this day. And can you talk to us a little bit about kind of the ROPE scaling method and and how how were you able to, like, find them like this quickly and how the results looked so far? I wasn't able to run this myself. But hopefully, yeah, talk to us aboutEnrico Shippole (@EnricoShippole)[00:13:34] Okay. So initially, The the thing is I actually was hoping that both Emozilla, Bowen, and KaioKenDev would have been able to make it because It was kinda like a equal parts effort on, like, all fronts from each of us. Initially, I had trained some pathways models at 8,000 context length about 4 months ago based on the exposition paper, which did rotary embedding scaling initially. They were one of the first people did it. They based their methodology off of ofer presses alibi.Enrico Shippole (@EnricoShippole)[00:14:11] I would imagine that most people are pretty familiar with Ofir Press in this work on the alibi positional bias that's been used in a wide range of models now. So Emozilla and I came into contact based off of the work that he had seen me doing with the Palm models scaling those to 8000 context length pretraining, not fine tuning. So what we had initially done is basically take a section of c 4 in different data sets that had examples that were all over 8000 context length that pretrained on them packed together.Enrico Shippole (@EnricoShippole)[00:14:50] with a beginning of string and end of string token to help with, like, the attention masking portion of that. After he had seen that, Emozilla actually became into contact with kaikode dev I believe Kaiokendev is how you pronounce it. Kaiokendev had also been following Ofir Press's research. He had started working on his own version of scaling the rotary embeddings, I believe based off of both alibi and exposition.Enrico Shippole (@EnricoShippole)[00:15:22] And what he found is that by scaling the max position all embeddings and the rotary embedding from something like 2048, which you would initially train with. He scaled it up to 8000 or 8192. And he found that by applying, like, in interpolation to the encoding by scaling basically like the the positional index in the rotary embedding, that you were able to essentially turn down the frequency window and rope by like a factor of 0.25.Enrico Shippole (@EnricoShippole)[00:16:01] The scaling depends on the length that you're trying extrapolate to and the initial context length that the model was trained with. So if you were training with LAMA 2, which had an context window of 4096, and you wanted to do the linear interpolation positional scaling to something like 8192. then you would use a scaling factor of 0.5. If you were trying to do it from 2048, which is the original LAMA was trained with, and you wanted to scale it to 8192, then you would use a scaling factor of 0 point 25.Enrico Shippole (@EnricoShippole)[00:16:39] So basically, after we had done all of this, Meta had released a paper around the same time that Kaiokendev had released his blog. They both found very similar finding. They had shown in the meta paper that you only had to fine tune for 1000 steps with the linear positional interpolation scaling to be able to get the benefit of doing a full pretrain at a context window of 8192.Enrico Shippole (@EnricoShippole)[00:17:13] So this is actually like a a big step because it shows that you no longer need to pre train right off the bat at a longer context length. Then you're able to do the fine tuning on essentially a a lower resource like, computational budget and still be able to get the, like, greater results of the longer context window. I know a lot of the major AI companies had been doing just for my work in in personal research with many of them had been doing staged scaling of the context window during training.Enrico Shippole (@EnricoShippole)[00:17:46] So they would pre train basically, when pre training, they would separate the initial examples from a dataset into multiple stages.Enrico Shippole (@EnricoShippole)[00:17:54] So anything that is under the window of 2048, you'd separate from the initial dataset then you take things between 2048 4096, then 4096, and 8192, and you would basically chunk the data sets into those different parts you'd first initially train on the 2048 chunk of the data, then you would train on the data between 2048 and 4096, and then you would do the same thing from 4096 to 8192, or if you want to scale that to 16k or 32k context length. But what we have shown now with both the meta paper and this thing, you don't even need to go through that extensive pretraining and staged process, you can just go from a context length of 2048 to 8192.Enrico Shippole (@EnricoShippole)[00:18:47] scale the rotary embeddings by whatever type of factor that you want to use. So like I was saying, if you're going from 2048 to 8192, you'd be using a scaling factor of 0.25. It only needs 2 lines of code to be able to do that. In the LLongMa post, I had provided an example of scaling the rotary embeddings. The the code was written by Emozilla or Jeff.Enrico Shippole (@EnricoShippole)[00:19:15] We also came into contact with after all these experiments we then came into contact with Bowen, who had worked a lot about the dynamic NTK scaling with Emozilla, and he had also done NTK by parts which we're we're currently training a lot of models on. So we have the Longma 1 models trained on the open llama series, like the suite of those models that use the linear interpolation scaling.Enrico Shippole (@EnricoShippole)[00:19:45] We now have the llama 2 models or the longma 2 suite, which is what we're calling it, again, trained on the linear interpolation scaling And then we have another suite of models coming out very soon that uses the the NDK by parts dynamic scaling. That was really specialized by Bowen, so I do not wanna speak on his behalf. It'd it'd probably be good to get him to talk about it in another one of these.Alex Volkov - targum.video (@altryne)[00:20:14] Absolutely. So let's get in touch after this and and and and set it up. So Thank you for the a very in-depth kind of explanation because we did cover the the the kind of the RoPE killing and how Kaioken in the image boards are ready to wherever he started this in his blog post, and then how it's gonna rotate it. So it's great to to actually hear from the folks who are doing this. I just for the audience, I've attached Enrico's tweet about LLongMA 2, which is now currently trained at AK contact length.Alex Volkov - targum.video (@altryne)[00:20:47] And and Rico, you told us that we may see even double from the So could you think about the next the next version?Enrico Shippole (@EnricoShippole)[00:20:56] Okay. So the the initial training process of doing this up to a context, like length of 8192, can be due with be done, basically, with deep speed, 02. and activation checkpointing. And you're able to fit the model on a A100 80 gigabyte node. Now, we are working on the process of scaling it both to 16 k and 32 k. This requires a different methodology during training, you either need to use deep speed 0.3 or fully sharded data parallelism.Enrico Shippole (@EnricoShippole)[00:21:35] Both of those are are very similar for people who aren't aware. Basically, you're just sharding the optimizer states. The model states across, like, different nodes. You can also use things like tensor parallelism to help with the scaling as well. And then we're going to be basically just adjusting the scaling factor again, collecting a large we've already collected large quantity of data at 16k context length, and we're going to be doing the fine tuning to 16k and be releasing those models Soon, all of this computing is sponsored by stability AI.Enrico Shippole (@EnricoShippole)[00:22:12] They've been very generous what helping with a lot of the independent research.Alex Volkov - targum.video (@altryne)[00:22:17] That so I wanna shout out Stability AI for not only given, you know, the world's stability diffusion, also participating in this kind of next wave of AI. Many folks kinda coined the stability AI moment when released the the stable diffusion of the. I wanna say 1.4 back then almost a year ago now, and many folks are saying the about the Llama 2 release now this commercially open source, and and folks can start, like, doing things for you know, for profit companies can join So we definitely wanna shout out stability for for the effort here. And, Enrico, thank you. And, folks, please follow Enrico, and and we'll stay tuned.Alex Volkov - targum.video (@altryne)[00:22:56] I wanna ask Karan and and Teknium, and other folks from Nous the efforts that that Enrico was talking about. the longer context windows. How would they kinda interplay with the stuff that you're working on with Hermes with with Pufin? Are are kind of the efforts interchangeable? We're gonna see building a top of each other?karan (@karan4d)[00:23:16] So I I think LDJ can definitely speak to this, but I'd like to happily say that once we did Longbow 1 on the 1st Llama generation of models, we already had puffin 2k, 4k, and 8 for that -- Yeah. -- already prepared and ready. So as the LLongMa models for 13B are released, we will also be doing equivalent, puff in fine tunes, and Potentially Hermes fine tunes. We can talk a little bit more about the future of Hermes at a a little bit later, though.LDJ (@Dogesator)[00:23:51] Yeah. I mean, I was pretty much going to say the same thing, but kind of elaborate on that about how before when LLongMa V1 and everything. And during the development of LLongMa, there was actually, like you know, of course, me Enrico who are usually just called concepts of mind and and and Emozilla. Like, we've all kinda, like, been butting shoulders a lot together and just kinda working closely, you know, in the same Discord and whatnot. And it's like, hey. Like, you know, working on this, like, experimental LLongMa with thing. Like, hey. You wanna try, like, fine tuning, and then the plan just kind of ended up being like, okay. Just gonna have this Puffin thing.LDJ (@Dogesator)[00:24:31] that Puffin dataset is already containing a ton of high context conversational data. from GPT 4 and, like, human high quality data. So it's like it's like the perfect fit to have something that's high context capable will be fine tuned on that. And then LLaMa 2 came out, and it's like, oh, Yeah. Let's let's get this out ASAP, and then we'll figure out what we're gonna do later.Alex Volkov - targum.video (@altryne)[00:24:58] Yeah. Great. And it's just great to see, you know, how many opportunities is like this where with open source can the stuff that we're able to now run and gonna iterate on are building on top of each other. They're just incredible. and this is maybe a watershed moment. And I I wanna thank all of you for being here. I wanna kind of let the other folks who usually hear on Thursday, I need to ask you a question or 2 for Nous visitors. Yam and Nisten, if you if you have a question for news or for Enrico, go ahead. I I will stay young.Alex Volkov - targum.video (@altryne)[00:25:29] I know you if you have to ask the super deep technical stuff, and the audience will, like it will fly over their I I won't be using the DM with LBJ and and Rico. But yeah. Of course, the stuff that we haven't covered and interesting tough news. Feel free as it pertains to LAMA 2 is gonna be very interesting, I think, for everyone.nisten (@nisten)[00:25:47] Just to quickly clarify, you guys fine tuned the plain model. Right? Not the chat 1.Teknium (e/λ) (@Teknium1)[00:25:55] Yep. Okay. Yep. The base model. We wouldn't fine that model. The chat 1 at all.Alex Volkov - targum.video (@altryne)[00:26:00] Actually, to -- Yeah. The -- -- to maybe continue this stratigram for interrupting. Just one sec. To continue this question, the there are models they were released by Meta, and you have to, like, register and get the email and everything. And then they put some stuff on Hugging Face. And then the the those models were delineated with, like, dash HF. Have you guys use the HuggingFace or the Meta 1, and do you guys know the difference? I felt somebody that, like, maybe doesn't work as well and to inform her Yeah.Teknium (e/λ) (@Teknium1)[00:26:30] The one on Hugging phase is an FP 16 and the original Llama 2 models in bf16, but we tested the difference between the two models at Carper, and there's such a negligible difference in their quality that it's irrelevant, but we trained on the Hug and Face f P Sixteen ones, but in the f Sixteen ask them.Alex Volkov - targum.video (@altryne)[00:26:52] Sorry. Yeah. Goran, for interrupting. Go ahead.karan (@karan4d)[00:26:56] No. All good.Alex Volkov - targum.video (@altryne)[00:26:58] I I totally forgot what -- That's not it. interrupted today. Yes, Randall. Okay. Nispen, if you have a question for Kiran to follow-up with feel free, and And if not, then, Yum, if you have anything that you wanna ask the the fine folks from Nous, feel feel free as well.Yam Peleg (@Yampeleg)[00:27:17] Yeah. Sure. First, thank you for what you're doing, guys. You're really making a difference for anyone. There aren't many demos online, so anyone that didn't try Hermes, I highly encourage you to try. I don't know why there aren't them. Okay. I know why there aren't demos that cost money, but just try it. Okay? And now I got a question because from my experience, if you train on the open datasets of Hermes, you get a significantly less quality of a model. No. Now I'm fine I'm fine if you don't release datasets. Don't don't get me wrong.Yam Peleg (@Yampeleg)[00:27:54] Just I wanted to ask, is there anything else besides the data that is different? What what tips can you give for, I don't know, someone else that want to train high quality model besides having high quality data.Teknium (e/λ) (@Teknium1)[00:28:08] Everyone understands this. Yeah. The hyperparameters can make key difference. LBJ knows very well because we had to do a ton of different tests. We don't have our freight owners for puffin model. But I'm not sure if those are on the model card for Hermes. If they're not, I can put them And Karen your card can probably talk about the Nous datasets that weren't made public.karan (@karan4d)[00:28:38] Yeah. We've got, like, maybe around, like, 50 k items of data, like, versus, like, total 300 k instructions there that are not released. And to be frank with you about 45 k of them is just more GPT 4, like, alpaca style instructions. The 5000 or so, the, like, 4500 them compose this dataset we have we've been working on that, you know, at this point, I'm pretty comfortable talking about a we call it the p dactyl dataset.karan (@karan4d)[00:29:14] I won't speak on everything that's in it, but, essentially, And I don't know if this is the thing that made the big difference, but it's, like, the the one place where I guess you deviate from just using the open datasets more GPT 4 instructions, but it's got some transformers instructions, some linguistics instructions, some calculus 1, instructions, etcetera. It seems to be pretty good.Teknium (e/λ) (@Teknium1)[00:29:41] Also, Yam, do you have links or anything to the models that tried it with just the makeup of the datasets that we're public from Hermes because I haven't actually seen that before.Yam Peleg (@Yampeleg)[00:29:57] And again, can you repeat that?Teknium (e/λ) (@Teknium1)[00:29:58] didn't hear. Do you have any links to the models that trained with just the open datasets from Hermes that you could share with me later?Yam Peleg (@Yampeleg)[00:30:06] No. No. It's just it's just from my experiments -- Oh, okay. -- on training. Pretty much following the same idea of let's take only GPT 4 from all the open datasets, and the the model that you get is is different. for sure. And and it might be that hyperparameters, you know.Teknium (e/λ) (@Teknium1)[00:30:25] Another thing that we did too is pretty extensive, like, cleaning. We did do deduplication. We removed things like a URL. Like, any response that had a URL in it, we removed in case it was gonna like, hallucinated URLs. Instead of, like, maybe 8 different filtering processes too that might have made our data quality higher.LDJ (@Dogesator)[00:30:48] So as an AI language model?nisten (@nisten)[00:30:51] For anybody -- What do you say? -- for anybody in the audience that hyperparameter meters are are just like the settings in the oven. So it it looks here, like, the ingredients were all okay, but yam mess something up, and before selling as a token -- Yeah. -- came out half baked at the model.LDJ (@Dogesator)[00:31:08] So we're gonna have to check that out.LDJ (@Dogesator)[00:31:10] I'm a big proponent personally of hyperparameter optimization being underrated right now, like, in -- Yeah. -- the current space. And that's something I've kind of focused on a lot specifically for things like puffin and just trying to help others around and use some stuff like trying to optimize they're doing, and even just something like like what you just said about the settings for the oven, I mean, double the amount of time you're putting something in the oven, and it's not gonna come out twice as good. It's not even gonna come out 10% as good. It's gonna come worse. You know?LDJ (@Dogesator)[00:31:45] And although it depends, like, what is your baseline for how how much time you're putting it in the oven and all these different variables that kind of are dependent on each other and affect each other. So it's definitely something you kind of have to build an intuition about to some degree. And then the other end is really I feel like there has to be more investment and more time and energy invested into actual tools that make hyperparameter optimization easier for people that are doing these things.Yam Peleg (@Yampeleg)[00:32:13] Yeah. Yeah. And the thing is that the models are are really big, so it's really expensive to run them. So you have you have a trade off of how many how much computer you're investing in searching hyperparameters rather than actually using it for training. But but I completely agree So one one last question, actually, too.Teknium (e/λ) (@Teknium1)[00:32:33] Actually, one thing before we go on. Something great about the puffin dataset is that it's just like, 3000 or so examples, I believe. And so it makes tuning a lot less expensive because you can finish the whole training in just a couple of hours. So, like, with Hermes, if we wanted to try full ablations and dozens of them, it would take weeks weeks to do.LDJ (@Dogesator)[00:32:55] Yeah. Yeah. Well, to be fair, it's not like it only takes a couple hours on one GPU. We use a a 100 80 gigabytes. So Yeah. Yeah.Teknium (e/λ) (@Teknium1)[00:33:04] Courtesy of Redman.Alex Volkov - targum.video (@altryne)[00:33:05] Thank you, Redman.Enrico Shippole (@EnricoShippole)[00:33:08] Mhmm. I should also probably clarify that when doing the context length, extrapolation, We're doing it on 1,000,000,000 tokens and 64, 80 gigabyte a 100.Yam Peleg (@Yampeleg)[00:33:20] OOf Mhmm.Alex Volkov - targum.video (@altryne)[00:33:23] Yeah. Yam is getting over excited. Alright, folks. I wanna -- Yeah. Yeah. -- maybe maybe ask her on this one less and we'll move on to the the the regular ThursdI update camera cadence. But I will say that, like, folks from Nous research and and Rick and and some other here. Thank you so much for coming up and giving us kind of the insights into how this actually happens. Lama2 just released, you know, a few days ago, and you guys are already pumping out, like, open source fine tuned models. And it's great to see. And just so you know, there's always a stage for you here to come in and and announce things.Alex Volkov - targum.video (@altryne)[00:33:53] And If you do wanna announce, like, a release or something, maybe just, you know, right now, Karan and and Teknium and some folks, I would love to hear like, when the next Hermes is coming?karan (@karan4d)[00:34:06] Before we say that, I just would like to clarify something about Hermes. So we have the original Hermes dataset on LAMA 2 as something that we will release, but also a sequel to the Hermes dataset, Hermes 2. There will be a distinction between these 2, and you'll see you'll see the the the prior come out first and the latter come out after. But as for release, etcetera, I will absolutely let Technium take the stage with those final words.Teknium (e/λ) (@Teknium1)[00:34:36] So the training is nearly done. At least it was about 2.8 epochs out of 3 a few hours ago. So it might be done already. Before I release it though, unlike puffin, I didn't we wanted it puffing out, like, same day that llama 2 came out, so we didn't run any benchmarks. And we had to put all the compute we had on Hermes immediately after we were done with that. So we don't have any compute to do any benchmarks or puffing until Hermes is done.Teknium (e/λ) (@Teknium1)[00:35:06] But before I release Hermes, I do wanna do, like, a full range of benchmarks and stuff like that to make sure everything's good and have a pretty detailed model card, but that should probably only take the rest of tonight at the most. So probably tomorrow morning would be when Hermes comes out.Alex Volkov - targum.video (@altryne)[00:35:22] That's some folks. And you you heard it here first and definitely follow Teknium, Karan, Enrico, LDJ, and the rest of, like, Nous Research folks, and stay tuned. Enrico, go ahead.Enrico Shippole (@EnricoShippole)[00:35:34] Yes. I just wanted to to piggyback off of Teknium comment a little bit. So we did do pretty sense of the valuation of the Lauma 2 AK models. We had run different things on perplexity using Gov Report in a couple different other data sets to make sure that the length extrapolation in the context was working properly. We did passkey retrieval. We also did a lot of extensive human evaluation, which took a little bit. I had wanted to get the LAMA 2 AK models out yesterday, but we decided to push it back one day.Enrico Shippole (@EnricoShippole)[00:36:08] So and what we were doing is we were feeding in research papers and seeing if it could pull out even, like, relevant pieces of information from the context length. And so far, it has been quite successful. So we're we're still running more evals, but the ones so far have shown that there's been, like, no performance degradation, no matter what context length that you're basically using with these extended models.Alex Volkov - targum.video (@altryne)[00:36:32] That sounds great. and now that this this, you know, LLongMa lies out and the next versions are gonna come out as well. I'm sure that some other folks who also contribute to this research and tell you, like, from their own experiences and vibe. So, yeah, I wanna thank folks. Again, this has been very illuminating, and very glad to have you. And, obviously, the stage is yours whenever you want to come here, and we appreciate you. And you guys are welcome to stay tuned and kinda chime in to the rest of the updates. And with that, I think, for folks in the audience, we're moving to the next thing.ThursdAI - Recaps of the most high signal AI weekly spaces is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber. This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit sub.thursdai.news/subscribe

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode