AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
Perplexity AI recently reached a milestone of one million users on both its Android and iOS mobile apps. The success can be attributed to their brand marketing and focus on creating a fast and reliable app experience. The app has gone through improvements based on user feedback and has become a preferred tool for many developers.
PPLX Online is a unique LM API that does not have a knowledge cutoff. This makes it the go-to choice for developers who need access to web information without any limitations. The API is highly performant and provides accurate results. There are two versions available, 7B for faster response times and 7B for higher quality outputs. Perplexity AI also plans to incorporate the API in the context of mixed-role MOE.
Cerebras Systems, known for producing giant massive GPUs, has been making strides in language models. They have released open-source GPT models, such as Cerebras GPT, which demonstrate compute optimal scaling, as well as the BTLM language model. Cerebras has also been collaborating with Group 42 in the UAE to train large Arabic language models, with the latest releases being the Jase 13 billion and 30 billion parameter models. These models have shown impressive quality and performance, making them competitive with other state-of-the-art models in both Arabic and English. Cerebras' focus on low-resource and multilingual language models is gaining attention and contributing to the wider AI community.
Cerebras Systems stands out with its unique hardware design. Rather than dividing the wafer into individual GPUs, Cerebras uses the entire wafer for a single processor, resulting in a more cost-effective and efficient solution. This approach allows for high performance and easy programming, comparable to GPU systems. The native support for unstructured sparsity enables efficient matrix multiplications, making it a versatile solution for various applications.
Cerebras Systems is actively exploring different research directions. Their hardware supports unstructured sparsity, enabling the use of vector multiplications on a per-weight basis. This flexibility allows for more efficient loading and processing of large-scale models. Cerebras is also focusing on streaming methods for language models and other workloads, leveraging the capabilities of their hardware. Additionally, they are pushing for larger clusters and scaling out to handle more demanding applications.
Sparse models and techniques, such as weight sparsity and changing sparsity throughout training, are gaining popularity and being used in recent research works like the lottery ticket hypothesis. Sparse models offer performance improvements and better model representation capabilities.
A new technique for estimating gradient norm calculations called the efficient and approximate per example gradient norms is being used to measure training dynamics. It provides statistical analysis for efficient computation, eliminating the need for large batch sizes, helping with appropriate batch sizing and learning rate selection.
Sparse models can be deployed efficiently using new techniques that involve pruning after dense pre-training and retraining to recover the model's capabilities. These techniques aim to find sparse models that allow for better deployment and more efficient usage of large language models.
Lightning Studio is a new development environment that allows AI engineers to build AI at scale in a user-friendly way. It offers a cloud-based environment where developers can create, test, and deploy models without the need for complex setup or infrastructure management.
We are running an end of year listener survey! Please let us know any feedback you have, what episodes resonated with you, and guest requests for 2024! Survey link here.
We can’t think of a more Latent-Space-y way to end 2023 than with a mega episode featuring many old and new friends recapping their biggest news, achievements, and themes and memes of the year!
We previously covered the Best Papers of NeurIPS 2023, but the other part of NeurIPS being an industry friendly conference is all the startups that show up to hire and promote their latest and greatest products and papers! As a startup-friendly podcast, we of course were ready with our mics to talk to everyone we could track down.
In lieu of an extended preamble, we encourage you to listen and click through all the interviews and show notes, all of which have been curated to match the references mentioned in the episode.
Timestamps & Show Notes
* [00:01:26] Jonathan Frankle - Chief Scientist, MosaicML/Databricks
* see also the Mosaic/MPT-7B episode
* $1.3B MosaicML x Databricks acquisition
* [00:22:11] Lin Qiao - CEO, Fireworks AI
* [00:38:24] Aman Sanger - CEO, Anysphere (Cursor)
* see also the Cursor episode
* Tweet: Request-level memory-based KV caching
* Tweet: GPT-4 grading and Trueskill ratings for rerankers
* [00:51:14] Aravind Srinivas - CEO, Perplexity
* 1m app installs on iOS and Android
* pplx-online api 7b and 70b models
* Shaan Puri/Paul Graham Fierce Nerds story
* [01:04:26] Will Bryk - CEO, Metaphor
* “Andrew Huberman may have singlehandedly ruined the SF social scene”
* [01:12:49] Jeremy Howard - CEO, Answer.ai
* see also the End of Finetuning episode
* Jeremy’s podcast with Tanishq Abraham, Jess Leao
* Announcing Answer.ai with $10m from Decibel VC
* Laundry Buddy, Nov 2023 AI Meme of the Month
* [01:37:13] Joel Hestness - Principal Scientist, Cerebras
* CerebrasGPT, all the Cerebras papers we discussed
* [01:56:34] Jason Corso - CEO, Voxel51
* Open Source FiftyOne project
* [02:02:39] Brandon Duderstadt - CEO, Nomic.ai
* [02:12:39] Luca Antiga - CTO, Lightning.ai
* Pytorch Lightning, Lightning Studios, LitGPT
* [02:29:46] Jay Alammar - Engineering Fellow, Cohere
Listen to all your favourite podcasts with AI-powered features
Listen to the best highlights from the podcasts you love and dive into the full episode
Hear something you like? Tap your headphones to save it with AI-generated key takeaways
Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more
Listen to all your favourite podcasts with AI-powered features
Listen to the best highlights from the podcasts you love and dive into the full episode