AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
Mistral has unveiled CodeStral Mamba, a model optimized for code generation with three sizes: 130 million, 350 million, and 1.7 billion parameters, outperforming existing models in their size categories and open source under the Apache 2.0 license.
Mistral AI and NVIDIA have collaborated to launch Mistral NEMO 12b, an enterprise AI model optimized for NVIDIA hardware, trained on NVIDIA DGX cloud AI platforms, and fully open source under the Apache 2.0 license with FP8 precision to reduce memory requirements.
Stable Diffusion faced criticism for its restrictive license agreement, leading to a ban on content related to it. They responded by updating the terms to grant free use for research, non-commercial, and limited commercial purposes, with free access for individuals and businesses under $1 million in annual revenue.
Hugging Face has introduced small LLM models with three sizes, optimized for local devices like laptops and phones, surpassing existing models in their categories. These fully open source models are trained on new datasets and offer improved licensing terms for usage.
Large businesses making over a million dollars may need to build tech in-house instead of commercial use. This move aims to ensure fair competition and maintain moral standards. A new optimization called Flash Attention 3 is being introduced for NVIDIA Hopper GPUs. The optimization allows for up to 75% GPU capacity usage, resulting in significant speed improvements.
Flash Attention techniques focus on improving the speed of LLM inference by optimizing attention implementation. Flash Attention 3 brings new optimizations specifically for NVIDIA Hopper GPUs. With this upgrade, up to 75% of the GPU's maximum capacity can be utilized, resulting in 1.5 to 2 times speed improvement compared to previous versions.
Training AI models involves a significant number of matrix multiplications. These computations form the majority of tasks in attention mechanisms and feedforward layers. Special functions like Softmax can become computational bottlenecks affecting training times. Techniques like Flash Attention aim to reduce data exchange between memory components to optimize performance, achieving significant speed enhancements.
Whistleblowers have raised concerns about OpenAI allegedly preventing staff from reporting safety risks to regulators. Agreements requiring employees to waive certain rights for whistleblower compensation and seek prior consent to disclose information are claimed to violate federal laws. These actions potentially hinder transparency and regulatory compliance in handling AI technology concerns.
The US government is considering imposing more stringent restrictions on China's semiconductor industry. Proposed measures include using the foreign direct product rule to pressure companies in Japan and the Netherlands to limit business with Chinese entities. These actions aim to enhance oversight in preventing technology transfers that could pose national security risks.
Google and Microsoft are providing access to Nvidia chips to Chinese companies via cloud services deployed outside China. This circumvention of export controls aims to comply with US regulations while enabling Chinese firms to benefit from GPU technology. The approach underscores challenges in managing international technology transfers and ensuring regulatory compliance.
Our 175th episode with a summary and discussion of last week's big AI news!
With hosts Andrey Kurenkov (https://twitter.com/andrey_kurenkov) and Jeremie Harris (https://twitter.com/jeremiecharris)
In this episode of Last Week in AI, hosts Andrey Kurenkov and Jeremy Harris explore recent AI advancements including OpenAI's release of GPT 4.0 Mini and Mistral’s open-source models, covering their impacts on affordability and performance. They delve into enterprise tools for compliance, text-to-video models like Hyper 1.5, and YouTube Music enhancements. The conversation further addresses AI research topics such as the benefits of numerous small expert models, novel benchmarking techniques, and advanced AI reasoning. Policy issues including U.S. export controls on AI technology to China and internal controversies at OpenAI are also discussed, alongside Elon Musk's supercomputer ambitions and OpenAI’s Prover-Verify Games initiative.Read out our text newsletter and comment on the podcast at https://lastweekin.ai/
If you would like to become a sponsor for the newsletter, podcast, or both, please fill out this form.
Email us your questions and feedback at contact@lastweekinai.com and/or hello@gladstone.ai
Timestamps + links:
Listen to all your favourite podcasts with AI-powered features
Listen to the best highlights from the podcasts you love and dive into the full episode
Hear something you like? Tap your headphones to save it with AI-generated key takeaways
Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more
Listen to all your favourite podcasts with AI-powered features
Listen to the best highlights from the podcasts you love and dive into the full episode