The New Stack Podcast cover image

The New Stack Podcast

Latest episodes

undefined
Jun 6, 2024 • 18min

Who’s Keeping the Python Ecosystem Safe?

Mike Fiedler, a PyPI safety and security engineer at the Python Software Foundation, prefers the title “code gardener,” reflecting his role in maintaining and securing open source projects. Recorded at PyCon US, Fiedler explains his task of “pulling the weeds” in code—handling unglamorous but crucial aspects of open source contributions. Since August, funded by Amazon Web Services, Fiedler has focused on enhancing the security of the Python Package Index (PyPI). His efforts include ensuring that both packages and the pipeline are secure, emphasizing the importance of vetting third-party modules before deployment.One of Fiedler’s significant initiatives was enforcing mandatory two-factor authentication (2FA) for all PyPI user accounts by January 1, following a community awareness campaign. This transition was smooth, thanks to proactive outreach. Additionally, the foundation collaborates with security researchers and the public to report and address malicious packages.In late 2023, a security audit by Trail of Bits, funded by the Open Technology Fund, identified and quickly resolved medium-sized vulnerabilities, increasing PyPI's overall security. More details on Fiedler's work are available in the full interview video.Learn more from The New Stack about PyPl:PyPl Strives to Pull Itself Out of TroubleHow Python Is EvolvingPoisoned Lolip0p PyPI PackagesJoin our community of newsletter subscribers to stay on top of the news and at the top of your game. 
undefined
May 30, 2024 • 23min

How Training Data Differentiates Falcon, the LLM from the UAE

The name "Falcon" for the UAE’s large language model (LLM) symbolizes the national bird's qualities of courage and perseverance, reflecting the vision of the Technology Innovation Institute (TII) in Abu Dhabi. TII, launched in 2020, addresses AI’s rapid advancements and unintended consequences by fostering an open-source approach to enhance community understanding and control of AI. In this New Stack Makers, Dr. Hakim Hacid, Executive Director and Acting Chief Researcher, Technology Innovation Institute emphasized the importance of perseverance and innovation in overcoming challenges. Falcon gained attention for being the first truly open model with capabilities matching many closed-source models, opening new possibilities for practitioners and industry. Last June, Falcon introduced a 40-billion parameter model, outperforming the LLaMA-65B, with smaller models enabling local inference without the cloud. The latest 180-billion parameter model, trained on 3.5 trillion tokens, illustrates Falcon’s commitment to quality and efficiency over sheer size. Falcon’s distinctiveness lies in its data quality, utilizing over 80% RefinedWeb data, based on CommonCrawl, which ensures cleaner and deduplicated data, resulting in high-quality outcomes. This data-centric approach, combined with powerful computational resources, sets Falcon apart in the AI landscape. Learn more from The New Stack about Open Source AI: Open Source Initiative Hits the Road to Define Open Source AI  Linus Torvalds on Security, AI, Open Source and TrustTransparency and Community: An Open Source Vision for AI Join our community of newsletter subscribers to stay on top of the news and at the top of your game. 
undefined
May 22, 2024 • 36min

Out with C and C++, In with Memory Safety

Crash-level bugs continue to pose a significant challenge due to the lack of memory safety in programming languages, an issue persisting since the punch card era. This enduring problem, described as "the Joker to the Batman" by Anil Dash, VP of developer experience at Fastly, is highlighted in a recent episode of The New Stack Makers. The White House has emphasized memory safety, advocating for the adoption of memory-safe programming languages and better software measurability. The Office of the National Cyber Director (ONCD) noted that languages like C and C++ lack memory safety traits and are prevalent in critical systems. They recommend using memory-safe languages, such as Java, C#, and Rust, to develop secure software. Memory safety is particularly crucial for the US government due to the high stakes, especially in space exploration, where reliability standards are exceptionally stringent. Dash underscores the importance of resilience and predictability in missions that may outlast their creators, necessitating rigorous memory safety practices.Learn more from The New Stack about Memory Safety:White House Warns Against Using Memory-Unsafe Languages Can C++ Be Saved? Bjarne Stroupstrup on Ensuring Memory SafetyBjarne Stroupstrup's Plan for Bringing Safety to C++Join our community of newsletter subscribers to stay on top of the news and at the top of your game.  
undefined
May 16, 2024 • 21min

How Open Source and Time Series Data Fit Together

In the push to integrate data into development, time series databases have gained significant importance. These databases capture time-stamped data from servers and sensors, enabling the collection and storage of valuable information. InfluxDB, a leading open-source time series database technology by InfluxData, has partnered with Amazon Web Services (AWS) to offer a managed open-source service for time series databases. Brad Bebee, General Manager of Amazon Neptune and Amazon Timestream highlighted the challenges faced by customers managing open-source Influx database instances, despite appreciating its API and performance. To address this, AWS initiated a private beta offering a managed service tailored to customer needs. Paul Dix, Co-founder and CTO of InfluxData joined Bebee, and highlighted Influx's prized utility in tracking measurements, metrics, and sensor data in real-time. AWS's Timestream complements this by providing managed time series database services, including TimesTen for Live Analytics and Timestream for Influx DB. Bebee emphasized the growing relevance of time series data and customers' preference for managed open-source databases, aligning with AWS's strategy of offering such services. This partnership aims to simplify database management and enhance performance for customers utilizing time series databases. Learn more from The New Stack about time series databases:What Are Time Series Databases, and Why Do You Need Them?Amazon Timestream: Managed InfluxDB for Time Series Data Install the InfluxDB Time-Series Database on Ubuntu Server 22.04Join our community of newsletter subscribers to stay on top of the news and at the top of your game. 
undefined
May 9, 2024 • 18min

Postgres is Now a Vector Database, Too

Amazon Web Services (AWS) has introduced PG Vector, an open-source tool that integrates generative AI and vector capabilities into PostgreSQL databases. Sirish Chandrasekaran, General Manager of Amazon Relational Database Services, explained at Open Source Summit 2024 in Seattle that PG Vector allows users to store vector types in Postgres and perform similarity searches, a key feature for generative AI applications. The tool, developed by Andrew Kane and offered by AWS in services like Aurora and RDS, originally used an indexing scheme called IVFFlat but has since adopted Hierarchical Navigable Small World (HNSW) for improved query performance. HNSW offers a graph-based approach, enhancing the ability to find nearest neighbors efficiently, which is crucial for generative AI tasks. AWS emphasizes customer feedback and continuous innovation in the rapidly evolving field of generative AI, aiming to stay responsive and adaptive to customer needs. Learn more from The New Stack about Vector Databases Top 5 Vector Database Solutions for Your AI Project Vector Databases Are Having a Moment – A Chat with Pinecone Why Vector Size Matters Join our community of newsletter subscribers to stay on top of the news and at the top of your game. https://thenewstack.io/newsletter/ 
undefined
May 2, 2024 • 18min

Valkey: A Redis Fork with a Future

Former Redis core contributor Madelyn Olson, Google's Ping Xie, and Oracle's Dmitry Polyakovsky discuss Valkey, a Redis fork challenging Redis' new license. They highlight the importance of open, permissive licenses for tech companies, Valkey's industry backing, plans for incremental updates, module development in Rust, and maintaining compatibility for a robust ecosystem.
undefined
Apr 25, 2024 • 23min

Kubernetes Gets Back to Scaling with Virtual Clusters

A virtual cluster, described by Loft Labs CEO Lukas Gentele at Kubecon+ CloudNativeCon Paris, is a Kubernetes control plane running inside a container within another Kubernetes cluster. In this New Stack Makers episode, Gentele explained that this approach eliminates the need for numerous separate control planes, allowing VMs to run in lightweight, quickly deployable containers. Loft Labs' open-sourced vcluster technology enables virtual clusters to spin up in about six seconds, significantly faster than traditional Kubernetes clusters that can take over 30 minutes to start in services like Amazon EKS or Google GKE.The integration of vCluster into Rancher at KubeCon Paris enables users to manage virtual clusters alongside real clusters seamlessly. This innovation addresses challenges faced by companies managing multiple applications and clusters, advocating for a multi-tenant cluster approach for improved sharing and security, contrary to the trend of isolated single-tenant clusters that emerged due to complexities in cluster sharing within Kubernetes. Learn more from The New Stack about virtual clusters: Vcluster to the Rescue Navigating the Trade-Offs of Scaling Kubernetes Dev Environments Managing Kubernetes Clusters for Platform Engineers Join our community of newsletter subscribers to stay on top of the news and at the top of your game. https://thenewstack.io/newsletter/
undefined
Apr 22, 2024 • 29min

How Giant Swarm Is Helping to Support the Future of Flux

When Weaveworks, known for pioneering "GitOps," shut down, concerns arose about the future of Flux, a critical open-source project. However, Puja Abbassi, Giant Swarm's VP of Product, reassured Alex Williams, Founder and Publisher of The New Stack at Open Source Summit in Paris that Flux's maintenance is secure in this episode of The New Makers podcast. Giant companies like Microsoft Azure and GitLab have pledged support. Giant Swarm, an avid Flux user, also contributes to its development, ensuring its vitality alongside related projects like infrastructure code plugins and UI improvements. Abbassi highlighted the importance of considering a project's sustainability and integration capabilities when choosing open-source tools. He noted Argo CD's advantage in UI, emphasizing that projects like Flux must evolve to meet user expectations and avoid being overshadowed. This underscores the crucial role of community support, diversity, and compatibility within the Cloud Native Computing Foundation's ecosystem for long-term tool adoption.Learn more from The New Stack about Flux:  End of an Era: Weaveworks Closes Shop Amid Cloud Native Turbulence Why Flux Isn't Dying after WeaveworksJoin our community of newsletter subscribers to stay on top of the news and at the top of your game.   
undefined
Apr 11, 2024 • 38min

AI, LLMs and Security: How to Deal with the New Threats

The use of large language models (LLMs) has become widespread, but there are significant security risks associated with them. LLMs with millions or billions of parameters are complex and challenging to fully scrutinize, making them susceptible to exploitation by attackers who can find loopholes or vulnerabilities. On an episode of The New Stack Makers, Chris Pirillo, Tech Evangelist and Lance Seidman, Backend Engineer at Atomic Form discussed these security challenges, emphasizing the need for human oversight to protect AI systems.One example highlighted was malicious AI models on Hugging Face, which exploited the Python pickle module to execute arbitrary commands on users' machines. To mitigate such risks, Hugging Face implemented security scanners to check every file for security threats. However, human vigilance remains crucial in identifying and addressing potential exploits.Seidman also stressed the importance of technical safeguards and a culture of security awareness within the AI community. Developers should prioritize security throughout the development life cycle to stay ahead of evolving threats. Overall, the message is clear: while AI offers remarkable capabilities, it requires careful management and oversight to prevent misuse and protect against security breaches.Learn more from The New Stack about AI and security:  Artificial Intelligence: Stopping the Big Unknown in Application, Data Security Cyberattacks, AI and Multicloud Hit Cybersecurity in 2023Will Generative AI Kill DevSecOps?  Join our community of newsletter subscribers to stay on top of the news and at the top of your game. 
undefined
Apr 4, 2024 • 29min

How Kubernetes Faces a New Reality with the AI Engineer

The Kubernetes community primarily focuses on improving the development and operations experience for applications and infrastructure, emphasizing DevOps and developer-centric approaches. In contrast, the data science community historically moved at a slower pace. However, with the emergence of the AI engineer persona, the pace of advancement in data science has accelerated significantly. Alex Williams, founder and publisher of The New Stack co-hosted a discussion with Sanjeev Mohan, an independent analyst, which highlighted the challenges faced by data-related tasks on Kubernetes due to the stateful nature of data. Unlike applications, restarting a database node after a failure may lead to inconsistent states and data loss. This discrepancy in pace and needs between developers and data scientists led to Kubernetes and the Cloud Native Computing Foundation initially overlooking data science. Nevertheless, Mohan noted that the pace of data engineers has increased as they explore new AI applications and workloads. Kubernetes now plays a crucial role in supporting these advancements by helping manage resources efficiently, especially considering the high cost of training large language models (LLMs) and using GPUs for AI workloads. Mohan also discussed the evolving landscape of AI frameworks and the importance of aligning business use cases with AI strategies. Learn more from The New Stack about data development and DevOps: AI Will Drive Streaming Data Use — But Not Yet, Report Says https://thenewstack.io/ai-will-drive-streaming-data-adoption-says-redpanda-survey/ The Paradigm Shift from Model-Centric to Data-Centric AI https://thenewstack.io/the-paradigm-shift-from-model-centric-to-data-centric-ai/ AI Development Needs to Focus More on Data, Less on Models https://thenewstack.io/ai-development-needs-to-focus-more-on-data-less-on-models/ Learn more from The New Stack about data development and DevOps: AI Will Drive Streaming Data Use - But Not Yet, Report SaysThe Paradigm Shift from Model-Centric to Data-Centric AIAI Development Needs to Focus More on Data, Less on Models Join our community of newsletter subscribers to stay on top of the news and at the top of your game. https://thenewstack.io/newsletter/

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app