Neon: A Serverless And Developer Friendly Postgres
Jul 8, 2024
auto_awesome
Nikita Shamgunov shares the journey of creating a serverless Postgres solution, Neon. Topics include maintaining Postgres compatibility, using database branches for isolated environments, managing latency and reliability in deployments, the PG Vector Extension for AI applications, open source vs. business models, and future plans for Neon.
Starburst Data Lake Platform offers comprehensive support for various table formats, showcasing scalability and reliability.
Nikita Shamgunov's journey in database architecture led to the creation of Neon, a cloud-friendly Postgres alternative with developer-centric features.
Neon's branching feature enables 'what-if' scenarios, simplifying staging environment creation for faster iteration and streamlined debugging.
Deep dives
Overview of Starburst Data Lake Platform on Trino Engine
Starburst Data Lake Platform is an end-to-end solution built on Trino, offering comprehensive support for various table formats like Apache Iceberg, Hive, and Delta Lake. Notable users include Comcast and DoorDash, showcasing its scalability and reliability.
Nikita Shemkadov's Journey in Postgres Development
Nikita Shemkadov shares his experience working on serverless Postgres, highlighting the company's growth from three members to a hundred. His fascination with databases started with PHP MySQL during college and continued with SQL Server at Microsoft, shaping his deep expertise in database architecture.
Evolution of Neon Project and Vision for Cloud Platform Extension
Reflecting on the Neon project's evolution, Nikita details the conceptualization of an alternative to Aurora inspired by his mentor's work at AWS Aurora. Neon aims to be a cloud-only product, emphasizing compatibility with Postgres while focusing on developer-friendly features to accelerate app development.
Impacts of Neon's Branching Capabilities on Development Cycles
The branching feature in Neon enables 'what-if' scenario exploration for customers, allowing simple creation of staging environments by branching off production data. This functionality enhances developer workflows by facilitating independent development cycles and performance testing, leading to faster iteration and streamlined debugging.
Neon's Enhanced Collaboration with Project Partners and Focus on Data Management Innovations
Neon's roadmap includes expanding cloud presence, integrating more developer tools like GitHub apps, advancing AI capabilities, and deepening connections with data lakes like AWS Redshift. By setting a robust infrastructure, Neon aims to accelerate app delivery and foster innovative data management practices in the industry.
Summary Postgres is one of the most widely respected and liked database engines ever. To make it even easier to use for developers to use, Nikita Shamgunov decided to makee it serverless, so that it can scale from zero to infinity. In this episode he explains the engineering involved to make that possible, as well as the numerous details that he and his team are packing into the Neon service to make it even more attractive for anyone who wants to build on top of Postgres. Announcements
Hello and welcome to the Data Engineering Podcast, the show about modern data management
Data lakes are notoriously complex. For data engineers who battle to build and scale high quality data workflows on the data lake, Starburst is an end-to-end data lakehouse platform built on Trino, the query engine Apache Iceberg was designed for, with complete support for all table formats including Apache Iceberg, Hive, and Delta Lake. Trusted by teams of all sizes, including Comcast and Doordash. Want to see Starburst in action? Go to dataengineeringpodcast.com/starburst and get $500 in credits to try Starburst Galaxy today, the easiest and fastest way to get started using Trino.
Your host is Tobias Macey and today I'm interviewing Nikita Shamgunov about his work on making Postgres a serverless database at Neon.
Interview
Introduction
How did you get involved in the area of data management?
Can you describe what Neon is and the story behind it?
The ecosystem around Postgres is large and varied. What are the pain points that you are trying to address with Neon?
What does it mean for a database to be serverless?
What kinds of products and services are unlocked by making Postgres a serverless database?
How does your vision for Neon compare/contrast with what you know of PlanetScale?
Postgres is known for having a large ecosystem of plugins that add a lot of interesting and useful features, but the storage layer has not been as easily extensible historically. How have architectural changes in recent Postgres releases enabled your work on Neon?
What are the core pieces of engineering that you have had to complete to make Neon possible?
How have the design and goals of the project evolved since you first started working on it?
The separation of storage and compute is one of the most fundamental promises of the cloud. What new capabilities does that enable in Postgres?
How does the branching functionality change the ways that development teams are able to deliver and debug features?
Because the storage is now a networked system, what new performance/latency challenges does that introduce? How have you addressed them in Neon?
Anyone who has ever operated a Postgres instance has had to tackle the upgrade process. How does Neon address that process for end users?
The rampant growth of AI has touched almost every aspect of computing, and Postgres is no exception. How does the introduction of pgvector and semantic/similarity search functionality impact the adoption and usage patterns of Postgres/Neon?
What new challenges does that introduce for you as an operator and business owner?
What are the lessons that you learned from MemSQL/SingleStore that have been most helpful in your work at Neon?
What are the most interesting, innovative, or unexpected ways that you have seen Neon used?
What are the most interesting, unexpected, or challenging lessons that you have learned while working on Neon?
From your perspective, what is the biggest gap in the tooling or technology for data management today?
Closing Announcements
Thank you for listening! Don't forget to check out our other shows. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used. The Machine Learning Podcast helps you go from idea to production with machine learning.
Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
If you've learned something or tried out a project from the show then tell us about it! Email hosts@dataengineeringpodcast.com with your story.