Paul Dix, the CTO of InfluxData, discusses the inception of InfluxDB, a powerful open-source time series database. He shares the challenges faced in early development and the pivotal shift from Scala to Go. Dix highlights the advantages of transitioning to Rust, including enhanced performance and error handling. He elaborates on the significant upgrades from InfluxDB version 1.0 to 2.0, along with strategies for managing high data volumes. The importance of community contributions and ongoing learning in Rust completes this insightful conversation.
InfluxDB was created to fill the gap in time series databases by addressing shortcomings in existing solutions like Graphite and focusing on specific use cases.
The development of InfluxDB transitioned from Go to Rust to enhance performance, concurrency, and error handling, resulting in significantly improved ingestion speeds and query processing times.
InfluxData emphasizes community engagement and the incorporation of new features like richer data types and scripting capabilities to modernize InfluxDB and foster collaboration within the Rust ecosystem.
Deep dives
Origins and Evolution of InfluxDB
InfluxDB was created to address a gap in time series databases, which arose in 2013 when existing solutions like Graphite were insufficient. The initial venture by Paul Dix and his co-founder focused on SaaS metrics using a combination of Scala and Cassandra but was ultimately redirected towards developing a robust time series database. This pivot was driven by observations that numerous companies were struggling to implement their own time series databases while existing open-source options had significant limitations. The founding team quickly built a prototype of InfluxDB, which garnered immediate interest due to its targeted approach to handling time series data.
The Shift to Go and Performance Concerns
The decision to write InfluxDB in Go was rooted in the language’s simplicity and rapid development capabilities, despite initial hesitations about its garbage collector. The early architecture of InfluxDB struggled with challenges such as file handle limits and inefficient data deletion processes. This led to the realization that traditional databases couldn't effectively manage the high data ingestion volume typical of time series use cases. By leveraging Go's strengths, the team optimized their initial architecture to improve performance, ultimately leading to version 1.0's release in 2016, which included better data modeling and indexing.
Transitioning to Rust for Enhanced Capabilities
As InfluxDB evolved, the need for new features such as infinite cardinality and improved query performance became apparent, prompting a move to incorporate Rust into their tech stack. The efficiency of Rust’s error handling, concurrency features, and the absence of a garbage collector appealed to the team looking for better reliability and performance. In late 2019, a small team began prototyping new database core features in Rust, ultimately expanding to include voting support for SQL-like queries. This move has led to significant enhancements in ingestion speed and reduced query processing times in the new architecture of InfluxDB 3.0.
Rust’s Impact on Database Design and Community Involvement
While Rust’s performance characteristics were a consideration, its ability to eliminate data races and enable fearless concurrency played a crucial role in the decision-making process. As the team tailored their designs to accommodate Rust's capabilities, they also emphasized community engagement by opening discussions on how to improve Data Fusion, a project associated with InfluxDB. The aim was not just to enhance their product but also to cultivate a collaborative environment encouraging contributions from others in the Rust ecosystem. Looking forward, the team envisions a future where contributors can share scripts and tools, fostering a supportive community that will enhance the overall time series data landscape.
Future Developments and Open Source Aspirations
InfluxData aims to continue expanding the functionalities of InfluxDB, particularly by introducing richer data types and embedded scripting capabilities to facilitate data transformations. The development of an open-source version of InfluxDB 3.0 is on the horizon, aligning with the company’s commitment to community involvement and transparency. The integration of features like an embedded VM for Python or JavaScript scripting reflects the desire to modernize their platform and enhance user experience. InfluxData is proactively seeking Rust developers to contribute to these exciting advancements, highlighting a collaborative approach to shaping the future of time-series data management.
About InfluxData InfluxData is the creator of InfluxDB, the leading open source time series database. They offer a cloud service, InfluxDB Cloud, and a commercial on-premise product, InfluxDB Enterprise (https://www.influxdata.com/products/influxdb-enterprise/).
About Paul Dix Paul Dix is the founder and CTO of InfluxData (https://www.influxdata.com/). He has helped build software for startups, large companies and organizations like Microsoft, Google, McAfee, Thomson Reuters, and Air Force Space Command. He is the series editor for Addison Wesley's Data & Analytics book and video series (https://www.informit.com/imprint/series_detail.aspx?ser=4255387). In 2010 Paul wrote the book "Service Oriented Design with Ruby and Rails" (https://www.oreilly.com/library/view/service-oriented-design-with/9780321700124/) for Addison Wesley's Professional Ruby Series. In 2009 he started the NYC Machine Learning Meetup (https://www.meetup.com/nyc-machine-learning/), which now has over 13,000 members. Paul holds a degree in computer science from Columbia University. You can find Paul on Twitter (https://twitter.com/pauldix) and GitHub (https://github.com/pauldix).
Links - InfluxData: https://www.influxdata.com/ - Careers at InfluxData: https://www.influxdata.com/careers/ - Blog post: Meet the Founders Who Rewrote in Rust: https://www.influxdata.com/blog/meet-founders-who-rewrote-in-rust/ - Reddit: Details and discussion on the Rust rewrite: https://www.reddit.com/r/rust/comments/16v13l5/influxdb_officially_made_the_switch_from_go_rust/ - Blog post: The Plan for InfluxDB 3.0 Open Source: https://www.influxdata.com/blog/the-plan-for-influxdb-3-0-open-source/
Get the Snipd podcast app
Unlock the knowledge in podcasts with the podcast player of the future.
AI-powered podcast player
Listen to all your favourite podcasts with AI-powered features
Discover highlights
Listen to the best highlights from the podcasts you love and dive into the full episode
Save any moment
Hear something you like? Tap your headphones to save it with AI-generated key takeaways
Share & Export
Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more
AI-powered podcast player
Listen to all your favourite podcasts with AI-powered features
Discover highlights
Listen to the best highlights from the podcasts you love and dive into the full episode