Chris Riccomini - Building (and Writing About) Data Intensive Applications
Nov 13, 2024
auto_awesome
Chris Riccomini, an engineer with a rich background at LinkedIn and WePay, shares insights on building SlateDB and the intricacies of data infrastructure. He recounts his journey from PayPal to LinkedIn, detailing the shifts in technology and engineering challenges. The discussion flows into writing a collaborative handbook for new software engineers during the pandemic. Riccomini also delves into learning Rust, the role of AI in programming, and strategies for staying relevant in tech, emphasizing the future of data applications.
Chris emphasizes that optimizing data infrastructure can significantly enhance model performance, illustrating this with his experience at LinkedIn and Hadoop.
He highlights the importance of open sourcing technology in attracting talent and fostering innovation, showcasing LinkedIn's culture of collaboration and branding.
Deep dives
Career Overview and Infrastructure Expertise
Chris has a solid background in engineering, having worked at two major companies, LinkedIn and WePay, over a span of 15 years. His expertise lies in infrastructure, where he played key roles in the Hadoop and Kafka ecosystems while at LinkedIn. Notably, Chris was involved in developing Apache SAMSA, a stream processing system, and later managed the data infrastructure team at WePay, contributing to Airflow integration and payments infrastructure. This extensive experience in building and managing complex systems laid the foundation for his current endeavors, including his writing and investment activities.
The Importance of Infrastructure in Data Performance
The podcast highlights Chris's belief that the infrastructure behind data systems often holds more leverage than the models themselves. He provides an example from his time at LinkedIn, where improving a data processing system to use Hadoop instead of Oracle significantly expedited the 'people you may know' algorithm. This shift not only enhanced performance metrics but also allowed for daily updates rather than weeks-long delays. Chris argues that optimizing infrastructure can lead to substantial gains in model performance, emphasizing the critical role of foundational systems in data science.
The Value of Open Source at LinkedIn
Chris discusses LinkedIn's culture of open sourcing tools and technologies developed within their engineering teams. He suggests that open sourcing was pivotal for branding and recruiting, allowing prospective employees to engage with the innovative technologies LinkedIn was building. This approach fostered a collaborative environment, attracting talent who were eager to contribute to impactful projects. While some competitive technologies were kept proprietary, sharing non-essential tools helped establish LinkedIn as a leader in the tech ecosystem.
Current Trends and Future Interests in Data Technologies
Chris expresses excitement about the evolving landscape of data technologies, particularly in durable execution and the shift towards PostgreSQL. He is exploring the implications of moving compute capabilities closer to the client-side, as well as advancements in cloud-based databases. Additionally, he is bullish on the Rust programming language and its application within big data frameworks, noting a growing movement to rewrite big data tools using Rust. These developments indicate a promising future for data infrastructure, with ongoing innovations aimed at catering to both performance and developer needs.