E151: Taking on DBT by Combining Data Transformation with a Query Engine
Sep 30, 2024
auto_awesome
Lukas Schulte, Co-Founder and CEO of SDF Labs, shares insights from his journey in data transformation and querying. He discusses the challenges of the Modern Data Stack, including SQL dialect variations, and how SDF Labs enables local SQL validation. Lukas highlights his choice of Rust for robust development and the advantages of a closed-source CLI for speed and flexibility. He emphasizes the importance of strategic partnerships in enhancing functionality and user experience within the data ecosystem. Tune in for valuable lessons from the world of data!
SDF Labs addresses the pain points of modern data stacks by providing a versatile data transformation layer that simplifies SQL dialect inconsistencies.
The importance of community engagement and customer feedback is emphasized as crucial for refining their product and driving growth nearly in real-time.
Deep dives
Origins of STF Labs
The concept for STF Labs emerged from the challenges experienced while building a data-intensive video editing app. As the startup grew, the need for a scalable data transformation layer arose, due to increasing complexities such as GDPR compliance and the evolving demands of diverse data consumers. Lucas, the CEO, recognized that existing solutions primarily catered to large-scale enterprises like Meta, which prompted him to explore developing a similar yet more adaptable framework. By collaborating with co-founders who gained experience in data architecture at Meta, they aimed to create a data transformation engine that could function effectively for startups and larger companies alike.
Addressing Data Complexity
STF Labs tackles the complexities of modern data stacks, primarily arising from the inconsistencies in SQL dialects across various data warehouse providers. This fragmentation necessitates robust developer tooling, which requires a compiler to produce accurate feedback for various SQL environments. STF's solution emulates the SQL compilers of popular platforms like Snowflake and BigQuery, allowing developers to run validations and transformations locally. This innovation reduces reliance on cloud-based feedback, enhancing development speed while minimizing costs and improving data quality.
Integration and Workflow Optimization
STF Labs aims to fit seamlessly within existing data workflows by leveraging popular tools like dbt, without requiring users to adopt a new SQL dialect or framework. Their approach includes offering static analysis and compile-time guarantees, thereby enhancing the data transformation process. By using Data Fusion, they provide a library-like tool that enhances users' current systems while promoting interoperability within their tech stacks. The goal is to open up new use cases that simplify the user experience and streamline data processing efficiently.
Community Building and Market Positioning
Lucas highlights the importance of fostering an engaged community for STF Labs while choosing a closed-source model to accelerate developmental agility. The strategy includes building a strong customer base through early adopters who experience tangible benefits from the product, which in turn facilitates word-of-mouth referrals. A notable early success involved a data consultancy that shared positive insights on the product, significantly contributing to early interest and inquiries. With plans to eventually consider open-sourcing parts of their platform, the focus remains on listening to customer needs to refine and expand their offerings.
Lukas Schulte is Co-Founder and CEO of SDF Labs (Semantic Data Fabric), the data transformation layer and query engine platform. They're an open core company powered by the Apache Data Fusion query engine.
SDF Labs has raised $9M from investors including RTP Global and Two Sigma Ventures.
In this episode, we dig into the complications and pain points with the Modern Data Stack, shifting left with data (ie. moving more over to the client), competing with DBT by adding a query engine, why building in Rust was important, why their CLI is closed source, the importance of a strong partner strategy as a data company & more!
Get the Snipd podcast app
Unlock the knowledge in podcasts with the podcast player of the future.
AI-powered podcast player
Listen to all your favourite podcasts with AI-powered features
Discover highlights
Listen to the best highlights from the podcasts you love and dive into the full episode
Save any moment
Hear something you like? Tap your headphones to save it with AI-generated key takeaways
Share & Export
Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more
AI-powered podcast player
Listen to all your favourite podcasts with AI-powered features
Discover highlights
Listen to the best highlights from the podcasts you love and dive into the full episode