215: Data Sharing and the Truth About Data Clean Rooms with Patrik Devlin of Wilde AI
Nov 13, 2024
auto_awesome
Patrik Devlin, founder of Wilde AI, dives into the world of data sharing and clean rooms, drawing on his rich background in software and data products. He discusses the evolution of QR codes, highlighting their marketing potential and rapid adoption during the pandemic. The conversation also explores the complexities of data clean rooms, the technical challenges they present, and innovative solutions like Snowflake. Patrik shares insights on data orchestration tools and emphasizes the importance of modern data infrastructures for efficient processing.
Patrik Devlin’s transition from software engineering to data engineering highlights the value of diverse experiences in creating effective data solutions.
The challenges of implementing data clean rooms reveal that practical, cost-effective data-sharing strategies are crucial for modern businesses.
Deep dives
Patrick Devlin's Background in Data Engineering
Patrick Devlin shares his journey into data engineering, highlighting his decade-long experience in software and recent transition into data-related roles. He co-founded Wild AI, focusing on creating a predictive lifetime value (LTV) product, which involves substantial data handling. His background primarily lies in software engineering, which didn't immediately translate into data engineering but inspired innovation within his new role. Devlin’s excitement about exploring the technical aspects of data work illustrates the importance of diverse experiences in developing effective data solutions.
The Role of DuckDB and MotherDuck in Data Strategy
DuckDB and MotherDuck have become integral components of Wild AI's architecture, allowing for innovative data sharing and analytics. These tools facilitate embedded OLAP databases that enhance data interactions and streamline data-sharing processes. Devlin emphasizes that the combination of these technologies allows for greater flexibility and capability in managing data, which is crucial for their predictive LTV model. The insights gained from utilizing these tools also pave the way for future innovations and improvements in data handling.
Challenges of Data Clean Rooms and Real-World Insights
The complexities of implementing data clean rooms are explored, revealing that many firms lack a clear understanding of their functionality. Devlin notes that while clean rooms promise seamless data sharing, they often involve cumbersome processes that aren't practical for all businesses. His team explored various clean room solutions, only to find that some required excessive costs and complicated architecture, leading them to favor in-place data sharing strategies. This exploration underscores the necessity of evaluating real-world applications and costs when designing data-sharing frameworks.
Innovating Data Infrastructure for Scalability
Devlin discusses the evolution of Wild AI's data stack, which differs significantly from traditional software engineering approaches. By using a data structure that stores information by client from the outset, Wild AI enhances the scalability and performance of their analytics processes. The integration of tools such as Dagster for orchestration allows for efficient processing and transformation of client data, further optimizing their workflow. This shift in data management showcases the benefits of innovative architecture in meeting the demands of modern data-driven businesses.
Transition from Software Engineering to Data Stack (46:36)
Data Contracts and Type Safety (49:10)
Database Schema Perspectives (50:27)
Final Thoughts and Takeaways (51:35)
The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.
RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
Get the Snipd podcast app
Unlock the knowledge in podcasts with the podcast player of the future.
AI-powered podcast player
Listen to all your favourite podcasts with AI-powered features
Discover highlights
Listen to the best highlights from the podcasts you love and dive into the full episode
Save any moment
Hear something you like? Tap your headphones to save it with AI-generated key takeaways
Share & Export
Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more
AI-powered podcast player
Listen to all your favourite podcasts with AI-powered features
Discover highlights
Listen to the best highlights from the podcasts you love and dive into the full episode