Colin Zima, an expert in data modeling and data engineering, chats about data modeling in 2023 and beyond, covering topics such as real ETL, semantic layers, and the challenges in business intelligence.
Balancing self-serve data models with governed data models is essential for productivity and scalability in data modeling.
Modern data modeling requires thoughtful practices to avoid complexity, maintenance issues, and performance problems.
Generative AI and large language models can enhance productivity, but transparency and interpretation challenges remain for non-technical users.
Deep dives
The Motivation behind Starting the Company
The CEO and founder of Omni, Colin, explains that the motivation behind starting the company stemmed from a desire to get back to the roots of building and working directly with customers. The previous experience at Looker highlighted the tension between the desire for a self-serve data model and the limitations of a governed data model. The goal of Omni is to balance the best of both worlds by creating a tool that allows end users to be productive and self-sufficient while also ensuring scalability, governance, and trust in the data.
Challenges with Modern Data Modeling
Colin discusses the challenges of modern data modeling, particularly with regards to the ease of getting started and the flexibility provided by modern data warehouses. While these advancements have made it easier to extract and query data without much upfront planning, they can lead to data models that become complex, difficult to maintain, and prone to performance issues as data size increases. Colin emphasizes the need for thoughtful data modeling practices, considering factors such as scalability, performance, and security to ensure the long-term success of data analysis.
The Impact of Language Models on Data Analytics
Colin shares his perspective on the impact of generative AI and large language models, acknowledging their capabilities in writing complex SQL queries and generating code. However, he highlights a key distinction between users who can understand the generated SQL and those who cannot. While it can make users who are familiar with SQL more productive, the lack of transparency and inability to explain the generated code makes it challenging for non-technical users to interpret and trust the outputs. Colin suggests that the most effective use of these language models could be in generating post-processing logic within well-defined UI interfaces, rather than relying solely on black-box SQL blocks.
Importance of Thinking Ahead for Querying Data
Thinking ahead about how data will be queried is crucial for efficient and effective data analysis. By considering factors like user roles and specific information needed, it becomes possible to optimize query performance and ensure convenient access to the data. Building a data model that can be universally applied to various content and dashboards is a powerful tool that empowers users to query and explore data without unnecessary barriers.
Data Contracts and the Balance of Enforcement
Data contracts, which establish rules for data accuracy and consistency, can play a role in ensuring data quality. However, it is important to strike a balance between enforcing data contracts and the cost associated with that enforcement. While contracts are valuable for critical data produced and consumed within an application, implementing them across the entire data pipeline may prove to be impractical. Organizations should carefully consider the impact of data contracts and strike a balance between ensuring data quality and optimizing resources.
Colin Zima joins the show to chat about what data modeling should look like in 2023 and beyond. We'll chat about real ETL, semantic layers, the troubles with BI, and much more.
#datamodeling #data #dataengineering #analytics
Get the Snipd podcast app
Unlock the knowledge in podcasts with the podcast player of the future.
AI-powered podcast player
Listen to all your favourite podcasts with AI-powered features
Discover highlights
Listen to the best highlights from the podcasts you love and dive into the full episode
Save any moment
Hear something you like? Tap your headphones to save it with AI-generated key takeaways
Share & Export
Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more
AI-powered podcast player
Listen to all your favourite podcasts with AI-powered features
Discover highlights
Listen to the best highlights from the podcasts you love and dive into the full episode