#145 - Data Engineering AMA w/ Matt Housley and Joe Reis
Sep 26, 2023
auto_awesome
Matt Housley and Joe Reis discuss the current state of generative AI and its potential disruptiveness, challenges of data quality and answer consistency, exploration of directed acyclic graphs (DAGs) in data engineering, limitations and challenges of cron jobs, the potential of students driving innovation, strategies for exam preparation, and upcoming events and podcasts.
Data engineers should focus on integrating with the business to propose projects and solutions that drive innovation.
Finding a middle ground between different data modeling methodologies can leverage their strengths for specific use cases.
Data engineers should collaborate with the business, align their skills with organizational needs, and make a significant impact on business outcomes.
Deep dives
The importance of integrating with the business
One key insight from the podcast episode is the importance for data engineers to focus on integrating with the business. By understanding the needs and goals of different stakeholders, data engineers can propose projects and solutions that provide value and drive innovation.
The evolution and challenges of data modeling
The podcast episode delves into the ongoing debate about data modeling methodologies. It highlights the need to move beyond the binary choice of either one big table or a completely Kimball approach. Finding a middle ground that leverages the strengths of both methods is seen as beneficial. The discussion also emphasizes the value of understanding the principles behind different data modeling techniques and making informed choices based on specific use cases.
The maturity of data engineering tools and career directions
As data engineering tools mature, the focus for data engineers should shift towards collaborating with the business and becoming an integral part of the organization. This involves understanding stakeholder needs, proposing projects, and providing insights that contribute to business outcomes. By aligning their skills and expertise with the evolving needs of the organization, data engineers can further advance their careers and make a significant impact.
The Definition of Data Warehousing
Data warehousing is defined as a subject-oriented, nonvolatile, integrated, time-variant collection of data in support of management decision making. It is an architecture and a way of thinking about data, rather than a physical implementation.
Relevant Tools and Skills for Data Engineering
When it comes to data engineering, it is important not to fixate on just one or two tools. Instead, focus on learning fundamental data aggregation operations and understanding different database technologies. Develop a broad skill set that can be applied in various areas and learn general principles. It is also crucial to understand stakeholder needs and master requirements gathering, as effective communication and gathering business requirements are often underrated skills.