AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
When building a data engineering team from scratch, it's important to consider the business problems that need to be solved and align the hiring process with those needs. Finding senior individuals who are comfortable with ambiguity and can propose solutions to the business problems is crucial. Start with a small team and iterate based on company goals, fostering data literacy and bringing data into the conversation. Setting clear expectations and milestones is important in the early stages.
In the early days of data engineering, when the term wasn't even well-defined, it was important to define the charter of the team and understand the business problems that needed to be solved. Hiring experts in SQL and Python and defining the skill sets required were crucial. Starting small with senior team members who can thrive in an ambiguous environment and gradually adding junior members once the mission of the team is defined is effective. It is important to think through the complexity of the data and the skill sets needed to solve the problems.
The threshold moment when a data program needs to be built is when the data can no longer be managed as a side project. This often happens when the complexity of the data and the business requirements grow. When data is changing frequently and the interpretation of the data becomes important, a data program and dedicated data engineering team are needed. It is crucial to have an owner for the data and a clear understanding of the business pain points that data will help solve.
Collaboration between data engineering and software engineering is essential as data becomes the product and directly impacts revenue streams. It is crucial to foster empathy and create a collaborative environment. Setting clear expectations and communication channels is important to avoid friction points, such as changing the schema without informing the data engineering team. Leadership alignment and a clear understanding of the organizational goals between the two teams are vital.
The rise of AI and ML brings challenges to data engineering, as it requires handling massive data volumes and building a specialized AI stack. The need for GPUs, specialized storage, and rethinking existing data pipelines arise. The complexity and power consumption of AI pose new considerations for data engineering. Evaluating the purpose and feasibility of AI initiatives in relation to data volumes and infrastructure resources is necessary. Collaboration and alignment between data engineering and software engineering in managing data as a product become more essential.
AI's energy consumption and sustainability are emerging topics. The power consumption of AI poses challenges, and the energy needs of AI should be considered. The trend of making AI more sustainable is gaining attention and may lead to future discussions and innovations in this area.
Data is no longer a side project, but a core product that affects other products and revenue streams. Ideal collaboration involves making data engineering an essential part of the engineering function and aligning goals between software engineering and data engineering. Creating a data contract and fostering communication and empathy are key to avoiding friction. The kindest person in the room is often the smartest.
Onboarding and chartering a data engineering team requires clear expectations and milestones. Understanding the business problems to be solved and hiring individuals who can propose solutions are critical. Defining the team's charter, evaluating existing pain points, and aligning the team's mission with company goals are important steps. It is crucial for data engineering leaders to communicate effectively, foster collaboration, and adapt agile methodologies to data engineering.
AI's impact on data pipelines and the need for processing massive volumes of data call for rethinking existing data stacks. The complexity of transforming existing transactional and BI stacks to support AI workloads arises. It is important to consider where and how AI needs to be applied and select the right technology stack to handle the requirements. Minimizing data movement and reevaluating existing processes are essential for managing AI-related data workflow effectively.
Overall, effective communication, collaboration, and alignment between software engineering and data engineering, understanding business needs, setting clear expectations, and evolving existing processes are key to building successful data engineering organizations. Collaboration should be fostered at all levels, from organizational leadership to individual team members, with a focus on empathy and shared goals.
The future of data engineering includes addressing emerging challenges in AI, such as power consumption and sustainability. As technology evolves, data engineering must adapt to new requirements and technologies. Constant evaluation, rethinking, and innovation are necessary to meet the evolving demands of data engineering.
As the Field CTO & Head of Strategy @ VAST Data, Colleen Tartow, Ph.D., has a vast resume of building data engineering teams from scratch and beyond. Colleen discusses the necessary components for developing new or reorienting existing data programs, strategies for effective communication & collaboration between data & eng functions, the implications of AI technology on data engineering, and integrating cross-functional partners into the data eng planning process & road map. Plus Colleen shares about building the hiring process for data eng functions, when the “data engineering” term or role didn’t exist yet, and how you can apply that to other emerging or undefined functions!
Colleen Tartow, Ph.D. is Field CTO and Head of Strategy at VAST Data and has 20+ years of experience in data, analytics, engineering, and consulting. Adept at assisting organizations in deriving value from a data-driven culture, she has successfully led diverse data, engineering, and analytics teams through the development of complex global data management solutions and architecting enterprise data systems. Her demonstrated excellence in data, engineering, analytics, and diversity leadership makes her a trusted senior advisor among executives. An experienced speaker, author, valued mentor and startup advisor, Colleen holds degrees in astrophysics and lives in Massachusetts.
"Everyone wants to be data driven, right? Like no one's going to say, 'No, we don't want data. We just want to function with opinions.' Like nobody's actually going to say that. But that said, getting started on that can be really challenging...
With anything, you have to go back to what does the business really need. Going back to the revenue drivers and the business pain points that you're going to help solve, whether it's monetizing your data directly or using data as an enablement function to actually help in other areas and so I think getting the organization to understand that data is a product of the business and then sort of working back from there into what does that specifically mean.”
- Colleen Tartow
ELCs Peer Groups provide a virtual, curated, and ongoing peer learning opportunity to help you navigate the unknown, uncover solutions and accelerate your learning with a small group of trusted peers.
Apply to join a peer group HERE: sfelc.com/peerGroups
Patrick Gallagher - Producer & Co-Host
Jerry Li - Co-Host
Noah Olberding - Associate Producer, Audio & Video Editor https://www.linkedin.com/in/noah-olberding/
Dan Overheim - Audio Engineer, Dan’s also an avid 3D printer - https://www.bnd3d.com/
Ellie Coggins Angus - Copywriter, Check out her other work at https://elliecoggins.com/about/
Listen to all your favourite podcasts with AI-powered features
Listen to the best highlights from the podcasts you love and dive into the full episode
Hear something you like? Tap your headphones to save it with AI-generated key takeaways
Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more
Listen to all your favourite podcasts with AI-powered features
Listen to the best highlights from the podcasts you love and dive into the full episode