Guest Shuang Li, Group Product Manager at Box, discusses challenges of building a data platform like ingestion pipelines, data quality, and democratization. They explore career transitions, data observability, developer experience, and future trends in data engineering.
Read more
AI Summary
AI Chapters
Episode notes
auto_awesome
Podcast summary created with Snipd AI
Quick takeaways
Building a data platform at Box involved overcoming challenges in adapting to the cloud and maintaining cost-effectiveness while scaling.
To showcase the impact of engineering work at Box, a laddering metric system aligned individual contributions with broader organizational goals.
Deep dives
Challenges of Building a Data Platform and Cloud Migration
Building a data platform at Box involved overcoming several challenges. The first major hurdle was adapting to the cloud, as most team members were new to cloud technologies, necessitating a learning curve for best practices. The project's scale was another significant challenge, leading to a breakdown into manageable milestones to prevent overwhelm. Team morale and balancing foundational work with innovation was a continuous challenge, requiring a storytelling approach to emphasize the importance of foundational work.
Trade-Offs for Cost-Effectiveness and Scaling
Maintaining cost-effectiveness while scaling the data platform at Box required strategic decision-making. The team implemented quarterly and monthly cost forecasts to consider organic growth, traffic, and new use cases for budget planning. Trade-offs between paying for vendor services versus building in-house solutions were carefully evaluated, aiming to optimize overall costs. Balancing cost efficiency with growth and technology choices played a crucial role in sustaining the platform's profitability.
Laddering Metrics for Impactful Engineering Work
To showcase the impact of engineering work at Box, a laddering metric system was employed. Individual team metrics, such as introducing streaming capabilities in the data platform, ladder up to product and engineering level metrics like enabling new use cases. These in turn ladder up to company-level metrics, focusing on profitable growth. By aligning individual contributions with broader organizational goals, the engineering team's efforts are seen as significant drivers of company success.
Future Trends: Developer Experience, Log Pipeline Uplift, and AI Integration
Looking ahead, Box's data platform will emphasize developer experience enhancements, streamlining interaction with the platform for insights and innovations. Initiatives include building frameworks for data aggregation and uplifting log pipelines to cater to various use cases efficiently and cost-effectively. AI integration will play a key role in improving data observability, aiding in anomaly detection and metadata management. Leveraging AI capabilities for data discovery and analytics will advance Box's data platform functionality.
Whether big or small, one of the biggest challenges organizations face when they want to work with data effectively is often lack of access to it. This is where building a data platform comes in. But building a data platform is no easy feat. It's not just about centralizing data in the data warehouse, it’s also about making sure that data is actionable, trustable and usable. So, how do you make sure your data platform is up to par?
Shuang Li is Group Product Manager at Box. With experience of building data, analytics, ML, and observability platform products for both external and internal customers, Shuang is always passionate about the insights, optimizations, and predictions that big data and AI/ML make possible. Throughout her career, she transitioned from academia to engineering, from engineering to product management, and then from an individual contributor to an emerging product executive.
In the episode, Adel and Shuang explore her career journey, including transitioning from academia to engineering and helping to work on Google Fiber, how to build a data platform, ingestion pipelines, processing pipelines, challenges and milestones in building a data platform, data observability and quality, developer experience, data democratization, future trends and a lot more.