AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
Spark Spark Modeling - What Are Some of the Things That Can Go Wrong in a Data Pipeline?
Pipelines tend to be far more brittle in nature for a handful of reasons. No nodes are fully isolated. You're running other Spark jobs on those nodes at the same time. The other thing you'll oftentimes see is there's bad data like all of a sudden a corrupted record comes through.