How to Run Parallel Operations on a Large Task Graph

The way I would first approach it is if it is inherently parallelizable and you don't have any cross dependencies across rose. You can get away with the really simple just criticism that a grouping key on your data. And this is what a lot of art like genomics and health and life sciences customers do when they're like, Hey, we got a couple of petabytes of data. We need to process. The most clever way to deal with that is to put my modem data frame that has my points, and then I want to set my right cluster to be adaptively scalable.

Play episode from 49:04

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app