The Engineering Side of Data cover image

AWS Glue with Johnny Chivers

The Engineering Side of Data

00:00

Is There a Better Way to Handle Incremental Jobs?

I like the ability to use Python shell jobs. You can use a fraction of a DPU. Maybe look at Python shell job as opposed to a Pyspark or a Scala. So it might be a cheaper way there to win some cost. I would like to see maybe some better ways to handle incremental jobs, right? Those delta jobs. They particularly work better for an S3 scenarios from dropping. What are the new files in S3? Also they have the able to use that feature in JDBC, so JD sources like databases, of course. But I think there's limited features there. It doesn't work off time stamps as some limited featuresThere. The same thing stands

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app