The Importance of Reproducible Scientific Data in Kafka

Kafka streams use stream time, just like flink has watermarks and they have their own thing too. But the deterministic processing of records for something like what you're talking about is paramount. There one caveat to that may be that to deal with vast amounts of data in Kafka, you probably are going to have to partition it. And if I don't know if there's a use case where you wouldn't have something to partition on, but it definitely don't care about order.

Play episode from 01:04:12

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app