
🏃♀️Moving Fast and Breaking Data with Shreya Shankar
The MLOps Podcast
00:00
The Use of Partition Summaries in Data Engineering
Traditional anomaly detection techniques, they kind of compare like a current partition to some aggregate of historical partitions. But we find that if we just save data, so we call it partition summaries in the paper, then we just create a summary for each partition. So instead of our feeding the whole thing and comparing today to full aggregate, we maintain those individual summaries. It's super cheap, simple, scalable, works very well. And I think there's a lot of value from this.
Transcript
Play full episode
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.