Data Engineering Podcast cover image

Building Auditable Spark Pipelines At Capital One

Data Engineering Podcast

00:00

The Volume, Variety and Velocity of Parquet Files

In terms of the kind of volume of data, you mentioned that you're dealing with a lot of parquet files. Om, assuming that youre primarily sourcing and depositing information in s three, i'm wondering what you are looking at in terms of the sort of three vs of volume, variety and velocity. Ca: It ranges from probably 25 to 50 million daily. That's the volum process. And then that volume goes through variety of watt flows,. So it ranges from a simple use case, where, if were familiar the caplon cards, the simple case tere customers wives using one of our cashbare card, which is quicksilver.

Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner
Get the app