Your next ETL pipeline will be serverless

11 snips

Jul 4, 2025

Poonam Pratik Patel, Director at The Line Tech UK and AWS Community Builder, dives into the world of serverless ETL implementation. She shares insights on how serverless architectures can streamline data processing while ensuring accuracy. The discussion includes practical strategies for data validation and partitioning, alongside the integration of AWS tools like Glue and Lambda. Poonam also highlights the transformative role of AI and ML in the evolution of data pipelines, making them more efficient and scalable.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

INSIGHT

Clean Data Crucial for Decisions

Businesses need clean, correct data to make the right decisions and grow effectively.
Manual data correction is impractical and error-prone with millions of data points.

ADVICE

Embrace Serverless for ETL

Use AWS serverless services like Lambda and Step Functions to eliminate infrastructure management in ETL.
Focus on defining data flow and validation, not on backend infrastructure sizing or maintenance.

ANECDOTE

Pipeline Orchestrated by Step Functions

Data from branches is collected into a single S3 bucket which triggers a Step Function workflow.
The workflow invokes Lambda functions that validate and process data, moving invalid files to a separate bucket.

Get the Snipd Podcast app to discover more snips from this episode

Get the app