The AWS Developers Podcast

Your next ETL pipeline will be serverless

11 snips
Jul 4, 2025
Poonam Pratik Patel, Director at The Line Tech UK and AWS Community Builder, dives into the world of serverless ETL implementation. She shares insights on how serverless architectures can streamline data processing while ensuring accuracy. The discussion includes practical strategies for data validation and partitioning, alongside the integration of AWS tools like Glue and Lambda. Poonam also highlights the transformative role of AI and ML in the evolution of data pipelines, making them more efficient and scalable.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

Clean Data Crucial for Decisions

  • Businesses need clean, correct data to make the right decisions and grow effectively.
  • Manual data correction is impractical and error-prone with millions of data points.
ADVICE

Embrace Serverless for ETL

  • Use AWS serverless services like Lambda and Step Functions to eliminate infrastructure management in ETL.
  • Focus on defining data flow and validation, not on backend infrastructure sizing or maintenance.
ANECDOTE

Pipeline Orchestrated by Step Functions

  • Data from branches is collected into a single S3 bucket which triggers a Step Function workflow.
  • The workflow invokes Lambda functions that validate and process data, moving invalid files to a separate bucket.
Get the Snipd Podcast app to discover more snips from this episode
Get the app