Shouldn't Data Connections Be Easier? (with Ashley Jeffs)
Jan 24, 2024
auto_awesome
Data Engineering expert Ashley Jeffs discusses Benthos, a tool for quick data pipeline setup. They cover simplifying data connections, advantages of Benthos, and when to use it. Topics include stream processing, transitioning to Go, error handling, custom plugins, and getting started with Benthos.
Benthos simplifies data pipeline setup for quick configurations and testing.
Benthos offers operational simplicity with straightforward processing pattern handling.
Bloblang enhances Benthos with native mapping for complex data transformations and enrichments.
Deep dives
The Importance of Streamlining System Connections in Software Development
Efficiently setting up connections between various systems is a crucial aspect of software development, especially in today's landscape with the vast amount of data and the prevalence of microservices and cloud services. Building reusable tools for connecting different systems is a key advancement in the industry, moving away from ad hoc solutions in the past. Tools like Kafka, Red Panda, and Debezium showcase effective ways to connect systems.
Ashley Jeffs and the Creation of Benthos: A Lightweight Connectivity Solution
Ashley Jeffs, the creator of Benthos, developed the project initially at his day job to manage data pipelines, which later became popular enough to transition into his full-time occupation. Benthos aims to serve as a lightweight, user-friendly tool for establishing quick and reliable connections in data processing. It provides a formal alternative to shell scripts, offering an easy-to-use solution for various data integration tasks. Benthos strives to offer a simple and efficient design while avoiding unnecessary complexity.
Operational Simplicity and Monitoring Capabilities in Benthos
Benthos emphasizes operational simplicity by offering a straightforward setup for handling various processing patterns, such as fan-out, round-robin, and swimlaning. The tool prioritizes ease of use, allowing users to configure complex patterns with minimal effort. Monitoring features include logging, metrics for tracking throughput and latency, and support for distributed tracing through open telemetry. Additionally, Benthos offers customizable alerting options, enabling users to integrate external alert systems for timely notifications.
Use of Custom Inputs and Outputs in Benthos Processes
Users in the podcast discuss the desire to nest Benfoss in their ecosystem with custom inputs and outputs rather than just plugins. The discussion revolves around using a custom main function for running multiple Benthos processes and the implications on clustering. The flexibility of depending on sources like Kafka for behavior and distribution of partitions is highlighted, emphasizing the importance of utilizing input sources effectively.
Bloblang Language Development and Its Integration with Benthos
The conversation delves into the development of Bloblang as a native mapping language for Benfoss, enhancing support for complex use cases and enrichments. Bloblang's role in achieving Clever Data processing within Benfoss is outlined, focusing on scenarios where IDML integration and native language design became essential. The advantages and practicality of leveraging Bloblang for intricate data transformations and dependency graph in enrichments are emphasized, showcasing the language's significance in maintaining Benthos operations efficiently.
Benthos wants to be part of your Data Engineering toolkit - it’s there as a quick and easy way to set up data pipelines and start streaming data out of A and into B. In contrast to a lot of the tools we’ve talked about on Developer Voices, Benthos seems focussed on cutting development time down to a minimum, so you can quickly configure a new pipeline and test it out, without making a whole sprint of the task. As quick as a quick-and-dirty shell script, without the dirt. 😉
So this week we’re talking to the creator of Benthos, Ashley Jeffs, to hear why he created Benthos, what it can do for you, and what its strengths and weaknesses are. And Jeff’s refreshingly candid about when you should and shouldn’t use it. If you ever need to get data from an HTTP connection into S3, or S3 into Kafka, or Kafka into a flat file, Benthos might just save you a few hours of development.
–
Benthos: https://www.benthos.dev/
A list of supported inputs, processors & outputs: https://www.benthos.dev/docs/about#components
All their cute blobfish logos: https://www.benthos.dev/blobfish/
IDML: https://idml.io/
Kris on Twitter: https://twitter.com/krisajenkins
Kris on LinkedIn: https://www.linkedin.com/in/krisjenkins/
–
#software #podcast #dataengineering #datascience
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.