Developer Voices

Kris Jenkins
undefined
6 snips
May 29, 2024 • 1h 23min

Reimplementing Apache Kafka with Golang and S3

This week on Developer Voices we’re talking to Ryan Worl, whose career in big data engineering has taken him from DataDog to Co-Founding WarpStream, an Apache Kafka-compatible streaming system that uses Golang for the brains and S3 for the storage. Ryan tells us about his time at DataDog, along with the things he learnt from doing large-scale systems migration bit-by-bit, before we discuss how and why he started WarpStream. Why re-implement Kafka? What are the practical challenges and cost benefits of moving all your storage to S3? And would he choose Go a second time around?--WarpStream: https://www.warpstream.com/DataDog: https://www.datadoghq.com/Ryan on Twitter: https://x.com/ryanworl Kris on Mastodon: http://mastodon.social/@krisajenkinsKris on LinkedIn: https://www.linkedin.com/in/krisjenkins/Kris on Twitter: https://twitter.com/krisajenkins
undefined
May 22, 2024 • 1h 8min

Extending Postgres for High-Performance Analytics (with Philippe Noël)

PostgreSQL is an incredible general-purpose database, but it can’t do everything. Every design decision is a tradeoff, and inevitably some of those tradeoffs get fundamentally baked into the way it’s built. Take storage for instance - Postgres tables are row-oriented; great for row-by-row access, but when it comes to analytics, it can’t compete with a dedicated OLAP database that uses column-oriented storage. Or can it?Joining me this week is Philippe Noël of ParadeDB, who’s going to take us on a tour of Postgres’ extension mechanism, from creating custom functions and indexes to Rust code that changes the way Postgres stores data on disk. In his journey to bring Elasticsearch’s strengths to Postgres, he’s gone all the way down to raw datafiles and back through the optimiser to teach a venerable old dog some new data-access tricks. –ParadeDB: https://paradedb.comParadeDB on Twitter: https://twitter.com/paradedbParadeDB on Github: https://github.com/paradedb/paradedbpgrx (Postgres with Rust): https://github.com/pgcentralfoundation/pgrxTantivy (Rust FTS library): https://github.com/quickwit-oss/tantivyPgMQ (Queues in Postgres): https://tembo.io/blog/introducing-pgmqApache Datafusion: https://datafusion.apache.org/Lucene: https://lucene.apache.org/Kris on Mastodon: http://mastodon.social/@krisajenkinsKris on LinkedIn: https://www.linkedin.com/in/krisjenkins/Kris on Twitter: https://twitter.com/krisajenkins
undefined
15 snips
May 15, 2024 • 1h 12min

Designing Actor-Based Software (with Hugh McKee)

Hugh McKee, Developer Advocate for Lightbend, discusses the actor model in software design, focusing on patterns, event-driven approaches, and architectural comparisons. He explores the evolution of event-driven systems, highlights scalability benefits, and emphasizes the importance of picking the right tools for robust system design.
undefined
May 8, 2024 • 1h 2min

ByteWax: Rust's Research Meets Python's Practicalities (with Dan Herrera)

Dan Herrera, an expert at blending Rust's research with Python's practicalities, talks about Bytewax, a stream processing tool merging Python and Rust. They discuss the marriage of Python and Rust in practice, the challenges in data engineering, integration of Rust into Python ecosystem, timely data flow library design challenges, data flow management with Bytewax and Timely, and cluster recovery and rescaling in PyLax.
undefined
May 1, 2024 • 1h 25min

Mojo Lang - Tomorrow's High Performance Python? (with Chris Lattner)

Chris Lattner, the mastermind behind Swift and LLVM, discusses Mojo, a new programming language that merges Python's syntax with high-performance capabilities. They delve into Mojo's innovative type system and memory management, which aim to enhance programming for AI and high-performance computing. Lattner explains how Mojo addresses language divides in the AI landscape and streamlines code optimization with compile-time techniques. Discover how this language could be a game-changer for developers seeking Python-like familiarity with the power of lower-level programming.
undefined
10 snips
Apr 24, 2024 • 52min

Batch Data & Streaming Data in one Atom (with Jove Zhong)

In this engaging discussion, Jove Zhong, a contributor to the open-source database Proton, shares insights on the challenges of managing both batch and streaming data. He reveals the innovative Lambda Architecture and how Proton aims to simplify data integration. Jove dives into stream processing, addressing issues like out-of-order events and data consistency. He also explores architectural strategies for massive datasets, highlighting the use of ClickHouse for efficient querying and data handling. This conversation is a treasure trove for data enthusiasts!
undefined
Apr 17, 2024 • 1h 10min

Advanced Memory Management in Vale (with Evan Ovadia)

Rust changed the discussion around memory management - this week's guest hopes to push that discussion even further.This week we're joined by Evan Ovadia, creator of the Vale programming language and collector of memory management techniques from far and wide. He takes us through his most important ones, including linear types, generation references and regions, to see what Evan hopes the future of memory management will look like.If you've been interested in Rust's borrow-check and want more (or want different!) then Evan has some big ideas for you to sink your teeth into.–Vale: https://vale.dev/The Vale Discord: https://discord.com/invite/SNB8yGHEvan’s Blog: https://verdagon.dev/homeEvan’s 7DRL Entry: https://verdagon.dev/blog/higher-raii-7drl7DRL: https://7drl.com/https://verdagon.dev/grimoire/grimoireWhat Colour Is Your Function?: https://journal.stuffwithstuff.com/2015/02/01/what-color-is-your-function/42, the language: https://forty2.is/Verona Language: https://www.microsoft.com/en-us/research/project/project-verona/Austral language: https://austral-lang.org/Surely You’re Joking, Mr Feynman! (book): https://www.goodreads.com/book/show/35167685-surely-you-re-joking-mr-feynmanEvan on Twitter: https://twitter.com/verdagonFind Evan in the Vale Discord: https://discord.com/invite/SNB8yGHKris on Mastodon: http://mastodon.social/@krisajenkinsKris on LinkedIn: https://www.linkedin.com/in/krisjenkins/Kris on Twitter: https://twitter.com/krisajenkins–#software #programming #podcast #valelang
undefined
Apr 3, 2024 • 1h 7min

Bringing Pure Python to Apache Kafka (with Tomáš Neubauer)

The “big data infrastructure” world is dominated by Java, but the data-analysis world is dominated by Python. So if you need to analyse and process huge amounts of data, chances are you’re in for a less-than-ideal time. The impedance mismatch will probably make your life hard somehow. So there are a lot of projects and companies trying to solve that problem. To bridge those two worlds seamlessly, and many of the popular solutions see SQL as the glue. But this week we’re going to look at another solution - ignore Java, treat Kafka as a protocol, and build up all the infrastructure tools you need with a pure Python library. It’s a lot of work, but in theory it would make Python the one language for data storage, analysis and processing, at scale. Tempting, but is it feasible? Joining me to discuss the pros, cons, and massive scope of that approach is Tomáš Neubauer. He started off doing real time data analysis for the Maclaren’s F1 team, and is now deep in the Python mines effectively rewriting Kafka Streams in Python. But how? How much work is actually involved in porting those ideas to Python-land, and how do you even get started? And perhaps most fundamental of all - even if you succeed, will that be enough to make the job easy, or will you still have to scale the mountain of teaching people how to use the new tools you’ve built? Let's find out.– Quix Streams on Github: https://github.com/quixio/quix-streamsQuix Streams getting started guide: https://quix.io/get-started-with-quix-streamsQuix: https://quix.io/ Tomáš on LinkedIn: https://www.linkedin.com/in/tom%C3%A1%C5%A1-neubauer-a10bb144Tomáš on Twitter: https://twitter.com/TomasNeubauer0Kris on Mastodon: http://mastodon.social/@krisajenkinsKris on LinkedIn: https://www.linkedin.com/in/krisjenkins/Kris on Twitter: https://twitter.com/krisajenkins  --#podcast #softwaredevelopment #datascience #apachekafka #streamprocessing
undefined
Mar 27, 2024 • 1h 4min

Taking Erlang to OCaml 5 (with Leandro Ostera)

Erlang wears three hats - it’s a language, it’s a platform, and it’s an approach to making software run reliably once it’s in production. Those last two are so interesting I sometimes wonder why those ideas haven’t been ported to every language going.  How much work would it be?This week we’re going to dig right down into that question with Leandro Ostera. He’s been working on Riot - a project to bring the best of Erlang’s runtime system and philosophy to OCaml. But why OCaml? Is it possible to marry together OCaml’s type system with Erlang’s dynamic dispatch systems? And what is it about the recent release of OCaml5 that makes the whole project easier?–Leandro’s Blog: https://www.abstractmachines.dev/Why Typing Erlang is Hard: https://www.abstractmachines.dev/posts/am012-why-typing-erlang-is-hard/Riot: https://riot.ml/Riot source: https://github.com/riot-ml/riotReasonML: https://reasonml.github.io/ReScript: https://rescript-lang.org/Leandro on Twitter: https://twitter.com/leosteraKris on Mastodon: http://mastodon.social/@krisajenkinsKris on LinkedIn: https://www.linkedin.com/in/krisjenkins/Kris on Twitter: https://twitter.com/krisajenkins--#podcast #softwaredevelopment #erlang #ocaml #softwaredesign
undefined
Mar 20, 2024 • 1h 14min

How Apache Pinot Achieves 200,000 Queries per Second (with Tim Berglund)

Discover how Apache Pinot achieves an impressive 200,000 queries per second and the architectural decisions behind it. Tim Berglund explains the roles, optimization, and performance of Pinot, covering queries, data movement, and indexing strategies. Learn about the evolution of databases, query optimization, and the technical details of real-time data applications with Apache Pinot.

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app