In the past couple of episodes, we’d gone over what Apache Kafka is and along the way we mentioned some of the pains of managing and running Kafka clusters on your own. In this episode, we discuss some of the ways you can offload those responsibilities and focus on writing streaming applications. Along the way, Joe does a mighty fine fill-in for proper noun pronunciation and Allen does a southern auctioneer-style speed talk.
View the full show notes here:
https://www.codingblocks.net/episode237
Reviews
As always, thank you for leaving us a review – we really do appreciate them!
From iTunes: Abucr7
Upcoming Events
Atlanta Dev Con
September 7th, 2024
https://www.atldevcon.com/
DevFest Central Florida on September 28th, 2024
Interested? Submit your talk proposal here:
https://sessionize.com/devfest-florida-orlando-2024/
Kafka Compatible and Kafka Functional Alternatives
Why? Because running any type of infrastructure requires time, knowledge, and blood, sweat and tears
Confluent
WarpStream
- https://www.warpstream.com/
- “WarpStream is an Apache Kafka® compatible data streaming platform built directly on top of object storage: no inter-AZ bandwidth costs, no disks to manage, and infinitely scalable, all within your VPC”
- ZERO disks to manage
- 10x cheaper than running Kafka
- Agents stream data directly to and from object storage with no buffering on local disks and no data tiering.
- Create new serverless “Virtual Clusters” in our control plane instantly
- Support different environments, teams, or projects without managing any dedicated infrastructure
- Things you won’t have to do with WarpStream
- Upscale a cluster that is about to run out of space
- Figure out how to restore quorum in a Zookeeper cluster or Raft consensus group
- Rebalance partitions in a cluster
- “WarpStream is protocol compatible with Apache Kafka®, so you can keep using all your favorite tools and software. No need to rewrite your application or use a proprietary SDK. Just change the URL in your favorite Kafka client library and start streaming!”
- Never again have to choose between reliability and your budget. WarpStream costs the same regardless of whether you run your workloads in a single availability zone, or distributed across multiple
- WarpStream’s unique cloud native architecture was designed from the ground up around the cheapest and most durable storage available in the cloud: commodity object storage
- WarpStream agents use object storage as the storage layer and the network layer, side-stepping interzone bandwidth costs entirely
- Can be run in BYOC (bring your own cloud) or in Serverless
- BYOC – you provide all the compute and storage – the only thing that WarpStream provides is the control plane
- Data never leaves your environment
- Serverless – fully managed by WarpStream in AWS – will automatically scale for you even down to nothing!
- Can run in AWS, GCP and Azure
- Agents are also S3 compatible so can run with S3 compatible storage such as Minio and others
RedPanda
- Redpanda is a slimmed down native Kafka protocol compliant drop-in replacement for Kafka
- There’s even a Redpanda Connect!
- It’s main differentiator is performance, it’s cheaper and faster
Apache Pulsar
- Similar to Kafka, but changes the abstraction on storage to allow more flexibility on IO
- Has a Kafka compliant wrapper for interchangability
- Simple data offload functionality to S3 or GCS
- Multi tenancy
- Geo replication
Cloud alternatives
- Google Cloud – PubSub
- Azure – Event Hubs
- AWS – Kinesis
Tip of the Week
- Chord AI is an Android/iOS app that uses AI to figure out the chords for a song. This is really useful if you just want to get the quick jist of a song to play along with. The base version is free, and has a few different integration options (YouTube, Spotify, Apple Music Local Files for me) and it uses your phones microphone and a little AI magic to figure it out. It even shows you how to play the chords on guitar or piano. The free version gets you basic chords, but you can pay $8.99 a month to get more advanced/frequent chords.
https://www.chordai.net/
- Pandas is nearly as good, if not better than SQL for exploring data
https://pandas.pydata.org/
- Another tip for displaying in Jupyter notebooks – to HTML() your dataframes to show the full column data
https://www.geeksforgeeks.org/how-to-render-pandas-dataframe-as-html-table/
- Take photos or video and convert them into 3d models
https://lumalabs.ai/luma-api