The GeekNarrator

Kaivalya Apte

The GeekNarrator podcast is a show hosted by Kaivalya Apte who is a Software Engineer and loves to talk about Technology, Technical Interviews, Self Improvement, Best Practices and Hustle.

Connect with Kaivalya Apte https://www.linkedin.com/in/kaivalya-apte-2217221a

Tech blogs: https://kaivalya-apte.medium.com/

Wanna talk? Book a slot here: https://calendly.com/speakwithkv/hey

Enjoy the show and please follow to get more updates. Also please don’t forget to rate and review the show.

Cheers

Episodes

Mentioned books

Nov 16, 2025 • 1h 12min

Databases and Engineering with @PlanetScale CEO - Sam Lambert

For memberships: join this channel as a member here:https://www.youtube.com/channel/UC_mGuY4g0mggeUGM6V1osdA/joinExploring Cloud Databases, Scalability, and Simple Engineering with Sam Lambert, CEO of PlanetScaleIn this episode of The Geek Narrator podcast, we welcome Sam Lambert, CEO and Co-Founder of PlanetScale, known for creating the world's fastest and most scalable cloud database. Sam shares his insights on databases, operational excellence, and simple engineering. We discuss topics such as scalability, Postgres versus MySQL, and replication. Sam also talks about handling complexity in engineering, the unique features of Vites, and how PlanetScale achieves high availability. Don't miss this deep dive into the future of cloud databases. Like, share, and subscribe to support the channel!Chapters:00:00 Introduction and Episode Overview01:13 Meet Sam Lambert: Background and Career02:42 Balancing Work and Social Media05:48 The Philosophy of Simple Engineering14:21 The Slotted Counter Pattern at GitHub18:27 Postgres vs MySQL: Design Flaws and Philosophical Differences28:58 Sharding and Scaling with Vitess37:01 Database Branching and Schema Changes38:50 Common Practices in Startups39:07 Challenges with Data Branching40:45 Legal and Ethical Considerations42:31 Staging Environments vs. Dev Branches45:26 Trade-offs in Cloud Databases52:41 Replication and Durability01:00:02 Ensuring High Availability01:08:04 Backup Strategies and Testing01:10:41 Conclusion and Final ThoughtsLearn about PlanetScale: https://planetscale.com/For memberships: join this channel as a member here:https://www.youtube.com/channel/UC_mGuY4g0mggeUGM6V1osdA/joinDon't forget to like, share, and subscribe for more insights!=============================================================================Like building stuff? Try out CodeCrafters and build amazing real world systems like Redis, Kafka, Sqlite. Use the link below to signup and get 40% off on paid subscription.https://app.codecrafters.io/join?via=geeknarrator=============================================================================Database internals series: https://youtu.be/yV_Zp0Mi3xsPopular playlists:Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_dModern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsNStay Curios! Keep Learning!

Nov 16, 2025 • 1h 32min

What is TigerStyle? Principles behind TigerBeetle ft. Joran

For memberships: join this channel as a member here:https://www.youtube.com/channel/UC_mGuY4g0mggeUGM6V1osdA/joinSummary:In this captivating episode, we sit down with Joran Dirk Greef, the mastermind behind Tiger Beetle, a groundbreaking financial transactions database. Joran shares his journey of innovation, highlighting the challenges and triumphs of creating a system that is not only faster but also safer. Dive into the philosophy of Tiger Style, a unique methodology that emphasizes quality and performance, ensuring that software development is both efficient and effective. Joran's insights into trust, discipline, and the relentless pursuit of excellence offer valuable lessons for anyone in the tech industry. Whether you're a developer, entrepreneur, or tech enthusiast, this episode is packed with inspiration and practical wisdom. Don't miss out on this opportunity to learn from one of the leading minds in software engineering.Chapters:00:01:37 Introduction to Tiger Beetle 00:02:27 Philosophy of Tiger Style 00:03:38 Challenges in Software Development00:04:43 Importance of Trust and Quality 00:09:43 Static Allocation in Software 00:16:53 AI in Software Development 00:23:53 Business Philosophy and Innovation 00:31:53 The Future of Software DevelopmentFor memberships: join this channel as a member here:https://www.youtube.com/channel/UC_mGuY4g0mggeUGM6V1osdA/joinDon't forget to like, share, and subscribe for more insights!=============================================================================Like building stuff? Try out CodeCrafters and build amazing real world systems like Redis, Kafka, Sqlite. Use the link below to signup and get 40% off on paid subscription.https://app.codecrafters.io/join?via=geeknarrator=============================================================================Database internals series: https://youtu.be/yV_Zp0Mi3xsPopular playlists:Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_dModern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsNStay Curios! Keep Learning!

Nov 16, 2025 • 59min

What makes Apache Pinot so Fast?

For memberships: join this channel as a member here:https://www.youtube.com/channel/UC_mGuY4g0mggeUGM6V1osdA/joinSummary:In this episode, host Kaivalya Apte interviews Ankit Sultana, a staff engineer at Uber with extensive experience in Apache Pinot, a real-time analytics platform. They discuss the high-level architecture, ingestion processes, and query mechanisms of Apache Pinot. Ankit provides a historical context, detailing the evolution of Apache Pinot from its origins at LinkedIn to its widespread adoption. They discuss the key components of Pinot, explaining the roles of Pinot servers, brokers, controllers, and the dependency on Zookeeper. Ankit also explained how data flows into Apache Pinot and the technicalities of its real-time ingestion and querying capabilities. Chapters:00:00 Introduction and Episode Overview03:30 Understanding Apache Pinot03:49 Apache Pinot's Historical Background05:20 Real-Time Analytics with Apache Pinot11:06 Apache Pinot's Architecture and Components17:05 Tenancy and Data Ingestion in Apache Pinot30:22 Understanding Real-Time Replication and Consumer Groups30:52 Pinot's Offset Tracking and Segment Creation31:59 Handling Server Restarts and Segment Transitions32:50 Dealing with Kafka Duplicates and Deduplication Features35:13 Ingestion Process and Mutable vs Immutable Segments39:18 Memory Management and Segment Flushing40:10 Advantages of Keeping Mutable Segments Longer42:21 Introduction to Pinot's Query Engines42:50 Single Stage Engine: Architecture and Optimizations54:49 Multi-Stage Engine: Flexibility and Challenges58:13 Conclusion and Next StepsImportant Links:* Good high-level overview on Pinot: https://www.youtube.com/watch?v=F8Q_pGIH9yY* Apache Pinot 101 by Tim: https://www.youtube.com/playlist?list=PLihIrF0tCXdfN6y-twj9KtWaXM1GH4RSe* Multistage Physical Optimizer, the new optimizer that we built at Uber and open-sourced: https://docs.pinot.apache.org/users/user-guide-query/multi-stage-query/physical-optimizer* Multistage Lite Mode: https://docs.pinot.apache.org/users/user-guide-query/multi-stage-query/multistage-lite-mode* Time Series Engine Talk at RTA Summit: https://www.youtube.com/watch?v=kgseiambgesFor memberships: join this channel as a member here:https://www.youtube.com/channel/UC_mGuY4g0mggeUGM6V1osdA/joinDon't forget to like, share, and subscribe for more insights!=============================================================================Like building stuff? Try out CodeCrafters and build amazing real world systems like Redis, Kafka, Sqlite. Use the link below to signup and get 40% off on paid subscription.https://app.codecrafters.io/join?via=geeknarrator=============================================================================Database internals series: https://youtu.be/yV_Zp0Mi3xsPopular playlists:Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_dModern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsNStay Curios! Keep Learning!

Oct 25, 2025 • 1h 18min

You don't need Linux, Docker, k8s? Future with Unikernels ft. NanoVMs

Ian Iberg, founder of NanoVMs and a security expert, dives deep into the world of unikernels and their transformative potential for cloud computing. He outlines how unikernels streamline applications by replacing traditional operating systems, enhancing performance while significantly reducing security vulnerabilities. Ian contrasts containers with unikernels, explaining the latter's distinct advantages. He also shares insights on the future of NanoVMs, including ongoing developments and their commitment to improved integrations, making cloud deployment simpler and more secure.

Oct 25, 2025 • 1h 23min

Modern, ultra fast PostgreSQL engineered from scratch? ft: CedarDB

For memberships: join this channel as a member here:https://www.youtube.com/channel/UC_mGuY4g0mggeUGM6V1osdA/joinSummaryIn this conversation, Philipp discusses the innovations behind CedarDB, a database system designed from scratch to optimize performance for modern hardware. He explains the foundational principles of compiling SQL to machine code, the importance of parallel processing, and the challenges of maintaining Postgres compatibility. The discussion also covers the system's approach to handling transactional and analytical workloads, data ingestion processes, query optimization strategies, and future developments including schema evolution and disaggregated storage.Takeaways:- CedarDB is built from the ground up to utilize modern hardware effectively.- The system compiles SQL directly to machine code for performance.- Parallel processing is a key feature, allowing efficient use of multiple cores.- CedarDB aims to be Postgres compatible while innovating on performance.- Transactional workloads are handled efficiently without sacrificing analytical capabilities.- Data ingestion is optimized for both row-oriented and columnar formats.- The system uses optimistic concurrency control to manage write conflicts.- Query optimization leverages statistics to improve join performance.- Future developments include schema evolution and disaggregated storage.- CedarDB is designed to be flexible and adaptable for various workloads.Chapters00:00 Introduction to CDRDB and Background of Philipp05:36 Compiling SQL to Machine Code for Performance11:25 General Purpose vs. Analytical Databases16:51 Transactional Workloads and Hybrid Storage Engine54:29 Understanding B-Tree and Columnar Storage01:02:18 Data Duplication and Memory Efficiency01:08:43 Indexing Strategies and B-Tree Optimization01:15:57 Handling Write Conflicts and Transaction Management01:24:10 Query Optimization and Join Strategies01:33:28 Future Developments in Schema Evolution and StorageImportant Links:CedarDB: https://cedardb.com/The Umbra research project: https://umbra-db.com/SQL Query Compilation: http://www.vldb.org/pvldb/vol4/p539-neumann.pdfOptimistic B-Trees: https://cedardb.com/blog/optimistic_btrees/Our B-Tree storage engine: https://cedardb.com/blog/colibri/For memberships: join this channel as a member here:https://www.youtube.com/channel/UC_mGuY4g0mggeUGM6V1osdA/joinDon't forget to like, share, and subscribe for more insights!=============================================================================Like building stuff? Try out CodeCrafters and build amazing real world systems like Redis, Kafka, Sqlite. Use the link below to signup and get 40% off on paid subscription.https://app.codecrafters.io/join?via=geeknarrator=============================================================================Database internals series: https://youtu.be/yV_Zp0Mi3xsPopular playlists:Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_dModern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsNStay Curios! Keep Learning!

Jul 29, 2025 • 1h 24min

Building a new Database Query Optimiser - @cmu

Read more about Kafka Diskless-topics, KIP by Aiven:KIP-1150: https://fnf.dev/3EuL7mvSummary:In this conversation, Kaivalya Apte and Alexis Schlomer discuss the internals of query optimization with the new project optd. They explore the challenges faced by existing query optimizers, the importance of cost models, and the advantages of using Rust for performance and safety. The discussion also covers the innovative streaming model of query execution, feedback mechanisms for refining optimizations, and the future developments planned for optd, including support for various databases and enhanced cost models.Chapters00:00 Introduction to optd and Its Purpose03:57 Understanding Query Optimization and Its Importance10:26 Defining Query Optimization and Its Challenges17:32 Exploring the Limitations of Existing Optimizers21:39 The Role of Calcite in Query Optimization26:54 The Need for a Domain-Specific Language40:10 Advantages of Using Rust for optd44:37 High-Level Overview of optd's Functionality48:36 Optimizing Query Execution with Coroutines50:03 Streaming Model for Query Optimization51:36 Client Interaction and Feedback Mechanism54:18 Adaptive Decision Making in Query Execution54:56 Persistent Memoization for Enhanced Performance57:12 Guided Scheduling in Query Optimization59:55 Balancing Execution Time and Optimization01:01:43 Understanding Cost Models in Query Optimization01:04:22 Exploring Storage Solutions for Query Optimization01:07:13 Enhancing Observability and Caching Mechanisms01:07:44 Future Optimizations and System Improvements01:18:02 Challenges in Query Optimization Development01:20:33 Upcoming Features and Roadmap for optdReferences:- NeuroCard: learned Cardinality Estimation: https://vldb.org/pvldb/vol14/p61-yang.pdf- RL-based QO: https://arxiv.org/pdf/1808.03196- Microsoft book about QO: https://www.microsoft.com/en-us/research/publication/extensible-query-optimizers-in-practice/- Cascades paper: https://15721.courses.cs.cmu.edu/spring2016/papers/graefe-ieee1995.pdf- optd source code: https://github.com/cmu-db/optd- optd website (for now): https://db.cs.cmu.edu/projects/optd/For memberships: join this channel as a member here:https://www.youtube.com/channel/UC_mGuY4g0mggeUGM6V1osdA/joinDon't forget to like, share, and subscribe for more insights!=============================================================================Like building stuff? Try out CodeCrafters and build amazing real world systems like Redis, Kafka, Sqlite. Use the link below to signup and get 40% off on paid subscription.https://app.codecrafters.io/join?via=geeknarrator=============================================================================Database internals series: https://youtu.be/yV_Zp0Mi3xsPopular playlists:Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_dModern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsNStay Curios! Keep Learning!#database #queryoptimization #sql #postgres

Jul 29, 2025 • 1h 6min

Fast Observability on S3 with Parseable

For memberships: join this channel as a member here:https://www.youtube.com/channel/UC_mGuY4g0mggeUGM6V1osdA/joinSummaryIn this conversation, Nitish Tiwari discusses Parseable, an observability platform designed to address the challenges of managing and analyzing large volumes of data. The discussion covers the evolution of observability systems, the design principles behind Parseable, and the importance of efficient data ingestion and storage in S3. Nitish explains how Parseable allows for flexible deployment, handles data organization, and supports querying through SQL. The conversation also touches on the correlation of logs and traces, failure modes, scaling strategies, and the optional nature of indexing for performance optimization.References:Parseable: https://www.parseable.com/GitHub Repository: https://github.com/parseablehq/parseableArchitecture: https://parseable.com/docs/architecture Chapters:00:00 Introduction to Parseable and Observability Challenges05:17 Key Features of Parseable12:03 Deployment and Configuration of Parseable18:59 Ingestion Process and Data Handling32:52 S3 Integration and Data Organisation35:26 Organising Data in Parseable38:50 Metadata Management and Retention39:52 Querying Data: User Experience and SQL44:28 Caching and Performance Optimisation46:55 User-Friendly Querying: SQL vs. UI48:53 Correlating Logs and Traces50:27 Handling Failures in Ingestion53:31 Managing Spiky Workloads54:58 Data Partitioning and Organisation58:06 Creating Indexes for Faster Reads01:00:08 Parseable's Architecture and Optimisation01:03:09 AI for Enhanced Observability01:05:41 Getting Involved with ParseableFor memberships: join this channel as a member here:https://www.youtube.com/channel/UC_mGuY4g0mggeUGM6V1osdA/joinDon't forget to like, share, and subscribe for more insights!=============================================================================Like building stuff? Try out CodeCrafters and build amazing real world systems like Redis, Kafka, Sqlite. Use the link below to signup and get 40% off on paid subscription.https://app.codecrafters.io/join?via=geeknarrator=============================================================================Database internals series: https://youtu.be/yV_Zp0Mi3xsPopular playlists:Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_dModern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsNStay Curios! Keep Learning!#database #s3 #objectstorage #opentelemetry #logs #metrics

Jul 29, 2025 • 1h 17min

How does AWS Lambda work?

For memberships: join this channel as a member here:https://www.youtube.com/channel/UC_mGuY4g0mggeUGM6V1osdA/joinSummary:In this conversation, Kaivalya Apte and Rajesh Pandey talk about the engineering behind AWS Lambda, exploring its architecture, use cases, and best practices. They discuss the challenges of event handling, concurrency, and load balancing, as well as the importance of observability and testing in serverless environments. The conversation highlights the innovative solutions AWS Lambda provides for developers, emphasizing the balance between simplicity and complexity in cloud computing.Chapters:00:00 Introduction to AWS Lambda04:36 Use Cases and Best Practices for AWS Lambda09:34 Event Handling and Queue Management19:41 Idempotency and Event Duplication Challenges29:39 Cold Starts and Performance Optimization34:37 Statelessness and Resource Management in Lambda42:18 Understanding Micro-VMs and Cold Starts45:14 Resource Management and Recommendations for Developers47:04 Scaling and Back Pressure in Serverless Systems51:33 Cellular Architecture and Fairness in Resource Allocation55:23 Handling Problematic Events and Poison Pills01:01:03 Testing and Operational Readiness in Lambda01:14:11 Preparing for High Traffic EventsReferences:Handling Billions of invocations: https://aws.amazon.com/blogs/compute/handling-billions-of-invocations-best-practices-from-aws-lambda/Firecracker: https://firecracker-microvm.github.io/AWS Lambda: https://aws.amazon.com/lambda/Connect with Rajesh: https://x.com/RPandeyViewshttps://www.linkedin.com/in/rajeshpandeyiiit/Don't forget to like, share, and subscribe for more insights!=============================================================================Like building stuff? Try out CodeCrafters and build amazing real world systems like Redis, Kafka, Sqlite. Use the link below to signup and get 40% off on paid subscription.https://app.codecrafters.io/join?via=geeknarrator=============================================================================Database internals series: https://youtu.be/yV_Zp0Mi3xsPopular playlists:Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_dModern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsNStay Curios! Keep Learning!#aws #awslambda #serverless #distributedsystems #scalability #reliability

Jul 29, 2025 • 1h 5min

Breaking Distributed Systems with Kyle Kingsbury from Jepsen

For memberships: join this channel as a member here:https://www.youtube.com/channel/UC_mGuY4g0mggeUGM6V1osdA/joinSummary:In this episode of The Geek Narrator podcast, host Kaivalya Apte interviews Kyle Kingsbury, a renowned expert in database and distributed systems safety analysis. They discuss the world of testing distributed systems, the challenges faced, common bugs and patterns. Kyle shares insights on the importance of understanding system documentation, the role of formal verification, and the balance between performance and safety in testing. He also provides valuable advice for aspiring engineers in the field of distributed systems.Chapters:00:00 Introduction to Kyle Kingsbury and His Work06:59 Common Bugs in Distributed Systems12:37 Functional Bugs vs Safety Bugs17:54 Changes in Testing Over the Years26:03 False Positives and Negatives in Testing32:33 The Importance of Experimentation in Testing39:28 Tools and Technologies for Testing48:58 The Role of Formal Verification57:04 Reusability of TestsImportant links:Distributed systems class: https://github.com/aphyr/distsys-classWrite your own distributed system: https://github.com/jepsen-io/maelstromJepsen Analyses: https://jepsen.io/analysesKey takeaways:- Reading documentation is a crucial first step in testing systems.- Testing distributed systems involves understanding their semantics and guarantees.- Common bugs often arise from mismanagement of definite versus indefinite failures.- Testing strategies for cloud-based systems require cooperation with providers.- Performance testing can reveal unexpected behaviours in systems under stress.- Formal verification remains a challenging but valuable tool in ensuring system safety.- The testing process is iterative and requires collaboration with engineering teams.- Aspiring engineers should immerse themselves in practical experiences to build intuition.For memberships: join this channel as a member here:https://www.youtube.com/channel/UC_mGuY4g0mggeUGM6V1osdA/joinDon't forget to like, share, and subscribe for more insights!=============================================================================Like building stuff? Try out CodeCrafters and build amazing real world systems like Redis, Kafka, Sqlite. Use the link below to signup and get 40% off on paid subscription.https://app.codecrafters.io/join?via=geeknarrator=============================================================================Database internals series: https://youtu.be/yV_Zp0Mi3xsPopular playlists:Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_dModern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsNStay Curios! Keep Learning!#databasearchitecture #distributedsystems #cloudcomputing #testing #jepsen

Apr 7, 2025 • 1h 9min

How do vector (search) databases work? ft: turbopuffer

Simon Eskildsen, Co-founder of TurboPuffer and former infrastructure builder at Shopify, dives into the fascinating world of vector databases. He discusses the transformative role of vector search in enhancing recommendation systems, alongside challenges like cost and scaling. Simon also shares insights on managing podcast episode archives using embeddings and indexing strategies. The conversation highlights the importance of observability in database performance and paints an exciting picture of future trends in vector search technology.

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

App store banner

Play store banner