The GeekNarrator

Kaivalya Apte
undefined
Jun 5, 2024 • 47min

SuperCharging PostgreSQL for Search and Analytics - ParadeDB (Philippe Noël)

In this video I speak with Philippe Noël, about ParadeDB, which is an Elasticsearch alternative built on Postgres, modernizing the features of Elasticsearch's product suite, starting with real-time search and analytics. I hope you will enjoy and learn about the product. Chapters: 00:00 Introduction 01:12 Challenges with Elasticsearch and the Need for ParadeDB 02:29 Why Postgres? 06:30 Technical Details of ParadeDB's Search Functionality 18:25 Analytics Capabilities of ParadeDB 24:00 Understanding ParadeDB Queries and Transactions 24:22 Application Logic and Data Workflows 25:14 Using PG Cron for Data Migration 30:05 Scaling Reads and Writes in Postgres 31:53 High Availability and Distributed Systems 34:31 Isolation of Workloads 39:38 Database Upgrades and Migrations 41:21 Using ParadeDB Extensions and Distributions 43:02 Observability and Monitoring 44:42 Upcoming Features and Roadmap 46:34 Final Thoughts Important links: Links: GitHub: https://github.com/paradedb/paradedb Website: https://paradedb.com Docs: https://docs.paradedb.com/ Blog: https://blog.paradedb.com Follow me on Linkedin and Twitter: https://www.linkedin.com/in/kaivalyaapte/ and https://twitter.com/thegeeknarrator If you like this episode, please hit the like button and share it with your network. Also please subscribe if you haven't yet. Database internals series: https://youtu.be/yV_Zp0Mi3xs Popular playlists: Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA- Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17 Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN Stay Curios! Keep Learning! #postgresql #datafusion #parquet #sql #OLAP #apachearrow #database #systemdesign #elasticsearch
undefined
15 snips
Jun 5, 2024 • 57min

Modern OLAP Database System Design with FDAP (Andrew Lamb)

Andrew Lamb, Staff Software Engineer at InfluxDB and chair of the Apache Data Fusion project, shares his expertise on modern OLAP database design. He explains the power of the FDAP stack, highlighting how Apache Parquet and Arrow enhance data storage and retrieval efficiency. The conversation delves into the challenges of data immutability and management, while also discussing Flight's role in simplifying data transfer. Looking ahead, Andrew envisions evolving trends in database technologies, paving the way for innovative solutions in analytics.
undefined
Jun 5, 2024 • 46min

The ultimate multi-model Database, SurrealDB with Pratim Bhosale

In this video I and Pratim Bhosale, Developer Advocate at SurrealDB, talk about SurrealDB, a multi-model database which aims to make Developer’s life easier by letting them focus mainly on the business logic and not on the Database choice. Following chapters will help you understand what is a multi-model database and how SurrealDB shines. Chapters: 00:00 Introduction 01:48 The Genesis of SurrealDB 03:59 SurrealDB's Mission and Use Cases 07:34 Understanding Multi-Model Databases 10:30 Deep Dive into SurrealDB's Architecture 33:09 Deployment and Getting Started with SurrealDB 34:31 Future Developments and Use Case Considerations 43:51 Final Thoughts and How to Get Started Important links: Install SurrealDB https://sdb.li/4bqwn38 SurrealDB Docs: https://sdb.li/3wxjoxx SurrealDB Website: https://sdb.li/3JMK7JI Surrealist: https://sdb.li/4b7wcdh SurrealDB GitHub: https://sdb.li/3JRPNlE Follow me on Linkedin and Twitter: https://www.linkedin.com/in/kaivalyaapte/ and https://twitter.com/thegeeknarrator If you like this episode, please hit the like button and share it with your network. Also please subscribe if you haven't yet. Database internals series: https://youtu.be/yV_Zp0Mi3xs Popular playlists: Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA- Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17 Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN Stay Curios! Keep Learning! #surrealdb #elasticsearch #search #vectorsearch #acid #databases #sql #joins #indexes #graphdatabase
undefined
May 17, 2024 • 1h 15min

Demystifying Real-time Analytics, Search and Hybrid Search with Dhruba, CTO @Rockset

In this video, I talk to Dhruba, CTO @Rockset about search and realtime analytics. We discussed deep internals of Rockset, its architecture and why is it a great fit for search and realtime analytics use cases. Chapters: 00:00 Introduction 02:45 The Evolution of Data Systems: From Hadoop to Rockset 07:30 Understanding Rockset: Real-Time Analytics and Search Defined 12:01 The Technical Edge: Rockset vs. Elasticsearch 18:16 Deep Dive into Rockset's Architecture and Internals 28:21 Partitioning, Hashing, and Data Distribution in Rockset 36:56 Exploring Hot Storage and Cache Layers 37:40 Why Hot Storage is Essential for Low Latency 39:05 Optimizing Data Storage with Compression and Delta Encoding 39:49 Balancing Cost and Performance in Data Storage 41:50 The Power of Converged Indexing in Rockset 45:50 Efficient Query Execution and Index Management 54:51 Leveraging Mutability for Real-Time Analytics 59:24 Deep Dive into Query Processing and Optimization 01:04:21 Understanding Joins and Reporting Queries in Rockset 01:12:23 Future Directions and Vector Search Innovations Index Conference: https://rockset.com/index-conf/ Follow me on Linkedin and Twitter: https://www.linkedin.com/in/kaivalyaapte/ and https://twitter.com/thegeeknarrator If you like this episode, please hit the like button and share it with your network. Also please subscribe if you haven't yet. Database internals series: https://youtu.be/yV_Zp0Mi3xs Popular playlists: Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA- Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17 Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN Stay Curios! Keep Learning! #rockset #elasticsearch #search #vectorsearch #realtime #databases #sql #joins #indexes
undefined
May 17, 2024 • 47min

Rapidly Simulate Production Traffic ft. Michael Drogalis

In this episode we explore how to Rapidly Simulate Production Traffic with Michael Drogalis, using his creation ShadowTraffic. I am sure you will be able to relate to all the different problems mentioned in this episode and like how ShadowTraffic aims to solve those problems. I hope you like this conversation. Chapters: 00:00 Welcome to The Geek Narrator Podcast: Exploring Deep Tech 00:18 The Challenge of Simulating Production Traffic 00:59 Introducing Shadow Traffic: A Solution to Data Simulation 02:34 Understanding the Problem Space of Data Simulation 06:03 How Shadow Traffic Works: A Deep Dive 08:17 The Power of Declarative Data Generation with Shadow Traffic 10:40 Shadow Traffic's Architecture and Deployment 13:02 Configuring Load Testing and Throttling with Shadow Traffic 15:47 Testing and Validation in Shadow Traffic 20:42 Mimicking Production Data Distribution with Shadow Traffic 26:48 Innovative Features for Stream Processing Testing 28:47 Shadow Traffic: Adding Faults to Data for Robust Testing 29:04 Antithesis and Shadow Traffic: A Synergistic Approach 32:46 The Challenge of Generating Realistic Test Data 40:04 Enhancing Observability in Data Generation 41:50 Customer-Driven Roadmap and Future Vision 45:27 Closing Thoughts ShadowTraffic: https://shadowtraffic.io/ Contact Michael: https://shadowtraffic.io/contact.html Follow me on Linkedin and Twitter: https://www.linkedin.com/in/kaivalyaapte/ and https://twitter.com/thegeeknarrator If you like this episode, please hit the like button and share it with your network. Also please subscribe if you haven't yet. Database internals series: https://youtu.be/yV_Zp0Mi3xs Popular playlists: Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA- Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17 Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN Stay Curios! Keep Learning! #kafka #s3 #postgres #testing #streamprocessing #loadtesting #chaostesting #demo
undefined
May 17, 2024 • 52min

High Performance with GraalVM - Alina Yurenko

If you're involved in the Java space, chances are you've come across #GraalVM. And for those active in the tech community, you might have heard about the recent 1BRC challenge initiated by Gunnar Morling. GraalVM truly showcased its capabilities in this challenge, sparking my curiosity. That's why I reached out to Alina to delve deeper into GraalVM, exploring its features and uncovering how it excels in such endeavors. And here we are talking about GraalVM Chapters: 00:00 Introduction 01:47 GraalVM's Impact on the 1BRC Challenge and Its Features 04:34 Exploring GraalVM's Core Features and Benefits 08:34 Real-World Success Stories: GraalVM in Action 16:18 Understanding Native Image Compilation with GraalVM 20:34 Framework Compatibility and GraalVM Integration 25:04 Testing and Integration with GraalVM 25:26 Exploring Testing and Development with GraalVM 25:58 Best Practices for Developing with GraalVM 28:11 Migrating to GraalVM: Strategies and Considerations 31:25 Performance Optimization in GraalVM 35:15 Building and Resource Considerations for GraalVM 38:45 Expanding Horizons: Polyglot Programming with GraalVM 43:15 Future Directions and Limitations of GraalVM 47:40 Engaging the Java Community: GraalVM's Impact 50:21 Getting Started with GraalVM: Resources and Recommendations References and Links: - The GraalVM website with docs, downloads, guides: https://www.graalvm.org/ - Nicolai Parlog's "Modern Java in Action" demo: https://github.com/nipafx/modern-java-demo - My native version of Nicolai's demo: https://github.com/alina-yur/native-modern-java-demo - For news, follow GraalVM: https://twitter.com/graalvm Follow me on Linkedin and Twitter: https://www.linkedin.com/in/kaivalyaapte/ and https://twitter.com/thegeeknarrator If you like this episode, please hit the like button and share it with your network. Also please subscribe if you haven't yet. Database internals series: https://youtu.be/yV_Zp0Mi3xs Popular playlists: Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA- Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17 Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN Stay Curios! Keep Learning! #Java #jvm #graalvm #highperformance #JITcompiler #AOT #nativeimage #security #rust #c++
undefined
Apr 18, 2024 • 59min

Taming TimeSeries Data with QuestDB - Javier Ramirez

In this episode I am talking to Javier Ramirez from QuestDB, about everything QuestDB. This episode is a great resource to understand how QuestDB works, its architecture, what is it optimised for and whats upcoming as per the roadmap. If you have timeseries data and need a simple yet highly scalable solution, #QuestDB is a great option. Chapters: 00:00 Introduction 03:04 Understanding QuestDB: Origins and Use Cases 09:21 Deep Dive into QuestDB's Architecture and Data Ingestion 19:07 Optimizing Data Reads and Writes in QuestDB 28:40 Exploring Data Granularity and Partitioning in QuestDB 29:29 Optimizing Query Performance with Partition Strategies 30:26 Handling Data Ingestion and Query Efficiency 32:58 In-depth Look at Data Duplication and Ingestion Performance 34:55 Understanding Compression and Its Impact on Performance 38:51 Replication and Data Distribution Strategies 47:10 Observability and Metrics in QuestDB 50:57 Future Developments and Enhancements in QuestDB 58:45 Closing Remarks Links: QuestDB: https://questdb.io/ Github: https://github.com/questdb/questdb =============================================================================== For discount on the below courses: Appsync: https://appsyncmasterclass.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003 Testing serverless: https://testserverlessapps.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003 Production-Ready Serverless: https://productionreadyserverless.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003 Use the button, Add Discount and enter "geeknarrator" discount code to get 20% discount. =============================================================================== Follow me on Linkedin and Twitter: https://www.linkedin.com/in/kaivalyaapte/ and https://twitter.com/thegeeknarrator If you like this episode, please hit the like button and share it with your network. Also please subscribe if you haven't yet. Database internals series: https://youtu.be/yV_Zp0Mi3xs Popular playlists: Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA- Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17 Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN Stay Curios! Keep Learning! #questdb #sql #timeseries #timeseriesanalysis #databases #highscale #scaleup #performance #parquet #S3 #replication #writeaheadlog #wal #durability #columnstore
undefined
Apr 9, 2024 • 1h 17min

Beat the CAP Theorem : Make Distributed consistency simple

In this episode I talk to Andras Gerlits, who founded omniledger.io. Andras has a very interesting view on how Distributed Consistency should work that can get rid of several bottlenecks when it comes to maintaining Distributed consistency. He argues how getting rid of a global wall clock and using causality to approach Distributed consistency helps you build resilient, simple and performant systems. We have gone deeper into how that can be achieved and how the product works. Chapters: 00:00 Introduction 00:52 Andras's Journey into Distributed Consistency 03:04 The Evolution of Data Consistency in Banking and Beyond 08:04 Introducing Client-Centric Consistency 10:36 Exploring the Standard Model of Distributed Consistency 16:01 Redefining Strong Consistency with a Relativistic Approach 34:25 Practical Implications of Client-Centric Consistency in Banking 36:20 Mitigating Latencies and Partitions in Distributed Systems 41:08 Exploring System Reliability and Availability 41:52 Tuning System Properties for Specific Use Cases 43:07 Comparing Standard and New Models for Data Management 45:08 Understanding Local Progress and Mutex-Free Updates 47:23 Deep Dive into Token-Based Ordering and Global Calibration 58:30 Introducing OmniLedger: A New Approach to Distributed Consistency 01:02:41 Performance Optimizations and Tunable Consistency 01:08:20 Ideal Use Cases and Potential Limitations of OmniLedger 01:14:30 Future Directions and Closing Thoughts Links: Our website: https://omniledger.io A long-form essay on the thinking behind our model: https://medium.com/p/5e397cb12e63 A demo of transactionality https://www.youtube.com/watch?v=XJSSjY4szZE I think my blog in general might be interesting to some https://medium.com/@andrasgerlits The science-paper with all its mathematical rigour: https://www.researchgate.net/publication/359578461_Continuous_Integration_of_Data_Histories_into_Consistent_Namespaces =============================================================================== For discount on the below courses: Appsync: https://appsyncmasterclass.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003 Testing serverless: https://testserverlessapps.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003 Production-Ready Serverless: https://productionreadyserverless.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003 Use the button, Add Discount and enter "geeknarrator" discount code to get 20% discount. =============================================================================== Follow me on Linkedin and Twitter: https://www.linkedin.com/in/kaivalyaapte/ and https://twitter.com/thegeeknarrator If you like this episode, please hit the like button and share it with your network. Also please subscribe if you haven't yet. Database internals series: https://youtu.be/yV_Zp0Mi3xs Popular playlists: Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA- Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17 Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN Stay Curios! Keep Learning! #databases #sql #consistency #distributedsystems
undefined
19 snips
Mar 27, 2024 • 1h 2min

A Graph Database That You Can Embed - KuzuDB

In a compelling discussion, Semih Salihoglu, an Associate Professor at the University of Waterloo and CEO of KuzuDB, dives into the world of graph databases. He unveils the journey of KuzuDB from academic roots to an innovative startup. The conversation reveals when to choose a graph database, KuzuDB's unique features compared to traditional systems, and advanced query optimization techniques. Salihoglu also shares insights on handling data ingestion and write operations, highlighting KuzuDB's efficiency and future aspirations in the data landscape.
undefined
Mar 22, 2024 • 1h 6min

Restate - making distributed systems simple with Stephan Ewen

In this video, I talk to Stephan Ewen from Restate, who is popularly known from the world of Apache Flink. We have talked about the problems in the world of Distributed systems and the complex solutions developers have to deal with. This complexity makes the architecture so complex that it eventually creates reliability, Observability and delivery velocity problems. Restate aims to solve it by making resilience and durability for your services, functions and RPC a lot simpler. Chapters: 00:00 Introduction 00:45 Introducing Restate: A Solution for Distributed System Challenges 01:22 Deep Dive into Restate with Stefan: From Apache Flink to Building Resilient Systems 06:04 The Complexities of Distributed Systems and How Restate Addresses Them 15:49 The Vision of Restate: Simplifying Developer Experience in Distributed Systems 24:42 Integrating Restate into Your Architecture: A User's Perspective 33:16 Exploring Restate: The Durable Service Mesh 33:32 The Power of Restate in Handling Transactions 34:26 Restate's Role in Service Communication and Durability 35:40 Deep Dive into Restate's Mechanisms and Benefits 38:04 Practical Example: Email Pipeline with Restate 39:40 Understanding Restate's Log and Event Handling 58:43 Restate's Unique Features and Programming Model 01:04:22 Final Thoughts on Restate's Impact and Deployment Restate: https://restate.dev/ =============================================================================== For discount on the below courses: Appsync: https://appsyncmasterclass.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003 Testing serverless: https://testserverlessapps.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003 Production-Ready Serverless: https://productionreadyserverless.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003 Use the button, Add Discount and enter "geeknarrator" discount code to get 20% discount. =============================================================================== Follow me on Linkedin and Twitter: https://www.linkedin.com/in/kaivalyaapte/ and https://twitter.com/thegeeknarrator If you like this episode, please hit the like button and share it with your network. Also please subscribe if you haven't yet. Database internals series: https://youtu.be/yV_Zp0Mi3xs Popular playlists: Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA- Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17 Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN Stay Curios! Keep Learning! #distributedsystems #faulttolerance #reliability #resilience

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app