Data Archives - Software Engineering Daily cover image

Data Archives - Software Engineering Daily

Latest episodes

undefined
Apr 27, 2022 • 42min

Airbyte Engineering with Michel Tricot

Data integration infrastructure is not easy to build. Moving large amounts of data from one place to another has historically required developers to build ad hoc integration points to move data between SaaS services, data lakes, and data warehouses. Today, there are dedicated systems and services for moving these large batches of data. Airbyte builds open source data integration systems, and Michel Tricot from Airbyte joins the show to talk about the design and development of Airbyte. Sponsorship inquiries: sponsor@softwareengineeringdaily.com   The post Airbyte Engineering with Michel Tricot appeared first on Software Engineering Daily.
undefined
Apr 25, 2022 • 43min

Select Star with Shinji Kim

Modern organizations eventually face data governance challenges.  Keeping track of where data came from, what systems update it, in what ways updates can be made are just some of the issues to be tackled.  Large organizations face additional challenges like training, onboarding, and capturing the institutional knowledge that leaves with the departure of key team members.  As teams grow, these challenges often grow faster for unprepared organizations. Select Star helps companies unlock the full context of their data.  Their solution automatically catalogs and documents your database tables and BI dashboards.  In this episode, I interview Shhinji Kim about the functionality of Select Star and how companies have achieved successful adoption. Sponsorship inquiries: sponsor@softwareengineeringdaily.com The post Select Star with Shinji Kim appeared first on Software Engineering Daily.
undefined
Apr 14, 2022 • 49min

Time Series IoT on InfluxDB with Brian Gilmore

The solution many turn to for capturing their streaming data is InfluxDB.  In this episode, I interview Brian Gilmore, Director of Product Management at InfluxData, about how real time applications achieve success built on top of InfluxDB. When most people hear the phrase Internet of Things, it typically evokes an image of connected devices we install in our homes.  While this is a common use case, the true winner to date in IoT is probably industrial automation.  While small improvements can yield big returns, and small errors can result in huge losses, it’s critical to capture and elegantly handle telemetry data from industrial systems.   Sponsorship inquiries: sponsor@softwareengineeringdaily.com   The post Time Series IoT on InfluxDB with Brian Gilmore appeared first on Software Engineering Daily.
undefined
Apr 5, 2022 • 44min

Data Engineering Trends with Lior Gavish and James Densmore

 Lior Gavish James Densmore Data infrastructure is a fast-moving sector of the software market. As the volume of data has increased, so too has the quality of tooling to support data management and data engineering. In today’s show, we have a guest from a data intensive company as well as a company that builds a popular data engineering product. James Densmore works at Hubspot, which produces tons of data, and Lior Gavish works at Monte Carlo Data, which sells a data quality product.   Sponsorship inquiries: sponsor@softwareengineeringdaily.com The post Data Engineering Trends with Lior Gavish and James Densmore appeared first on Software Engineering Daily.
undefined
Mar 31, 2022 • 49min

PlanetScale Management with Sam Lambert

Running a database company requires expertise in both technical and managerial skills. There are deeply technical engineering questions around query paths, scalability, and distributed systems. And there are complex managerial questions around developer productivity and task allocation. Sam Lambert is the CEO of PlanetScale, which is building modern relational database infrastructure. Before PlanetScale he spent several years on infrastructure at GitHub. He joins the show to talk about his work at PlanetScale and the vision for the company. Sponsorship inquiries: sponsor@softwareengineeringdaily.com The post PlanetScale Management with Sam Lambert appeared first on Software Engineering Daily.
undefined
Mar 29, 2022 • 43min

SingleStore with Jordan Tigani

SingleStore is a multi-use, multi-model database designed for transactional and analytic workloads, as well as search and other domain specific applications. SingleStore is the evolution of the database company MemSQL, which sought to bring fast, in-memory SQL database technology to market. Jordan Tigani is Chief Product Officer of SingleStore and joins the show to talk through the architecture and engineering of the SingleStore platform. Sponsorship inquiries: sponsor@softwareengineeringdaily.com The post SingleStore with Jordan Tigani appeared first on Software Engineering Daily.
undefined
Mar 19, 2022 • 49min

DuckDB with Hannes Muleisen

DuckDB is a relational database management system with no external dependencies, with a simple system for deployment and integration into build processes. It enables complex queries in SQL with a large function library, and provides transactional guarantees through multi-version concurrency control. Hannes Mühleisen works on DuckDB and joins the show to talk about query engines and OLAP system. Sponsorship inquiries: sponsor@softwareengineeringdaily.com The post DuckDB with Hannes Muleisen appeared first on Software Engineering Daily.
undefined
Mar 16, 2022 • 47min

RudderStack Engineering with Soumaydeb Mitra

Customer data pipelines power the backend of many successful web platforms. In a customer data pipeline, data is collected from sources such as mobile apps and cloud SaaS tools, transformed and munged using data engineering, stored in data warehouses, and piped to analytics, advertising platforms, and data infrastructure. RudderStack is an open source customer data pipeline system that pulls together this disparate functionality. In a previous episode, we covered the basics of RudderStack. In today’s show, we dive deeper into the engineering of RudderStack with returning guest CEO Soumyadeb Mitra. Sponsorship inquiries: sponsor@softwareengineeringdaily.com The post RudderStack Engineering with Soumaydeb Mitra appeared first on Software Engineering Daily.
undefined
Mar 9, 2022 • 43min

Apache Hudi with Vinoth Chandar

The data lake architecture has become broadly adopted in a relatively short period of time.  In a nutshell, that means data in it’s raw format stored in cloud object storage.  Modern software and data engineers have no shortage of options for accessing their data lake, but that list shrinks quickly if you care about features like transactions.  Apache Hudi is a platform for building streaming data lakes that is optimized for lake engines and batch processing.  In this episode, I interview Vinoth Chandar, creator of the Hudi Project and Founder and CEO at Onehouse. Sponsorship inquiries: sponsor@softwareengineeringdaily.com   The post Apache Hudi with Vinoth Chandar appeared first on Software Engineering Daily.
undefined
Feb 25, 2022 • 52min

Data Catalog in Practice with Mark Grover

A data catalog provides an index into the data sets and schemas of a company. Data teams are growing in size, and more companies than ever have a data team, so the market for data catalog is larger than ever.Mark is the CEO of Stemma and the co-creator of Amundsen, a data catalog that came out of Lyft. We have previously explored the basics of Amundsen. In today’s episode, Mark Grover returns to the show to talk about the art and science of data catalogs. Sponsorship inquiries: sponsor@softwareengineeringdaily.com  The post Data Catalog in Practice with Mark Grover appeared first on Software Engineering Daily.

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner