Data Archives - Software Engineering Daily cover image

Data Archives - Software Engineering Daily

Latest episodes

undefined
Jun 12, 2023 • 56min

Data Reliability with Barr Moses and Lior Gavish

As companies depend more on data to improve digital products and make informed decisions, it’s crucial that the data they use be accurate and reliable. MonteCarlo, the data reliability company, is the creator of the industry’s first end-to-end data observability platform. Barr Moses and Lior Gavish are the founders of Monte Carlo and they join us today. Sean’s been an academic, startup founder, and Googler. He has published works covering a wide range of topics from information visualization to quantum computing. Currently, Sean is Head of Marketing and Developer Relations at Skyflow and host of the podcast Partially Redacted, a podcast about privacy and security engineering. You can connect with Sean on Twitter @seanfalconer .   Sponsorship inquiries: sponsor@softwareengineeringdaily.com The post Data Reliability with Barr Moses and Lior Gavish appeared first on Software Engineering Daily.
undefined
May 26, 2023 • 54min

Low-Code SQL on dbt Core with Raj Bains from Prophecy

In this podcast episode, we take a look at the intricacies of low-code data pipelines with Raj Bains, the founder of Prophecy.io. Raj shares valuable insights into how performant low-codedata pipelines are revolutionizing industries and transforming everyday operations. Raj discusses the founding story of Prophecy.io, the company’s mission, and its approach to democratizing the creation of efficient data pipeline solutions visual design and code generation. We also discuss technical concepts/conundrums such as data lineage, schema evolution, and metadata management, which are critical in addressing the challenges faced by data pipeline developers and businesses. The episode concludes with Raj’s thoughts on the future of low-code data pipelines, the Prophecy.io roadmap and its potential impact on various industries, from healthcare to finance.   Starting her career as a software developer, Jocelyn Houle is now a Senior Director of Product Management at Securiti.ai, a unified data protection and governance platform. Before that, she was an Operating Partner at Capital One Ventures investing in data and AI startups. Jocelyn has been a founder of two startups and a full life cycle, technical product manager at large companies like Fannie Mae, Microsoft and Capital One.  Follow Jocelyn on LinkedIn  or Twitter @jocelynbyrne. Sponsorship inquiries: sponsor@softwareengineeringdaily.com The post Low-Code SQL on dbt Core with Raj Bains from Prophecy appeared first on Software Engineering Daily.
undefined
Apr 20, 2023 • 33min

Open-Source Embedding Database with Anton Troynikov

Chroma is an open source embedding database that is designed to make it easy to build large language model applications by making knowledge, facts and skills pluggable. Anton Troynikov is the co-founder of Chroma and he is our guest today. This episode is hosted by Lee Atchison. Lee Atchison is a software architect, author, and thought leader on cloud computing and application modernization. His best-selling book, Architecting for Scale (O’Reilly Media), is an essential resource for technical teams looking to maintain high availability and manage risk in their cloud environments. Lee is the host of his podcast, Modern Digital Business, an engaging and informative podcast produced for people looking to build and grow their digital business with the help of modern applications and processes developed for today’s fast-moving business environment. Listen at mdb.fm. Follow Lee at softwarearchitectureinsights.com, and see all his content at leeatchison.com. Sponsorship inquiries: sponsor@softwareengineeringdaily.com The post Open-Source Embedding Database with Anton Troynikov appeared first on Software Engineering Daily.
undefined
Apr 13, 2023 • 41min

Data Activation with Tejas Manohar

Data Activation is the method of unlocking the knowledge sorted within your data warehouse, and making it actionable by your business users in the end tools that they use every day. In doing so, Data Activation helps bring data people toward the center of the business, directly tying their work to business outcomes. Hightouch is the simplest and fastest way to get started with Data Activation. As a Data Activation Platform, Hightouch uses Reverse ETL to sync data from the warehouse to 100+ different integrations. With Hightouch companies can leverage their existing data models and easily view and monitor all of their data syncs in a single platform. Better yet, Hightouch offers a visual audience builder that makes it easy for non-technical users to create custom audiences at moment’s notice. Tejas Manohar is the CEO at Hightouch and he joins us today. Full disclosure: Hightouch is a sponsor of Software Engineering Daily. Alex is an AWS Data Hero, an independent consultant, and the author of The DynamoDB Book, the comprehensive guide to data modeling with DynamoDB. He was an early employee at Serverless, Inc., creators of the Serverless Framework, and was an early community member in the serverless space. His consulting and training work focuses on serverless architectures and database optimization. You can find him on Twitter as @alexbdebrie or on his site, alexdebrie.com. Sponsorship inquiries: sponsor@softwareengineeringdaily.com The post Data Activation with Tejas Manohar appeared first on Software Engineering Daily.
undefined
Apr 7, 2023 • 46min

Self-Service Data Culture with Stemma’s Mark Grover

The podcast discusses the importance of data catalogs in modern data culture, highlighting the challenges of managing data in the cloud era. The creation of Stemma from Lyft's data management issues is explored, along with insights on architecting information schema metadata systems. Additionally, the episode touches on the entrepreneurial journey and the significance of data velocity.
undefined
Apr 6, 2023 • 47min

Streaming Analytics with Hojjat Jafarpour

Streaming analytics refers to the process of analyzing real-time data that is generated continuously and rapidly from various sources, such as sensors, applications, social media, and other internet-connected devices. Streaming analytics platforms enable organizations to extract business value from data in motion, similar to how traditional analytics tools derive insights from data at rest. DeltaStream is a unified serverless stream processing platform to manage, secure and process all your event streams and is based on Apache Flink. Hojjat Jafarpour is the Founder and CEO at DeltaStream and he joins us today. Before joining DeltaStream, Hojjat was at Confluent, the company behind Apache Kafka, he built a product called ksqlDB, ksqlDB is a database built to do Stream processing on top of Apache Kafka.   Starting her career as a software developer, Jocelyn Houle is now a Senior Director of Product Management at Securiti.ai, a unified data protection and governance platform. Before that, she was an Operating Partner at Capital One Ventures investing in data and AI startups. Jocelyn has been a founder of two startups and a full life cycle, technical product manager at large companies like Fannie Mae, Microsoft and Capital One. Follow Jocelyn on Linkedin or Twitter @jocelynbyrne   Sponsorship inquiries: sponsor@softwareengineeringdaily.com The post Streaming Analytics with Hojjat Jafarpour appeared first on Software Engineering Daily.
undefined
Apr 3, 2023 • 51min

Turso: Globally Replicated SQLite with Glauber Costa

Distributed databases are necessary for storing and managing data across multiple nodes in a network. They provide scalability, fault tolerance, improved performance, and cost savings. By distributing data across nodes, they allow for efficient processing of large amounts of data and redundancy against failures. They can also be used to store data across multiple locations for faster access and better performance. Turso is an edge-hosted, distributed database based on libSQL, an open-source and open-contribution fork of SQLite. It was designed to minimize query latency for applications where queries come from anywhere in the world. In particular, it works well with edge functions provided by cloud platforms such as CloudFlare, Netlify, and Vercel, by putting your data geographically close to the code that accesses it. Glauber Costa is the Founder and CEO of ChiselStrike the company behind Turso, and he joins us today. Full disclosure: ChiselStrike is a sponsor of Software Engineering Daily. Alex is an AWS Data Hero, an independent consultant, and the author of The DynamoDB Book, the comprehensive guide to data modeling with DynamoDB. He was an early employee at Serverless, Inc., creators of the Serverless Framework, and was an early community member in the serverless space. His consulting and training work focuses on serverless architectures and database optimization. You can find him on Twitter as @alexbdebrie or on his site, alexdebrie.com. Sponsorship inquiries: sponsor@softwareengineeringdaily.com   The post Turso: Globally Replicated SQLite with Glauber Costa appeared first on Software Engineering Daily.
undefined
Mar 20, 2023 • 27min

Observability Trends with John Hart

DataSet is a log analytics platform provided by Sentinel One that helps DevOps, IT engineering, and security teams get answers from their data across all time periods, both live streaming and historical. It’s powered by a unique architecture that uses a massively parallel query engine to provide actionable insights from the data available. John Hart is a distinguished engineer leading the Event DB team, where he’s responsible for the time series database that powers the Dataset product. John is our guest here today. Full disclosure: SentinelOne is a sponsor of Software Engineering Daily. This episode is hosted by Lee Atchison. Lee Atchison is a software architect, author, and thought leader on cloud computing and application modernization. His best-selling book, Architecting for Scale (O’Reilly Media), is an essential resource for technical teams looking to maintain high availability and manage risk in their cloud environments. Lee is the host of his podcast, Modern Digital Business, an engaging and informative podcast produced for people looking to build and grow their digital business with the help of modern applications and processes developed for today’s fast-moving business environment. Listen at mdb.fm. Follow Lee at softwarearchitectureinsights.com, and see all his content at leeatchison.com. Sponsorship inquiries: sponsor@softwareengineeringdaily.com The post Observability Trends with John Hart appeared first on Software Engineering Daily.
undefined
Mar 10, 2023 • 51min

Data Investing and the MAD with Matt Turck

There are many types of early stage funding available from friends and family to seed to series A.  Some firms invest across a wide set of technologies and seek only to provide capital. Others are in it for the long haul – they focus on specific areas of technology and develop both long term relationships and deep expertise over time.   Today, we are interviewing Matt Turck of First Mark Capital, who is in it for the long haul and whose portfolio companies include Dataiku, Crossbeam, Ada, Cockroach Labs, Clickhouse and more.  Today we will talk about Matt’s career, investment point of view, founding the Data-driven NYC community and the recent release of the 20234 MAD  – an industry resource for understanding the Machine Learning, AI and Data Landscape Be sure to check out the show notes for links to the MAD This epsiode is hosted by Jocelyn Houle. Follow Jocelyn on Linked or on Twitter @jocelynbyrne.   Show notes –  In today’s show we referenced a couple things you may want to check out. Matt’s blog and MAD Landscape The interactive MAD Landscape The picture in Matt’s Office was The Son of Man by Rene Magritte  Matt’s full bio FirstMark Capital Site Sponsorship inquiries: sponsor@softwareengineeringdaily.com The post Data Investing and the MAD with Matt Turck appeared first on Software Engineering Daily.
undefined
Nov 11, 2022 • 46min

Accessing Data at Scale with Justin Borgman

The Presto/Trino project makes distributed querying easier across a variety of data sources. As the need for machine learning and other high volume data applications has increased, the need for support, tooling, and cloud infrastructure for Presto/Trino has increased with it. Starburst helps your teams run fast queries on any data source. With Starburst you get a single point of access to your data, no matter where it’s stored and it supports high concurrency. Whether it’s fast SQL queries on your data lake or faster queries across multiple datasets, Starburst helps your teams run analytics anywhere. Justin Borgman is the CEO of Starburst, and he joins us today. Sponsorship inquiries: sponsor@softwareengineeringdaily.com ᐧ The post Accessing Data at Scale with Justin Borgman appeared first on Software Engineering Daily.

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner