
Data Archives - Software Engineering Daily
Databases and data engineering episodes of Software Engineering Daily
Latest episodes

Nov 7, 2022 • 40min
Building on the Data Cloud with Torsten Grabs
Building and managing data-intensive applications has traditionally been costly and complex, and has placed an operational burden on developers to maintain as their organization scales. Todays’ developers, data scientists, and data engineers need a streamlined, single cloud data platform for building applications, pipelines, and machine learning models — without having to move or copy their data. Platforms like the Snowflake Data Cloud provides a unified tool for developers to easily build data applications with Python using Streamlit’s open source framework and Snowflake’s Native Application Framework, gain a streamlined architecture that natively supports users’ programming languages of choice including Java, Scala, SQL, and now Python with Snowpark, store and use transactional and analytical data together with Unistore, and more.
Torsten Grabs is the Director of Product Management at Snowflake focused on Data Lake, Data Pipelines, and Data Science. He joins the show to dive into how Snowflake is disrupting application development, and how developers today can eliminate complexity with the Data Cloud.
Sponsorship inquiries: sponsor@softwareengineeringdaily.com
The post Building on the Data Cloud with Torsten Grabs appeared first on Software Engineering Daily.

Sep 12, 2022 • 35min
Serverless Clickhouse for Developers with Jorge Sancha
Data analytics technology and tools have seen significant improvements in the past decade. But, it can still take weeks to prototype, build and deploy new transformations and deployments, usually requiring considerable engineering resources. Plus, most data isn’t real-time. Instead, most of it is still batch-processed.
Tinybird Analytics provides an easy way to ingest and query large amounts of data in real-time, as well as to automatically create an API to consume those queries. This makes it easy to build fast and scalable applications that query your data; no backend needed!
In this episode, we interview Jorge Sancha, Founder and CEO of Tinybird.
Full disclosure: Tinybird is a sponsor of Software Engineering Daily.
Sponsorship inquiries: sponsor@softwareengineeringdaily.com
The post Serverless Clickhouse for Developers with Jorge Sancha appeared first on Software Engineering Daily.

Aug 18, 2022 • 54min
Data Infrastructure for Finance
Data is becoming a bank’s biggest asset. These complex enterprises have a huge opportunity ahead – to transform themselves to become a trusted hub of a much broader data ecosystem that goes beyond the financial industry and helps to form a new class of cross-industry experience architectures that are scalable and transparent. The data physics that is needed for such emerging systems runs on consent and privacy preservation rather than black-boxed data lakes. A foundation for making this happen lies in the ability to use distributed, heterogenous data effectively and transform it into experiences that are relevant to the customer. These new experience architectures are following design patterns that are more participatory and consent-based than blindly personalized. In this episode, we deep dive into the context-aware Flybits platform with founder & CEO, Hossein Rahnama, alongside their CTO, Petar Kramaric, and VP of Engineering, Justin Lam. Sponsorship inquiries: sponsor@softwareengineeringdaily.com
The post Data Infrastructure for Finance appeared first on Software Engineering Daily.

Aug 5, 2022 • 47min
Faking Data Using Tonic.ai with Ian Coe and Adam Kamor
Ian Coe CEO
Adam KamorHead of Engineering
Companies that gather data about their users have an ethical obligation and legal responsibility to protect the personally identifiable information in their dataset. Ideally, developers working on a software application wouldn’t need access to production data. Yet without high-quality example data, many technology groups stumble on avoidable problems. Organizations need a solution to protect privacy while simultaneously preserving aspects of the data which are important.Tonic is automating data synthesis to advance data privacy. Their solution gives your production-like data for development and analytical purposes without compromising on data quality or privacy. In this episode, We interview Tonic’s CEO Ian Coe, and Head of Engineering Adam Kamor.
Sponsorship inquiries: sponsor@softwareengineeringdaily.com
The post Faking Data Using Tonic.ai with Ian Coe and Adam Kamor appeared first on Software Engineering Daily.

Jul 28, 2022 • 30min
Couchbase with Ravi Mayuram
Couchbase is a distributed NoSQL cloud database. Since its creation, Couchbase has expanded into edge computing, application services, and most recently, a database-as-a-service called Capella.
Couchbase started as an in-memory cache and needed to be rearchitected to be a persistent storage system. In this episode, We interviewed Ravi Mayuram, SVP Products, and Engineering at Couchbase. To learn more about Couchbase, check out couchbase.com/sedaily.
Sponsorship inquiries: sponsor@softwareengineeringdaily.com
The post Couchbase with Ravi Mayuram appeared first on Software Engineering Daily.

Jun 1, 2022 • 45min
Decodable Streaming with Eric Sammer
Streaming data platforms like Kafka, Pulsar, and Kinesis are now common in mainstream enterprise architectures, providing low-latency real-time messaging for analytics and applications. However, stream processing – the act of filtering, transforming, or analyzing the data inside the messages – is still an exercise left to the receiving microservice or datastore, a custom programming exercise likely repeated over and over within an application. Stream processing tools such as Apache Flink and ksqlDB have been around for half a decade, but their complexity has hindered adoption. Decodable’s mission is to radically simplify processing on the stream with a SaaS platform based on Flink, and using only SQL which frees up developers to focus on what matters most.
Eric Sammer is founder and CEO of Decodable and joins the show to discuss the potential of stream processing, its role in modern data platforms, and how it’s being used today.
Sponsorship inquiries: sponsor@softwareengineeringdaily.com
The post Decodable Streaming with Eric Sammer appeared first on Software Engineering Daily.

May 14, 2022 • 28min
Data Delivery with Naqeeb Memon
Data-as-a-service is a company category type that is not as common as API-as-a-service, software-as-a-service, or platform-as-a-service. In order to vend data, a data-as-a-service provider needs to define how that data will be priced, stored, and delivered to users: streaming over an API or served via static files. Naqeeb Memon of Safegraph joins the show to talk through the mechanics of delivering data-as-a-service.
Sponsorship inquiries: sponsor@softwareengineeringdaily.com
The post Data Delivery with Naqeeb Memon appeared first on Software Engineering Daily.

May 11, 2022 • 42min
Data Labeling with Michael Malyuk
Data labeling allows machine learning algorithms to find patterns among the data. There are a variety of data labeling platforms that enable humans to apply labels to this data and ready it for algorithms. Heartex is a data labeling platform with an open source core. Michael Malyuk joins the show to talk through the platform and modern usage of data labeling systems.
Sponsorship inquiries: sponsor@softwareengineeringdaily.com
The post Data Labeling with Michael Malyuk appeared first on Software Engineering Daily.

May 9, 2022 • 44min
Pinot and StarTree with Chinmay Soman
Real-time analytics are difficult to achieve because large amounts of data must be integrated into a data set as that data streams in. As the world moved from batch analytics powered by Hadoop into a norm of “real-time” analytics, a variety of open source systems emerged. One of these was Apache Pinot. StarTree is a company based on Apache Pinot that provides fast, real-time data analytics. Chinmay Soman joins the show to discuss Apache Pinot in relation to other real-time analytics platforms, and what StarTree has built on top of Pinot.
Sponsorship inquiries: sponsor@softwareengineeringdaily.com
The post Pinot and StarTree with Chinmay Soman appeared first on Software Engineering Daily.

Apr 29, 2022 • 41min
Data Loss Prevention with Yasir Ali
Data loss can occur when large data sources such as Slack or Google Drive get leaked. In order to detect and avoid leaks, a data asset graph can be built to understand the risks of a company environment. Polymer is a data loss prevention product that helps companies avoid problematic data leaks. Yasir Ali is the founder of Polymer and joins the show to talk about the engineering and product vision of Polymer.
Show notes:
polymerhq.io
Sponsorship inquiries: sponsor@softwareengineeringdaily.com
The post Data Loss Prevention with Yasir Ali appeared first on Software Engineering Daily.
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.