
Data Archives - Software Engineering Daily
Databases and data engineering episodes of Software Engineering Daily
Latest episodes

Nov 9, 2023 • 48min
Chronosphere with Martin Mao
Observability software helps teams to actively monitor and debug their systems, and these tools are increasingly vital in DevOps. However, it’s not uncommon for the volume of observability data to exceed the amount of actual business data. This creates two challenges – how to analyze the large stream of observability data, and how to keep down the compute and storage costs for that data.
Chronosphere is a popular observability platform that works by identifying the data that’s actually being used to power dashboards and metrics. It then shows the cost for each segment of data, and allows users to decide if a metric is worth that cost. In this way, technical teams can manage costs by dynamically adjusting which data is analyzed and stored. Martin Mao is the Co-founder and CEO of Chronosphere and he joins the podcast today to talk about the growing challenge of managing observability data, and the design of Chronosphere.
This episode is hosted by Lee Atchison. Lee Atchison is a software architect, author, and thought leader on cloud computing and application modernization. His best-selling book, Architecting for Scale (O’Reilly Media), is an essential resource for technical teams looking to maintain high availability and manage risk in their cloud environments.
Lee is the host of his podcast, Modern Digital Business, an engaging and informative podcast produced for people looking to build and grow their digital business with the help of modern applications and processes developed for today’s fast-moving business environment. Listen at mdb.fm. Follow Lee at softwarearchitectureinsights.com, and see all his content at leeatchison.com.
Please click here to see the transcript for this episode.
Sponsorship inquiries: sponsor@softwareengineeringdaily.com
The post Chronosphere with Martin Mao appeared first on Software Engineering Daily.

Oct 24, 2023 • 47min
Streamlit with Amanda Kelly
The importance of data teams is undeniable. Most companies today use data to drive decision-making on anything from software feature development to product strategy, hiring and marketing. In some companies data is the product, which can make data teams even more vital. But there’s a common problem – analyzing data is hard and time consuming. Lots of people have questions they want to answer with data, but data teams often don’t have the resources to move quickly. This can create a pernicious effect where organizations stop asking questions about their own data.
Amanda Kelly thinks a lot about data and the dynamics of data teams inside organizations. She’s worked at Google X, and on self-driving cars and cybersecurity. Her experiences on data teams inspired her to co-found Streamlit, which is an open source Python library that gives primitives to assemble a data app for rapid data visualization and interaction. Her goal was to accelerate the iteration loop to go from a question to a data-driven answer. Amanda is currently the COO of Streamlit and a Product Director at Snowflake, and she joins us today to talk all about data and how she’s building Streamlit.
Sean’s been an academic, startup founder, and Googler. He has published works covering a wide range of topics from information visualization to quantum computing. Currently, Sean is Head of Marketing and Developer Relations at Skyflow and host of the podcast Partially Redacted, a podcast about privacy and security engineering. You can connect with Sean on Twitter @seanfalconer.
Please click here to view this show’s transcript.
Sponsorship inquiries: sponsor@softwareengineeringdaily.com
The post Streamlit with Amanda Kelly appeared first on Software Engineering Daily.

Oct 18, 2023 • 57min
Modern Web Scraping with Erez Naveh
Today it’s estimated there are over 1 billion websites on the internet. Much of this content is optimized to be viewed by human eyes, not consumed by machines. However, creating systems to automatically parse and structure the web greatly extends its utility, and paves the way for innovative solutions and applications. The industry of web scraping has emerged to do just that. However, many websites erect obstacles to hinder web scraping. This has created a new kind of arms race between developers and anti-scraping software.
Bright Data has developed some of the most sophisticated consumer tools available to scrape public web data. Erez Naveh is an entrepreneur and former engineer at Meta. He is currently the VP of Product at Bright Data. Erez joins us in this episode to talk about Bright Data’s mission to structure the open web, and the toolkit they’ve developed to make this possible.
Full Disclosure: Bright Data is a sponsor of Software Engineering Daily
Paweł is the founder at flat.social the world’s first ‘flatverse’ start-up and glot.space, an AI-powered language learning app. Pawel’s background is as a full-stack software engineer with a lean and experimental approach towards product development. With a strong grounding in computing science, he spent the last decade getting early-stage products off the ground – both in startup and corporate settings. Follow Paweł on Twitter, LinkedIn and his personal website – pawel.io.
Please click here to view this show’s transcript.
Sponsorship inquiries: sponsor@softwareengineeringdaily.com
The post Modern Web Scraping with Erez Naveh appeared first on Software Engineering Daily.

Oct 12, 2023 • 45min
Observability with Eduardo Silva
There are hundreds of observability companies out there, and many ways to think about observability, such as application performance monitoring, server monitoring, and tracing. In a production application, multiple tools are often needed to get proper visibility on the application. This creates some challenges. Applications can produce lots of different observatory observability data, but how should the data be routed to the various downstream tools? In addition, how can data be selectively sent to different storage tiers to minimize costs?
Calyptia is a service that helps manage observability data from source to destination. Eduardo Silva is the founder and CEO of Calyptia and he joins us in this episode.
This episode is hosted by Lee Atchison. Lee Atchison is a software architect, author, and thought leader on cloud computing and application modernization. His best-selling book, Architecting for Scale (O’Reilly Media), is an essential resource for technical teams looking to maintain high availability and manage risk in their cloud environments.
Lee is the host of his podcast, Modern Digital Business, an engaging and informative podcast produced for people looking to build and grow their digital business with the help of modern applications and processes developed for today’s fast-moving business environment. Listen at mdb.fm. Follow Lee at softwarearchitectureinsights.com, and see all his content at leeatchison.com.
Please click here to view this show’s transcript.
Sponsorship inquiries: sponsor@softwareengineeringdaily.com
The post Observability with Eduardo Silva appeared first on Software Engineering Daily.

Oct 5, 2023 • 30min
AI and Business Analytics with John Adams
John Adams, Co-founder and Chief Innovation Officer at Alembic, discusses the adoption of AI and its impact on business analytics. Topics include incorporating AI into the platform, analyzing marketing data, graph technology, privacy challenges, tracking impact, and the current state and limitations of AI.

Sep 7, 2023 • 36min
Highly Scalable NoSQL with Dor Laor
Dor Laor, Co-founder and CEO of ScyllaDB, shares insights on Scylla's impressive scalability and speed as a NoSQL database. He talks about the motivations behind creating Scylla, emphasizing the inefficiencies of current database systems. The discussion includes the advantages of NoSQL for handling massive datasets and how choosing the right database architecture can impact scalability. Laor also explores the convergence of SQL and NoSQL technologies and the influence of AI on future database strategies and customer expectations.

Aug 8, 2023 • 36min
Database Caching with Ben Hagan
Database caching is a fundamental challenge in database management and there are hundreds of techniques to satisfy different caching scenarios.
PolyScale is a fully automated database cache. It offers an innovative approach to database caching, leveraging AI and automated configuration to simplify the process of determining what should and should not be cached. Ben Hagan is the founder and CEO of PolyScale and he is our guest today.
Full disclosure: PolyScale is a sponsor of Software Engineering Daily.
This episode is hosted by Lee Atchison. Lee Atchison is a software architect, author, and thought leader on cloud computing and application modernization. His best-selling book, Architecting for Scale (O’Reilly Media), is an essential resource for technical teams looking to maintain high availability and manage risk in their cloud environments.
Lee is the host of his podcast, Modern Digital Business, an engaging and informative podcast produced for people looking to build and grow their digital business with the help of modern applications and processes developed for today’s fast-moving business environment. Listen at mdb.fm. Follow Lee at softwarearchitectureinsights.com, and see all his content at leeatchison.com.
Sponsorship inquiries: sponsor@softwareengineeringdaily.com
Please click here to view this show’s transcript.
The post Database Caching with Ben Hagan appeared first on Software Engineering Daily.

Jul 20, 2023 • 50min
Data-Centric AI with Alex Ratner
Companies have high hopes for Machine learning and AI to support real-time product offerings, prevent fraud and drive innovation. But there was a catch – training models require labeled data that machines can digest. As data volumes increase, the opportunity to get great ML results rises, but so does the problem of labeling all the data to get that excellent result.
Enter Snorkel AI’s programmatic data labeling and MLops platforms like Snorkel Flow. Today we are interviewing Alex Ratner, one of the founders of Snorkel AI. Snorkel AI evolved from research Alex led as part of his Ph.D. research at Stanford, focused on programmatic data labeling to enable much faster and more accurate ML training and retraining.
Alex is a born teacher who always has enthusiasm for the topic. Today he will share the newest evolutions of the product at Snorkel, shed light on why doing ML well requires programmatic data labeling, and talk about foundation models in actual enterprise settings and generally.
Starting her career as a software developer, Jocelyn Houle is now a Senior Director of Product Management at Securiti.ai, a unified data protection and governance platform. Before that, she was an Operating Partner at Capital One Ventures investing in data and AI startups. Jocelyn has been a founder of two startups and a full life cycle, technical product manager at large companies like Fannie Mae, Microsoft and Capital One. Follow Jocelyn on LinkedIn or Twitter @jocelynbyrne.
Sponsorship inquiries: sponsor@softwareengineeringdaily.com
Please click here to view this show’s transcript.
The post Data-Centric AI with Alex Ratner appeared first on Software Engineering Daily.

Jul 11, 2023 • 51min
Making Data-Driven Decisions with Soumyadeb Mitra
In this episode, Soumyadeb Mitra, the founder and CEO of RudderStack, discusses the importance of activating all your data, challenges of integrating different data sources, and building a data-driven culture. They also explore the concept of Customer Data Platforms (CDPs) and the benefits of cloud-native warehouse consolidation. Additionally, they delve into the intersection of generative AI and business, a real business case driving LTV for WISE with data, customer data privacy and compliance, and building better products in existing markets.

Jun 30, 2023 • 52min
Customer-facing Analytics with Tyler Wells
The state of Data inside most companies is chaotic. It takes significant time and investment to tame this chaos. When you are a platform provider you are gathering tons of data from the developers using your platform. These developers building products on your platform need insight into that data to better understand how their application is performing or to troubleshoot it. Most Platforms or SaaS application providers find it both difficult and expensive to build customer-facing analytics and data applications into their platforms. In fact most companies don’t know what to do with the data they are gathering and continually postpone future product roadmap features aimed to unlock this data. This data can be a crucial part of the developer experience and can empower your customers. It can save you countless hours of handling support tickets, and increase overall stickiness on the platform.
Propel is a GraphQL API platform ideal for powering customer-facing analytics use cases, from customer dashboards and analytics APIs to product usage or in-product metrics.
Tyler Wells is Co-founder and CTO at Propel and he joins us today. We discuss how the customer-centric experiences at Twilio lead his team to the journey they are on today.
Sean’s been an academic, startup founder, and Googler. He has published works covering a wide range of topics from information visualization to quantum computing. Currently, Sean is Head of Marketing and Developer Relations at Skyflow and host of the podcast Partially Redacted, a podcast about privacy and security engineering. You can connect with Sean on Twitter @seanfalconer .
Sponsorship inquiries: sponsor@softwareengineeringdaily.com
Please click here to view this show’s transcript.
The post Customer-facing Analytics with Tyler Wells appeared first on Software Engineering Daily.
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.