Data Archives - Software Engineering Daily

Data Archives - Software Engineering Daily
undefined
Jul 19, 2021 • 45min

Imply Infra: Big Data Analysis and Real-World Examples with Jad Naous

Big data analytics is the process of collecting data, processing and cleaning it, then analyzing it with techniques like data mining, predictive analytics, and deep learning. This process requires a suite of tools to operate efficiently. Data analytics can save companies money, drive product development, and give insight into the market and customers. The company Imply provides the necessary tools for safe and efficient large scale analytic workloads. Their tools help minimize operational complexity so that deploying container clusters, or bare metal, or public and hybrid clouds is simple. Infra also monitors their performance, uptime, and scale. They also create fully interactive visualizations that update with real-time data, and plenty of other tools.  In this episode, we talk with Jad Naous, VP of Engineering and Product at Imply, about big data analysis and real-world use cases. Sponsorship inquiries: sponsor@softwareengineeringdaily.com The post Imply Infra: Big Data Analysis and Real-World Examples with Jad Naous appeared first on Software Engineering Daily.
undefined
Jul 15, 2021 • 52min

Better Stack: A New DevOps Experience with Juraj Masar

DevOps has shortened the development life cycle for countless applications and is embraced by companies around the world. But managing and monitoring multiple environments is still a major pain point, particularly when companies need to mix cloud and legacy systems. Knowing when services go down and quickly pinpointing the cause is essential for continuous development.  The company Better Stack provides services for DevOps teams. Their first service, Logtail, provides SQL-compatible structured log management. Query logs like you query a database. Their second service, Better Uptime, is an infrastructure monitoring platform that monitors uptime and, when issues occur, sends voice calls, SMS, and plenty other types of alerts with screenshots and error logs of the incident. Juraj Masar, Co-Founder and CEO at Better Stack, joins us to discuss DevOps and Better Stack. Sponsorship inquiries: sponsor@softwareengineeringdaily.com The post Better Stack: A New DevOps Experience with Juraj Masar appeared first on Software Engineering Daily.
undefined
Jul 14, 2021 • 47min

Data Science on AWS: Implementing AI and ML Pipelines on AWS with Chris Fregly

Data science is an interdisciplinary field that combines strong technical skills with industry knowledge to perform a large range of jobs. Data scientists solve business questions with hands-on work cleaning and analyzing data, building machine learning models and applying algorithms, and generating dynamic visuals and tools to understand the world from the data it generates. Amazon Web Services provides tools for storing data, moving it, analyzing it, and executing algorithms and models on it. In this episode we talk to Chris Fregly, author of the book Data Science on AWS. Chris works at AWS full time on AI and machine learning projects, and joins us to discuss his book and data science more broadly. What does it take to become a data scientist, who’s his book for, and what are the latest advancements in the field? Sponsorship inquiries: sponsor@softwareengineeringdaily.com   The post Data Science on AWS: Implementing AI and ML Pipelines on AWS with Chris Fregly appeared first on Software Engineering Daily.
undefined
Jul 12, 2021 • 59min

Data Lineage: Understanding Data Lineage at Scale with Julien Le Dem

Big Data has exploded the past decade as cloud computing and more efficient hardware made scaling essentially limitless. Products like Uber revolve entirely around analyzing data to provide rides. According to an EMC/IDC study, there was approximately 5.2TB of data for every person in 2020. That estimate was made before the transition to remote work, which likely makes it much higher.  The term “data lineage” refers to the collection, origin, storage, transfer, and use of data over time. Given the size of the Big Data industry and related industries, maintaining a thorough data lineage, even within small companies, can be very difficult. It becomes especially challenging at scale. What innovative tools make understanding all this information possible? Can data really continue growing at this rate? In this episode we talk with Julien Le Dem, CTO and Co-Founder at Datakin. We discuss the challenges, available tools, and future for big data and data lineage. Sponsorship inquiries: sponsor@softwareengineeringdaily.com The post Data Lineage: Understanding Data Lineage at Scale with Julien Le Dem appeared first on Software Engineering Daily.
undefined
Jul 3, 2021 • 46min

Text Blaze: Text Shortcuts with Scott Fortmann-Roe

There are over 4 billion people using email. Many people using email for business communicate quick questions to colleagues, send repetitive, template-based information to potential customers and freshly hired employees, and repeat a lot of the same phrases. We actually repeat phrases in a lot of written formats. How often do you copy and paste the same thing to multiple people? The company Text Blaze is making the workday a little faster, more productive, and convenient with their shortcut-to-snippet software product. With Text Blaze you can save any snippet of text or template, including templates that need fill-in-the-blank sections, to a keyboard shortcut. Then type that shortcut in Gmail, Google Docs, LinkedIn, or Salesforce, and wherever else you need to use your saved snippet.  In this episode we talk to Scott Fortmann-Roe, CTO at Text Blaze. Sponsorship inquiries: sponsor@softwareengineeringdaily.com The post Text Blaze: Text Shortcuts with Scott Fortmann-Roe appeared first on Software Engineering Daily.
undefined
Jul 2, 2021 • 50min

LayerCI with Colin Chartier

Continuous integration is a coding practice where engineers deliver incremental and frequent code changes to create higher quality software and collaborate more. Teams attempting to continuously integrate new code need a consistent and automated pipeline for reviewing, testing, and deploying the changes. Otherwise change requests pile up in the queue and nothing gets integrated efficiently.  The company LayerCI is a platform built to deliver a better remote infrastructure experience. It enables engineers to preview full stack staging environments for every commit and have a centralized CI/CD stack with full end-to-end testing. LayerCI can duplicate a fully provisioned environment so that end-to-end workflows can run in parallel and alongside unit tests. The result is faster code review, testing and deployment. Sponsorship inquiries: sponsor@softwareengineeringdaily.com The post LayerCI with Colin Chartier appeared first on Software Engineering Daily.
undefined
Jul 1, 2021 • 52min

Meltano: ELT for DataOps with Douwe Maan

ELT is a process for copying data from a source system into a target system. It stands for “Extract, Load, Transform” and starts with extracting a copy of data from the source location. It’s loaded into the target system like a data warehouse, and then it’s ready to be transformed into a usable format for things like modern cloud applications. The company Meltano provides code that manages ELT pipelines through an open-source, self-hosted, CLI-first, debuggable, and extensible process. Meltano projects manage your Singer tap and target configurations to easily select which entities and attributes to extract. These pipelines track their own incremental replication state so they can pick up where the previous run left off. Once your raw data is in its target source, Meltano helps you transform it into a usable format. These pipelines can run on a schedule and be fed to supported orchestrators like Apache Airflow.  In this episode we talk to Douwe Maan, founder and CEO of Meltano, about their product-market fit and delivery plans. Sponsorship inquiries: sponsor@softwareengineeringdaily.com The post Meltano: ELT for DataOps with Douwe Maan appeared first on Software Engineering Daily.
undefined
Jun 24, 2021 • 50min

Uber Data Science with Kevin Novak

Uber is one of many examples we’ve discussed on this show that has changed the world with big data analysis. With over 8 million users, 1 billion Uber trips and people driving for Uber in over 400 cities and 66 countries, Uber has redefined an entire industry in a very short time frame. It’s difficult to find precise details about Uber’s big data infrastructure online, but we know they collect every possible data point about their drivers and riders. Matching riders and drivers, setting ride fares, predicting demand for cars – these are some examples of what Uber does with its data. In this episode we talk with Kevin Novak about Uber’s data science. What are some key details about their data infrastructure? What can people expect in the future from their data methodologies?  How did a tech conference in Paris turn into one of the fastest growing, highest valued startups in the world? Sponsorship inquiries: sponsor@softwareengineeringdaily.com The post Uber Data Science with Kevin Novak appeared first on Software Engineering Daily.
undefined
Jun 23, 2021 • 38min

Axiom Browser Automation with Yaseer Sheriff

The quantity and quality of a company’s data can mean the difference between a major success or major failure. Companies like Google have used big data from its earliest days to steer their product suite in the direction consumers need. Other companies, like Apple, didn’t always use big data analytics to drive product design, but they do now.  The company Axiom has created a large suite of advanced browser robots that perform difficult tasks like consolidating data across many web applications, extracting data from public sites or from behind logins, data entry, user interface automation, file management and spreadsheet automation. These powerful tools enable people and businesses to collect valuable data to inform their decisions. In this episode we talk to Yaseer Sheriff, Co-Founder and CEO at Axiom. We discuss the value of big data, the opportunities their products enable, and how people can use their tools to improve their data collection practices. Sponsorship inquiries: sponsor@softwareengineeringdaily.com The post Axiom Browser Automation with Yaseer Sheriff appeared first on Software Engineering Daily.
undefined
Jun 17, 2021 • 56min

StreamSets: DataOps and Smart Pipelines with Arvind Prabhakar

The company StreamSets is enabling DataOps practices in today’s enterprises. StreamSets is a data engineering platform designed to help engineers design, deploy, and operate smart data pipelines. StreamSets Data Collector is a codeless solution for designing pipelines, triggering CDC operations, and monitoring data in flight. StreamSets Transformer uses Apache Spark to generate insights about your data across multiple different platforms. Their Control Hub is the single hub for managing all of your data pipelines, data processing jobs, and execution engines. In this episode we talk to Arvind Prabhakar, CTO at StreamSets. Arvind is also an Official Member of the Forbes Technology Council, and a Member, PMC Chair/Member, Committer, Mentor, and Contributor to multiple projects with the Apache Software Foundation. He was previously a Director of Engineering at Cloudera, and a Software Architect at Informatica before that. Sponsorship inquiries: sponsor@softwareengineeringdaily.com The post StreamSets: DataOps and Smart Pipelines with Arvind Prabhakar appeared first on Software Engineering Daily.

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app