Data Science at Home

Francesco Gadaleta
undefined
Apr 13, 2021 • 32min

Learning and training in AI times (Ep. 148)

Is there a gap between life sciences and data science? What's the situation when it comes to interdisciplinary research? In this episode I am with Laura Harris, Director of Training for the Institute of Cyber-Enabled Research (ICER) at Michigan State University (MSU), and we try to answer some of those questions.   You can contact Laura at training@msu.edu or on LinkedIn
undefined
Apr 11, 2021 • 45min

You are the product [RB] (Ep. 147)

In this episode I am with George Hosu from Cerebralab and we speak about how dangerous it is not to pay for the services you use, and as a consequence how dangerous it is letting an algorithm decide what you like or not.   Our Sponsors This episode is supported by Chapman’s Schmid College of Science and Technology, where master’s and PhD students join in cutting-edge research as they prepare to take the next big leap in their professional journey. To learn more about the innovative tools and collaborative approach that distinguish the Chapman program in Computational and Data Sciences, visit chapman.edu/datascience   If building software is your passion, you’ll love ThoughtWorks Technology Podcast. It’s a podcast for techies by techies. Their team of experienced technologists take a deep dive into a tech topic that’s piqued their interest — it could be how machine learning is being used in astrophysics or maybe how to succeed at continuous delivery.   Links https://cerebralab.com https://www.eugenewei.com/blog/2019/2/19/status-as-a-service
undefined
Apr 8, 2021 • 33min

Polars: the fastest dataframe crate in Rust - with Ritchie Vink (Ep. 146)

Ritchie Vink, author of Polars, the fastest dataframe library in Rust, discusses his background in data science and how he started working on Polars. They also explore the challenges of designing a new data manipulation library, the significance of thread safety and parallelism in Rust, the use of Apache Arrow as a backhand for communication, and the importance of SIMD instructions in optimizing operations.
undefined
Mar 26, 2021 • 30min

Apache Arrow, Ballista and Big Data in Rust with Andy Grove (Ep. 145)

Do you want to know the latest in big data analytics frameworks? Have you ever heard of Apache Arrow? Rust? Ballista? In this episode I speak with Andy Grove one of the main authors of Apache Arrow and Ballista compute engine. Andy explains some challenges while he was designing the Arrow and Ballista memory models and he describes some amazing solutions.   Our Sponsors This episode is supported by Chapman’s Schmid College of Science and Technology, where master’s and PhD students join in cutting-edge research as they prepare to take the next big leap in their professional journey. To learn more about the innovative tools and collaborative approach that distinguish the Chapman program in Computational and Data Sciences, visit chapman.edu/datascience   If building software is your passion, you’ll love ThoughtWorks Technology Podcast. It’s a podcast for techies by techies. Their team of experienced technologists take a deep dive into a tech topic that’s piqued their interest — it could be how machine learning is being used in astrophysics or maybe how to succeed at continuous delivery.   References   https://arrow.apache.org/   https://ballistacompute.org/   https://github.com/ballista-compute/ballista        
undefined
Mar 19, 2021 • 32min

Pandas vs Rust (Ep. 144)

Pandas is the de-facto standard for data loading and manipulation. Python is the de-facto programming language for such operations. Rust is the underdog. Or is it? In this episode I am showing you why that is no longer the case.   Our Sponsors This episode is supported by Chapman’s Schmid College of Science and Technology, where master’s and PhD students join in cutting-edge research as they prepare to take the next big leap in their professional journey. To learn more about the innovative tools and collaborative approach that distinguish the Chapman program in Computational and Data Sciences, visit chapman.edu/datascience   Amethix use advanced Artificial Intelligence and Machine Learning to build data platforms and predictive engines in domain like finance, healthcare, pharmaceuticals, logistics, energy. Amethix provide solutions to collect and secure data with higher transparency and disintermediation, and build the statistical models that will support your business.     Useful Links https://github.com/haixuanTao/Data-Manipulation-Rust-Pandas https://github.com/ritchie46/polars https://github.com/rust-ndarray/ndarray  
undefined
Mar 13, 2021 • 15min

Concurrent is not parallel - Part 2 (Ep. 143)

In plain English, concurrent and parallel are synonyms. Not for a CPU. And definitely not for programmers. In this episode I summarize the ways to parallelize on different architectures and operating systems. Rock-star data scientists must know how concurrency works and when to use it IMHO.   Our Sponsors This episode is supported by Chapman’s Schmid College of Science and Technology, where master’s and PhD students join in cutting-edge research as they prepare to take the next big leap in their professional journey. To learn more about the innovative tools and collaborative approach that distinguish the Chapman program in Computational and Data Sciences, visit chapman.edu/datascience   Amethix use advanced Artificial Intelligence and Machine Learning to build data platforms and predictive engines in domain like finance, healthcare, pharmaceuticals, logistics, energy. Amethix provide solutions to collect and secure data with higher transparency and disintermediation, and build the statistical models that will support your business.     Useful Links http://web.mit.edu/6.005/www/fa14/classes/17-concurrency/ https://doc.rust-lang.org/book/ch16-00-concurrency.html https://urban-institute.medium.com/using-multiprocessing-to-make-python-code-faster-23ea5ef996ba  
undefined
Mar 10, 2021 • 32min

Concurrent is not parallel - Part 1 (Ep. 142)

In plain English, concurrent and parallel are synonyms. Not for a CPU. And definitely not for programmers. In this episode I summarize the ways to parallelize on different architectures and operating systems. Rock-star data scientists must know how concurrency works and when to use it IMHO.   Our Sponsors This episode is supported by Chapman’s Schmid College of Science and Technology, where master’s and PhD students join in cutting-edge research as they prepare to take the next big leap in their professional journey. To learn more about the innovative tools and collaborative approach that distinguish the Chapman program in Computational and Data Sciences, visit chapman.edu/datascience   Amethix use advanced Artificial Intelligence and Machine Learning to build data platforms and predictive engines in domain like finance, healthcare, pharmaceuticals, logistics, energy. Amethix provide solutions to collect and secure data with higher transparency and disintermediation, and build the statistical models that will support your business.      
undefined
Mar 2, 2021 • 25min

Backend technologies for machine learning in production (Ep. 141)

This is one of the most dynamic and fascinating topics: API technologies for machine learning. It's always fun to build ML models. But how about serving them in the real world? In this episode I speak about three must-know technologies to place your model behind an API.   Our Sponsors This episode is supported by Chapman’s Schmid College of Science and Technology, where master’s and PhD students join in cutting-edge research as they prepare to take the next big leap in their professional journey. To learn more about the innovative tools and collaborative approach that distinguish the Chapman program in Computational and Data Sciences, visit chapman.edu/datascience   If building software is your passion, you’ll love ThoughtWorks Technology Podcast. It’s a podcast for techies by techies. Their team of experienced technologists take a deep dive into a tech topic that’s piqued their interest — it could be how machine learning is being used in astrophysics or maybe how to succeed at continuous delivery.
undefined
Feb 22, 2021 • 45min

You are the product (Ep. 140)

In this episode I am with George Hosu from Cerebralab and we speak about how dangerous it is not to pay for the services you use, and as a consequence how dangerous it is letting an algorithm decide what you like or not.   Our Sponsors This episode is supported by Chapman’s Schmid College of Science and Technology, where master’s and PhD students join in cutting-edge research as they prepare to take the next big leap in their professional journey. To learn more about the innovative tools and collaborative approach that distinguish the Chapman program in Computational and Data Sciences, visit chapman.edu/datascience   If building software is your passion, you’ll love ThoughtWorks Technology Podcast. It’s a podcast for techies by techies. Their team of experienced technologists take a deep dive into a tech topic that’s piqued their interest — it could be how machine learning is being used in astrophysics or maybe how to succeed at continuous delivery.   Links https://cerebralab.com https://www.eugenewei.com/blog/2019/2/19/status-as-a-service
undefined
Feb 15, 2021 • 37min

How to reinvent banking and finance with data and technology (Ep. 139)

The financial system is changing. It is becoming more efficient and integrated with many more services making our life more... digital. Is the old banking system doomed to fail? Or will it just be disrupted by the smaller players of the fintech industry? In this episode we answer some of these fundamental questions with Alessandro E. Hatami from Pacemakers Subscribe to the Newsletter and come chat with us on the official Discord channel   Our Sponsors This episode is supported by Chapman’s Schmid College of Science and Technology, where master’s and PhD students join in cutting-edge research as they prepare to take the next big leap in their professional journey. To learn more about the innovative tools and collaborative approach that distinguish the Chapman program in Computational and Data Sciences, visit chapman.edu/datascience   Amethix use advanced Artificial Intelligence and Machine Learning to build data platforms and predictive engines in domain like finance, healthcare, pharmaceuticals, logistics, energy. Amethix provide solutions to collect and secure data with higher transparency and disintermediation, and build the statistical models that will support your business.    

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app