

Data Science at Home
Francesco Gadaleta
Technology, AI, machine learning and algorithms. Come join the discussion on Discord!
https://discord.gg/4UNKGf3
https://discord.gg/4UNKGf3
Episodes
Mentioned books

Jul 1, 2021 • 32min
Pandas vs Rust [RB] (Ep. 158)
Sponsors
Get one of the best VPN at a massive discount with coupon code DATASCIENCE. It provides you with an 83% discount which unlocks the best price in the market plus 3 extra months for free.
Here is the link https://surfshark.deals/DATASCIENCE

Jun 22, 2021 • 22min
A simple trick for very unbalanced data (Ep. 157)
Data from the real world are never perfectly balanced. In this episode I explain a simple yet effective trick to train models with very unbalanced data. Enjoy the show!
Sponsors
Get one of the best VPN at a massive discount with coupon code DATASCIENCE. It provides you with an 83% discount which unlocks the best price in the market plus 3 extra months for free. Here is the link https://surfshark.deals/DATASCIENCE
References
Leo Breiman, Random Forests, 2001
C. Chen, A. Liaw, L. Breiman, Using Random Forest to Learn Imbalanced Data (2004)

Jun 15, 2021 • 41min
Time to take your data back with Tapmydata (Ep. 156)
In this episode I am with Gilbert Hill, head of strategy at https://tapmydata.com/
We speak about personal data, blockchain and the ability to control it and monetize with another simple yet effective app in the ecosystem.
References
https://tapmydata.com/
https://medium.com/@tholder/we-dont-want-your-data-pushing-boundaries-in-data-collection-and-end-to-end-encryption-for-apps-ebd1d5f79df5

Jun 4, 2021 • 34min
True Machine Intelligence just like the human brain (Ep. 155)
In this episode I have a really interesting conversation with Karan Grewal, member of the research staff at Numenta where he investigates how biological principles of intelligence can be translated into silicon.
We speak about the thousand brains theory and why neural networks forget.
References
Main paper on the Thousand Brains Theory: https://www.frontiersin.org/articles/10.3389/fncir.2018.00121/full
Blog post on Thousand Brains Theory: https://numenta.com/blog/2019/01/16/the-thousand-brains-theory-of-intelligence/
GLOM paper by Geoff Hinton: https://arxiv.org/pdf/2102.12627.pdf
Why neural networks forget? https://numenta.com/blog/2021/02/04/why-neural-networks-forget-and-lessons-from-the-brain

May 26, 2021 • 43min
Delivering unstoppable data with Streamr (Ep. 154)
Delivering unstoppable data to unstoppable apps is now possible with Streamr Network
Streamr is a layer zero protocol for real-time data which powers the decentralized Streamr pub/sub network. The technology works in tandem with companion blockchains - currently Ethereum and xDai chain - which are used for identity, security and payments. On top is the application layer, including the Data Union framework, Marketplace and Core, and all third party applications.
In this episode I have a very interesting conversation with Streamr founder and CEO Henri Pihkala
References
Streamr project website: https://streamr.network/
More about the Streamr Network: https://streamr.network/discover/network
More about Data Unions: https://streamr.network/discover/data-unions
More about the Data Marketplace: https://streamr.network/discover/marketplace
Developer docs: https://streamr.network/docs
Streamr Github: https://github.com/streamr-dev
Streamr Discord: https://discord.gg/gZAm8P7hK8
Streamr Twitter: https://twitter.com/streamr
Streamr YouTube: https://www.youtube.com/channel/UCGWEA61RueG-9DV53s-ZyJQ
Streamr Reddit: https://reddit.com/r/streamr
Scalability & latency research blog: https://blog.streamr.network/streamr-network-performance-and-scalability-whitepaper/
Swash, a Data Union built on Streamr: https://swashapp.io/

May 24, 2021 • 25min
MLOps: the good, the bad and the ugly (Ep. 153)
Our Sponsor
Amethix use advanced Artificial Intelligence and Machine Learning to build data platforms and predictive engines in domain like finance, healthcare, pharmaceuticals, logistics, energy. Amethix provide solutions to collect and secure data with higher transparency and disintermediation, and build the statistical models that will support your business.

May 19, 2021 • 31min
MLOps: what is and why it is important Part 2 (Ep. 152)
Our Sponsor
Amethix use advanced Artificial Intelligence and Machine Learning to build data platforms and predictive engines in domain like finance, healthcare, pharmaceuticals, logistics, energy. Amethix provide solutions to collect and secure data with higher transparency and disintermediation, and build the statistical models that will support your business.

May 11, 2021 • 33min
MLOps: what is and why it is important (Ep. 151)
If you think that knowing Tensorflow and Scikit-learn is enough, think again.
MLOps is one of those trendy terms today.
What is MLOps and why is it important?
In this episode I speak about the undeniable evolution of the data scientist in the last 5-10 years.
Sponsors
If building software is your passion, you’ll love ThoughtWorks Technology Podcast. It’s a podcast for techies by techies. Their team of experienced technologists take a deep dive into a tech topic that’s piqued their interest — it could be how machine learning is being used in astrophysics or maybe how to succeed at continuous delivery.
Amethix use advanced Artificial Intelligence and Machine Learning to build data platforms and predictive engines in domain like finance, healthcare, pharmaceuticals, logistics, energy. Amethix provide solutions to collect and secure data with higher transparency and disintermediation, and build the statistical models that will support your business.

Apr 28, 2021 • 39min
Can I get paid for my data? With Mike Andi from Mytiki (Ep. 150)
Your data is worth thousands a year. Why aren’t you getting your fair share?
There is a company that has a mission: they want you to take back control and get paid for your data.
In this episode I speak about knowledge graphs, data confidentiality and privacy with Mike Audi, CEO of MyTiki.
You can reach them on their website https://mytiki.com/
Discord official channel
https://discord.com/invite/evjYQq48Be
Telegram
https://t.me/mytikiapp
Signal
https://signal.group/#CjQKIA66Eq2VHecpcCd-cu-dziozMRSH3EuQdcZJNyMOYNi5EhC0coWtjWzKQ1dDKEjMqhkP

Apr 19, 2021 • 26min
Building high-growth data businesses with Lillian Pierson (Ep. 149)
In this episode I have an amazing conversation with Lillian Pierson from data-mania.com
This is an action-packed episode on how data professionals can quickly convert their data expertise into high-growth data businesses, all by selecting optimal business models, revenue models, and pricing structures.
If you want to know more or get in touch with Lillian, follow the links below:
Weekly Free Trainings: We currently publish 1 free training per week on YouTube! https://www.youtube.com/channel/UCK4MGP0A6lBjnQWAmcWBcKQ
Becoming World-Class Data Leaders and Data Entrepreneurs Facebook Group: https://www.facebook.com/groups/data.leaders.and.entrepreneurs
LinkedIn: https://www.linkedin.com/in/lillianpierson/
The Data Entrepreneur’s Toolkit: A recommendation set for 32 free (or low-cost) tools & processes that'll actually grow your data business (even if you still haven’t put up that website yet!). https://www.data-mania.com/data-entrepreneur-toolkit/