

Data Science at Home
Francesco Gadaleta
Cutting through AI bullsh*t.Come join the discussion on Discord! https://discord.gg/4UNKGf3
Episodes
Mentioned books

Feb 15, 2023 • 29min
[RB] Online learning is better than batch, right? Wrong! (Ep. 216)
In this episode I speak about online learning systems and why blindly choosing such a paradigm can lead to very unpredictable and expensive outcomes.
Also in this episode, I have to deal with an intruder :)
Links
Birman, K.; Joseph, T. (1987). "Exploiting virtual synchrony in distributed systems". Proceedings of the Eleventh ACM Symposium on Operating Systems Principles - SOSP '87. pp. 123–138. doi:10.1145/41457.37515. ISBN 089791242X. S2CID 7739589.

Jan 26, 2023 • 31min
Chatting with ChatGPT: Pros and Cons of Advanced Language AI (Ep. 215)
In this episode, I'll be discussing the capabilities and limitations of ChatGPT, an advanced language AI model. I'll go over its power to understand and respond to natural language, and its applications in tasks such as language translation and text summarization.
However, I'll also touch on the challenges that still need to be overcome such as bias and data privacy concerns.
Tune in for a comprehensive look at the current state of advanced language AI.
References
https://datascienceathome.com/have-you-met-shannon-conversation-with-jimmy-soni-and-rob-goodman-about-one-of-the-greatest-minds-in-history/

Jan 14, 2023 • 42min
Accelerating Perception Development with Synthetic Data (Ep. 214)
In this episode I am with Kevin McNamara, founder and CEO of Parallel Domain. We speak about a very effective method to generate synthetic data that is currently in production at Parallel Domain.
Enjoy the show!
References
Parallel Domain Synthetic Data Improves Cyclist Detection (blog post):
https://paralleldomain.com/parallel-domain-synthetic-data-improves-cyclist-detection/
Beating the State of the Art in Object Tracking with Synthetic Data:
https://paralleldomain.com/beating-the-state-of-the-art-in-object-tracking-with-synthetic-data/
Parallel Domain Open Synthetic Dataset:
https://paralleldomain.com/open-datasets/bicycle-detection
How Toyota Research Institute Trains Better Computer Vision Models with PD Synthetic Data (interview):
https://www.youtube.com/watch?v=QIYttoVxf2w
Career Opportunities:
https://paralleldomain.com/careers

Dec 13, 2022 • 21min
Edge AI applications for military and space [RB] (Ep. 213)
Our Sponsors
NordPass Business has developed a password manager, that will save you a lot of time and energy whenever you
need access to business accounts, work across devices, even with the other members of your team, or whenever you need to share sensitive data with your colleagues, or make payments efficiently. All this with the highest standard of cyber secure technology.
See NordPass Business in action now with a 3-month free trial here
https://nordpass.com/DATASCIENCE with code DATASCIENCE
Amethix works to create and maximize the impact of the world’s leading corporations and startups, so they can create a better future for everyone they serve. We provide solutions in AI/ML, Fintech, Healthcare/RWE, and Predictive maintenance.

Dec 8, 2022 • 23min
From image to 3D model (Ep. 212)
Is it possible to reconstruct a 3D model from a simple image?
Under certain constraints, it is!
In this episode I tell you how.
Our Sponsors
Explore the Complex World of Regulations. Compliance can be overwhelming. Multiple frameworks. Overlapping requirements. Let Arctic Wolf be your guide.
Check it out at https://arcticwolf.com/datascience
Amethix works to create and maximize the impact of the world’s leading corporations and startups, so they can create a better future for everyone they serve. We provide solutions in AI/ML, Fintech, Healthcare/RWE, and Predictive maintenance.
References
https://github.com/isl-org/Open3D
https://huggingface.co/docs/transformers/model_doc/glpn
https://arxiv.org/abs/2201.07436

4 snips
Dec 2, 2022 • 24min
Machine learning is physics (Ep. 211)
What if we borrowed from physics some theories that would interpret deep learning and machine learning in general?
Here is a list of plausible ways to interpret our beloved ML models and understand why they works, or they don't.
Enjoy the show!
Our Sponsors
NordPass Business has developed a password manager, that will save you a lot of time and energy whenever you
need access to business accounts, work across devices, even with the other members of your team, or whenever you need to share sensitive data with your colleagues, or make payments efficiently. All this with the highest standard of cyber secure technology.
See NordPass Business in action now with a 3-month free trial here
https://nordpass.com/DATASCIENCE with codeDATASCIENCE
Amethix works to create and maximize the impact of the world’s leading corporations and startups, so they can create a better future for everyone they serve. We provide solutions in AI/ML, Fintech, Healthcare/RWE, and Predictive maintenance.

Nov 21, 2022 • 35min
Autonomous cars cannot drive. Here is why. (Ep. 210)
If you think that the problem of self-driving cars has been solved, think twice.
As a matter of fact, the problem of self-driving cars cannot be solved with the technical solutions that companies are currently considering.
Don't get fooled by marketing and PR on social media. Whoever is telling you they solved the problem of driving a vehicle fully autonomously, they are lying.
Here is why.
Our Sponsors
Explore the Complex World of Regulations. Compliance can be overwhelming. Multiple frameworks. Overlapping requirements. Let Arctic Wolf be your guide.
Check it out at https://arcticwolf.com/datascience
Amethix works to create and maximize the impact of the world’s leading corporations and startups, so they can create a better future for everyone they serve. We provide solutions in AI/ML, Fintech, Healthcare/RWE, and Predictive maintenance.

Nov 8, 2022 • 18min
Evolution of data platforms (Ep. 209)
Let's look at the history of data platforms. How did they evolve? Why?
Shall I switch to the latest architecture?
Enjoy the show!
Our Sponsors
Explore the Complex World of Regulations. Compliance can be overwhelming. Multiple frameworks. Overlapping requirements. Let Arctic Wolf be your guide.
Check it out at https://arcticwolf.com/datascience
Amethix works to create and maximize the impact of the world’s leading corporations and startups, so they can create a better future for everyone they serve. We provide solutions in AI/ML, Fintech, Healthcare/RWE, and Predictive maintenance.

Nov 2, 2022 • 20min
[RB] Is studying AI in academia a waste of time? (Ep. 208)
Companies and other business entities are actively involved in defining data products and applied research every year. Academia has always played a role in creating new methods and solutions/algorithms in the fields of machine learning and artificial intelligence.
However, there is doubt about how powerful and effective such research efforts are.
Is studying AI in academia a waste of time?
Our Sponsors
Ready to advance your career in data science? University of Cincinnati Online offers nationally recognized educational programs in business analytics and information systems. Predictive Analytics Today named UC as the No.1 MS Data Science school in the country and is nationally recognized with a proven track record of placing students at high-profile companies such as Google, Amazon and P&G.
Discover more about the University of Cincinnati’s 100% online master’s degree programs at online.uc.edu/obais
Amethix works to create and maximize the impact of the world’s leading corporations and startups, so they can create a better future for everyone they serve. We provide solutions in AI/ML, Fintech, Healthcare/RWE, and Predictive maintenance.

Oct 25, 2022 • 27min
Private machine learning done right (Ep. 207)
There are many solutions to private machine learning. I am pretty confident when I say that the one we are speaking in this episode is probably one of the most feasible and reliable.
I am with Daniel Huynh, CEO of Mithril Security, a graduate from Ecole Polytechnique with a specialisation in AI and data science. He worked at Microsoft on Privacy Enhancing Technologies under the office of the CTO of Microsoft France. He has written articles on Homomorphic Encryptions with the CKKS explained series (https://blog.openmined.org/ckks-explained-part-1-simple-encoding-and-decoding/). He is now focusing on Confidential Computing at Mithril Security and has written extensive articles on the topic: https://blog.mithrilsecurity.io/.
In this show we speak about confidential computing, SGX and private machine learning
References
Mithril Security: https://www.mithrilsecurity.io/
BindAI GitHub: https://github.com/mithril-security/blindai
Use cases for BlindAI:Deploy Transformers models with confidentiality: https://blog.mithrilsecurity.io/transformers-with-confidentiality/
Confidential medical image analysis with COVID-Net and BlindAI: https://blog.mithrilsecurity.io/confidential-covidnet-with-blindai/
Build a privacy-by-design voice assistant with BlindAI: https://blog.mithrilsecurity.io/privacy-voice-ai-with-blindai/
Confidential Computing Explained: https://blog.mithrilsecurity.io/confidential-computing-explained-part-1-introduction/
Confidential Computing Consortium: https://confidentialcomputing.io/
Confidential Computing White Papers: https://confidentialcomputing.io/white-papers-reports/
List of Intel processors with Intel SGX:https://www.intel.com/content/www/us/en/support/articles/000028173/processors.html
https://github.com/ayeks/SGX-hardware
Azure Confidential Computing VMs with SGX:Azure Docs: https://docs.microsoft.com/en-us/azure/confidential-computing/confidential-computing-enclaves
How to deploy BlindAI on Azure: https://docs.mithrilsecurity.io/getting-started/cloud-deployment/azure-dcsv3
Confidential Computing 101: https://www.youtube.com/watch?v=77U12Ss38Zc
Rust: https://www.rust-lang.org/
ONNX: https://github.com/onnx/onnx
Tract, a Rust inference engine for ONNX models: https://github.com/sonos/tract


