
DataTalks.Club
DataTalks.Club - the place to talk about data!
Latest episodes

Dec 2, 2022 • 54min
Teaching and Mentoring in Data Analytics - Irina Brudaru
We talked about:
Irina’s background
Irina as a mentor
Designing curriculum and program management at AI Guild
Other things Irina taught at AI Guild
Why Irina likes teaching
Students’ reluctance to learn cloud
Irina as a manager
Cohort analysis in a nutshell
How Irina started teaching formally
Irina’s diversity project in the works
How DataTalks.Club can attract more female students to the Zoomcamps
How to get technical feedback at work
Antipatterns and overrated/overhyped topics in data analytics
Advice for young women who want to get into data science/engineering
Finding Irina online
Fundamentals for data analysts
Suggestions for DataTalks.club collaborations
Conclusions
Links:
LinkedIn Account: https://www.linkedin.com/in/irinabrudaru/
ML Zoomcamp: https://github.com/alexeygrigorev/mlbookcamp-code/tree/master/course-zoomcamp
Join DataTalks.Club: https://datatalks.club/slack.html
Our events: https://datatalks.club/events.html

Nov 25, 2022 • 51min
Technical Writing and Data Journalism - Angelica Lo Duca
We talked about:
Angelica’s background
Angelica’s books
Data journalism
How Angelica got into data journalism
The field of digital humanities and Angelica’s data journalism course
Technical articles vs data journalism articles
Transforming reports into data storytelling
Are reports to stakeholders considered technical writing?
Data visualization in articles
Article length
The process of writing an article
Finding writing topics
How Angelica got into writing a book (communication with publishers)
The process for writing a book
Brainstorming
Reviews and revisions
Conclusion
Links:
Data Journalism examples (FENCED OUT): https://www.washingtonpost.com/graphics/world/border-barriers/europe-refugee-crisis-border-control/??noredirect=on
Data Journalism examples (La tierra esclava): https://latierraesclava.eldiario.es/
Small medium publication aiming at being Stack Overflow of Medium: https://medium.com/syntaxerrorpub
Example of a self-published book on Data Visualization: https://www.amazon.com/Introduction-Data-Visualization-Storytelling-Scientist-ebook/dp/B07VYCR3Z6/ref=sr_1_4?crid=4JRJ48O7K8TK&keywords=joses+berengueres&qid=1668270728&sprefix=joses+beremguere%2Caps%2C273&sr=8-4
My novels (in Italian) La bambina e il Clown: https://www.amazon.it/Bambina-Clown-Angelica-Lo-Duca/dp/1500984515/ref=sr_1_9?__mk_it_IT=%C3%85M%C3%85%C5%BD%C3%95%C3%91&crid=2KGK9GMN0FAHI&keywords=la+bambina+e+il+clown&qid=1668270769&sprefix=la+bambina+e+il+clown%2Caps%2C88&sr=8-9
My novels (in Italian) Il Violinista: https://www.amazon.it/Violinista-1-Angelica-Lo-Duca/dp/1501009672/ref=sr_1_1?__mk_it_IT=%C3%85M%C3%85%C5%BD%C3%95%C3%91&crid=12KTF9EF5UKIG&keywords=il+violinista+lo+duca&qid=1668270791&sprefix=il+violinista+lo+duca%2Caps%2C81&sr=8-1
Course on Data Journalism: https://www.coursera.org/learn/visualization-for-data-journalism
ML Zoomcamp: https://github.com/alexeygrigorev/mlbookcamp-code/tree/master/course-zoomcamp
Join DataTalks.Club: https://datatalks.club/slack.html
Our events: https://datatalks.club/events.html

Nov 18, 2022 • 47min
From Digital Marketing to Analytics Engineering - Nikola Maksimovic
We talked about:
Nikola’s background
Making the first steps towards a transition to BI and Analytics Engineering
Learning the skills necessary to transition to Analytics Engineering
The in-between period – from Marketing to Analytics Engineering
Nikola’s current responsibilities
Understanding what a Data Model is
Tools needed to work as an Analytics Engineer
The Analytics Engineering role over time
The importance of DBT for Analytics Engineers
Where can one learn about data modeling theory?
Going from Ancient Greek and Latin to understanding Data (Just-In-Time Learning)
The importance of having domain knowledge to analytics engineering
Suggestion for those wishing to transition into analytics engineering
The importance of having a mentor when transitioning
Finding a mentor
Helpful newsletters and blogs
Finding Nikola online
Links:
Nikola's LinkedIn account: https://www.linkedin.com/in/nikola-maksimovic-40188183/
ML Zoomcamp: https://github.com/alexeygrigorev/mlbookcamp-code/tree/master/course-zoomcamp
Join DataTalks.Club: https://datatalks.club/slack.html
Our events: https://datatalks.club/events.html

Nov 11, 2022 • 54min
Product Owners in Data Science - Anna Hannemann
We talked about:
About Anna and METRO
Anna’s background
The importance of a technical background for data product owners
What are product owners?
Product owners vs product managers
Anna’s work on recommender systems at METRO
Expanding the data team
Types of algorithms used for recommender systems
What kind of knowledge and skills data product owners need to have
Problems and ideas should come from the business
How Anna handles all her responsibilities
The process for starting work on new domains
Product portfolio management
ProductTank and Anna’s role in it
Anna’s resource recommendations
Links:
Data Science for Business Book: https://www.amazon.de/-/en/Foster-Provost/dp/1449361323/ref=sr_1_1?keywords=data+science+for+business&qid=1666404807&qu=eyJxc2MiOiIxLjg3IiwicXNhIjoiMS41MiIsInFzcCI6IjEuNDYifQ%3D%3D&sr=8-1
Article on Data Science Products: https://www.linkedin.com/pulse/way-create-data-science-products-lessons-learnt-anna-hannemann-phd/
ML Zoomcamp: https://github.com/alexeygrigorev/mlbookcamp-code/tree/master/course-zoomcamp
Join DataTalks.Club: https://datatalks.club/slack.html
Our events: https://datatalks.club/events.html

Nov 4, 2022 • 50min
Building Data Science Practice - Andrey Shtylenko
We talked about:
Audience Poll
Andrey’s background
What data science practice is
Best DS practice in a traditional company vs IT-centric companies
Getting started with building data science practice (finding out who you report to)
Who the initiative comes from
Finding out what kind of problems you will be solving (Centralized approach)
Moving to a semi-decentralized approach
Resources to learn about data science practice
Pivoting from the role of a software engineer to data scientist
The most impactful realization from data science practice
Advice for individual growth
Finding Andrey online
Links:
Data Teams book: https://www.amazon.com/Data-Teams-Management-Successful-Data-Focused/dp/1484262271/
ML Zoomcamp: https://github.com/alexeygrigorev/mlbookcamp-code/tree/master/course-zoomcamp
Join DataTalks.Club: https://datatalks.club/slack.html
Our events: https://datatalks.club/events.html

Oct 28, 2022 • 53min
Large-Scale Entity Resolution - Sonal Goyal
We talked about:
Sonal’s background
How the idea for Zingg came about
What Zingg is
The difference between entity resolution and identity resolution
How duplicate detection relates to entity resolution
How Sonal decided to start working on Zingg
How Zingg works
What Zingg runs on
Switching from consultancy to working on a new open source solution
Why Zingg is open source
Open source licensing
Working on Zingg initially vs now
Zingg’s current and future team
Sonal’s biggest current challenge
Avoiding problems with entity/identity resolution through database design
Identity resolution vs basic joins, data fusions, and fuzzy joins
Deterministic matching vs probabilistic machine learning
Identity and entity resolution applications for fraud detection
Graph algorithms vs classic ML in entity resolution
Identity resolution success stories
What Sonal would do differently given the chance to start over with Zingg
Advice for those seeking to realize their own solution to a data problem
Reading suggestion from Sonal
Conclusion
Links:
Open-Source Spotlight demo "Zingg":https://www.youtube.com/watch?v=zOabyZxN9b0
Creative Selection: Inside Apple's Design Process During the Golden Age of Steve Jobs book: https://www.amazon.com/Creative-Selection-Inside-Apples-Process/dp/1250194466
ML Zoomcamp: https://github.com/alexeygrigorev/mlbookcamp-code/tree/master/course-zoomcamp
Join DataTalks.Club: https://datatalks.club/slack.html
Our events: https://datatalks.club/events.html

Oct 21, 2022 • 51min
From Data Science to DataOps - Tomasz Hinc
We talked about:
Tomasz’s background
What Tomasz did before DataOps (Data Science)
Why Tomasz made the transition from Data science to DataOps
What is DataOps?
How is DataOps related to infrastructure?
How Tomasz learned the skills necessary to become DataOps
Becoming comfortable with terminal
The overlap between DataOps and Data Engineering
Suitable/useful skills for DataOps
Minimal operational skills for DataOps
Similarities between DataOps and Data Science Managers
Tomasz’s interesting projects
Confidence in results and avoiding going too deep with edge cases
Conclusion
Links:
Terminal setup video, 19 minutes long: https://www.youtube.com/watch?v=D2PSsnqgBiw
Command line videos, one and a half hour to become somewhat comfy with the terminal: https://www.youtube.com/playlist?list=PLIhvC56v63IKioClkSNDjW7iz-6TFvLwS
Course from MIT talking about just that (command line, git, storing secrets): https://missing.csail.mit.edu/
ML Zoomcamp: https://github.com/alexeygrigorev/mlbookcamp-code/tree/master/course-zoomcamp
Join DataTalks.Club: https://datatalks.club/slack.html
Our events: https://datatalks.club/events.html

Oct 14, 2022 • 54min
Data Science Career Development - Katie Bauer
We talked about:
Katie’s background
What is a data scientist?
What is a data science manager?
Quality of the craft
How data leaders promote career growth
Supporting senior data professionals
Choosing the IC route vs the management route
Managing junior data professionals
Talking to senior stakeholders and PMs as a junior
The importance of hiring juniors
What skills do data scientist managers need to get hired?
How juniors that are just starting out can set themselves apart from the competition
Asking senior colleagues for help and the rubber duck channel
The challenges of the head of data
Conclusion
Links:
Jobs at Gloss Genius: https://boards.greenhouse.io/glossgenius
ML Zoomcamp: https://github.com/alexeygrigorev/mlbookcamp-code/tree/master/course-zoomcamp
Join DataTalks.Club: https://datatalks.club/slack.html
Our events: https://datatalks.club/events.html

Oct 7, 2022 • 49min
From Testing Phones to Managing NLP Projects - Alvaro Navas Peire
We talked about:
Alvaro’s background
Working as a QA (Quality Assurance) engineer
Transitioning from QA to Machine Learning
Gathering knowledge about ML field
Searching for an ML job (improving soft skills and CV)
Data science interview skills
Zoomcamp projects
Zoomcamp project deployment
How to not undersell yourself during interviews
Alvaro’s experience with interviews during his transition
Alvaro’s Zoomcamp notes
Alvaro’s coach
The importance of mathematical knowledge to a transition into ML
Preparing for technical interviews
Alvaro’s typical workday
Alvaro’s team’s tech stack
The importance of a technical background to transitioning into ML
Links:
Alvaro's CV: https://www.dropbox.com/s/89hkt3ug0toqa2n/CV%20nou%20-%20angl%C3%A8s.pdf?dl=0
Github profile: https://github.com/ziritrion
LinkedIn profile: https://www.linkedin.com/in/alvaronavas/
ML Zoomcamp: https://github.com/alexeygrigorev/mlbookcamp-code/tree/master/course-zoomcampJoin
DataTalks.Club: https://datatalks.club/slack.html
Our events: https://datatalks.club/events.html

Sep 30, 2022 • 53min
Responsible and Explainable AI - Supreet Kaur
We talked about:
Supreet’s background
Responsible AI
Example of explainable AI
Responsible AI vs explainable AI
Explainable AI tools and frameworks (glass box approach)
Checking for bias in data and handling personal data
Understanding whether your company needs certain type of data
Data quality checks and automation
Responsibility vs profitability
The human touch in AI
The trade-off between model complexity and explainability
Is completely automated AI out of the question?
Detecting model drift and overfitting
How Supreet became interested in explainable AI
Trustworthy AI
Reliability vs fairness
Bias indicators
The future of explainable AI
About DataBuzz
The diversity of data science roles
Ethics in data science
Conclusion
Links:
LinkedIn: https://www.linkedin.com/in/supreet-kaur1995/
Databuzz page: https://www.linkedin.com/company/databuzz-club/
Medium Blog Page: https://medium.com/@supreetkaur_66831
ML Zoomcamp: https://github.com/alexeygrigorev/mlbookcamp-code/tree/master/course-zoomcamp
Join DataTalks.Club: https://datatalks.club/slack.html
Our events: https://datatalks.club/events.html