

DataTalks.Club
DataTalks.Club
DataTalks.Club - the place to talk about data!
Episodes
Mentioned books

Feb 11, 2023 • 52min
The Journey of a Data Generalist: From Bioinformatics to Freelancing - Jekaterina Kokatjuhha
We talked about:
Jekaterina’s background
How Jekaterina started freelancing
Jekaterina’s initial ways of getting freelancing clients
How being a generalist helped Jekaterina’s career
Connecting business and data
How Jekaterina’s LinkedIn posts helped her get clients
Jekaterina’s work in fundraising
Cohorts and KPIs
Improving communication between the data and business teams
Motivating every link in the company’s chain
The cons of freelancing
Balancing projects and networking
The importance of enjoying what you do
Growing the client base
In the office work vs working remotely
Jekaterina’s advice who people who feel stuck
Jekaterina’s resource recommendations
Links:
Jekaterina's LinkedIn: https://www.linkedin.com/in/jekaterina-kokatjuhha/
Join DataTalks.Club: https://datatalks.club/slack.html

Feb 3, 2023 • 56min
Navigating Career Changes in Machine Learning - Chris Szafranek
We talked about
Chris’s background
Switching careers multiple times
Freedom at companies
Chris’s role as an internal consultant
Chris’s sabbatical
ChatGPT
How being a generalist helped Chris in his career
The cons of being a generalist and the importance of T-shaped expertise
The importance of learning things you’re interested in
Tips to enjoy learning new things
Recruiting generalists
The job market for generalists vs for specialists
Narrowing down your interests
Chris’s book recommendations
Links:
Lex Fridman: science, philosophy, media, AI (especially earlier episodes): https://www.youtube.com/lexfridman
Andrej Karpathy, former Senior Director of AI at Tesla, who's now focused on teaching and sharing his knowledge: https://www.youtube.com/@AndrejKarpathy
Beautifully done videos on engineering of things in the real world: https://www.youtube.com/@RealEngineering
Chris' website: https://szafranek.net/
Zalando Tech Radar: https://opensource.zalando.com/tech-radar/
Modal Labs, new way of deploying code to the cloud, also useful for testing ML code on GPUs: https://modal.com
Excellent Twitter account to follow to learn more about prompt engineering for ChatGPT: https://twitter.com/goodside
Image prompts for Midjourney: https://twitter.com/GuyP
Machine Learning Workflows in Production - Krzysztof Szafanek: https://www.youtube.com/watch?v=CO4Gqd95j6k
From Data Science to DataOps: https://datatalks.club/podcast/s11e03-from-data-science-to-dataops.html
Free data engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp
Join DataTalks.Club: https://datatalks.club/slack.html
Our events: https://datatalks.club/events.html

Jan 27, 2023 • 54min
Preparing for a Data Science Interview - Luke Whipps
We talked about:
Luke’s background
Luke’s podcast - AI Game Changers
How Luke helps people get jobs
What’s changed in the recruitment market over the last 6 months
Getting ready for the interview process
Stage “zero” – the filter between the candidate and the company
Preparing for the introduction stage – research and communication
Reviewing the fundamentals during preparation
Preparing for the technical part of the interview
Establishing the hiring company’s expectations
Depth vs breadth
Overly theoretical and mathematical questions in interviews
Bombing (failing) in the middle of an interview
Applying to different roles within the same company
Luke’s resource recommendations
Links:
Luke's LinkedIn: https://www.linkedin.com/in/lukewhipps/
Free data engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp
Join DataTalks.Club: https://datatalks.club/slack.html
Our events: https://datatalks.club/events.html

Jan 20, 2023 • 51min
Indie Hacking - Pauline Clavelloux
We talked about:
Pauline’s background
Pauline’s work as a manager at IBM
What is indie hacking?
Pauline initial indie hacking projects
Getting ready for launch
Responsibilities and challenges in indie hacking
Pauline’s latest indie hacking project
Going live and marketing
Challenges with Unreal Me
Staying motivated with indie hacking projects
Skills Pauline picked up while doing indie hacking projects
Balancing a day job and indie hacking
Micro SaaS and AboutStartup.io
How Pauline comes up with ideas for projects
Going from an idea on paper to building a project
Pauline’s Twitter success
Connecting with Pauline online
Pauline’s indie hacking inspiration
Pauline’s resource recommendation
Links:
Website: https://wintopy.io/
Pauline's Twitter: https://twitter.com/Pauline_Cx
Pauline's LinkedIn: https://www.linkedin.com/in/paulineclavelloux/
Blog about Indiehacking: https://aboutstartup.io
Free data engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp
Join DataTalks.Club: https://datatalks.club/slack.html
Our events: https://datatalks.club/events.html

Jan 13, 2023 • 50min
Doing Software Engineering in Academia - Johanna Bayer
We talked about:
Johanna’s background
Open science course and reproducible papers
Research software engineering
Convincing a professor to work on software instead of papers
The importance of reproducible analysis
Why academia is behind on software engineering
The problems with open science publishing in academia
The importance of standard coding practices
How Johanna got into research software engineering
Effective ways of learning software engineering skills
Providing data and analysis for your project
Johanna’s initial experience with software engineering in a project
Working with sensitive data and the nuances of publishing it
How often Johanna does hackathons, open source, and freelancing
Social media as a source of repos and Johanna’s favorite communities
Contributing to Git repos
Publishing in the open in academia vs industry
Johanna’s book and resource recommendations
Conclusion
Links:
The Society of Research Software Engineering, plus regional chapters: https://society-rse.org/
The RSE Association of Australia and New Zealand: https://rse-aunz.github.io/
Research Software Engineers (RSEs) The people behind research software: https://de-rse.org/en/index.html
The software sustainability institute: https://www.software.ac.uk/
The Carpentries (beginner git and programming courses): https://carpentries.org/
The Turing Way Book of Reproducible Research: https://the-turing-way.netlify.app/welcome
Free data engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp
Join DataTalks.Club: https://datatalks.club/slack.html
Our events: https://datatalks.club/events.html

7 snips
Jan 6, 2023 • 53min
Data-Centric AI - Marysia Winkels
We talked about:
Marysia’s background
What data-centric AI is
Data-centric Kaggle competitions
The mindset shift to data-centric AI
Data-centric does not mean you should not iterate on models
How to implement the data-centric approach
Focusing on the data vs focusing on the model
Resources to help implement the data-centric approach
Data-centric AI vs standard data cleaning
Making sure your data is representative
Knowing when your data is good enough
The importance of user feedback
“Shadow Mode” deployment
What to do if you have a lot of bad data or incomplete data
Marysia’s role at PyData
How Marysia joined PyData
The difference between PyData and PyCon
Finding Marysia online
Links:
Embetter & Bulk Demo: https://www.youtube.com/watch?v=L---nvDw9KU
Free data engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp
Join DataTalks.Club: https://datatalks.club/slack.html
Our events: https://datatalks.club/events.html

Dec 16, 2022 • 54min
Business Skills for Data Professionals - Loris Marini
We talked about:
Loris’ background
Transitioning from physics to data
Aligning people on concepts
Lead indicators and stickiness
Context, semantics, and meaning
Communication and being memorable
Making data digestible for business and building trust
The importance of understanding the language of business
Stakeholder mapping
Attending business meetings as a data professional
Organizing your stakeholder map
Prioritizing
How to support the business strategy
Learning to speak online
Resource recommendations from Loris
Links:
Discovering Data Discord server: https://bit.ly/discovering-data-discord
Loris' LinkedIn: https://www.linkedin.com/in/lorismarini/
Loris' Twitter: https://twitter.com/LorisMarini

Dec 9, 2022 • 53min
From Software Engineer to Data Science Manager - Sadat Anwar
We talked about:
Sadat’s background
Sadat’s backend engineering experience
Sadat’s pivot point as a backend engineer
Sadat’s exposure to ML and Data Science
Sadat’s Act Before you Think approach (with safety nets)
Sadat’s street cred and transition into management
The hiring process as an internal candidate
The importance of people management skills
The Brag List
The most difficult part of transitioning to management
Focusing on projects and setting milestones
Sadat’s transition from EM to data science management
How much domain knowledge is needed for management?
The main difference between engineering and management
How being an EM helped Sadat transition no DS management
53:32 Transitioning to DS management from other roles
How to feel accomplished as a manager
Sadat’s book recommendations
Sadat’s meetups
Links:
Sadat's Meetup page: https://www.meetup.com/berlin-search-technology-meetup/
Meetup event "Bias in AI: how to measure it and how to fix it event": https://www.meetup.com/data-driven-ai-berlin-meetup/events/289927565/
ML Zoomcamp: https://github.com/alexeygrigorev/mlbookcamp-code/tree/master/course-zoomcamp
Join DataTalks.Club: https://datatalks.club/slack.html
Our events: https://datatalks.club/events.html

Dec 2, 2022 • 54min
Teaching and Mentoring in Data Analytics - Irina Brudaru
We talked about:
Irina’s background
Irina as a mentor
Designing curriculum and program management at AI Guild
Other things Irina taught at AI Guild
Why Irina likes teaching
Students’ reluctance to learn cloud
Irina as a manager
Cohort analysis in a nutshell
How Irina started teaching formally
Irina’s diversity project in the works
How DataTalks.Club can attract more female students to the Zoomcamps
How to get technical feedback at work
Antipatterns and overrated/overhyped topics in data analytics
Advice for young women who want to get into data science/engineering
Finding Irina online
Fundamentals for data analysts
Suggestions for DataTalks.club collaborations
Conclusions
Links:
LinkedIn Account: https://www.linkedin.com/in/irinabrudaru/
ML Zoomcamp: https://github.com/alexeygrigorev/mlbookcamp-code/tree/master/course-zoomcamp
Join DataTalks.Club: https://datatalks.club/slack.html
Our events: https://datatalks.club/events.html

Nov 25, 2022 • 51min
Technical Writing and Data Journalism - Angelica Lo Duca
We talked about:
Angelica’s background
Angelica’s books
Data journalism
How Angelica got into data journalism
The field of digital humanities and Angelica’s data journalism course
Technical articles vs data journalism articles
Transforming reports into data storytelling
Are reports to stakeholders considered technical writing?
Data visualization in articles
Article length
The process of writing an article
Finding writing topics
How Angelica got into writing a book (communication with publishers)
The process for writing a book
Brainstorming
Reviews and revisions
Conclusion
Links:
Data Journalism examples (FENCED OUT): https://www.washingtonpost.com/graphics/world/border-barriers/europe-refugee-crisis-border-control/??noredirect=on
Data Journalism examples (La tierra esclava): https://latierraesclava.eldiario.es/
Small medium publication aiming at being Stack Overflow of Medium: https://medium.com/syntaxerrorpub
Example of a self-published book on Data Visualization: https://www.amazon.com/Introduction-Data-Visualization-Storytelling-Scientist-ebook/dp/B07VYCR3Z6/ref=sr_1_4?crid=4JRJ48O7K8TK&keywords=joses+berengueres&qid=1668270728&sprefix=joses+beremguere%2Caps%2C273&sr=8-4
My novels (in Italian) La bambina e il Clown: https://www.amazon.it/Bambina-Clown-Angelica-Lo-Duca/dp/1500984515/ref=sr_1_9?__mk_it_IT=%C3%85M%C3%85%C5%BD%C3%95%C3%91&crid=2KGK9GMN0FAHI&keywords=la+bambina+e+il+clown&qid=1668270769&sprefix=la+bambina+e+il+clown%2Caps%2C88&sr=8-9
My novels (in Italian) Il Violinista: https://www.amazon.it/Violinista-1-Angelica-Lo-Duca/dp/1501009672/ref=sr_1_1?__mk_it_IT=%C3%85M%C3%85%C5%BD%C3%95%C3%91&crid=12KTF9EF5UKIG&keywords=il+violinista+lo+duca&qid=1668270791&sprefix=il+violinista+lo+duca%2Caps%2C81&sr=8-1
Course on Data Journalism: https://www.coursera.org/learn/visualization-for-data-journalism
ML Zoomcamp: https://github.com/alexeygrigorev/mlbookcamp-code/tree/master/course-zoomcamp
Join DataTalks.Club: https://datatalks.club/slack.html
Our events: https://datatalks.club/events.html


