
DataTalks.Club
DataTalks.Club - the place to talk about data!
Latest episodes

Nov 22, 2024 • 54min
Large Hadron Collider and Mentorship – Anastasia Karavdina
We talked about:
00:00 DataTalks.Club intro
00:00 Large Hadron Collider and Mentorship
02:35 Career overview and transition from physics to data science
07:02 Working at the Large Hadron Collider
09:19 How particles collide and the role of detectors
11:03 Data analysis challenges in particle physics and data science similarities
13:32 Team structure at the Large Hadron Collider
20:05 Explaining the connection between particle physics and data science
23:21 Software engineering practices in particle physics
26:11 Challenges during interviews for data science roles
29:30 Mentoring and offering advice to job seekers
40:03 The STAR method and its value in interviews
50:32 Paid vs unpaid mentorship and finding the right fit
About the speaker:
Anastasia is a particle physicist turned data scientist, with experience in large-scale experiments like those at the Large Hadron Collider. She also worked at Blue Yonder, scaling AI-driven solutions for global supply chain giants, and at Kaufland e-commerce, focusing on NLP and search. Anastasia is a mentor for Ml/AI, dedicated to helping her mentees achieve their goals. She is passionate about growing the next generation of data science elite in Germany: from Data Analysts up to ML Engineers.
Join our Slack: https://datatalks .club/slack.html

17 snips
Nov 8, 2024 • 56min
MLOps as a Team - Raphaël Hoogvliets
Raphaël Hoogvliets, a leader in MLOps with a background in data science, discusses his transition from sustainable agriculture to tech. He highlights Dutch agricultural challenges and the importance of addressing technical debt in MLOps. The conversation emphasizes the delicate balance between speed and quality in team dynamics. Key roles like tech translators are explored, alongside best practices such as data versioning. Raphaël also shares success and failure stories, underscoring the complexity of deploying machine learning in real-world environments.

Nov 1, 2024 • 46min
Using Data to Create Liveable Cities - Rachel Lim
We talked about:
00:00 DataTalks.Club intro
01:56 Using data to create livable cities
02:52 Rachel's career journey: from geography to urban data science
04:20 What does a transport scientist do?
05:34 Short-term and long-term transportation planning
06:14 Data sources for transportation planning in Singapore
08:38 Rachel's motivation for combining geography and data science
10:19 Urban design and its connection to geography
13:12 Defining a livable city
15:30 Livability of Singapore and urban planning
18:24 Role of data science in urban and transportation planning
20:31 Predicting travel patterns for future transportation needs
22:02 Data collection and processing in transportation systems
24:02 Use of real-time data for traffic management
27:06 Incorporating generative AI into data engineering
30:09 Data analysis for transportation policies
33:19 Technologies used in text-to-SQL projects
36:12 Handling large datasets and transportation data in Singapore
42:17 Generative AI applications beyond text-to-SQL
45:26 Publishing public data and maintaining privacy
45:52 Recommended datasets and projects for data engineering beginners
49:16 Recommended resources for learning urban data science
About the speaker:
Rachel is an urban data scientist dedicated to creating liveable cities through the innovative use of data. With a background in geography, and a masters in urban data science, she blends qualitative and quantitative analysis to tackle urban challenges. Her aim is to integrate data driven techniques with urban design to foster sustainable and equitable urban environments.
Links: - https://datamall.lta.gov.sg/content/datamall/en/dynamic-data.html
00:00 DataTalks.Club intro
01:56 Using data to create livable cities
02:52 Rachel's career journey: from geography to urban data science
04:20 What does a transport scientist do?
05:34 Short-term and long-term transportation planning
06:14 Data sources for transportation planning in Singapore
08:38 Rachel's motivation for combining geography and data science
10:19 Urban design and its connection to geography
13:12 Defining a livable city
15:30 Livability of Singapore and urban planning
18:24 Role of data science in urban and transportation planning
20:31 Predicting travel patterns for future transportation needs
22:02 Data collection and processing in transportation systems
24:02 Use of real-time data for traffic management
27:06 Incorporating generative AI into data engineering
30:09 Data analysis for transportation policies
33:19 Technologies used in text-to-SQL projects
36:12 Handling large datasets and transportation data in Singapore
42:17 Generative AI applications beyond text-to-SQL
45:26 Publishing public data and maintaining privacy
45:52 Recommended datasets and projects for data engineering beginners
49:16 Recommended resources for learning urban data science
Join our slack: https: //datatalks.club/slack.html

Oct 26, 2024 • 54min
DataTalks.Club 4th Anniversary AMA Podcast – Alexey Grigorev and Johanna Bayer
We talked about:
00:00 DataTalks.Club intro
00:00 DataTalks.Club anniversary "Ask Me Anything" event with Alexey Grigorev
02:29 The founding of DataTalks .Club
03:52 Alexey's transition from Java work to DataTalks.Club
04:58 Growth and success of DataTalks.Club courses
12:04 Motivation behind creating a free-to-learn community
24:03 Staying updated in data science through pet projects
26 :37 Hosting a second podcast and maintaining programming skills
28:56 Skepticism about LLMs and their relevance
31:53 Transitioning to DataTalks.Club and personal reflections
33:32 Memorable moments and the first event's success
36:19 Community building during the pandemic
38:31 AI's impact on data analysts and future roles
42:24 Discussion on AI in healthcare
44:37 Age and reflections on personal milestones
47:54 Building communities and personal connections
49:34 Future goals for the community and courses
51:18 Community involvement and engagement strategies
53:46 Ideas for competitions and hackathons
54:20 Inviting guests to the podcast
55:29 Course updates and future workshops
56:27 Podcast preparation and research process
58:30 Career opportunities in data science and transitioning fields
1:01 :10 Book recommendations and personal reading experiences
About the speaker:
Alexey Grigorev is the founder of DataTalks.Club.
Join our slack: https://datatalks.club/slack.html

Oct 10, 2024 • 48min
Human-Centered AI for Disordered Speech Recognition - Katarzyna Foremniak
Katarzyna Foremniak, a seasoned computational linguist with over a decade of experience in NLP and speech recognition, shares her insights. She discusses the complexities of automatic speech recognition, particularly for disordered speech, and the challenges of articulation variability. With anecdotes about consonant clusters and amusing voice recognition mishaps in automotive systems, Kasia emphasizes the need for human-centered AI and personalized ASR models to enhance communication for diverse speech patterns.

Aug 15, 2024 • 54min
DataOps, Observability, and The Cure for Data Team Blues - Christopher Bergh
Christopher Bergh, a DataOps and observability expert, dives into the evolution and challenges of DataOps, drawing insights from his experiences at NASA and MIT. He underlines the need for a cultural shift towards automation and effective methodologies to boost data team satisfaction. The discussion also delves into optimizing data systems, navigating Kubernetes, and the importance of focusing on code versioning over data versioning. Throughout, Bergh emphasizes how culture plays a pivotal role in fostering collaboration and retaining talent within data teams.

Jul 26, 2024 • 53min
Working as a Core Developer in the Scikit-Learn Universe - Guillaume Lemaître
In this podcast episode, we talked with Guillaume Lemaître about navigating scikit-learn and imbalanced-learn.
🔗 CONNECT WITH Guillaume Lemaître
LinkedIn - https://www.linkedin.com/in/guillaume-lemaitre-b9404939/
Twitter - https://x.com/glemaitre58
Github - https://github.com/glemaitre
Website - https://glemaitre.github.io/
🔗 CONNECT WITH DataTalksClub
Join the community - https://datatalks-club.slack.com/join/shared_invite/zt-2hu0sjeic-ESN7uHt~aVWc8tD3PefSlA#/shared-invite/email
Subscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/u/0/r?cid=ZjhxaWRqbnEwamhzY3A4ODA5azFlZ2hzNjBAZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQ
Check other upcoming events - https://lu.ma/dtc-events
LinkedIn - https://www.linkedin.com/company/datatalks-club/
Twitter - https://twitter.com/DataTalksClub
Website - https://datatalks.club/
🔗 CONNECT WITH ALEXEY
Twitter - https://twitter.com/Al_Grigor
Linkedin - https://www.linkedin.com/in/agrigorev/
🎙 ABOUT THE PODCAST
At DataTalksClub, we organize live podcasts that feature a diverse range of guests from the data field. Each podcast is a free-form conversation guided by a prepared set of questions, designed to learn about the guests’ career trajectories, life experiences, and practical advice. These insightful discussions draw on the expertise of data practitioners from various backgrounds.
We stream the podcasts on YouTube, where each session is also recorded and published on our channel, complete with timestamps, a transcript, and important links.
You can access all the podcast episodes here - https://datatalks.club/podcast.html
📚Check our free online courses
ML Engineering course - http://mlzoomcamp.com
Data Engineering course - https://github.com/DataTalksClub/data-engineering-zoomcamp
MLOps course - https://github.com/DataTalksClub/mlops-zoomcamp
Analytics in Stock Markets - https://github.com/DataTalksClub/stock-markets-analytics-zoomcamp
LLM course - https://github.com/DataTalksClub/llm-zoomcamp
Read about all our courses in one place - https://datatalks.club/blog/guide-to-free-online-courses-at-datatalks-club.html
👋🏼 GET IN TOUCH
If you want to support our community, use this link - https://github.com/sponsors/alexeygrigorev
If you're a company and want to support us, contact at alexey@datatalks.club

Jul 13, 2024 • 50min
Building a Domestic Risk Assessment Tool - Sabina Firtala
Links:
LinkedIn:https://www.linkedin.com/company/frontline100/
Ba Linh Le's LinkedIn: https://www.linkedin.com/in/ba-linh-le-/
Sabrina's LinkedIn: https://www.linkedin.com/in/sabina-firtala/
Twitter: https://x.com/frontline_100?mx=2
Website: https://www.frontline100.com/
Free LLM course: https://github.com/DataTalksClub/llm-zoomcamp
Join DataTalks.Club: https://datatalks.club/slack.html
Our events: https://datatalks.club/events.html

Jul 6, 2024 • 38min
Berlin Buzzwords 2024
We stream the podcasts on YouTube, where each session is also recorded and published on our channel, complete with timestamps, a transcript, and important links.
You can access all the podcast episodes here - https://datatalks.club/podcast.html
📚Check our free online courses
ML Engineering course - http://mlzoomcamp.com
Data Engineering course - https://github.com/DataTalksClub/data-engineering-zoomcamp
MLOps course - https://github.com/DataTalksClub/mlops-zoomcamp
Analytics in Stock Markets - https://github.com/DataTalksClub/stock-markets-analytics-zoomcamp
LLM course - https://github.com/DataTalksClub/llm-zoomcamp
Read about all our courses in one place - https://datatalks.club/blog/guide-to-free-online-courses-at-datatalks-club.html
👋🏼 GET IN TOUCH
If you want to support our community, use this link - https://github.com/sponsors/alexeygrigorev
If you’re a company, support us at alexey@datatalks.club

May 10, 2024 • 50min
Community Building and Teaching in AI & Tech - Erum Afzal
We talked about:
Erum's Background
Omdena Academy and Erum’s Role There
Omdena’s Community and Projects
Course Development and Structure at Omdena Academy
Student and Instructor Engagement
Engagement and Motivation
The Role of Teaching in Community Building
The Importance of Communities for Career Building
Advice for Aspiring Instructors and Freelancers
DS and ML Talent Market Saturation
Resources for Learning AI and Community Building
Erum’s Resource Recommendations
Links:
LinkedIn: https://www.linkedin.com/in/erum-afzal-64827b24/
Twitter: https://twitter.com/Erum55449739
Free Data Engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp
Join DataTalks.Club: https://datatalks.club/slack.html
Our events: https://datatalks.club/events.html
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.