DataTalks.Club cover image

DataTalks.Club

Latest episodes

undefined
Feb 14, 2025 • 53min

Competitive Machine Leaning And Teaching – Alexander Guschin

Join Alexander Guschin, a Kaggle Grandmaster and seasoned Machine Learning Engineer, as he shares his journey from Moscow to teaching over 100K students. Discover how participating in Kaggle competitions can jumpstart a career in machine learning and the significance of teamwork versus solo efforts. Guschin emphasizes the value of community support, the evolution of teaching methods, and practical applications in education. He also explores the impact of generative AI and AutoML on competitive data science, offering insights on how to persuade management about Kaggle's benefits.
undefined
Jan 31, 2025 • 57min

Redefining AI Infrastructure: Open-Source, Chips, and the Future Beyond Kubernetes – Andrey Cheptsov

In this podcast episode, we talked with Andrey Cheptsov about ​The future of AI infrastructure.About the Speaker:Andrey Cheptsov is the founder and CEO of dstack, an open-source alternative to Kubernetes and Slurm, built to simplify the orchestration of AI infrastructure. Before dstack, Andrey worked at JetBrains for over a decade helping different teams make the best developer tools.During the event, the guest, Andrey Cheptsov, founder and CEO of dstack, discussed the complexities of AI infrastructure. We explore topics like the challenges of using Kubernetes for AI workloads, the need to rethink container orchestration, and the future of hybrid and cloud-only infrastructures. Andrey also shares insights into the role of on-premise and bare-metal solutions, edge computing, and federated learning.00:00 Andrey's Career Journey: From JetBrains to DStack5:00 The Motivation Behind DStack7:00 Challenges in Machine Learning Infrastructure10:00 Transitioning from Cloud to On-Prem Solutions14:30 Reflections on OpenAI's Evolution17:30 Open Source vs Proprietary Models: A Balanced Perspective21:01 Monolithic vs. Decentralized AI businesses22:05 The role of privacy and control in AI for industries like banking and healthcare30:00 Challenges in training large AI models: GPUs and distributed systems37:03 DeepSpeed's efficient training approach vs. brute force methods39:00 Challenges for small and medium businesses: hosting and fine-tuning models47:01 Managing Kubernetes challenges for AI teams52:00 Hybrid vs. cloud-only infrastructure56:03 On-premise vs. bare-metal solutions58:05 Exploring edge computing and its challenges🔗 CONNECT WITH ANDREY CHEPTSOVTwitter -  / andrey_cheptsov  Linkedin -  / andrey-cheptsov  GitHub - https://github.com/dstackai/dstack/Website - https://dstack.ai/🔗 CONNECT WITH DataTalksClubJoin DataTalks.Club:⁠⁠⁠https://datatalks.club/slack.html⁠⁠⁠Our events:⁠⁠⁠https://datatalks.club/events.html⁠⁠⁠Datalike Substack -⁠⁠⁠https://datalike.substack.com/⁠⁠⁠LinkedIn:⁠⁠⁠  / datatalks-club  ⁠
undefined
Jan 17, 2025 • 53min

Linguistics and Fairness - Tamara Atanasoska

In this podcast episode, we talked with Tamara Atanasoska about ​building fair AI systems.About the Speaker:​Tamara works on ML explainability, interpretability and fairness as Open Source Software Engineer at probable. She is a maintainer of fairlearn, contributor to scikit-learn and skops. Tamara has both computer science/ software engineering and a computational linguistics(NLP) background.During the event, the guest discussed their career journey from software engineering to open-source contributions, focusing on explainability in AI through Scikit-learn and Fairlearn. They explored fairness in AI, including challenges in credit loans, hiring, and decision-making, and emphasized the importance of tools, human judgment, and collaboration. The guest also shared their involvement with PyLadies and encouraged contributions to Fairlearn.00:00 Introduction to the event and the community01:51 Topic introduction: Linguistic fairness and socio-technical perspectives in AI02:37 Guest introduction: Tamara’s background and career03:18 Tamara’s career journey: Software engineering, music tech, and computational linguistics09:53 Tamara’s background in language and computer science14:52 Exploring fairness in AI and its impact on society21:20 Fairness in AI models26:21 Automating fairness analysis in models32:32 Balancing technical and domain expertise in decision-making37:13 The role of humans in the loop for fairness40:02 Joining Probable and working on open-source projects46:20 Scopes library and its integration with Hugging Face50:48 PyLadies and community involvement55:41 The ethos of Scikit-learn and Fairlearn🔗 CONNECT WITH TAMARA ATANASOSKALinkedin - https://www.linkedin.com/in/tamaraatanasoskaGitHub- https://github.com/TamaraAtanasoska🔗 CONNECT WITH DataTalksClubJoin DataTalks.Club:⁠⁠https://datatalks.club/slack.html⁠⁠Our events:⁠⁠https://datatalks.club/events.html⁠⁠Datalike Substack -⁠⁠https://datalike.substack.com/⁠⁠LinkedIn:⁠⁠  / datatalks-club  
undefined
Jan 10, 2025 • 55min

Career choices, transitions and promotions in and out of tech - Agita Jaunzeme

Agita Jaunzeme, a versatile professional with a background in DevOps/DataOps engineering, education, and community building, shares her unique career trajectory. She discusses her transition from art school to tech, tackling burnout, and the value of mentorship. Agita highlights her NGO aimed at promoting inclusivity and gender equality while exploring the differences between volunteer and employee management. The conversation delves into the significance of community in tech and the challenges faced by expatriates, combined with insights on fostering collaboration and innovative problem-solving in the industry.
undefined
Dec 13, 2024 • 55min

Career advice, learning, and featuring women in ML and AI - Isabella Bicalho

In this engaging conversation, Isabella Bicalho, a Machine Learning Engineer and passionate advocate for women in data science, shares her journey from academia to freelancing in AI. She discusses the vibrant AI scene in France, her experiences with open-source contributions, and the significance of mentorship. Isabella reveals her insights on balancing technical and soft skills, the pros and cons of freelancing vs. full-time work, and highlights her mission to showcase women's achievements in the tech field through her newsletter.
undefined
Dec 6, 2024 • 53min

AI in Industry: Trust, Return on Investment and Future - Maria Sukhareva

Maria Sukhareva, a Principal key expert in AI at Siemens, shares her 15 years of experience in generative AI. She discusses the journey of AI in industry, highlighting vulnerabilities and the importance of human oversight in chatbot deployments. Sukhareva emphasizes AI as a tool to enhance human capabilities rather than replace them. She also reflects on the evolution of the English language and the challenges of decoding ancient languages, underscoring the need for caution in using AI for linguistic research.
undefined
Nov 22, 2024 • 54min

Large Hadron Collider and Mentorship – Anastasia Karavdina

We talked about: 00:00 DataTalks.Club intro 00:00 Large Hadron Collider and Mentorship 02:35 Career overview and transition from physics to data science 07:02 Working at the Large Hadron Collider 09:19 How particles collide and the role of detectors 11:03 Data analysis challenges in particle physics and data science similarities 13:32 Team structure at the Large Hadron Collider 20:05 Explaining the connection between particle physics and data science 23:21 Software engineering practices in particle physics 26:11 Challenges during interviews for data science roles 29:30 Mentoring and offering advice to job seekers 40:03 The STAR method and its value in interviews 50:32 Paid vs unpaid mentorship and finding the right fit ​About the speaker: ​Anastasia is a particle physicist turned data scientist, with experience in large-scale experiments like those at the Large Hadron Collider. She also worked at Blue Yonder, scaling AI-driven solutions for global supply chain giants, and at Kaufland e-commerce, focusing on NLP and search. Anastasia is a mentor for Ml/AI, dedicated to helping her mentees achieve their goals. She is passionate about growing the next generation of data science elite in Germany: from Data Analysts up to ML Engineers. Join our Slack: https://datatalks .club/slack.html
undefined
17 snips
Nov 8, 2024 • 56min

MLOps as a Team - Raphaël Hoogvliets

Raphaël Hoogvliets, a leader in MLOps with a background in data science, discusses his transition from sustainable agriculture to tech. He highlights Dutch agricultural challenges and the importance of addressing technical debt in MLOps. The conversation emphasizes the delicate balance between speed and quality in team dynamics. Key roles like tech translators are explored, alongside best practices such as data versioning. Raphaël also shares success and failure stories, underscoring the complexity of deploying machine learning in real-world environments.
undefined
Nov 1, 2024 • 46min

Using Data to Create Liveable Cities - Rachel Lim

We talked about: 00:00 DataTalks.Club intro 01:56 Using data to create livable cities 02:52 Rachel's career journey: from geography to urban data science 04:20 What does a transport scientist do? 05:34 Short-term and long-term transportation planning 06:14 Data sources for transportation planning in Singapore 08:38 Rachel's motivation for combining geography and data science 10:19 Urban design and its connection to geography 13:12 Defining a livable city 15:30 Livability of Singapore and urban planning 18:24 Role of data science in urban and transportation planning 20:31 Predicting travel patterns for future transportation needs 22:02 Data collection and processing in transportation systems 24:02 Use of real-time data for traffic management 27:06 Incorporating generative AI into data engineering 30:09 Data analysis for transportation policies 33:19 Technologies used in text-to-SQL projects 36:12 Handling large datasets and transportation data in Singapore 42:17 Generative AI applications beyond text-to-SQL 45:26 Publishing public data and maintaining privacy 45:52 Recommended datasets and projects for data engineering beginners 49:16 Recommended resources for learning urban data science About the speaker: Rachel is an urban data scientist dedicated to creating liveable cities through the innovative use of data. With a background in geography, and a masters in urban data science, she blends qualitative and quantitative analysis to tackle urban challenges. Her aim is to integrate data driven techniques with urban design to foster sustainable and equitable urban environments.  Links: - https://datamall.lta.gov.sg/content/datamall/en/dynamic-data.html 00:00 DataTalks.Club intro 01:56 Using data to create livable cities 02:52 Rachel's career journey: from geography to urban data science 04:20 What does a transport scientist do? 05:34 Short-term and long-term transportation planning 06:14 Data sources for transportation planning in Singapore 08:38 Rachel's motivation for combining geography and data science 10:19 Urban design and its connection to geography 13:12 Defining a livable city 15:30 Livability of Singapore and urban planning 18:24 Role of data science in urban and transportation planning 20:31 Predicting travel patterns for future transportation needs 22:02 Data collection and processing in transportation systems 24:02 Use of real-time data for traffic management 27:06 Incorporating generative AI into data engineering 30:09 Data analysis for transportation policies 33:19 Technologies used in text-to-SQL projects 36:12 Handling large datasets and transportation data in Singapore 42:17 Generative AI applications beyond text-to-SQL 45:26 Publishing public data and maintaining privacy 45:52 Recommended datasets and projects for data engineering beginners 49:16 Recommended resources for learning urban data science Join our slack: https: //datatalks.club/slack.html
undefined
Oct 26, 2024 • 54min

DataTalks.Club 4th Anniversary AMA Podcast – Alexey Grigorev and Johanna Bayer

We talked about: 00:00 DataTalks.Club intro 00:00 DataTalks.Club anniversary "Ask Me Anything" event with Alexey Grigorev 02:29 The founding of DataTalks .Club 03:52 Alexey's transition from Java work to DataTalks.Club 04:58 Growth and success of DataTalks.Club courses 12:04 Motivation behind creating a free-to-learn community 24:03 Staying updated in data science through pet projects 26 :37 Hosting a second podcast and maintaining programming skills 28:56 Skepticism about LLMs and their relevance 31:53 Transitioning to DataTalks.Club and personal reflections 33:32 Memorable moments and the first event's success 36:19 Community building during the pandemic 38:31 AI's impact on data analysts and future roles 42:24 Discussion on AI in healthcare 44:37 Age and reflections on personal milestones 47:54 Building communities and personal connections 49:34 Future goals for the community and courses 51:18 Community involvement and engagement strategies 53:46 Ideas for competitions and hackathons 54:20 Inviting guests to the podcast 55:29 Course updates and future workshops 56:27 Podcast preparation and research process 58:30 Career opportunities in data science and transitioning fields 1:01 :10 Book recommendations and personal reading experiences About the speaker: Alexey Grigorev is the founder of DataTalks.Club. Join our slack: https://datatalks.club/slack.html

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode