DataTalks.Club cover image

DataTalks.Club

Latest episodes

undefined
Jul 30, 2021 • 58min

Humans in the Loop - Lina Weichbrodt

We talked about: Lina’s background What we need to remember when starting a project (checklists) Make sure the problem is formalized and close to the core business Get the buy-in with stakeholders Building trust with stakeholders Don’t just focus on upsides – ask about concerns Turning a concert into a metric What happens when something goes wrong? Post mortem reporting Apply the 5 why’s If a lot of users say it’s a bug – it’s worth investigating Post mortem format Action points Debugging vs explaining the model Are there online versions of checklists? Make sure to log your inputs Talking to end-users and using your own service Your ideas vs Stakeholder ideas Should data practitioners educate the team about data? People skills and ‘dirty’ hacks Where to find Lina Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html
undefined
Jul 23, 2021 • 1h 12min

Running from Complexity - Ben Wilson

We talked about: Ben’s Background Building solutions for customers Why projects don’t make it to production Why do people choose overcomplicated solutions? The dangers of isolating data science from the business unit The importance of being able to explain things Maximizing chances of making into production The IKEA effect Risks of implementing novel algorithms If it can be done simply – do that first Don’t become the guinea pig for someone’s white paper The importance of stat skills and coding skills Structuring an agile team for ML work Timeboxing research Mentoring Ben’s book ‘Uncool techniques’ at AI-First companies Should managers learn data science? Do data scientists need to specialize to be successful? Links: Ben's book: https://www.manning.com/books/machine-learning-engineering-in-action (get 35% off with code "ctwsummer21") Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html
undefined
Jul 16, 2021 • 58min

I Want to Build a Machine Learning Startup! - Elena Samuylova

We talked about: Elena’s background Why do a startup instead of being an employee? Where to get ideas for your startup Finding a co-founder What should you consider before starting a startup? Vertical startup vs infrastructure startup ‘AI First’ startups Building tools for engineers What skills do you need to start a startup? Startup risks How to be prepared to fail Work-life balance The part-time startup approach Startup investment models No resources and no technical expertise – what to do? Productionizing your services When to hire an expert Talking to people with a problem before solving the problem Starting Elena’s startup, Evidently Elena’s role at Evidently Why is Evidently open source? “People will just copy my open source code. Should I be concerned?” Bottom-up adoption Creating value so that clients engage with your product Is there a difference between countries when creating a startup? Does open source mean the data is safer? When should you hire engineers? Following the market Startups out of genuine interest vs Just for money and for fun Links: EvidentlyAI: https://evidentlyai.com/ Elena's LinkedIn: https://www.linkedin.com/in/elenasamuylova/ Elena's Twitter: https://twitter.com/elenasamuylova/ Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html
undefined
Jul 9, 2021 • 1h 2min

Big Data Engineer vs Data Scientist - Roksolana Diachuk

Links: Twitter: https://twitter.com/dead_flowers22 LinkedIn: https://www.linkedin.com/in/roksolanadiachuk/ Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html
undefined
Jul 2, 2021 • 1h 2min

Build Your Own Data Pipeline - Andreas Kretz

We talked about: Andreas’s background Why data engineering is becoming more popular Who to hire first – a data engineer or a data scientist? How can I, as a data scientist, learn to build pipelines? Don’t use too many tools What is a data pipeline and why do we need it? What is ingestion? Can just one person build a data pipeline? Approaches to building data pipelines for data scientists Processing frameworks Common setup for data pipelines — car price prediction Productionizing the model with the help of a data pipeline Scheduling Orchestration Start simple Learning DevOps to implement data pipelines How to choose the right tool Are Hadoop, Docker, Cloud necessary for a first job/internship? Is Hadoop still relevant or necessary? Data engineering academy How to pick up Cloud skills Avoid huge datasets when learning Convincing your employer to do data science How to find Andreas Links: LinkedIn: https://www.linkedin.com/in/andreas-kretz Data engieering cookbook: https://cookbook.learndataengineering.com/ Course: https://learndataengineering.com/ Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html
undefined
Jun 25, 2021 • 60min

From Software Engineering to Machine Learning - Santiago Valdarrama

We talked about: Santiago’s background “Transitioning to ML” vs “Adding ML as a skill” Getting over the fear of math for software developers Learning by explaining Seven lessons I learned about starting a career in machine learning Lesson 1 – Take the first step Lesson 2 – Learning is a marathon, not a sprint Lesson 3 – If you want to go quickly, go alone. If you want to go far, go together. Lesson 4 – Do something with the knowledge you gain Lesson 5 – ML is not just math. Math is not scary. Lesson 6 – Your ability to analyze a problem is the most important skill. Coding is secondary. Lesson 7 – You don’t need to know every detail Tools and frameworks needed to transition to machine learning Problem-based learning vs Top-down learning Learning resources Santiago’s favorite books Santiago’s course on transitioning to machine learning Improving coding skills Building solutions without machine learning Becoming a better engineer What is the difference between machine learning and data science? Getting into machine learning - Reiteration Getting past the math Links: Santiago's Twitter: https://twitter.com/svpino Santiago's course: https://gumroad.com/svpino#kBjbC Pinned tweet with a roadmap: https://twitter.com/svpino/status/1400798154732212230 Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html
undefined
Jun 18, 2021 • 60min

Analytics Engineer: New Role in a Data Team - Victoria Perez Mola

Links: https://www.notion.so/Analytics-Engineer-New-Role-in-a-Data-Team-9decbf33825c4580967cf3173eb77177 https://www.linkedin.com/in/victoriaperezmola/ Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html Conference: https://datatalks.club/conferences/2021-summer-marathon.html
undefined
Jun 11, 2021 • 58min

Data Governance - Jessi Ashdown, Uri Gilad

We talked about: Jessi’s background Uri’s background Data governance Implementing data governance: policies and processes Reasons not to have data governance Start with “why” Cataloging and classifying our data Let data work for you The human component Data quality Defining policies Implementing policies Shopping-card experience for requesting data Proving the value of data catalog Using data catalog Data governance = data catalog? Links: Book: https://www.oreilly.com/library/view/data-governance-the/9781492063483/ Jessi’s LinkedIn: https://www.linkedin.com/in/jashdown/ Uri’s LinkedIn: https://linkedin.com/in/ugilad Uri’s Twitter: https://twitter.com/ugilad Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html Conference: https://datatalks.club/conferences/2021-summer-marathon.html
undefined
Jun 4, 2021 • 60min

What Data Scientists Don’t Mention in Their LinkedIn Profiles - Yury Kashnitsky

We talked about: Yury’s background Failing fast: Grammarly for science Not failing fast: Keyword recommender Four steps to epiphany Lesson learned when bringing XGBoost into production When data scientists try to be engineers Joining a fintech startup: Doing NLP with thousands of GPUs Working at a Telco company Having too much freedom The importance of digital presence Work-life balance Quantifying impact of failing projects on our CVs Business trips to Perm: don’t work on the weekend What doesn’t kill you makes you stronger Links: Yury's course: https://mlcourse.ai/ Yury's Twitter: https://twitter.com/ykashnitsky Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html
undefined
May 28, 2021 • 1h

Becoming a Data-led Professional - Arpit Choudhury

We talked about: Data-led academy Arpit’s background Growth marketing Being data-led Data-led vs data-driven Documenting your data: creating a tracking plan Understanding your data Tools for creating a tracking plan Data flow stages Tracking events — examples Collecting the data Storing and analyzing the data Data activation Tools for data collection Data warehouses Reverse ETL tools Customer data platforms Modern data stack for growth Buy vs build People we need to in the data flow Data democratization Motivating people to document data Product-led vs data-led Links: https://dataled.academy/ Join our Slack: https://datatalks.club/slack.html

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app