
DataTalks.Club
DataTalks.Club - the place to talk about data!
Latest episodes

Oct 8, 2021 • 59min
Building and Leading Data Teams - Tammy Liang
We talked about:
Tammy’s background
Being the chief of data
First projects as the first data person in a company
Initial resistance
Expanding the team
Role of business analyst
Platanomelon’s stack
Order for growing the data team
Demand forecasting
Should analysts know machine learning
Qualifications for the first data person in a company
Providing accurate results
Receiving insights in a timely manner
Providing useful insights
Giving ownership to the team
Starting as the first data person in a company
Data For Future podcast
Supporting team members that are stuck
Finding Tammy online
Links:
Tammy's podcast: https://dataforfuture.org/
Join DataTalks.Club: https://datatalks.club/slack.html
Our events: https://datatalks.club/events.html

Oct 1, 2021 • 1h 2min
What Researchers and Engineers Can Learn from Each Other - Mihail Eric
We talked about:
Mihail’s background
NLP and self-driving vehicles
Transitioning from academia to the industry
Machine learning researchers
Finding open-ended problems
Machine learning engineers
Is data science more engineering or research?
What can engineers and researchers learn from one another?
Bridging the disconnect between researchers and engineers
Breaking down silos
Fluid roles
Full-stack data scientists
Advice to machine learning researchers
Advice to machine learning engineers
Reading papers
Choosing between engineering or research if you’re just starting
Confetti.ai
Links:
https://twitter.com/mihail_eric
http://confetti.ai/
Join DataTalks.Club: https://datatalks.club/slack.html
Our events: https://datatalks.club/events.html

Sep 24, 2021 • 59min
Introducing Data Science in Startups - Marianna Diachuk
We talked about:
Marianna’s background
Being the only data scientist
What should already be in the company
How much experience do you need
Identifying problems
Prioritization
What should the company already know?
First week
First month
First quarter
Managing expectations
Solving problems without ML
Project timelines
Finding the best solution
Evaluating performance
Getting stuck
Communicating with analysts
Transitioning from engineering to data science
Growing the team
Stopping projects
Questions for the company
From research to production
Wrapping up
Links:
Marianna's LinkedIn: https://www.linkedin.com/in/marianna-diachuk-53ba60116/
Join DataTalks.Club: https://datatalks.club/slack.html
Our events: https://datatalks.club/events.html

Sep 17, 2021 • 1h 3min
Defining Success: Metrics and KPIs - Adam Sroka
We talked about:
Adam’s background
Adam’s laser and data experience
Metrics and why do we care about them
Examples of metrics
KPIs
KPI examples
Derived KPIs
Creating metrics — grocery store example
Metric efficiency
North Star metrics
Threshold metrics
Health metrics
Data team metrics
Experiments: treatment and control groups
Accelerate metrics and timeboxing
Links:
Domino's article about measuring value: http://blog.dominodatalab.com/measuring-data-science-business-value
Adam's article about skills useful for data scientists: https://towardsdatascience.com/how-to-apply-your-hard-earned-data-science-skillset-812585e3cc06
Adam's article about standing out: https://towardsdatascience.com/how-to-stand-out-as-a-great-data-scientist-in-2021-3b7a732114a9
Join DataTalks.Club: https://datatalks.club/slack.html
Our events: https://datatalks.club/events.html

Sep 11, 2021 • 1h
Making Sense of Data Engineering Acronyms and Buzzwords - Natalie Kwong
We talked about:
Natalie’s background
Airbyte
What is ETL?
Why ELT instead of ETL?
Transformations
How does ELT help analysts be more independent?
Data marts and Data warehouses
Ingestion DB
ETL vs ELT
Data lakes
Data swamps
Data governance
Ingestion layer vs Data lake
Do you need both a Data warehouse and a Data lake?
Airbyte and ELT
Modern data stack
Reverse ETL
Is drag-and-drop killing data engineering jobs?
Who is responsible for managing unused data?
CDC – Change Data Capture
Slowly changing dimension
Are there cases where ETL is preferable over ELT?
Why is Airbyte open source?
The case of Elasticsearch and AWS
Links:
Natalie's LinkedIn: https://www.linkedin.com/in/nataliekwong/
https://airbyte.io/blog/why-the-future-of-etl-is-not-elt-but-el
Join DataTalks.Club: https://datatalks.club/slack.html
Our events: https://datatalks.club/events.html

Sep 3, 2021 • 1h 2min
Mastering Algorithms and Data Structures - Marcello La Rocca
We talked about:
Learning algorithms and data structures
Resources for learning algorithms and data structures
Most important data structures
Learning the abstractions
Learning algorithms if they aren’t needed at work
Common mistakes when using wrong data structures
Importance of data structures for data scientists
Marcello’s book - Advanced Algorithms and Data Structures
Bloom filters
Where Bloom filters are useful
Approximate nearest neighbours
Searching for most similar vectors
Knowing frameworks vs knowing internals of data structures
Serializing Bloom filters
Algorithmic problems in job interviews
Important data structures for data scientists and data engineers
Learning by doing
Importance of compiled languages for data scientists
Links:
Marcello's book: Advanced Algorithms and Data Structures http://mng.bz/eP79 (promo code for 35% discount: poddatatalks21)
MIT, Introduction to Algorithms: https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-006-introduction-to-algorithms-fall-2011/
Algorithms specialization by Tim Roughgarden: https://www.coursera.org/specializations/algorithms
Join DataTalks.Club: https://datatalks.club/slack.html
Our events: https://datatalks.club/events.html

Aug 27, 2021 • 1h 2min
Chief Data Officer - Marco De Sa
We talked about:
Marco’s background
Role of CDO
Keeping track of many things
Becoming a CDO
Strategy vs tactics
VP of Data vs CDO
How many VPs of Data could be there?
Splitting the work between VP and CDO
Difference between CTO, CPO, and CDO
Breaking down the goals and working backwards from them
Assessing if we’re moving in the right direction
Dealing with many meetings
Being more effective
Building the data-driven culture
Challenges of working remotely
Does CDO need deep technical skills?
Importance of MBA
The key skills for becoming a CDO
Biggest challenges within OLX so far
Demonstrating the CDO skills on a job interview
Overcoming resistance
Join DataTalks.Club: https://datatalks.club/slack.html
Our events: https://datatalks.club/events.html

Aug 20, 2021 • 1h 2min
Freelancing in Machine Learning - Mikio Braun
We talked about:
Mikio’s background
What Mikio helps with
Moving from a full-time job to freelancing
Finding clients and importance of a strong network
Building a network
Initial meetings with clients
Understanding what clients need
Template for the offer (Million dollar consulting)
Deciding on rate type: hourly, daily, per project
Taking vacations (and paying twice for them)
Avoiding overworking
Specializing: consulting as a product
Working full-time as a principal vs being a consultant
Is the overhead worth it?
Getting a new client when you already have a project
After freelancing: what’s next?
Output of Mikio’s work
Learning new things
Lessons learned after finding clients
Registering as a freelancer in Germany
Personal liability of a freelancer
Effect of globalization and remote work on consulting
Advice for people who want to start freelancing
Woking full-time and freelancing at the same time
Books:
Million Dollar Consulting by Alan Weiss
Built to Sell by John Warrillow
Links:
Mikio's Twitter: https://twitter.com/mikiobraun
Mikio's LinkedIn: https://www.linkedin.com/in/mikiobraun/
Join DataTalks.Club: https://datatalks.club/slack.html
Our events: https://datatalks.club/events.html

Aug 13, 2021 • 1h 7min
Launching a Startup: From Idea to First Hire - Carmine Paolino
We talked about:
Carmine’s background
Carmine’s startup FreshFlow
Doing user research
Design thinking
Entrepreneur first
Finding co-founders: the “expertise edges” framework
The structure of the EF program
Coming up with the idea
How important is going through a startup accelerator?
Finding your first client
Finding investors
Consequences of having a bad investor
Splitting responsibilities between co-founders
Hiring
The importance of delegating
Making work attractive to hires
Plans for the future
Just-in-time supply chain
What would you have done differently?
Advice for people starting a startup
Don’t focus on skills only
Getting motivation
Am I ready for a startup?
Importance of a business school
Advice on finding a co-founder
Do I need EF if I already have an idea?
Having a prototype before the pitch
Books:
The Mom Test by Rob Fitzpatrick
Design Thinking by Robert Curedale
Links:
FreshFlow: https://freshflow.ai/
Carmine's LinkedIn: https://www.linkedin.com/in/carminepaolino
Carmine's Twitter: https://twitter.com/paolino
Join DataTalks.Club: https://datatalks.club/slack.html
Our events: https://datatalks.club/events.html

Aug 6, 2021 • 14min
Approach Learning as ML Project - Vladimir Finkelshtein [mini]
We don't have an episode lined up for this week, but we recorded a small chat with Vladimir some time ago. Enjoy it!
We talked about:
Vladimir's background
Learning by answering questions
Don't be afraid of being wrong
Winnings books
Learning random things
Approach learning as a machine learning project
Links:
Vladimir on LinkedIn: https://www.linkedin.com/in/vladimir-finkelshtein/
Join DataTalks.Club: https://datatalks.club/slack.html
Our events: https://datatalks.club/events.html