DataTalks.Club

DataTalks.Club
undefined
Oct 20, 2023 • 54min

Bridging Data Science and Healthcare - Eleni Stamatelou

Free ML Engineering course: http://mlzoomcamp.com Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html
undefined
Oct 12, 2023 • 58min

DataTalks.Club Anniversary Interview - Alexey Grigorev, Johanna Bayer

Free ML Engineering course: http://mlzoomcamp.com Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html
undefined
10 snips
Oct 6, 2023 • 54min

Data Engineering for Fraud Prevention - Angela Ramirez

Angela Ramirez, a data engineer with experience in fraud prevention, talks about her career journey, the usefulness of knowing ML as a data engineer, best practices for system design and data engineering, working with different types of databases including document and network-based databases, and selecting the appropriate database type to work with. She also discusses the importance of software engineering knowledge in data engineering, data quality check tooling, debugging failed jobs, and working with external data sources.
undefined
Sep 29, 2023 • 57min

From Data Manager to Data Architect - Loïc Magnien

We talked about: Loïc's background Data management Loïc's transition to data engineer Challenges in the transition to data engineering What is a data architect? The output of a data architect's work Establishing metrics and dimensions The importance of communication Setting up best practices for the team Staying relevant and tech-watching Setting up specifications for a pipeline Be agile, create a POC, iterate ASAP, and build reusable templates Reaching out to Loïc for questions Links: Loiic LinkedIn: https://www.linkedin.com/in/loicmagnien/ Free ML Engineering course: http://mlzoomcamp.com Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html
undefined
Sep 8, 2023 • 54min

Pragmatic and Standardized MLOps - Maria Vechtomova

We talked about: Maria's background Marvelous MLOps Maria's definition of MLOps Alternate team setups without a central MLOps team Pragmatic vs non-pragmatic MLOps Must-have ML tools (categories) Maturity assessment What to start with in MLOps Standardized MLOps Convincing DevOps to implement Understanding what the tools are used for instead of knowing all the tools Maria's next project plans Is LLM Ops a thing? What Ahold Delhaize does Resource recommendations to learn more about MLOps The importance of data engineering knowledge for ML engineers Links: LinkedIn: https://www.linkedin.com/company/marvelous-mlops/ Website: https://marvelousmlops.substack.com/ Free MLOps course: https://github.com/DataTalksClub/mlops-zoomcamp Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html
undefined
Aug 25, 2023 • 56min

Democratizing Causality - Aleksander Molak

We talked about: Aleksander's background Aleksander as a Causal Ambassador Using causality to make decisions Counterfactuals and and Judea Pearl Meta-learners vs classical ML models Average treatment effect Reducing causal bias, the super efficient estimator, and model uplifting Metrics for evaluating a causal model vs a traditional ML model Is the added complexity of a causal model worth implementing? Utilizing LLMs in causal models (text as outcome) Text as treatment and style extraction The viability of A/B tests in causal models Graphical structures and nonparametric identification Aleksander's resource recommendations Links: The Book of Why: https://amzn.to/3OZpvBk Causal Inference and Discovery in Python: https://amzn.to/46Pperr Book's GitHub repo: https://github.com/PacktPublishing/Causal-Inference-and-Discovery-in-Python The Battle of Giants: Causality vs NLP (PyData Berlin 2023): https://www.youtube.com/watch?v=Bd1XtGZhnmw New Frontiers in Causal NLP (papers repo): https://bit.ly/3N0TFTL Free MLOps course: https://github.com/DataTalksClub/mlops-zoomcamp Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html
undefined
Aug 18, 2023 • 47min

Mastering Data Engineering as a Remote Worker - José María Sánchez Salas

Topics include moving from Spain to Norway, organizing the day as a remote worker, company's expertise and data collection process, challenges of finding a remote job in Norway, finding inspiration and writing interesting topics, benefits and challenges of remote work as a data engineer.
undefined
Aug 4, 2023 • 51min

The Good, the Bad and the Ugly of GPT - Sandra Kublik

We talked about: Sandra's background Making a YouTube channel to break into the LLM space The business cases for LLMs LLMs as amplifiers The befits of keeping a human in the loop when using LLMs (AI limitations) Using LLMs as assistants Building an app that uses an LLM Prompt whisperers and how to improve your prompts Sandra's 7-day LLM experiment Sandra's LLM content recommendations Finding Sandra online Links: LinkedIn: https://www.linkedin.com/in/sandrakublik/ Twitter: https://twitter.com/sandra_kublik Youtube: https://www.youtube.com/@sandra_kublik Free MLOps course: https://github.com/DataTalksClub/mlops-zoomcamp Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html
undefined
Jul 28, 2023 • 55min

LLMs for Everyone - Meryem Arik

We talked about: Meryam's background The constant evolution of startups How Meryam became interested in LLMs What is an LLM (generative vs non-generative models)? Why LLMs are important Open source models vs API models What TitanML does How fine-tuning a model helps in LLM use cases Fine-tuning generative models How generative models change the landscape of human work How to adjust models over time Vector databases and LLMs How to choose an open source LLM or an API Measuring input data quality Meryam's resource recommendations Links: Website: https://www.titanml.co/ Beta docs: https://titanml.gitbook.io/iris-documentation/overview/guide-to-titanml... Using llama2.0 in TitanML Blog: https://medium.com/@TitanML/the-easiest-way-to-fine-tune-and-inference-llama-2-0-8d8900a57d57 Discord: https://discord.gg/83RmHTjZgf Meryem LinkedIn: https://www.linkedin.com/in/meryemarik/ Free MLOps course: https://github.com/DataTalksClub/mlops-zoomcamp Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html
undefined
Jul 21, 2023 • 55min

Investing in Open-Source Data Tools - Bela Wiertz

Bela Wiertz, investor in open-source data tools, talks about the viability of open source as a go-to-market strategy, the differences between angel investors, VC funds, and family offices, and the use of GitHub stars as a metric for investment. They also discuss the future of open source, recent successes of open source companies, and Bela's resource recommendations.

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app