DataTalks.Club

DataTalks.Club
undefined
Sep 8, 2023 • 54min

Pragmatic and Standardized MLOps - Maria Vechtomova

We talked about: Maria's background Marvelous MLOps Maria's definition of MLOps Alternate team setups without a central MLOps team Pragmatic vs non-pragmatic MLOps Must-have ML tools (categories) Maturity assessment What to start with in MLOps Standardized MLOps Convincing DevOps to implement Understanding what the tools are used for instead of knowing all the tools Maria's next project plans Is LLM Ops a thing? What Ahold Delhaize does Resource recommendations to learn more about MLOps The importance of data engineering knowledge for ML engineers Links: LinkedIn: https://www.linkedin.com/company/marvelous-mlops/ Website: https://marvelousmlops.substack.com/ Free MLOps course: https://github.com/DataTalksClub/mlops-zoomcamp Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html
undefined
Aug 25, 2023 • 56min

Democratizing Causality - Aleksander Molak

We talked about: Aleksander's background Aleksander as a Causal Ambassador Using causality to make decisions Counterfactuals and and Judea Pearl Meta-learners vs classical ML models Average treatment effect Reducing causal bias, the super efficient estimator, and model uplifting Metrics for evaluating a causal model vs a traditional ML model Is the added complexity of a causal model worth implementing? Utilizing LLMs in causal models (text as outcome) Text as treatment and style extraction The viability of A/B tests in causal models Graphical structures and nonparametric identification Aleksander's resource recommendations Links: The Book of Why: https://amzn.to/3OZpvBk Causal Inference and Discovery in Python: https://amzn.to/46Pperr Book's GitHub repo: https://github.com/PacktPublishing/Causal-Inference-and-Discovery-in-Python The Battle of Giants: Causality vs NLP (PyData Berlin 2023): https://www.youtube.com/watch?v=Bd1XtGZhnmw New Frontiers in Causal NLP (papers repo): https://bit.ly/3N0TFTL Free MLOps course: https://github.com/DataTalksClub/mlops-zoomcamp Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html
undefined
Aug 18, 2023 • 47min

Mastering Data Engineering as a Remote Worker - José María Sánchez Salas

Topics include moving from Spain to Norway, organizing the day as a remote worker, company's expertise and data collection process, challenges of finding a remote job in Norway, finding inspiration and writing interesting topics, benefits and challenges of remote work as a data engineer.
undefined
Aug 4, 2023 • 51min

The Good, the Bad and the Ugly of GPT - Sandra Kublik

We talked about: Sandra's background Making a YouTube channel to break into the LLM space The business cases for LLMs LLMs as amplifiers The befits of keeping a human in the loop when using LLMs (AI limitations) Using LLMs as assistants Building an app that uses an LLM Prompt whisperers and how to improve your prompts Sandra's 7-day LLM experiment Sandra's LLM content recommendations Finding Sandra online Links: LinkedIn: https://www.linkedin.com/in/sandrakublik/ Twitter: https://twitter.com/sandra_kublik Youtube: https://www.youtube.com/@sandra_kublik Free MLOps course: https://github.com/DataTalksClub/mlops-zoomcamp Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html
undefined
Jul 28, 2023 • 55min

LLMs for Everyone - Meryem Arik

We talked about: Meryam's background The constant evolution of startups How Meryam became interested in LLMs What is an LLM (generative vs non-generative models)? Why LLMs are important Open source models vs API models What TitanML does How fine-tuning a model helps in LLM use cases Fine-tuning generative models How generative models change the landscape of human work How to adjust models over time Vector databases and LLMs How to choose an open source LLM or an API Measuring input data quality Meryam's resource recommendations Links: Website: https://www.titanml.co/ Beta docs: https://titanml.gitbook.io/iris-documentation/overview/guide-to-titanml... Using llama2.0 in TitanML Blog: https://medium.com/@TitanML/the-easiest-way-to-fine-tune-and-inference-llama-2-0-8d8900a57d57 Discord: https://discord.gg/83RmHTjZgf Meryem LinkedIn: https://www.linkedin.com/in/meryemarik/ Free MLOps course: https://github.com/DataTalksClub/mlops-zoomcamp Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html
undefined
Jul 21, 2023 • 55min

Investing in Open-Source Data Tools - Bela Wiertz

Bela Wiertz, investor in open-source data tools, talks about the viability of open source as a go-to-market strategy, the differences between angel investors, VC funds, and family offices, and the use of GitHub stars as a metric for investment. They also discuss the future of open source, recent successes of open source companies, and Bela's resource recommendations.
undefined
Jul 14, 2023 • 51min

Why Machine Learning Design is Broken - Valerii Babushkin

Links: Book: https://www.manning.com/books/machine-learning-system-design?utm_source=AGMLBookcamp&utm_medium=affiliate&utm_campaign=book_babushkin_machine_4_25_23&utm_content=twitter Discount: poddatatalks21 (35% off) Evidently: https://www.evidentlyai.com/ Article: https://medium.com/people-ai-engineering/design-documents-for-ml-models-bbcd30402ff7 Free MLOps course: https://github.com/DataTalksClub/mlops-zoomcamp Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html
undefined
Jul 7, 2023 • 53min

Interpretable AI and ML - Polina Mosolova

We talked about: Polina's background How common it is for PhD students to build ML pipelines end-to-end Simultaneous PhD and industry experience Support from both the academic and industry sides How common the industrial PhD setup is and how to get into one Organizational trust theory How price relates to trust How trust relates to explainability The importance of actionability Explainability vs interpretability vs actionability Complex glass box models Does the explainability of a model follow explainability? What explainable AI bring to customers and end users Can all trust be turned into KPI? Links: LinkedIn: https://www.linkedin.com/in/polina-mosolova/ Neural Additive Models paper: https://proceedings.neurips.cc/paper/2021/file/251bd0442dfcc53b5a761e050f8022b8-Paper.pdf Neural Basis Model paper: https://arxiv.org/pdf/2205.14120.pdf Interpretable Feature Spaces paper: https://kdd.org/exploration_files/vol24issue1_1._Interpretable_Feature_Spaces_revised.pdf
undefined
Jun 30, 2023 • 54min

From Scratch to Success: Building an MLOps Team and ML Platform - Simon Stiebellehner

We talked about: Simon's background What MLOps is and what it isn't Skills needed to build an ML platform that serves 100s of models Ranking the importance of skills The point where you should think about building an ML platform The importance of processes in ML platforms Weighing your options with SaaS platforms The exploratory setup, experiment tracking, and model registry What comes after deployment? Stitching tools together to create an ML platform Keeping data governance in mind when building a platform What comes first – the model or the platform? Do MLOps engineers need to have deep knowledge of how models work? Is API design important for MLOps? Simon's recommendations for furthering MLOps knowledge Links: LinkedIn: https://www.linkedin.com/in/simonstiebellehner/ Github: https://github.com/stiebels Medium: https://medium.com/@sistel Free MLOps course: https://github.com/DataTalksClub/mlops-zoomcamp Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html
undefined
Jun 23, 2023 • 53min

From MLOps to DataOps - Santona Tuli

We talked about: Santona's background Focusing on data workflows Upsolver vs DBT ML pipelines vs Data pipelines MLOps vs DataOps Tools used for data pipelines and ML pipelines The “modern data stack” and today's data ecosystem Staging the data and the concept of a “lakehouse” Transforming the data after staging What happens after the modeling phase Human-centric vs Machine-centric pipeline Applying skills learned in academia to ML engineering Crafting user personas based on real stories A framework of curiosity Santona's book and resource recommendations Links: LinkedIn: https://www.linkedin.com/in/santona-tuli/ Upsolver website: upsolver.com Why we built a SQL-based solution to unify batch and stream workflows: https://www.upsolver.com/blog/why-we-built-a-sql-based-solution-to-unify-batch-and-stream-workflows Free MLOps course: https://github.com/DataTalksClub/mlops-zoomcamp Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app