DataTalks.Club cover image

DataTalks.Club

Latest episodes

undefined
Aug 18, 2023 • 47min

Mastering Data Engineering as a Remote Worker - José María Sánchez Salas

Topics include moving from Spain to Norway, organizing the day as a remote worker, company's expertise and data collection process, challenges of finding a remote job in Norway, finding inspiration and writing interesting topics, benefits and challenges of remote work as a data engineer.
undefined
Aug 4, 2023 • 51min

The Good, the Bad and the Ugly of GPT - Sandra Kublik

We talked about: Sandra's background Making a YouTube channel to break into the LLM space The business cases for LLMs LLMs as amplifiers The befits of keeping a human in the loop when using LLMs (AI limitations) Using LLMs as assistants Building an app that uses an LLM Prompt whisperers and how to improve your prompts Sandra's 7-day LLM experiment Sandra's LLM content recommendations Finding Sandra online Links: LinkedIn: https://www.linkedin.com/in/sandrakublik/ Twitter: https://twitter.com/sandra_kublik Youtube: https://www.youtube.com/@sandra_kublik Free MLOps course: https://github.com/DataTalksClub/mlops-zoomcamp Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html
undefined
Jul 28, 2023 • 55min

LLMs for Everyone - Meryem Arik

We talked about: Meryam's background The constant evolution of startups How Meryam became interested in LLMs What is an LLM (generative vs non-generative models)? Why LLMs are important Open source models vs API models What TitanML does How fine-tuning a model helps in LLM use cases Fine-tuning generative models How generative models change the landscape of human work How to adjust models over time Vector databases and LLMs How to choose an open source LLM or an API Measuring input data quality Meryam's resource recommendations Links: Website: https://www.titanml.co/ Beta docs: https://titanml.gitbook.io/iris-documentation/overview/guide-to-titanml... Using llama2.0 in TitanML Blog: https://medium.com/@TitanML/the-easiest-way-to-fine-tune-and-inference-llama-2-0-8d8900a57d57 Discord: https://discord.gg/83RmHTjZgf Meryem LinkedIn: https://www.linkedin.com/in/meryemarik/ Free MLOps course: https://github.com/DataTalksClub/mlops-zoomcamp Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html
undefined
Jul 21, 2023 • 55min

Investing in Open-Source Data Tools - Bela Wiertz

Bela Wiertz, investor in open-source data tools, talks about the viability of open source as a go-to-market strategy, the differences between angel investors, VC funds, and family offices, and the use of GitHub stars as a metric for investment. They also discuss the future of open source, recent successes of open source companies, and Bela's resource recommendations.
undefined
Jul 14, 2023 • 51min

Why Machine Learning Design is Broken - Valerii Babushkin

Links: Book: https://www.manning.com/books/machine-learning-system-design?utm_source=AGMLBookcamp&utm_medium=affiliate&utm_campaign=book_babushkin_machine_4_25_23&utm_content=twitter Discount: poddatatalks21 (35% off) Evidently: https://www.evidentlyai.com/ Article: https://medium.com/people-ai-engineering/design-documents-for-ml-models-bbcd30402ff7 Free MLOps course: https://github.com/DataTalksClub/mlops-zoomcamp Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html
undefined
Jul 7, 2023 • 53min

Interpretable AI and ML - Polina Mosolova

We talked about: Polina's background How common it is for PhD students to build ML pipelines end-to-end Simultaneous PhD and industry experience Support from both the academic and industry sides How common the industrial PhD setup is and how to get into one Organizational trust theory How price relates to trust How trust relates to explainability The importance of actionability Explainability vs interpretability vs actionability Complex glass box models Does the explainability of a model follow explainability? What explainable AI bring to customers and end users Can all trust be turned into KPI? Links: LinkedIn: https://www.linkedin.com/in/polina-mosolova/ Neural Additive Models paper: https://proceedings.neurips.cc/paper/2021/file/251bd0442dfcc53b5a761e050f8022b8-Paper.pdf Neural Basis Model paper: https://arxiv.org/pdf/2205.14120.pdf Interpretable Feature Spaces paper: https://kdd.org/exploration_files/vol24issue1_1._Interpretable_Feature_Spaces_revised.pdf
undefined
Jun 30, 2023 • 54min

From Scratch to Success: Building an MLOps Team and ML Platform - Simon Stiebellehner

We talked about: Simon's background What MLOps is and what it isn't Skills needed to build an ML platform that serves 100s of models Ranking the importance of skills The point where you should think about building an ML platform The importance of processes in ML platforms Weighing your options with SaaS platforms The exploratory setup, experiment tracking, and model registry What comes after deployment? Stitching tools together to create an ML platform Keeping data governance in mind when building a platform What comes first – the model or the platform? Do MLOps engineers need to have deep knowledge of how models work? Is API design important for MLOps? Simon's recommendations for furthering MLOps knowledge Links: LinkedIn: https://www.linkedin.com/in/simonstiebellehner/ Github: https://github.com/stiebels Medium: https://medium.com/@sistel Free MLOps course: https://github.com/DataTalksClub/mlops-zoomcamp Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html
undefined
Jun 23, 2023 • 53min

From MLOps to DataOps - Santona Tuli

We talked about: Santona's background Focusing on data workflows Upsolver vs DBT ML pipelines vs Data pipelines MLOps vs DataOps Tools used for data pipelines and ML pipelines The “modern data stack” and today's data ecosystem Staging the data and the concept of a “lakehouse” Transforming the data after staging What happens after the modeling phase Human-centric vs Machine-centric pipeline Applying skills learned in academia to ML engineering Crafting user personas based on real stories A framework of curiosity Santona's book and resource recommendations Links: LinkedIn: https://www.linkedin.com/in/santona-tuli/ Upsolver website: upsolver.com Why we built a SQL-based solution to unify batch and stream workflows: https://www.upsolver.com/blog/why-we-built-a-sql-based-solution-to-unify-batch-and-stream-workflows Free MLOps course: https://github.com/DataTalksClub/mlops-zoomcamp Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html
undefined
Jun 16, 2023 • 51min

Data Developer Relations - Hugo Bowne-Anderson

We talked about: Hugo's background Why do tools and the companies that run them have wildly different names Hugo's other projects beside Metaflow Transitioning from educator to DevRel What is DevRel? DevRel vs Marketing How DevRel coordinates with developers How DevRel coordinates with marketers What skills a DevRel needs The challenges that come with being an educator Becoming a good writer: nature vs nurture Hugo's approach to writing and suggestions Establishing a goal for your content Choosing a form of media for your content Is DevRel intercompany or intracompany? The Vanishing Gradients podcast Finding Hugo online Links: Hugo Browne's github: http://hugobowne.github.io/ Vanishing Gradients: https://vanishinggradients.fireside.fm/ MLOps and DevOps: Why Data Makes It Differenthttps://www.oreilly.com/radar/mlops-and-devops-why-data-makes-it-different/ Evaluate Metaflow for free, right from your Browser: https://outerbounds.com/sandbox/ Free MLOps course: https://github.com/DataTalksClub/mlops-zoomcamp Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html
undefined
Jun 9, 2023 • 51min

Lessons Learned from Freelancing and Working in a Start-up - Antonis Stellas

We talked about; Antonis' background The pros and cons of working for a startup Useful skills for working at a startup and the Lean way to work How Antonis joined the DataTalks.Club community Suggestions for students joining the MLOps course Antonis contributing to Evidently AI How Antonis started freelancing Getting your first clients on Upwork Pricing your work as a freelancer The process after getting approved by a client Wearing many hats as a freelancer and while working at a startup Other suggestions for getting clients as a freelancer Antonis' thoughts on the Data Engineering course Antonis' resource recommendations Links: Lean Startup by Eric Ries: https://theleanstartup.com/ Lean Analytics: https://leananalyticsbook.com/ Designing Machine Learning Systems by Chip Huyen: https://www.oreilly.com/library/view/designing-machine-learning/9781098107956/ Kafka Streaming with python by Khris Jenkins tutorial video: https://youtu.be/jItIQ-UvFI4 Free MLOps course: https://github.com/DataTalksClub/mlops-zoomcamp Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode