The Real Python Podcast

Real Python
undefined
9 snips
Jul 29, 2022 • 59min

Natural Language Processing and How ML Models Understand Text

How do you process and classify text documents in Python? What are the fundamental techniques and building blocks for Natural Language Processing (NLP)? This week on the show, Jodie Burchell, developer advocate for data science at JetBrains, talks about how machine learning (ML) models understand text. Jodie explains how ML models require data in a structured format, which involves transforming text documents into columns and rows. She covers the most straightforward approach, called binary vectorization. We discuss the bag-of-words method and the tools of stemming, lemmatization, and count vectorization. We jump into word embedding models next. Jodie talks about WordNet, Natural Language Toolkit (NLTK), word2vec, and Gensim. Our conversation lays a foundation for starting with text classification, implementing sentiment analysis, and building projects using these tools. Jodie also shares multiple resources to help you continue exploring NLP and modeling. Course Spotlight: Learn Text Classification With Python and Keras In this course, you’ll learn about Python text classification with Keras, working your way from a bag-of-words model with logistic regression to more advanced methods, such as convolutional neural networks. You’ll see how you can use pretrained word embeddings, and you’ll squeeze more performance out of your model through hyperparameter optimization. Topics: 00:00:00 – Introduction 00:02:47 – Exploring the topic 00:06:00 – Perceived sentience of LaMDA 00:10:24 – How do we get started? 00:11:16 – What are classification and sentiment analysis? 00:13:03 – Transforming text in rows and columns 00:14:47 – Sponsor: Snyk 00:15:27 – Bag-of-words approach 00:19:12 – Stemming and lemmatization 00:22:05 – Capturing N-grams 00:25:34 – Count vectorization 00:27:14 – Stop words 00:28:46 – Text Frequency / Inverse Document Frequency (TFIDF) vectorization 00:32:28 – Potential projects for bag-of-words techniques 00:34:07 – Video Course Spotlight 00:35:20 – WordNet and NLTK package 00:37:27 – Word embeddings and word2vec 00:45:30 – Previous training and too many dimensions 00:50:07 – How to use word2vec and Gensim? 00:51:26 – What types of projects for word2vec and Gensim? 00:54:41 – Getting into GPT and BERT in another episode 00:56:11 – How to follow Jodie’s work? 00:57:36 – Thanks and goodbye Show Links: Why Google’s “sentient” AI LaMDA is nothing like a person. On NYT Magazine on AI: Resist the Urge to be Impressed | Emily M. Bender | Medium ELIZA - Wikipedia eliza.py - Python 2 version by Daniel Connelly dabraude/Pyliza: Python3 Implementation of Eliza magneticpoetry.com Natural Language Processing With Python’s NLTK Package – Real Python Practical Text Classification With Python and Keras – Real Python Sentiment Analysis: First Steps With Python’s NLTK Library – Real Python NLTK: Natural Language Toolkit spaCy · Industrial-strength Natural Language Processing in Python Natural Language Processing With spaCy in Python - Real Python Stemming - Wikipedia Lemmatization - Wikipedia Binary/Count Vectorization: sklearn.feature_extraction.text.CountVectorizer— scikit-learn TFIDF: sklearn.feature_extraction.text.TfidfVectorizer — scikit-learn Porter Stemmer: nltk.stem.porter module — NLTK Snowball Stemmer: nltk.stem.snowball module — NLTK WordNet Lemmatizer: nltk.stem.wordnet module — NLTK Lemmatizer · spaCy API Documentation Applying Bag of Words and Word2Vec models on Reuters-21578 Dataset Elvin Ouyang’s Blog UCI Machine Learning Repository: Reuters-21578 Text Categorization Collection Data Set The Illustrated Word2vec – Jay Alammar A Complete Guide to Using WordNET in NLP Applications Gensim: Topic modeling for humans Core Tutorials — gensim Find Open Datasets and Machine Learning Projects | Kaggle Engineering All Hands: Vectorise all the things! - YouTube PyCon Portugal 2022 NDC Oslo 2022 | Conference for Software Developers Jodie Burchell’s Blog - Standard error Jodie Burchell 🇦🇺🇩🇪 (@t_redactyl) / Twitter JetBrains: Essential tools for software developers and teams Level up your Python skills with our expert-led courses: Data Cleaning With pandas and NumPy Reading and Writing Files With pandas Learn Text Classification With Python and Keras Support the podcast & join our community of Pythonistas
undefined
Jul 22, 2022 • 55min

Creating Documentation With MkDocs & When to Use a Python dict

How do you start building your project documentation? What if you had a tool that could do the heavy lifting and automatically write large portions directly from your code? This week on the show, Christopher Trudeau is here, bringing another batch of PyCoder’s Weekly articles and projects. We talk about a Real Python step-by-step project from Martin Breuss about MkDocs. The project walks you through generating nice-looking and modern documentation from Markdown files and your existing code’s docstrings. The final step is to deploy your freshly generated documentation to a GitHub repository. Christopher talks about a pair of articles arguing for and against using Python dictionaries. The first article, “Just Use Dictionaries,” pushes to keep things simple, while the second article, “Don’t Let Dicts Spoil Your Code,” contends that complex projects require something more specific. We cover several other articles and projects from the Python community, including discussing the recent beta release of Python 3.11, 2FA for PyPI, procedural music composition with arvo, building a tic-tac-toe game with Python and Tkinter, common issues encountered while coding in Python, a type-safe library to generate SVG files, and a lightweight static analysis tool for your projects. Course Spotlight: Dictionaries and Arrays: Selecting the Ideal Data Structure In this course, you’ll learn about two of Python’s data structures: dictionaries and arrays. You’ll look at multiple types and classes for both of these and learn which implementations are best for your specific use cases. Topics: 00:00:00 – Introduction 00:02:39 – Python 3.11 Release May Be Delayed 00:03:39 – The cursed release of Python 3.11.0b4 is now available 00:05:01 – PyPI 2FA Security Key Giveaway 00:08:01 – Build Your Python Project Documentation With MkDocs 00:14:12 – Don’t Let Dicts Spoil Your Code 00:16:22 – Just Use Dictionaries 00:20:12 – Sponsor: Snyk.io 00:20:51 – Procedural Music Composition With arvo 00:29:10 – Build a Tic-Tac-Toe Game With Python and Tkinter 00:33:59 – Video Course Spotlight 00:35:35 – Most Common Issue You Have Coding With Python? 00:45:00 – svg.py: Type-Safe Library to Generate SVG Files 00:48:27 – semgrep: Lightweight Static Analysis for Many Languages 00:53:46 – Thanks and goodbye News: Python 3.11 Release May Be Delayed The cursed release of Python 3.11.0b4 is now available - Python.org. “We’ve begun rolling out a 2FA requirement”: PyPI on Twitter PyPI 2FA Security Key Giveaway Topic Links: Build Your Python Project Documentation With MkDocs – In this tutorial, you’ll learn how to build professional documentation for a Python package using MkDocs and mkdocstrings. These tools allow you to generate nice-looking and modern documentation from Markdown files and, more importantly, from your code’s docstrings. Don’t Let Dicts Spoil Your Code – The dict is the go-to data structure for Python programmers, but its loose relationship to the data can be problematic in large data streams. Learn more about why and when you might choose a different data structure. Just Use Dictionaries – Using simple data structures is an important part of keeping it simple, and Python is all about simplicity. Less code means fewer problems. Just use dictionaries. You probably don’t need classes. Procedural Music Composition With arvo – By using the music21 and avro libraries, you can create musical scores programmatically. This article runs you through which libraries you need and how you can compose your own music. Build a Tic-Tac-Toe Game With Python and Tkinter – In this step-by-step project, you’ll learn how to create a tic-tac-toe game using Python and the Tkinter GUI framework. Tkinter is cross-platform and is available in the Python standard library. Creating a game in Python is a great and fun way to learn something new and exciting! Discussion: Most Common Issue You Have Coding With Python? Projects: svg.py: Type-Safe Library to Generate SVG Files semgrep: Lightweight Static Analysis for Many Languages Additional Links: Getting 2FA Right in 2019 | Trail of Bits Blog autoDocstring - Python Docstring Generator - Visual Studio Marketplace styleguide | Style guides for Google-originated open-source projects Writing docstrings — Sphinx-RTD-Tutorial documentation Django Ninja arvo: Python library for procedural music composition Getting Started with arvo: a Procedural Music Composition Library - YouTube music21: a Toolkit for Computer-Aided Musicology music21: GitHub Free music composition and notation software | MuseScore Code Generation | MusicXML 4.0 Python GUI Programming With Tkinter – Real Python Tcl Developer Site tkinter — Python interface to Tcl/Tk — Python 3.10.5 documentation svg-xsd-schema: XSD schema for SVG SVG: Scalable Vector Graphics | MDN Level up your Python skills with our expert-led courses: Records and Sets: Selecting the Ideal Data Structure Documenting Code in Python Dictionaries and Arrays: Selecting the Ideal Data Structure Support the podcast & join our community of Pythonistas
undefined
Jul 15, 2022 • 1h 7min

Measuring Python Code Quality, Simplicity, and Maintainability

How maintainable is your Python code? Is it possible to hold the code for your functions in your head? When is it appropriate to use measurements in a code review? This week on the show, Reka Horvath and Ben Martineau from Sourcery are here to discuss their recent PyCon talk, “Actionable insights vs ranking: How to use and how NOT to use code quality metrics.” Reka and Ben share their thoughts on how metrics can provide insights into your Python code. We discuss four measurements of code complexity and what factors into each. We also talk about deciding whether to refactor or rewrite code. Ben and Reka share their experience in code review situations and the importance of shifting the conversation from subjective opinions toward objective measurements. Course Spotlight: Writing Idiomatic Python What are the programming idioms unique to Python? This course is a short overview for people coming from other languages and an introduction for beginners to the idiomatic practices within Python. You’ll cover truth values, looping, DRY principles, and the Zen of Python. Topics: 00:00:00 – Introduction 00:01:54 – Reka’s tutorials on Real Python 00:02:45 – PyCon US 2022 Talk 00:06:01 – Code reviews and metrics 00:09:42 – Trying to make things more objective 00:13:24 – Sponsor: CData Software 00:14:07 – Measuring WTFs/min and magical code 00:16:39 – Pythonic, idiomatic, and clean code 00:21:17 – Sourcery.ai and refactoring 00:24:04 – Four metrics to measure 00:24:56 – Function length 00:30:05 – Cyclomatic complexity 00:36:49 – Video Course Spotlight 00:38:21 – Cognitive complexity 00:44:38 – Working memory 00:51:49 – Suggestions on how to use the metrics 00:58:34 – Generating measurements 01:01:36 – What are you excited about in the world of Python? 01:03:30 – What do you want to learn next? 01:05:31 – Thanks and goodbye Show Links: Sourcery | Automatically Improve Python Code Quality Can you fit all of this code in your head? Talk - Reka/Ben: Actionable insights vs ranking How to use and how NOT to use code quality metrics - YouTube Using Pandas and Python to Explore Your Dataset – Real Python Refactoring - Martin Fowler WTFs/m – OSnews Who’s Your Coding Buddy? Principle of least astonishment - Wikipedia Anthony Shaw - Wily Python: Writing simpler and more maintainable Python - PyCon 2019 - YouTube Refactoring Python Applications for Simplicity – Real Python Cyclomatic complexity - Wikipedia Cognitive Complexity: A new way of measuring understandability - SonarSource The Magical Number Seven, Plus or Minus Two - Wikipedia wily · PyPI sourcery-analytics: A command line tool and library for statically analyzing Python code quality. Clean Code: A Handbook of Agile Software Craftsmanship - Robert C. Martin Errors and Exceptions — Python 3.10.5 documentation Rust Programming Language Ben Martineau – Medium Sourcery | Blog The Prodigy - ‘Breathe’ - YouTube Level up your Python skills with our expert-led courses: Python vs JavaScript for Python Developers Testing Your Code With pytest Writing Idiomatic Python Support the podcast & join our community of Pythonistas
undefined
8 snips
Jul 1, 2022 • 1h 14min

Exploring Functional Programming in Python With Bruce Eckel

Would you like to explore the functional programming side of Python? What are the advantages of this approach, and what tools are built into the language? This week on the show, author Bruce Eckel talks about functional programming in Python. Bruce is the author of several programming books, including Thinking in Java, Thinking in C++, Thinking in Python, Atomic Scala, and most recently, Atomic Kotlin. He’s been an explorer of programming languages over his career. Functional programming—with its lack of side effects, its transparency, and its potential for parallelization—has recently caught Bruce’s attention. Bruce’s talk “Making Data Classes Work for You” at PyCon US 2022 explored the idea of the invariance of objects. We also discuss his next book project, the Python community, and his affection for “un-conferences”. Bruce is hosting the upcoming Summer Tech Forum in Crested Butte, Colorado, this August. Course Spotlight: Using Data Classes in Python Data classes were introduced in Python 3.7. When using data classes, you don’t have to write boilerplate code to get proper initialization, representation, and comparisons for your objects. Topics: 00:00:00 – Introduction 00:02:15 – Musicians as programmers 00:03:40 – Happy Path Programming Podcast 00:04:19 – The essence of functional programming 00:10:55 – Making Data Classes Work for You - PyCon US 2022 talk 00:15:14 – What are things that have drawn you to Python? 00:17:39 – More than just a scripting language 00:21:35 – Consulting with companies 00:23:40 – Video Course Spotlight 00:25:04 – The frozen flag in a data class and __post_init__ 00:29:02 – Invariance and functional programming 00:31:02 – How do you see it changing your programming? 00:34:49 – Not forcing an “approach” to Python 00:38:46 – Writing style in programming and books 00:42:31 – Who has been the intended audience for your books? 00:45:26 – Educational debt 00:46:50 – Python’s community 00:51:03 – New book about concurrency 01:00:51 – Winter and Summer Tech Forums 01:07:23 – Conference precautions and safety 01:08:51 – What are you excited about in the world of Python? 01:11:40 – What do you want to learn next? 01:12:23 – How can people follow the things you do? 01:13:13 – Thanks and goodbye Show Links: Computing Thoughts - Bruce Eckel’s Programming Blog MindView Exceptional Learning Experiences Happy Path Programming • A podcast on Anchor Bruce Eckel: Making Data Classes Work for You - PyCon US 2022 - YouTube Data Classes in Python 3.7+ (Guide) – Real Python Functional Programming in Python: When and How to Use It – Real Python Function composition (computer science) - Wikipedia Monad (functional programming) - Wikipedia Smalltalk - Wikipedia C++ - Wikipedia Java (programming language) - Wikipedia Object-oriented programming - Wikipedia Learning Resources for the Kotlin Programming Language Episode #88: Discussing Type Hints, Protocols, and Ducks in Python – The Real Python Podcast Fluent Python, 2nd Edition Summer Tech Forum Unconference: Self-Organizing Open Spaces in Crested Butte, Colorado August 15-19, 2022 Bruce Eckel (@BruceEckel) / Twitter Level up your Python skills with our expert-led courses: Python Decorators 101 Using Data Classes in Python Using Python Lambda Functions Support the podcast & join our community of Pythonistas
undefined
Jun 24, 2022 • 56min

Digging Into PyScript & Preventing or Handling Python Errors

Have you heard about PyScript? The brand-new framework has the community excited about building interactive Python applications that run entirely within the user’s browser. Would you like to dig into the details beyond the “Hello World” examples? This week on the show, Christopher Trudeau is here, bringing another batch of PyCoder’s Weekly articles and projects. We talk about a Real Python tutorial from Bartosz Zaczyński about PyScript. The article provides an initial look at the framework and then takes you deep into the intricacies. We also share additional resources to help familiarize you with the project. Christopher talks about another Real Python article, this one on how to approach managing errors in Python. The tutorial “LBYL vs EAFP: Preventing or Handling Errors in Python” is from frequent contributor Leodanis Pozo Ramos. We cover several other articles and projects from the Python community, including discussing the PSF’s Python Developers Survey 2021 Results, Django static files and templates, how to profile Python code, a launcher for penetration testing, and a project for confirming Python versions through syntax errors. Course Spotlight: Building a Django User Management System In this video course, you’ll learn how to extend your Django application with a user management system, complete with email sending and third-party authentication. Topics: 00:00:00 – Introduction 00:02:32 – LBYL vs EAFP: Preventing or Handling Errors in Python 00:08:38 – How To Profile Python Code 00:18:52 – Sponsor: CData Software 00:19:35 – Django Static Files and Templates 00:26:35 – A First Look at PyScript: Python in the Web Browser 00:38:20 – Video Course Spotlight 00:39:39 – Python Developers Survey 2021 Results 00:48:19 – python-syntax-errors: Version-Specific No-Ops 00:50:03 – arsenal: Inventory Launcher for Penetration Testing 00:55:03 – Thanks and goodbye Topics Links: LBYL vs EAFP: Preventing or Handling Errors in Python – In this tutorial, you’ll learn about two popular coding styles in Python: look before you leap (LBYL) and easier to ask forgiveness than permission (EAFP). You can use these styles to deal with errors and exceptional situations in your code. You’ll dive into the LBYL vs EAFP discussion in Python. How To Profile Python Code – No matter how good you are, sometimes your code just runs slowly. Learning how to properly profile your software to identify and fix bottlenecks is a useful skill. This article talks about what you need to know to measure your code’s performance and how to use the cProfile, profile, and timeit libraries, along with others. Django Static Files and Templates – “Static files like CSS, JavaScript, and fonts are a core piece of any modern web application. They are also typically confusing for Django newcomers since Django provides tremendous flexibility around how these files are used. This tutorial will demonstrate current best practices for configuring static files in both local development and production.” A First Look at PyScript: Python in the Web Browser – In this tutorial, you’ll learn about PyScript, a new framework that allows for running Python in the web browser with few or no code modifications and excellent performance. You’ll leverage browser APIs and JavaScript libraries to build rich, highly interactive web applications with Python. Discussion: Python Developers Survey 2021 Results | PSF Python Developers Survey 2021 Results | JetBrains Raw Data SurveyExport.csv | Google Drive Projects: python-syntax-errors: Version-Specific No-Ops arsenal: Inventory Launcher for Penetration Testing navi: An interactive cheatsheet tool for the command-line eg: Useful examples at the command line. Additional Links: “Zero cost” exception handling · Issue #84403 Python 3.11 Preview: Task and Exception Groups | Real Python PyCon US 2022 Keynote - Peter Wang - PyScript | YouTube PyScript | Run Python in your HTML pyscript: Getting started pyscript: Pyscript examples Python Timer Functions: Three Ways to Monitor Your Code | Real Python Amdahl’s law | Wikipedia Django Books | William Vincent Django Chat Episode #114: Getting Started in Python Cybersecurity and Forensics | The Real Python Podcast Level up your Python skills with our expert-led courses: Building a Django User Management System How to Set Up a Django Project Raising and Handling Python Exceptions Support the podcast & join our community of Pythonistas
undefined
Jun 17, 2022 • 1h 2min

Getting Started in Python Cybersecurity and Forensics

Are you interested in a career in security using Python? Would you like to stay ahead of potential vulnerabilities in your Python applications? This week on the show, James Pleger talks about Python information security, incident response, and forensics. James has been doing information security for over fifteen years, working at some of the biggest companies, government agencies, and startups. He shares numerous Python resources to dive into detecting threats and improving your projects. We discuss how to learn about security topics and get involved in the community. Make sure you check out the massive collection of links in the show notes this week. Course Spotlight: Exploring HTTPS and Cryptography in Python In this course, you’ll gain a working knowledge of the various factors that combine to keep communications over the Internet safe. You’ll see concrete examples of how to keep information secure and use cryptography to build your own Python HTTPS application. Topics: 00:00:00 – Introduction 00:01:28 – How did you find the show? 00:02:00 – Evolution of roles in security 00:04:09 – Why is Python leveraged in security? 00:07:51 – Red team vs blue team 00:10:16 – Application security and bug bounties 00:13:31 – What’s your background? 00:14:07 – Company focus between regulations vs engineering 00:18:09 – Ways to get involved and keep learning 00:21:56 – Different perspective from computer science 00:23:35 – Red vs blue reprise 00:25:07 – Shifting landscape of vulnerabilities 00:30:06 – How do you approach tests? 00:32:30 – Incident response 00:35:54 – Video Course Spotlight 00:37:19 – Where does Python come in during an incident? 00:43:08 – Crossing into forensic research 00:48:43 – Where to practice security research and learn more? 00:51:41 – What’s the security community like? 00:56:05 – What are you excited about in the world of Python? 00:57:53 – What do you want to learn next? 01:00:17 – Where can people learn more about what you do? 01:00:39 – Thanks and goodbye Security Specific Tools Written in Python: binwalk: Firmware Analysis Tool | ReFirmLabs binaryalert: BinaryAlert: Serverless, Real-time & Retroactive Malware Detection | airbnb Cuckoo Sandbox - Automated Malware Analysis YARA - The pattern matching Swiss knife for malware researchers Scapy: Python-based interactive packet manipulation program & library radare2-bindings: Bindings of the r2 api for Valabind and friends python-iocextract: Defanged Indicator of Compromise (IOC) Extractor | InQuest yeti: Your Everyday Threat Intelligence capa: The FLARE team’s open-source tool to identify capabilities in executable files PDF Tools | Didier Stevens Incident Response and Memory Forensics: volatility: An advanced memory forensics framework | Volatility Foundation FIR: Fast Incident Response | CERT Societe Generale (Computer Emergency Response Team) GRR Rapid Response: Remote live forensics for incident response | Google Honeypot Resources: What is a Honeypot? How It Can Trap Cyberattackers | CrowdStrike awesome-honeypots: An awesome list of honeypot resources Bug Bounty Programs: Bug Bounty Program List - All Active Programs in 2022 | Bugcrowd Bug Bounty Program - Complete List | HackerOne TOP Bug Bounty Programs & Websites (Jun 2022 Updated List) Security and Hacking Conferences: Black Hat USA 2022 DEF CON® Hacking Conference Home Chaos Communication Congress - Wikipedia CactusCon Additional Links: Blue team (computer security) - Wikipedia Open Source Projects for Software Security | OWASP Foundation HackerOne | #1 Trusted Security Platform and Hacker Program Bugcrowd | Platform Overview pyinstaller · PyPI Wireshark · Go Deep. Python security best practices cheat sheet | Snyk PyCharm Python Security Scanner · Actions · GitHub Marketplace Security scanners for Python and Docker: from code to dependencies Bandit — Designed to find common security issues in Python code black · PyPI Build a Site Connectivity Checker in Python – Real Python Kali Linux | Penetration Testing and Ethical Hacking Linux Distribution Level up your Python skills with our expert-led courses: Exploring HTTPS and Cryptography in Python Django View Authorization: Restricting Access Testing Your Code With pytest Support the podcast & join our community of Pythonistas
undefined
Jun 10, 2022 • 50min

Build Streamlit Data Science Dashboards & Verbose Regex f-Strings

Would you like a fast way to share your data science project results as an interactive dashboard instead of a Jupyter notebook? Streamlit is a library for creating simple web apps and dashboards using just Python. This week on the show, Christopher Trudeau is here, bringing another batch of PyCoder’s Weekly articles and projects. We talk about the article “Forget About Jupyter Notebooks - Showcase Your Research Using Dashboards.” It covers the basics of turning a data science script into an interactive dashboard using Streamlit. We also share some additional resources to get you started with the library. Christopher discusses an article covering ways to make life easier when working with Python regular expressions. He talks about composing verbose regexes using f-strings and potentially reusing these patterns. We cover several other articles and projects from the Python community, including a news roundup, a step-by-step project to build a URL shortener with FastAPI, the fact that Python’s functions are sometimes classes, an automatic water pistol pigeon deterrent project, a discussion about music playlists for coding, a project for Python metadata extraction without execution, and a powerful audio-to-MIDI converter library. Course Spotlight: Using Python Class Constructors In this video course, you’ll learn how class constructors work in Python. You’ll also explore Python’s instantiation process, which has two main steps: instance creation and instance initialization. Topics: 00:00:00 – Introduction 00:02:22 – ctx Library Hijacked to Steal AWS Keys 00:04:33 – Typosquatting Attack on ‘requests’ 00:06:55 – Build a URL Shortener With FastAPI and Python 00:10:51 – Sponsor: Rookout 00:11:31 – Python’s Functions Are Sometimes Classes 00:14:05 – Forget Jupyter, Showcase Your Data with Dashboards 00:22:08 – The Unreasonable Effectiveness of f-strings and re.VERBOSE 00:25:43 – Robotic Water Pistol as Pigeon Deterrent 00:28:13 – Video Course Spotlight 00:29:34 – Do You Have a Favorite Playlist for Coding? 00:40:05 – dowsing: Metadata Extraction Without Execution 00:42:01 – spotify/basic-pitch: A lightweight yet powerful audio-to-MIDI converter 00:49:12 – Thanks and goodbye News: ctx Library Hijacked to Steal AWS Keys Typosquatting Attack on ‘requests’ - One of the Most Popular Python packages Topic Links: Build a URL Shortener With FastAPI and Python – In this step-by-step project, you’ll build an app to create and manage shortened URLs. Your Python URL shortener can receive a full target URL and return a shortened URL. You’ll also use the automatically created documentation of FastAPI to try out your API endpoints. Python’s Functions Are Sometimes Classes – Ever use list() or enumerate()? Think of them as functions? They’re not—they’re classes. Sometimes we call classes functions in Python. Why? And what’s a “callable”? Forget Jupyter, Showcase Your Data with Dashboards – Streamlit can be used as an alternative to Jupyter notebooks for sharing research data. Streamlit is a relatively new library for creating simple web apps and dashboards using just Python. Learn why it might be the right choice for your next data project. The Unreasonable Effectiveness of f-strings and re.VERBOSE – A look at one or two ways to make life easier when working with Python regular expressions. Robotic Water Pistol as Pigeon Deterrent – Max built a wifi-equipped water gun to shoot the pigeons on his balcony. It is controlled over the Internet by a Python script running openCV reading the camera image from an old iPhone. See all the details. Discussion: Do You Have a Favorite Playlist for Coding? Projects: dowsing: Metadata Extraction Without Execution spotify/basic-pitch: A lightweight yet powerful audio-to-MIDI converter with pitch bend detection Additional Links: The Twelve-Factor App Streamlit vs Dash vs Voilà vs Panel — Battle of The Python Dashboarding Giants Getting machine learning to production | Vicki Boykis Detecting deforestation from satellite images | André Ferreira Create an app - Streamlit Docs spotify/pedalboard: 🎛 🔊 A Python library for manipulating audio. Episode #96: Manipulating and Analyzing Audio in Python – The Real Python Podcast Episode #15: Python Regular Expressions, Views vs Copies in Pandas, and More – The Real Python Podcast Episode #64: Detecting Deforestation With Python & Using GraphQL With Django and Vue – The Real Python Podcast Level up your Python skills with our expert-led courses: Data Visualization Interfaces in Python With Dash Regular Expressions and Building Regexes in Python Using Python Class Constructors Support the podcast & join our community of Pythonistas
undefined
32 snips
Jun 3, 2022 • 46min

Managing Large Python Data Science Projects With Dask

What do you do when your data science project doesn’t fit within your computer’s memory? One solution is to distribute it across multiple worker machines. This week on the show, Guido Imperiale from Coiled talks about Dask and managing large data science projects through distributed computing. We talk about projects where an orchestration system like Dask will help. Dask is designed to take advantage of parallel computing, spreading the work and data across multiple machines. Many familiar techniques for working with pandas and NumPy data are supported with Dask equivalents. We also discuss the differences between managed and unmanaged memory. Guido shares advice on how to tackle memory issues while working with Dask. This week we also talk briefly with Jodie Burchell, who will be a guest host on upcoming episodes. As a data scientist, Jodie will be bringing new topics, projects, and discussions to the show. Course Spotlight: Exploring Scopes and Closures in Python In this Code Conversation video course, you’ll take a deep dive into how scopes and closures work in Python. To do this, you’ll use a debugger to walk through some sample code, and then you’ll take a peek under the hood to see how Python holds variables internally. Topics: 00:00:00 – Introduction 00:01:56 – Guido at PyCon DE 2022 00:02:14 – Working on Dask for Coiled 00:03:27 – Dask project history 00:04:00 – How would someone start to use Dask? 00:10:28 – Managing distributed data 00:11:18 – Data files CSV vs Parquet 00:15:02 – Managed vs unmanaged memory 00:22:42 – Video Course Spotlight 00:24:01 – Dask active memory manager 00:28:36 – Learning best practices and Dask tutorials 00:33:06 – Where is Dask being used? 00:35:45 – What are you excited about in the world of Python? 00:37:55 – What do you want to learn next? 00:40:31 – Thanks, Guido 00:40:40 – Introduction to Jodie Burchell 00:45:28 – Goodbye Show Links: Coiled | Python for Data Science on the Cloud with Dask Guido Imperiale: Introducing the Dask Active Memory Manager - PyCon DE 2022 - YouTube Active Memory Management on Dask.Distributed - Guido Imperiale | Dask Summit 2021 - YouTube Tackling unmanaged memory with Dask | Coiled The Beginner’s Guide to Distributed Computing | Richard Pelgrim Common Mistakes to Avoid when Using Dask | Coiled File Format | Apache Parquet Dask: Scalable analytics in Python PEP 554 – Multiple Interpreters in the Stdlib | peps.python.org CUDA Python | NVIDIA Developer Rust Programming Language Product : Coiled Coiled (@CoiledHQ) / Twitter Jodie Burchell (@t_redactyl) | Twitter Learn Python through Nursery Rhymes and Fairy Tales: Shari Eskenas - Amazon Level up your Python skills with our expert-led courses: Data Cleaning With pandas and NumPy Navigating Namespaces and Scope in Python Exploring Scopes and Closures in Python Support the podcast & join our community of Pythonistas
undefined
May 27, 2022 • 52min

Questions for New Dependencies & Comparing Python Game Libraries

What are the differences between the various Python game frameworks? Would it help to see a couple of game examples across several libraries to understand the distinctions? This week on the show, Christopher Trudeau is here, bringing another batch of PyCoder’s Weekly articles and projects. We discuss a Real Python article by previous guest Jon Fincher titled “Top Python Game Engines”. Jon compares five different game frameworks and provides example projects and thorough commentary for each. We talk about a blog post by recent guest Adam Johnson about determining if a project is well maintained. He suggests twelve questions to decide whether to add a new dependency to your project. We cover several other articles and projects from the Python community, including a news roundup, Python decorator patterns, finding the smallest and largest values with min() and max(), a discussion about the most-used Python packages, the pony object-relational mapper, and a project to read PEPs in your console. Course Spotlight: Using Pygame to Build an Asteroids Game in Python In this course, you’ll build a clone of the Asteroids game in Python using Pygame. Step by step, you’ll add images, input handling, game logic, sounds, and text to your program. Topics: 00:00:00 – Introduction 00:02:04 – News: Python Release Python 3.11.0b1 00:02:56 – Faster CPython project 00:04:16 – nogil conversation at the 2022 summit 00:06:08 – PEP 690: Lazy Imports 00:08:14 – DjangoCon US & Europe 2022 Call for Proposals 00:09:25 – Top Python Game Engines 00:20:38 – Sponsor: CData Software 00:21:20 – Python Decorator Patterns 00:24:00 – The Well-Maintained Test: 12 Questions for New Dependencies 00:29:27 – Python’s min() and max(): Find Smallest and Largest Values 00:33:33 – Video Course Spotlight 00:34:44 – Which Python Packages Do You Use the Most? 00:41:42 – pony: Pony Object Relational Mapper 00:47:41 – pepdocs: Read PEPs in Your Console 00:50:20 – Thanks and goodbye News: Python Release Python 3.11.0b1 Faster CPython faster-cpython/cpython: The Python programming language nogil conversation at the 2022 summit PEP 690: Lazy Imports DjangoCon Europe 2022 Call for Proposals DjangoCon US 2022 Call for Proposals Topic Links: Top Python Game Engines – In this tutorial, you’ll explore several Python game engines available to you. For each, you’ll code simple examples and a more advanced game to learn the game engine’s strengths and weaknesses. Python Decorator Patterns – Decorators are a way of wrapping functions around functions, they’re a common technique for providing pre- and post-conditions on your code. Learn about the different ways decorators get invoked and how to write each pattern. The Well-Maintained Test: 12 Questions for New Dependencies – There is lots of openly available code out there, but how do you know if you should build a dependency on some random coder’s package? 12 Questions you should ask yourself before using a library. Python’s min() and max(): Find Smallest and Largest Values – In this tutorial, you’ll learn how to use Python’s built-in min() and max() functions to find the smallest and largest values. You’ll also learn how to modify their standard behavior by providing a suitable key function. Finally, you’ll code a few practical examples of using min() and max(). Discussion: Which Python Packages Do You Use the Most? Projects: pony: Pony Object Relational Mapper pepdocs: Read PEPs in Your Console Additional Links: Eric Snow on Twitter:”We need your help making CPython faster. Publish benchmarks for your app or library.” Episode #59: Organizing and Restructuring DjangoCon Europe 2021 – The Real Python Podcast PyGame: A Primer on Game Programming in Python – Real Python Make a 2D Side-Scroller Game With PyGame – Real Python Primer on Python Decorators – Real Python The Joel Test: 12 Steps to Better Code – Joel on Software Episode #97: Improving Your Django and Python Developer Experience – The Real Python Podcast pyflakes · PyPI coverage · PyPI pudb · PyPI Django | The web framework for perfectionists with deadlines django-awl · PyPI waelstow · PyPI six · PyPI Pygments — Welcome! chardet · PyPI certifi · PyPI pandas - Python Data Analysis Library Project Jupyter | Home Bokeh Level up your Python skills with our expert-led courses: Using Pygame to Build an Asteroids Game in Python Python's map() Function: Transforming Iterables Python Decorators 101 Support the podcast & join our community of Pythonistas
undefined
May 20, 2022 • 58min

Advantages of Protobuf for Serialization in Python

Would you like a way to send structured serialized data between different platforms and languages? What if the data was self-documenting, could automatically generate Python code, and would validate itself? This week on the show, Liran Haimovitch talks about protocol buffers and communicating with microservices through Remote Procedure Calls (RPC). Protocol buffers, aka protobuf, are a language-neutral, platform-neutral system for serializing structured data. Liran talks about how they go beyond text-based protocols like JSON, providing the benefits above, along with faster transmissions and a smaller footprint. Liran shares how his company uses protobuf to communicate between their tools. We also discuss using gRPC to communicate between microservices and scaling infrastructure in either direction. Course Spotlight: Testing Your Code With pytest In this video course, you’ll learn how to take your testing to the next level with pytest. You’ll cover intermediate and advanced pytest features such as fixtures, marks, parameters, and plugins. With pytest, you can make your test suites fast, effective, and less painful to maintain. Topics: 00:00:00 – Introduction 00:01:59 – PyCon US 2022 Talk on protobuf 00:04:46 – PyCon 2019 Talk on Understanding Python’s Debugging Internals 00:05:34 – The Production-First Mindset Podcast 00:07:03 – Protobuf and serialization 00:11:17 – Static vs dynamic serializers 00:13:58 – Text vs binary serializers and metadata 00:21:08 – How long have you been using protobuf? 00:21:40 – What does it look like to set up? 00:24:45 – Video Course Spotlight 00:26:11 – Performance challenges and trade-offs 00:34:29 – Remote procedure calls 00:41:13 – Using RPC for microservices 00:47:21 – Scaling your infrastructure up or down 00:50:35 – Working across different languages 00:54:02 – What is Rookout? 00:55:11 – What are you excited about in the world of Python? 00:55:59 – What do you want to learn next? 00:56:57 – How can people learn more about what you do? 00:57:31 – Thanks and goodbye Show Links: Rookout | Painless Cloud-Native Debugging Liran Haimovitch - Understanding Python’s Debugging Internals - PyCon 2019 - YouTube The Production-First Mindset Podcast Liran Haimovitch: Effective Protobuf: Everything You Wanted To Know, But Never Dared To Ask - PyCon 2022 - YouTube Protocol Buffers  |  Google Developers Frequently Asked Questions - Protocol Buffers  |  Google Developers Thrift protocol stack — Thrift Tutorial 1.0 documentation Python Microservices With gRPC – Real Python gRPC - Modern Open Source High Performance Remote Procedure Call Framework gRPC vs REST: Understanding gRPC, OpenAPI and REST and when to use them in API design | Google Cloud Blog Welcome to PyCon US 2022 Rust Programming Language Level up your Python skills with our expert-led courses: Working With JSON in Python Testing Your Code With pytest Deploy Your Python Script on the Web With Flask Support the podcast & join our community of Pythonistas

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app