Discover how a recent antitrust ruling ensures Mozilla's financial future while shaking up its historic ties with Google. Dive into the innovative troml tool that optimizes trove classifiers for Python projects, making package management easier. Learn about the advantages of Parquet files over traditional formats for handling complex datasets. Plus, catch a glimpse of exciting developments coming with Python 3.14 and get ready for the upcoming PyBay conference, where tech meets fun!
31:29
forum Ask episode
web_stories AI Snips
view_agenda Chapters
auto_awesome Transcript
info_circle Episode notes
insights INSIGHT
Mozilla’s Fragile Funding From Google
Mozilla depends on Google's search payments for roughly 85% of its revenue, which makes its independence fragile.
The court ruling allows Google to keep paying Mozilla only if deals are non-exclusive, so Mozilla's future funding remains uncertain.
volunteer_activism ADVICE
Diversify Revenue Before It's Too Late
Do diversify Mozilla's revenue sources instead of relying on Google to avoid existential risk.
Start building non-Google products and services now to survive potential funding loss.
volunteer_activism ADVICE
Automate Trove Classifiers With Trommel
Run Troml (Trommel) to auto-suggest or fix Trove classifiers in your pyproject.toml.
Use trommel check in CI or pre-commit to ensure classifiers stay accurate and exact.
Get the Snipd Podcast app to discover more snips from this episode
Join us on YouTube at pythonbytes.fm/live to be part of the audience. Usually Monday at 10am PT. Older video versions available there too.
Finally, if you want an artisanal, hand-crafted digest of every week of the show notes in email form? Add your name and email to our friends of the show list, we'll never share it.
A judge lets Google keep paying Mozilla to make Google the default search engine but only if those deals aren’t exclusive.
More than 85% of Mozilla’s revenue comes from Google search payments.
The ruling forbids Google from making exclusive contracts for Search, Chrome, Google Assistant, or Gemini, and forces data sharing and search syndication so rivals get a fighting chance.
Note that just saying you require 3.9+ doesn’t tell the user that you’ve actually tested stuff on 3.14. I like to keep Trove Classifiers around for this reason.
Also, License classifier is deprecated, and if you include it, it shows up in two places, in Meta, and in the Classifiers section. Probably good to only have one place. So I’m going to be removing it from classifiers for my projects.
One problem, classifier text has to be an exact match to something in the classifier list, so we usually recommend copy/pasting from that list.
But no longer! Just use troml!
It just fills it in for you (if you run troml suggest --fix). How totally awesome is that!
I tried it on pytest-check, and it was mostly right. It suggested me adding 3.15, which I haven’t tested yet, so I’m not ready to add that just yet. :)
pqrs is a command line tool for inspecting Parquet files
This is a replacement for the parquet-tools utility written in Rust
Built using the Rust implementation of Parquet and Arrow
pqrs roughly means "parquet-tools in rust"
Why Parquet?
Size
A 200 MB CSV will usually shrink to somewhere between about 20-100 MB as Parquet depending on the data and compression. Loading a Parquet file is typically several times faster than parsing CSV, often 2x-10x faster for a full-file load and much faster when you only read some columns.
Speed
Full-file load into pandas: Parquet with pyarrow/fastparquet is usually 2x–10x faster than reading CSV with pandas because CSV parsing is CPU intensive (text tokenizing, dtype inference).
Example: if read_csv is 10 seconds, read_parquet might be ~1–5 seconds depending on CPU and codec.
Column subset: Parquet is much faster if you only need some columns — often 5x–50x faster because it reads only those column chunks.
Predicate pushdown & row groups: When using dataset APIs (pyarrow.dataset) you can push filters to skip row groups, reducing I/O dramatically for selective queries.
Memory usage: Parquet avoids temporary string buffers and repeated parsing, so peak memory and temporary allocations are often lower.
Brian #4: Testing for Python 3.14
Python 3.14 is just around the corner, with a final release scheduled for October.