Eric Snow and Anthony Shaw discuss the exciting developments in Python, such as sub-interpreters, Faster CPython, async work, and the adoption of typing. They explore the concept of subinterpreters in Python processes, highlighting the benefits of isolation and enabling concurrency. The discussion also touches on consolidating global state, using sub interpreters as an API for concurrent futures, managing Python processes with sub interpreters, and implementing multi-phase init in extension modules to support sub-interpreters in Python.
Python has introduced sub-interpreters, allowing for isolation and parallelization of code, promising more efficient and scalable applications.
The Faster CPython initiative, sub-interpreters, and optimization methods are contributing to Python's performance improvements.
Sub-interpreters offer benefits such as isolation and parallelism, but developers need to consider shared data and potential impacts on existing code.
Deep dives
Python's new isolation and parallelization capability through sub-interpreters
Python has introduced a new capability called sub-interpreters, which allows for isolation and parallelization of Python code. Sub-interpreters are separate instances of the Python runtime that can run concurrently and independently, each with its own global state. This capability has been eagerly anticipated and has now become a reality with the inclusion of sub-interpreters in Python 3.12. Sub-interpreters offer exciting possibilities for concurrency and parallelism, opening doors for more efficient and scalable Python applications.
Faster CPython and the adoption of sub-interpreters
The podcast episode delves into the progress being made in Python's performance, with the Faster C Python initiative, the introduction of sub-interpreters, and advancements in optimization techniques. The team behind Faster CPython, including Eric Snow, has been working to apply concepts from other dynamic languages and optimization methods to Python. While the performance improvements are still a work in progress, there is optimism for significant performance gains in Python 3.13. Collaboration with other projects, like Meta, has facilitated the sharing of ideas and efforts to make Python more efficient.
Benefits and challenges of sub-interpreters
Sub-interpreters bring benefits such as isolation and parallelism to Python, but they also introduce challenges. With sub-interpreters, developers have explicit control over what is shared between interpreters, ensuring thread safety and proper concurrency. Sub-interpreters provide a more controlled approach compared to threads, which may have hidden threading issues. However, these benefits come with the need for careful consideration of shared data and potential impacts on existing code or extension modules. The upcoming Python 3.13 is set to introduce a Python-level API for creating and interacting with sub-interpreters, making them more accessible for developers.
Using pickle for data transfer in multiprocessing
When using multiprocessing in Python, the typical way to send data to a process is through pickling. Data can be passed as parameters or through a queue or pipe, and the pickle module is used to convert Python objects into byte strings for transmission. However, pickling has limitations, as not all objects can be pickled and some complex objects may not be rehydrated properly. An alternative to pickle is the dill package, which can handle more exotic objects.
Sub interpreters as a more efficient alternative to multiprocessing
Sub interpreters, unlike separate processes in multiprocessing, share the same process and can be used to run fully parallel tasks. While the APIs for sub interpreters are similar to those of multiprocessing, the overhead and startup time of sub interpreters are much lower. Starting a sub interpreter is approximately 20-30 times faster than starting a separate process. Additionally, the memory and CPU requirements are smaller. However, when working with sub interpreters, it's important to carefully design the application considering data sharing, locking, and avoiding potential issues with certain modules that rely on global state.