The chapter explores the development of Apache Arrow as a versatile library for efficient data processing across languages and systems. It discusses the optimization of data structures for analytic operations in Arrowland, highlighting the benefits of columnar data organization for CPU and GPU processing efficiency. Additionally, it delves into the evolution of data frame libraries like IBIS, Modin, Pollers, and Dask, each offering unique approaches to extending pandas functionality for diverse use cases.
This episode dives into some of the most important data science libraries from the Python space with one of its pioneers: Wes McKinney. He's the creator or co-creator of pandas, Apache Arrow, and Ibis projects and an entrepreneur in this space.
Episode sponsors
Neo4j
Mailtrap
Talk Python Courses
Links from the show