

cuDF, cuML & RAPIDS: GPU Accelerated Data Science with Paul Mahler - TWiML Talk #254
Apr 19, 2019
Join Paul Mahler, a senior data scientist at NVIDIA, discussing the RAPIDS open-source project designed to enhance GPU acceleration in data science. He shares his journey from philosophy to machine learning, highlighting tools like cuDF and cuML that boost data processing efficiency. Discover how GPUs revolutionize algorithms like ridge regression with impressive speed and explore innovations in natural language processing and graph algorithms. Paul also emphasizes the importance of community involvement in shaping the future of these pivotal technologies.
AI Snips
Chapters
Transcript
Episode notes
From Economics to Data Science
- Paul Mahler's interest in economics stemmed from reading The Economist.
- His shift to data science was inspired by an article about an algorithm providing screenplay feedback.
RAPIDS: GPU Acceleration for Data Science
- RAPIDS, including cuDF and cuML, aims to GPU-accelerate data science workflows.
- This acceleration significantly reduces processing time, akin to moving from driving to flying.
RAPIDS Ecosystem
- RAPIDS comprises sub-libraries like cuDF (GPU DataFrame) and cuML (machine learning toolkit).
- cuDF mirrors Pandas, while cuML aims to provide Scikit-learn functionality on GPUs.