779: The Tidyverse of Essential R Libraries and their Python Analogues, with Dr. Hadley Wickham
Apr 30, 2024
auto_awesome
Dr. Hadley Wickham discusses Tidyverse essentials for data scientists, Posit's goals, bringing programming languages together, principles for tech longevity, ggplot2 development, contributing to open-source projects
Tidyverse is essential for data scientists, Posit's rebrand prioritizes community impact, Quarto transforms Jupyter Notebooks into polished presentations.
Deep dives
S7 Object-Oriented Programming Aimed at Finding the Sweet Spot between S3 and S4
S7, also known as R7, seeks to merge the lightweight conventions of S3 with the formalism of S4 in a backward compatible way. By combining the best features of both, S7 aims to provide a balanced approach to object-oriented programming in R, ensuring easier coding, better documentation, and improved tooling.
Future of Data Communication and Publication with Quarto
Quarto functions as a scientific and technical communication tool, enabling users to seamlessly integrate code, results, and textual analysis. It ensures document reproducibility, allowing for quick reexecution of the code to update the document with the latest data. Quarto revolutionizes data publication by providing a high-fidelity, interactive, and fully reproducible document format.
Object-Oriented Programming Principles in S7 and its Impact on Data Scientists
The S7 approach to object-oriented programming, focusing on finding a balance between lightweight conventions and formal structures, may not drastically impact a data scientist's daily work. However, the implementation in R packages can enhance developer productivity, code efficiency, and maintainability, benefiting data science projects in the long run.
Legacy Building Philosophies of Posit for Long -term Technological Impact
Posit's focus on being a public benefit corp demonstrates a commitment beyond short-term profit. By prioritizing stakeholders beyond shareholders, such as the community and employees, Posit aims to create impactful tools that improve data scientists' lives. The long-term vision, not driven by VC pressures, allows Posit to prioritize the right actions over expedient decisions, fostering a culture of innovation and social responsibility.
The Power of Quoto for Data Scientists
Quoto is a powerful tool that can transform Jupyter Notebooks into polished presentations or documents suitable for sharing with a broader audience. By utilizing Quoto on top of Pandoc, users can generate various output formats like PowerPoint presentations, websites, and physical books from the same source. This flexibility allows for easy dissemination of complex data insights in a more digestible format, bridging the gap between technical and non-technical stakeholders.
Expanding Beyond the R Community
The podcast delves into RStudio's rebranding to Posit and its efforts to engage with a wider audience, particularly in the Python community. The challenge of reaching out to a larger and more diverse user base prompts discussions on evolving marketing strategies and community engagement tactics. As the data science landscape shifts, Posit aims to adapt its approach by listening to community feedback, addressing pain points, and exploring innovative ways to connect with a broader demographic.
Tidyverse, ggplot2, and the secret to a tech company’s longevity: Hadley Wickham talks to Jon Krohn about Posit’s rebrand, Tidyverse and why it needs to be in every data scientist’s toolkit, and why getting your hands dirty with open-source projects can be so lucrative for your career.
In this episode you will learn: • All about the Tidyverse [04:46] • Hadley’s favorite R libraries [17:10] • The goal of Posit [30:29] • On bringing multiple programming languages together [36:02] • The principles for a long-lasting tech company [52:10] • How Hadley developed ggplot2 [55:24] • How to contribute to the open-source community [1:05:43]