AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
Research on open source software has a rich history, with chaos bringing together diverse research metrics under a common umbrella. By standardizing how metrics are counted, chaos fosters consistency and comparability among researchers, ensuring uniformity in data interpretation.
Concerns arise regarding the use of open source data, leading to ethical considerations in research and industry applications. Collaborative efforts between academia and industry aim to bridge gaps in research methodologies, data privacy, and ethical data usage, emphasizing the need for clear standards and careful data presentation.
Open source research delves into language variations, social network influences, and cultural etiquette, enriching the understanding of open source dynamics beyond computational algorithms. Multimodal research methods, including language analysis, surveys, and social engagements, broaden the perspective and insights drawn from open source ecosystems.
The convergence of academic research and industry applications in open source presents opportunities for synergistic collaborations. Initiatives promoting common taxonomies, aligning research incentives, and supporting cross-disciplinary projects foster a more holistic and impactful approach to open source research.
Panelists discuss their recent milestones and collaborations in inclusive open source research, cultural dialogues, book publications, and keynote presentations. The emphasis lies on sharing insights, leveraging diverse research methodologies, and fostering global partnerships to advance open source understanding and practices.
Thank you to the folks at Sustain for providing the hosting account for CHAOSSCast!
CHAOSScast – Episode 79
In this episode, host Georg Link is joined by Daniel, Anita, Sophia, and Sean, to discuss their research experiences with CHAOSS metrics and software for open source community health analysis. They dive into various topics, such as collecting and interpreting data from different perspectives, considerations regarding privacy and ethics, and the importance of collaboration between academics and industry professionals. They also highlight some significant projects and studies where CHAOSS metrics and software were employed, and their hopes and concerns for the future direction of research in the field. Furthermore, they discuss the necessity of bridging the gap between academia and industry and touch on the importance of linguistics and cultural context when examining data. Download this episode now!
[00:02:48] Anita discusses the history of open source software research and how CHAOSS provides a common framework for various metrics used by researchers, and Sean emphasizes the standardization of metrics by CHAOSS, which aids in consistency across research.
[00:04:52] Sophia highlights the discrepancies in metric calculations and definitions, seeking standard methodologies, especially for non-academic publications, and Daniel reflects on the differences in research approaches between academia and industry, emphasizing the importance of methodological rigor.
[00:08:25] Sean critiques academic papers for often lacking complete method descriptions, calling for a more rigorous methodological transparency, and Daniel shares about transitioning from academia to industry and the different expectations for communication and results.
[00:10:44] Georg inquires about the impact of CHAOSS research capabilities, and Daniel explains that CHAOSS is shaping research by reflecting the interests and observations of its contributors.
[00:12:16] Sean talks about the increased capacity for research offered by CHAOSS, particularly through tools like Grimoire Lab and Augur, Anita shares her experience using Grimoire Lab for creating interventions and dashboards for open source communities to monitor their projects, and Daniel adds historical context and mentions the importance of tools that allow the replication of analysis in research.
[00:17:10] Georg introduces a study using CHAOSS metrics and software that hasn’t been officially published yet, and Sophia shares some details and explains the study’s premise.
[00:21:00] Anita raises a philosophical point about the potential limitations of metrics, suggesting that they may only reflect what is observable and could lead to gamification if people optimize their behavior based on the metrics.
[00:22:14] Sean speaks about the importance of deep field engagement and the combination of social science with data mining to fully understand the data’s underlying human behavior. Sophia shares her perspective from market research, discussing the design of surveys, the selection bias inherent in data collection, and the importance of understanding the population that is excluded by the research filters used.
[00:25:56] Anita discusses the challenges of academic surveys, and Daniel discusses the bias that may arise from the data available.
[00:28:10] Sophia contemplates the behavioral nuances dictated by different platforms’ processes, and Sean suggests a focus on common software engineering processes across different tools and advocates for social scientific research in open source to better understand the human aspects.
[00:30:32] Georg transitions to discussing survey methodologies and their relation to CHAOSS metrics, and Anita shares her experiences with survey design for the international Apache Software Foundation community and implementation.
[00:33:10] Daniel reflects on the collaborative effort with the ASF community to ensure the survey’s terms and questions were appropriately adapted for an international audience. Sophia suggest the need for a consistent taxonomy is research to ensure cultural sensitivity and understanding.
[00:36:15] Sean touches on the use of large language models in research to identify common language patterns, discussing the ethical considerations of using machine learning to evaluate inclusivity in projects. Anita shares thoughts on presenting survey data responsibly and the need for careful consideration of what information is shared.
[00:38:53] Georg questions the future direction for research in open source using metrics and software. Sean advocates for deeper social scientific engagement, Anita points out the silos between industry and academics, highlighting the need for more interaction and collaboration to synergize efforts and ask more relevant questions, and Sophia stresses the need to focus on gaps in data and to consider work not visible in trace data.
[00:42:59] Daniel brings a pessimistic view, cautioning that the different goals of industry and academia might lead to problems unless they find ways to work together more effectively.
[00:44:11] Georg asks Daniel to clarify the problems he foresees with the current research trajectories. Daniel elaborates on the potential ethical and legal issues that may arise when data is used beyond the limits of fair use, such as in mental health analysis from developer messages, and Sean and Anita add some thoughts as well.
Value Adds (Picks) of the week:
Panelists:
Georg Link
Sean Goggins
Daniel Izquierdo
Anita Sarma
Sophia Vargas
Links:
Mining Software Repositories (MSR) conference 2024
CHAOSSCon EU 2024-Brussels Livestream (YouTube)
Language Variation and Change in Social Networks by Robin Dodsworth and
From the Soil: The Foundations of Chinese Society by Fei Xiaotong
“Counting Potatoes: the Size of Debian 2.2 “ (UPGRADE-Open Source/Free Software: Towards Maturity
Special Guest: Anita Sarma.
Listen to all your favourite podcasts with AI-powered features
Listen to the best highlights from the podcasts you love and dive into the full episode
Hear something you like? Tap your headphones to save it with AI-generated key takeaways
Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more
Listen to all your favourite podcasts with AI-powered features
Listen to the best highlights from the podcasts you love and dive into the full episode