S3E20: The Rise of Machine Learning in the Social Sciences with Doug Steinley
Feb 15, 2022
auto_awesome
Doug Steinley, Professor in the Department of Psychology at the University of Missouri at Columbia and current editor of the APA journal Psychological Methods, discusses the rise of machine learning in the social sciences. They explore the tension between prediction and explanation, the challenges of incorporating new techniques into curriculum, and the foundational concepts of machine learning and statistical techniques.
Machine learning methods can be used for prediction and classification in social science research.
Machine learning techniques aid in theory development by revealing hidden structures and relationships in data.
Variable selection is a challenge in machine learning models, balancing including relevant variables and avoiding overfitting.
Deep dives
Machine learning methods in social science
The episode explores the use of machine learning methods in the social science landscape. The discussion focuses on the application of these methods for prediction and classification. It highlights the importance of prediction in statistical modeling and challenges the traditional emphasis on explanation. The episode also emphasizes the need to balance theory with data exploration, using machine learning techniques to inform and refine theoretical models. It suggests that machine learning methods can be used to reveal underlying structures and subgroups in data, leading to more personalized and effective approaches in social science research.
Finding insights with machine learning
The episode discusses the value of machine learning in uncovering patterns and insights in data. It advocates for the use of machine learning techniques in exploratory data analysis, highlighting the benefits of tools such as principal component analysis and cluster analysis. It suggests that these techniques can aid in theory development by revealing hidden structures and relationships in the data. The episode also emphasizes the importance of understanding the limitations and assumptions of machine learning models, such as the need for cross-validation and external validation with new data sets.
Variable selection and optimization
The episode explores the challenges of variable selection in machine learning models. It acknowledges the risk of overfitting and the need for efficient and effective optimization routines. The discussion highlights the importance of selecting relevant variables and avoiding unnecessary complexity in models. It also suggests that there is room for improvement in variable selection methods, such as the development of new penalty functions and regularization techniques. The episode emphasizes the need for a balance between including relevant variables and avoiding overfitting.
Linking theory and statistical models
The episode considers the relationship between theoretical models and statistical models in social science research. It encourages researchers to connect their theoretical understanding with the distributional characteristics of their data. It suggests that machine learning methods can help bridge this gap by providing insights into the data and informing the development of theoretical models. The episode underscores the importance of visualization and exploratory data analysis in understanding data structures and informing model building. It encourages researchers to be open to revisiting and refining their theories based on insights gained from machine learning techniques.
Expanding the curriculum in quantitative methods
The episode discusses the need to update and expand the curriculum in quantitative methods to incorporate machine learning techniques. It suggests that foundational topics such as matrix algebra and eigenvalue decomposition are crucial for understanding and applying these methods. The episode highlights the availability of resources such as textbooks and workshops that can serve as entry points for learning about machine learning. It emphasizes the importance of balancing traditional statistical methods with contemporary machine learning approaches in order to stay current and better address research questions in the social sciences.
Patrick and Greg discuss the rise of machine learning in the social sciences with guest Doug Steinley, Professor in the Department of Psychology at the University of Missouri at Columbia and current editor of the APA journal Psychological Methods. Along the way they also mention funeral expenses, Swedish massage, Amy the Chatbot, irony versus coincidence, lavender bath bombs, varmint removal, Planet of the Apes, Voltron, the Cookie Monster, theory smoothies, Bugs Bunny and Yosemite Sam, ironing your Christmas paper, and meat grinders.