AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
All the key news since our episode on 6th November - including new research on AI in education, and a big tech news week!
It's okay to write research papers with Generative AI - but not to review them!The publishing arm of American Association for Advancement of Science (they publish 6 science journals, including the "Science" journal) says authors can use “AI-assisted technologies as components of their research study or as aids in the writing or presentation of the manuscript” as long as their use is noted. But they've banned AI-generated images and other multimedia" without explicit permission from the editors”.
And they won't allow the use of AI by reviewers because this “could breach the confidentiality of the manuscript”.
A number of other publishers have made announcements recently, including
the International Committee of Medical Journal Editors , the World Association of Medical Editors and the Council of Science Editors.
https://www.science.org/content/blog-post/change-policy-use-generative-ai-and-large-language-models
Learning From Mistakes Makes LLM Better Reasonerhttps://arxiv.org/abs/2310.20689
News Article: https://venturebeat.com/ai/microsoft-unveils-lema-a-revolutionary-ai-learning-method-mirroring-human-problem-solving
Researchers from Microsoft Research Asia, Peking University, and Xi’an Jiaotong University have developed a new technique to improve large language models’ (LLMs) ability to solve math problems by having them learn from their mistakes, akin to how humans learn.
The researchers have revealed a pioneering strategy, Learning from Mistakes (LeMa), which trains AI to correct its own mistakes, leading to enhanced reasoning abilities, according to a research paper published this week.
The researchers first had models like LLaMA-2 generate flawed reasoning paths for math word problems. GPT-4 then identified errors in the reasoning, explained them and provided corrected reasoning paths. The researchers used the corrected data to further train the original models.
Role of AI chatbots in education: systematic literature review
International Journal of Educational Technology in Higher Education
https://educationaltechnologyjournal.springeropen.com/articles/10.1186/s41239-023-00426-1#Sec8
Looks at chatbots from the perspective of students and educators, and the benefits and concerns raised in the 67 research papers they studied
We found that students primarily gain from AI-powered chatbots in three key areas: homework and study assistance, a personalized learning experience, and the development of various skills. For educators, the main advantages are the time-saving assistance and improved pedagogy. However, our research also emphasizes significant challenges and critical factors that educators need to handle diligently. These include concerns related to AI applications such as reliability, accuracy, and ethical considerations."
Also, a fantastic list of references for papers discussing chatbots in education, many from this year
More Robots are Coming: Large Multimodal Models (ChatGPT) can Solve Visually Diverse Images of Parsons Problems
https://arxiv.org/abs/2311.04926
https://arxiv.org/pdf/2311.04926.pdf
Parsons problems are a type of programming puzzle where learners are given jumbled code snippets and must arrange them in the correct logical sequence rather than producing the code from scratch
"While some scholars have advocated for the integration of visual problems as a safeguard against the capabilities of language models, new multimodal language models now have vision and language capabilities that may allow them to analyze and solve visual problems. … Our results show that GPT-4V solved 96.7% of these visual problems"
The research's findings have significant implications for computing education. The high success rate of GPT-4V in solving visually diverse Parsons Problems suggests that relying solely on visual complexity in coding assignments might not effectively challenge students or assess their true understanding in the era of advanced AI tools. This raises questions about the effectiveness of traditional assessment methods in programming education and the need for innovative approaches that can more accurately evaluate a student's coding skills and understanding.
Interesting to note some research earlier in the year found that LLMs could only solve half the problems - so things have moved very fast!
The Impact of Large Language Models on Scientific Discovery: a Preliminary Study using GPT-4
https://arxiv.org/pdf/2311.07361.pdf
By Microsoft Research and Microsoft Azure Quantum researchers
"Our preliminary exploration indicates that GPT-4 exhibits promising potential for a variety of scientific applications, demonstrating its aptitude for handling complex problem-solving and knowledge integration tasks"
The study explores the impact of GPT-4 in advancing scientific discovery across various domains. It investigates its use in drug discovery, biology, computational chemistry, materials design, and solving Partial Differential Equations (PDEs). The study primarily uses qualitative assessments and some quantitative measures to evaluate GPT-4's understanding of complex scientific concepts and problem-solving abilities. While GPT-4 shows remarkable potential and understanding in these areas, particularly in drug discovery and biology, it faces limitations in precise calculations and processing complex data formats. The research underscores GPT-4's strengths in integrating knowledge, predicting properties, and aiding interdisciplinary research.
An Interdisciplinary Outlook on Large Language Models for Scientific Research
https://arxiv.org/abs/2311.04929
Overall, the paper presents LLMs as powerful tools that can significantly enhance scientific research. They offer the promise of faster, more efficient research processes, but this comes with the responsibility to use them well and critically, ensuring the integrity and ethical standards of scientific inquiry. It discusses how they are being used effectively in eight areas of science, and deals with issues like hallucinations - but, as it points out, even in Engineering where there's low tolerance for mistakes, GPT-4 can pass critical exams. This research is a good source of focus for researchers thinking about how it may help or change their research areas, and help with scientific communication and collaboration.
With ChatGPT, do we have to rewrite our learning objectives -- CASE study in Cybersecurity
https://arxiv.org/abs/2311.06261
This paper examines how AI tools like ChatGPT can change the way cybersecurity is taught in universities. It uses a method called "Understanding by Design" to look at learning objectives in cybersecurity courses. The study suggests that ChatGPT can help students achieve these objectives more quickly and understand complex concepts better. However, it also raises questions about how much students should rely on AI tools. The paper argues that while AI can assist in learning, it's crucial for students to understand fundamental concepts from the ground up. The study provides examples of how ChatGPT could be integrated into a cybersecurity curriculum, proposing a balance between traditional learning and AI-assisted education.
"We hypothesize that ChatGPT will allow us to accelerate some of our existing LOs, given the tool’s capabilities… From this exercise, we have learned two things in particular that we believe we will need to be further examined by all educators. First, our experiences with ChatGPT suggest that the tool can provide a powerful means to allow learners to generate pieces of their work quickly…. Second, we will need to consider how to teach concepts that need to be experienced from “first-principle” learning approaches and learn how to motivate students to perform some rudimentary exercises that “the tool” can easily do for me."
A Step Closer to Comprehensive Answers: Constrained Multi-Stage Question Decomposition with Large Language Models
https://arxiv.org/abs/2311.07491
What this means is that AI is continuing to get better, and people are finding ways to make it even better, at passing exams and multi-choice questions
Assessing Logical Puzzle Solving in Large Language Models: Insights from a Minesweeper Case Study
https://arxiv.org/abs/2311.07387
Good news for me though - I still have a skill that can't be replaced by a robot. It seems that AI might be great at playing Go, and Chess, and seemingly everything else. BUT it turns out it can't play Minesweeper as well as a person. So my leisure time is safe!
DEMASQ: Unmasking the ChatGPT Wordsmith
https://arxiv.org/abs/2311.05019
Finally, I'll mention this research, where the researchers have proposed a new method of ChatGPT detection, where they're assessing the 'energy' of the writing. It might be a step forward, but tbh it took me a while to find the thing I'm always looking for with detectors, which is the False Positive rate - ie how many students in a class of 100 will it accuse of writing something with ChatGPT when they actually wrote it themself. And the answer is it has a 4% false positive rate on research abstracts published on ArXiv - but apparently it's 100% accurate on Reddit. Not sure that's really good enough for education use, where students are more likely to be using academic style than Reddit style!
I'll leave you to read the research if you want to know more, and learn about the battle between AI writers and AI detectors
Harvard's AI Pedagogy Project
And outside of research, it's worth taking a look at work from the metaLAB at Harvard called
"Creative and critical engagement with AI in education"
It's a collection of assignments and materials inspired by the humanities, for educators curious about how AI affects their students and their syllabi. It includes an AI starter, an LLM tutorial, lots of resources, and a set of assignments
Microsoft Ignite Book of News
There's way too much to fit into the shownotes, so just head straight to the Book of News for all the huge AI announcements from Microsoft's big developer conference
Link: Microsoft Ignite 2023 Book of News