More or Less: Behind the Stats cover image

Reoffending rates, Welsh taxes and the menopause

More or Less: Behind the Stats

00:00

Corpus Linguistics - What Do You Find?

Corpus linguistics is a methodology that uses large electronic collections of what we call naturally occurring language data. The kind of really big modern corpora tend to be running into the billions, so the really big ones that we can use do tend to be from the web here. So I mean across the different corpora that I looked at is around 80% data is versus 20% data are, and that seems to be replicated across different verbs as well. And you can figure out how often people say data is versus how often they say data are by looking at their comments on social media sites such as Twitter or Facebook. But there does definitely take a plural verb much more often in genres like academia

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app