The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Mining the Vatican Secret Archives with TensorFlow w/ Elena Nieddu - TWiML Talk #243

Mar 27, 2019

Elena Nieddu, a PhD student at Roma Tre University, dives into her fascinating project, "In Codice Ratio," which aims to transcribe and annotate documents from the Vatican Secret Archive using machine learning. She discusses the challenges of traditional OCR and shares innovative strategies for improving accuracy in transcribing medieval manuscripts. Elena also highlights the unique crowdsourcing initiative involving high school students in Italy, empowering them while enriching the project with quality annotations. This intersection of history and technology promises to unlock hidden treasures of knowledge.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

ANECDOTE

OCR Limitations

Initially, the In Codice Ratio project aimed to use OCR for transcription.
However, the team realized OCR was insufficient for handwritten documents and needed a smarter solution.

INSIGHT

Expert Skepticism and Scalability

Paleographers initially doubted the feasibility of computers transcribing ancient handwriting, citing years of training required.
Existing transcription systems demanded extensive manual annotation, deeming them not cost-effective for experts.

ANECDOTE

High School Collaboration

Incodice Ratio partnered with high school students for data annotation, providing them with training and flexible work arrangements.
This collaboration benefited students with practical experience and allowed them to contribute to a real-world project.

Get the Snipd Podcast app to discover more snips from this episode

Get the app