The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Mining the Vatican Secret Archives with TensorFlow w/ Elena Nieddu - TWiML Talk #243

Mar 27, 2019
Elena Nieddu, a PhD student at Roma Tre University, dives into her fascinating project, "In Codice Ratio," which aims to transcribe and annotate documents from the Vatican Secret Archive using machine learning. She discusses the challenges of traditional OCR and shares innovative strategies for improving accuracy in transcribing medieval manuscripts. Elena also highlights the unique crowdsourcing initiative involving high school students in Italy, empowering them while enriching the project with quality annotations. This intersection of history and technology promises to unlock hidden treasures of knowledge.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
ANECDOTE

OCR Limitations

  • Initially, the In Codice Ratio project aimed to use OCR for transcription.
  • However, the team realized OCR was insufficient for handwritten documents and needed a smarter solution.
INSIGHT

Expert Skepticism and Scalability

  • Paleographers initially doubted the feasibility of computers transcribing ancient handwriting, citing years of training required.
  • Existing transcription systems demanded extensive manual annotation, deeming them not cost-effective for experts.
ANECDOTE

High School Collaboration

  • Incodice Ratio partnered with high school students for data annotation, providing them with training and flexible work arrangements.
  • This collaboration benefited students with practical experience and allowed them to contribute to a real-world project.
Get the Snipd Podcast app to discover more snips from this episode
Get the app