Tools for Text Transcription (Nashi Predky@Home)

Transcribing letters, diaries, or other long texts is a common but tedious task faced by family historians. This presentation will demonstrate some software tools that can help.

Family historians are often faced with long text documents that they need to transcribe in order to create a print edition or a searchable digital resource. Such transcription projects can be daunting, tedious, and error-prone.

The UHEC's staff has also faced such projects. In this talk, archivist Michael Andrec will present some of the software tools that he and his co-workers have used to create large-scale transcriptions. In particular, he will talk about Transkribus, an EU-supported project of the "Read Co-op" that is formally based in Austria. Transkribus is a tool that allows you to upload pages images, automatically detect the locations of text areas and lines, and then manually transcribe the text while having the page image "follow along" with the position at which you are transcribing. They also have a large number of trained AI models for printed and typewritten text (including ones that work for Ukrainian), as well as much more limited ones for handwritten text.

Along with an overview of Transkribus, Michael will talk about its strengths and weaknesses, especially as it relates to different types of documents and to Cyrillic text. He will also talk about other tools that could be of use in text transcription, including tabular or structured information from documents like metrical books.

This is a free event. You must, however, register to attend.

June 18th, 2024 from  7:00 PM to  8:00 PM
Event Fee(s)
Voluntary donation $1.00