AI Helping to Translate Transcription of Persia's Achaemenid Empire

Monday, March 16, 2020 - 11:38

Researchers from the OI and UChicago's Department of Computer Science have succeeded to translate the transcription of the "paperwork" of Persia's Achaemenid Empire which was recorded on clay tablets, and reveal rich information about Achaemenid history, society and language.

According to the University of Chicago official website, twenty-five centuries ago, the "paperwork" of Persia's Achaemenid Empire was recorded on clay tablets—tens of thousands of which were discovered in 1933 in modern-day Iran by archaeologists from the University of Chicago's Oriental Institute.

With a training set of more than 6,000 annotated images from the Persepolis Fortification Archive, the Center for Data and Computing-funded project will build a model that can "read" as-yet-unanalyzed tablets in the collection, and potentially a tool that archaeologists can adapt to other studies of ancient writing.

“If we could come up with a tool that is flexible and extensible, that can spread to different scripts and time periods, that would really be field-changing,” said Susanne Paulus, associate professor of Assyriology.

The collaboration began when Paulus, Sandra Schloen and Miller Prosser of the OI met Asst. Prof. Sanjay Krishnan of the Department of Computer Science at a Neubauer Collegium event on digital humanities. Schloen and Prosser oversee OCHRE, a database management platform supported by the OI to capture and organize data from archaeological excavations and other forms of research. Krishnan applies deep learning and AI techniques to data analysis, including video and other complex data types. The overlap was immediately apparent to both sides.

With resources from the UChicago Research Computing Center, Krishnan used this annotated dataset to train a machine learning model, similar to those used in other computer vision projects. When tested on tablets not included in the training set, the model could successfully decipher cuneiform signs with about 80% accuracy. Ongoing research will try to nudge that number higher while examining what accounts for the remaining 20%.

But even 80% accuracy can immediately provide help for transcription efforts. Many of the tablets describe basic commercial transactions, similar to “a box of Walmart receipts,” Paulus said. And a system that can’t quite make up its mind may still be useful.

“If the computer could just translate or identify the highly repetitive parts and leave it to an expert to fill in the difficult place names or verbs or things that need some interpretation, that gets a lot of the work done,” said Paulus, the Tablet Collection Curator at the OI. “And if the computer can't make a definitive decision, if it could give us back probabilities or the top four ranks, then an expert has a place to start. That would be amazing.”


Popular News

Latest News