Predicting the Past? Unexpected applications of AI

Much of history has been lost to natural or human-induced degradation of text—AI, however, may overcome this. Photo credit: Yusuf Dündar via Unsplash

What links a shipwreck, astronomy, and modern Artificial Intelligence (AI)? More than you might think.

In early 1900, a group of sponge divers happened upon a shipwreck located between Crete and the Peloponnese as they were sheltering from a storm. A few months later, a calcified lump of bronze was dredged from those same waters near the Greek island of Antikythera. However, Greek authorities were understandably dazzled by the more discernible treasures the dive had uncovered, including glassware, jewellery, and sculptures. Therefore, when it was taken to the National Archaeological Museum in Athens (where it remains to this date), the lump was not recognised as anything important.

Not everyone was so dismissive of the lump. Albert Rehm (1871-1943), a German philologist by training (someone who studies the history of languages), ironically, was not highly communicative and largely kept to himself. He studied the lump (which came to be known as the Antikythera Mechanism) in detail. By now, it had taken on a different form, having split into 3 pieces and revealing complex gearwheels. He was the first person to understand that the mechanism was a type of calculating machine, designed to predict the movements of planets and stars decades in advance. At first, this might not seem like a startling revelation. That is, until you consider how this technology, likely built around 60–200 B.C.E ,and lost with the Roman Empire, was not reinvented until the 14th century. It came to be dubbed the world’s ‘first analogue computer.’

Despite being an expert in philology, Rehm did not have the advantage of knowing that the Antikythera Mechanism also contained many Greek inscriptions. In fact, for more than 120 years after it was discovered, its inscriptions were hidden and remained largely indecipherable. Only in 2005-6, researchers at Cardiff University attempted to ‘decode’ the mechanism using surface imaging and high-resolution X-ray tomography of the surviving fragments. Researchers at UCL also joined in 2021 in an effort to help decrypt the mysteries of the device. However, two-thirds of the mechanism itself are still missing.

What if there was a faster way to identify ancient inscriptions? That, instead of taking years of deciphering, a machine could provide answers within mere seconds? This is the promise of Google DeepMind’s Ithaca, released in 2022 (if only it had been released a year sooner!). Ithaca was designed to restore and attribute ancient texts using deep neural networks—a subset of machine learning (ML) methods which is based on the use of artificial neural networks and aims to mimic the function and structure of the human brain. Dubbed ‘the AI historian’, it was trained on thousands of existing inscriptions to fill in missing words and characters, as well as to ascribe a likely date to the text. Interestingly, Ithaca isn’t the only recent AI breakthrough to address the challenges of deciphering historical text. In 2023, 21-year-old computer science student Luke Farritor won a global contest to read texts from the ancient Herculaneum scroll for the first time—a scroll which had been indecipherable since a volcanic eruption of Mount Vesuvius in AD79. Luke cleverly developed and deployed an ML algorithm to identify Greek words from CT scans of the rolled-up papyrus.

Dubbed ‘the AI historian’, it [Ithaca] was trained on thousands of existing inscriptions to fill in missing words and characters, as well as to ascribe a likely date to the text.

These technological advances hold promise for the future. AI could be used, for example, to aid in deciphering some of the 6,000 fragmentary inscriptions housed at the Museum of Inscription in Ephesus. These shards of history that have baffled experts for years could finally start to provide clues about the history of prominent kings, buildings, laws, practices, cult festivals, and rituals. Or ML algorithms could be used to help restore the names, hometowns, or achievements of ancient athletes inscribed on stone, papyrus, and statues, offering a richer history of the Olympic Games. In short, AI could potentially enable the restoration of damaged texts, as well as support conservation efforts, academic research, museums, archaeological sites, and digital repositories (e.g., Packard Humanities Index’s searchable inscriptions.)

AI could be used, for example, to aid in deciphering some of the 6,000 fragmentary inscriptions housed at the Museum of Inscription in Ephesus.

So, should historians rush to make use of these promising new AI tools? Or better yet, can AI be a useful aid for laypersons with an interest in history but lack of formal training? The answer is yes; however, it is imperative to heed significant considerations. Foremost among these is accuracy. That is, how often an AI system guesses the correct answer when measured against correctly labelled test data. This point is best exemplified by Ithaca, which achieves only a 62% accuracy in restoring damaged texts when used alone (with 71% accuracy in attributing inscriptions to their original ‘findspot’). The good news is that this accuracy is improved when Ithaca is used alongside human historians—something the creators of the AI argue confirms the synergistic nature of this research. However, this also clearly demonstrates that AI tools are far from 100% accurate.

AI technologies clearly have their shortcomings, but they still present remarkable benefits. For example, using Ithaca to restore damaged inscriptions on artefacts at museums and archaeological sites would allow for a more robust experience for visitors. Uniquely, for pieces with missing fragments, the missing text could be displayed alongside the artefact with a note about its predictive nature. The predictive text could also be sense-checked by an expert historian or archaeologist. Adopting this approach would address other limitations of using AI tools that are not unique to historical applications, such as over-reliance and generalisation. That is, if people rely solely on Ithaca’s predictions, they may miss out on the depth and cultural context of the original inscriptions. Or, if the AI model generalises too much based on the data it has been trained on, it might miss nuances such as rare names or unique achievements.

Some challenges associated with this emerging technology are less easy to address. One such issue is interpretability. If experts don’t understand how Ithaca is making its predictions (because it is, like all AI tools, a black box), they might be hesitant to trust or use its results without further verification. Building new AI models also requires a huge amount of data (to ‘train the model’)—thus presently the benefits of AI can mostly only be enjoyed by trained historians or laypersons interested specifically in ancient Greece. Indeed, if you are interested in a particularly niche ancient language, for which there is not enough data, then AI may, unfortunately, not be a tool you can turn to as a reliable aid.

To conclude, human attempts to decipher ancient Greek text from fragmented, decayed, and often scattered or hidden objects have seen significant strides. They just have not been super-fast in doing so. Nor have humans gotten around to transcribing thousands of untouched (albeit, sometimes archived) artefacts scattered across the globe. AI holds a promise to assist in speeding things up, and cheaply: Using AI tools can be free or affordable for consumers (although it is important to note that the cost of training them up is not). Who knows, if AI had existed 120 years ago, then perhaps we would already have a better picture of the inner workings of the Antikythera Mechanism. Before unleashing AI on all unsolved ancient Greek texts, however, it is important to consider that it is no magic bullet. Care should be taken to involve (human) experts, to sense and fact-check findings, and to consider the various limitations that come with AI. If ‘predicting the past’ sounds elusive, that is because (at least for now) it is.

From this section