Thursday, February 18, 2021

Urdu translations incorporated into HRC, OCRE, and CRRO

Thanks to translations provided by Dr. Asma Ibrahim, curator at the State Bank of Pakistan Museum, Urdu user interface translations have been incorporated into the Numishare framework. Urdu has been activated in Online Coins of the Roman Empire, Coinage of the Roman Republic Online, and all of the NEH-funded Hellenistic Royal Coinages sub-projects (Hellenistic typologies in the Inventory of Greek Coin Hoards database). These Numishare collections have been reindexed into Apache Solr, so that concepts with Urdu labels are integrated into the user interface. There are not many Urdu labels for Greek and Roman numismatics so far--these have primarily been harvested from and therefore reflect the coverage of articles from the Urdu language version of Wikipedia. That is to say, many notable entities, such as Alexander the Great, Augustus, or mints, such as Athens, Rome, etc. have relevant articles in Wikipedia, but not denominations or less notable people or corporate bodies.


Seleucid Coins (part 1), no. 278, a hemidrachm from Bactria.

This is the first of numerous deliverables for the Oxford-ANS OXUS-INDUS project to publish Bactrian and Indo-Greek typologies through the Hellenistic Royal Coinages umbrella. One of the chief aims of the project is to enhance discoverability and accessibility of Central and South Asian cultural materials to the residents of those regions. We hope to provide translations in other relevant languages in advancement of this goal, and this includes filling in gaps in translations for URIs. Our analytics suggest that translations of Numishare interfaces into Arabic, Turkish, Bulgarian, and other non-Western European languages has directly contributed into increased usage of our open digital resources in Turkey, North Africa, the Middle East, and Eastern Europe. The introduction of Urdu into the interface is the first modern language in an area that covers the easternmost extent of Hellenistic cultural contact.

Tuesday, February 16, 2021

26,000 Roman Imperial coins from the Portable Antiquities Scheme added to OCRE

The coverage of Roman coin finds in Britain has been expanded dramatically from just over 1,000 in the first batch from the Portable Antiquities Scheme ever ingested (about five years ago) to more than 26,000. About 3,000 coins in the PAS database link directly to Online Coins of the Roman Empire (OCRE) URIs through its internal lookup interface, but another 23,000 links were established by me in a process that took several days worth of cleaning and reconciliation in OpenRefine.


RIC VII Treveri 475

I began with a query to the PAS's JSON-based search API to look for any Roman coin with "RIC" in the metadata. I loaded these data, as well as Nomisma IDs for ruler/person depicted, mint, and denomination (when available in the PAS data). With a lot of careful parsing, regex, and a number of other data munging techniques, I was able to isolate RIC numbers, and in combination with the RIC volume number and/or emperor name or Nomisma mint preferred label (particularly for RIC volumes VI to IX, which are organized by mint, rather than emperor), I did numerous passes through the OpenRefine reconciliation API that is inherent to OCRE (and Numishare projects, more broadly). Eventually, I ended up with over 26,000 matches. There may be some false positives here or there, but I'm pretty confident in the accuracy of the matching, and I did a substantial amount of manual checking when the API yielded more than one possible match.

I should note that, with only a few exceptions, Hadrianic coins were ignored, as we need to develop a different process to link to URIs for Richard Abdy's new RIC volume (II [second edition], Part 3) by means of the concordance between the original RIC numbers and the new ones that were published to OCRE in June, 2020.

Many (about half) of the coins link to a parish-level findspot, and so coordinates will appear on maps in OCRE and in the dynamically generated maps in relevant Nomisma concepts.

Finds distribution of Vespasian.

Another half of the coins are published to the PAS from the Iron Age and Roman Coins from Wales (IARCW) dataset. The findspots link to the district level, but do not display on the map in the current user interface. However, many districts can be rendered as GeoJSON polygons, which had been extracted from a Github repository set up by Dan Pett when he worked on the PAS database. Many of these coins are from hoards, and eventually we will be able to to hoard URIs that will be rendered differently on the map, to distinguish from individual finds. I will provide more details about this functionality when Oxford Institute of Archaeology formally launches the Iron Age Coins in Britain project in the next few weeks.

The Ordnance Survey URIs from the PAS database have been resolved to their matching entity in Wikidata, and the Wikidata SPARQL endpoint was used to extract coordinates for parishes, as well as the full administrative hierarchy from parish to country (UK). This makes it possible to query all finds within a district or county, according to modern divisions. I'll provide a more detailed look at this data structure eventually. Eventually all gazetteers used for findspots (whether the Getty Thesaurus of Geographic Names or will resolve to Wikidata as a centralized authority service, which will make it possible to aggregate finds databases across countries, and query them through a shared gazetteer vocabulary.

Nearly 50,000 total records were extracted from the RIC query from the Portable Antiquities Scheme. About 4,000 are uncertain (and can't be matched to one single RIC number), but that leaves a further 20,000 or so that might be linked with further rounds of cleanup or crowdsourcing. The other major remaining task is to link coins to hoard records, whether these records are published in the PAS database or Coin Hoards of the Roman Empire, and this will enable the query and display of a large swatch of coins ingested into the system that otherwise have no public lat-long coordinates.