Wednesday, April 7, 2021

Dutch National Numismatic Collection added to CRRO

The Dutch Nationale Numismatische Collectie, published by De Nederlandsche Bank, is the latest collection to join the Nomisma.org numismatic Linked Open Data cloud. Nearly 2,200 Roman Republican coins from the collection have been linked to Coinage of the Roman Republic Online (CRRO) URIs through OpenRefine, and exported directly into RDF with a template. There are now more than 52,000 Republican coins in CRRO, making it, by far, the most comprehensive research tool for this corpus of material.

A small number (about a dozen) coins have known findspots that were reconciled to Wikidata.org URIs. Using Wikidata's SPARQL endpoint, I extracted coordinates for these places, as well as the entire geographic hierarchy up to the country level, making it possible to begin querying coin finds of the Netherlands in a systematic way. Hopefully the Portable Antiquities of the Netherlands (PAN) will eventually integrate with CRRO and Online Coins of the Roman Empire (OCRE) URIs, painting a more complete picture of the circulation of Roman coins into the Netherlands.



Distribution of DNB coins for RRC 285/2, with two finds.

In the near future, a substantial portion of the NNC-NDB's Hellenistic collection will be linked to Hellenistic Royal Coinages URIs, and then the Roman imperial collection will be integrated into OCRE following its complete digitization and cataloging.

Friday, April 2, 2021

Getty Roman Republican coinage in CRRO

The J. Paul Getty Museum is the latest to join the Nomisma.org Linked Open Data cloud. With special access to their development Linked Art JSON-LD API combined with their experimental SPARQL endpoint, I have been able to extract 66 coins with RRC references, with a query built around the Linked Art CIDOC-CRM profile.

I took the resulting CSV data from the endpoint and loaded it up into OpenRefine for some further cleanup, to link to the Coinage of the Roman Republic URIs for coin types, and to pull the Linked Art JSON-LD into OpenRefine in order to extract the IIIF Manifest and image service API URIs. For the first time, I experimented with OpenRefine's built in template export scheme, and put together a generally reusable template to export Nomisma-compliant RDF directly from the app (rather than authoring a one-off PHP script to transform cleaned CSV data into RDF). This saved considerable time. I threw this template into Gist, and so I can generate Nomisma RDF from any OpenRefine data. Hopefully this will open the door to other contributors cleaning their own data and providing us the RDF directly without further intervention.

A sample of the representation of RRC 422/1a
 

There are some 700 Roman Imperial coins with RIC references that I will eventually link to Online Coins of the Roman Empire. This task is a bit more complex, but it can be knocked out in an afternoon. The Hellenistic coins in the Getty aren't cataloged with type references, and so there's no way to integrate these until a curator identifies and links them.

Wednesday, March 24, 2021

What's in Iron Age Coins in Britain and what's next?

By now, you have probably heard of the official launch of the University of Oxford Institute of Archaeology's launch of Iron Age Coins in Britain (IACB), a typology based on Ancient British Coins and published in Numishare, much like the American Numismatic Society's digital coin type projects.

 

ABC 2433, a well-represented stater.

The digital corpus comprises 999 types which are linked to over 35,000 specimens, most of which have been harvested from the Portable Antiquities Scheme. There is some overlap here, and much work remains to eliminate duplicates. Here is a synopsis of what's currently accessible through IACB:

 

Exemplars 

Nine hundred sixty-four "Exemplar" specimens in a temporarily stand-alone database. These were photographs selected for Ancient British Coins as the best extant representation for the type. These coins may come from public or private collections and exist to provide 100% photographic coverage of the types in IACB. These will eventually be filtered out as we begin to expand the coverage from other collections.

Portable Antiquities Scheme

The largest contribution consists of 29,627 coins from the Portable Antiquities Scheme that include an ABC number. Note that this does not include all Iron Age coinage from the PAS database, as a large portion are not cataloged with ABC numbers. About half of the PAS coins link to Ordnance Survey URIs, mainly at the parish level, enabling the mapping of latitudes and longitudes for findspots. Higher level geographic entities (districts and counties) incorporate GeoJSON polygons for boundaries that I parsed from Dan Pett's PAS GitHub repository.

Data for over 500 "Iron Age" hoards have been exported from the PAS and mapped into Linked Open Data, although not all of these are hoards of coins. The vast majority of these link to district-or-above geographic entities and are only represented as polygonal areas, rather than points. However, there are almost no direct links between individual coins in the PAS database and hoard records. Approximately only 2,000 coins include a hoard name in the "knownas" field, and so subsequent reconciliation has linked these coins to hoard URIs for a separate sort of visualization from individual finds (represented as orange points).

The PAS database also includes tens of thousands of records from the Celtic Coins Index, but only the objects catalogued through 2004.

 

The British Museum

Over 6,400 coins from the British Museums have been extracted from their Collections Online, and ABC numbers from the reference fields were linked to IACB. Not all of these coins have been photographed, but many that have been are very nice quality color photos. The BM records include hoard names as well. These were linked, as best as possible, to the PAS hoard URIs. About 3,500 of the coins from the BM were linked to more than 50 hoards.

The caveat is that there is no link from the PAS record to the BM record, or vice versa. This means a coin found and reported to the PAS or any number of the thousands of CCI coins in the Scheme is duplicated in the British Museum database. This is a task that will require some sorting out, especially after the entire Celtic Coins Index is published online by the end of 2021, and we hope the general public can aid in spotting and reporting duplicates in IACB.

 

Berlin

Fourteen coins so far from the Berlin M√ľnzkabinett have been linked to IACB, the first collection to be integrated since the official launch yesterday. In the near future, we also expect the American Numismatic Society, Biblioth√®que nationale de France, and the Swiss National Museum to make their collections available. In time, others will begin to use IACB as their cataloging tool for Iron Age coinage.


Duplication Illustrated

Because a significantly larger proportion of the British Museum coins link to hoards than the PAS, and because hoards tend to link to districts and individual finds to parishes, there are some obvious signs that the visualization you see in the maps in IACB (and maps on the pages of related concepts in Nomisma.org) that the distribution of finds actually represents a hoard. This is illustrated most simply in ABC 120.


The sizes of the circles for finds varies based on the density of coins found within a particular parish. The red polygon represents the district of Folkstone and Hythe, for the Folkstone II Hoard, from where numerous British Museum coins were found. Additionally, 75 objects are linked to the parish of Folkstone, predominately CCI coins in the PAS database that are almost certainly from the Folkstone Hoard(s). A further 74 coins are from Kingston, probably from the Kingston Upon Thames Hoard. This is a hoard that has been harvested from the PAS database and ingested as Linked Data into Nomisma.org's SPARQL endpoint, but no coins have actually been linked to it yet. Over time, we hope to be able to link more PAS and CCI coins to Iron Age hoard records, which will create a more accurate picture of the distribution of these coins.

Eventually, the priority for de-duplication is as follows:

British Museum (and other museum collections) > CCI > PAS.

That is to say, the museum (or permanent caretaker) is primarily responsible for the permanent and stable URI for an object. The eventual online CCI database will include all of the objects recorded in the index, which will include high resolution scans of one or more cards containing metadata and photographs (there is one card per provenance event, so the same coin that passes through multiple auctions over its lifetime will have multiple cards). When CCI goes online, we will remove the CCI coins from the PAS export. However, we want to ensure that the findspot and/or hoard metadata from the PAS are incorporated into the new CCI digital records. Similarly, we want to establish a concordance between British Museum and CCI records and CCI/PAS records with any other museum collection that comes online. The British Museum doesn't include geographic coordinates for individual finds. We need to make sure that we are merging data from disparate information systems into a cohesive Linked Data record that includes more and better information than any of the individual databases currently contributing to IACB. This de-duplication process will likely take years. But the end result is a scholarly tool that completely recalibrates the research paradigm for British Iron Age coinage.

Thursday, February 18, 2021

Urdu translations incorporated into HRC, OCRE, and CRRO

Thanks to translations provided by Dr. Asma Ibrahim, curator at the State Bank of Pakistan Museum, Urdu user interface translations have been incorporated into the Numishare framework. Urdu has been activated in Online Coins of the Roman Empire, Coinage of the Roman Republic Online, and all of the NEH-funded Hellenistic Royal Coinages sub-projects (Hellenistic typologies in the Inventory of Greek Coin Hoards database). These Numishare collections have been reindexed into Apache Solr, so that Nomisma.org concepts with Urdu labels are integrated into the user interface. There are not many Urdu labels for Greek and Roman numismatics so far--these have primarily been harvested from Wikidata.org and therefore reflect the coverage of articles from the Urdu language version of Wikipedia. That is to say, many notable entities, such as Alexander the Great, Augustus, or mints, such as Athens, Rome, etc. have relevant articles in Wikipedia, but not denominations or less notable people or corporate bodies.

 

Seleucid Coins (part 1), no. 278, a hemidrachm from Bactria.

This is the first of numerous deliverables for the Oxford-ANS OXUS-INDUS project to publish Bactrian and Indo-Greek typologies through the Hellenistic Royal Coinages umbrella. One of the chief aims of the project is to enhance discoverability and accessibility of Central and South Asian cultural materials to the residents of those regions. We hope to provide translations in other relevant languages in advancement of this goal, and this includes filling in gaps in translations for Nomisma.org URIs. Our analytics suggest that translations of Numishare interfaces into Arabic, Turkish, Bulgarian, and other non-Western European languages has directly contributed into increased usage of our open digital resources in Turkey, North Africa, the Middle East, and Eastern Europe. The introduction of Urdu into the interface is the first modern language in an area that covers the easternmost extent of Hellenistic cultural contact.

Tuesday, February 16, 2021

26,000 Roman Imperial coins from the Portable Antiquities Scheme added to OCRE

The coverage of Roman coin finds in Britain has been expanded dramatically from just over 1,000 in the first batch from the Portable Antiquities Scheme ever ingested (about five years ago) to more than 26,000. About 3,000 coins in the PAS database link directly to Online Coins of the Roman Empire (OCRE) URIs through its internal lookup interface, but another 23,000 links were established by me in a process that took several days worth of cleaning and reconciliation in OpenRefine.

 

RIC VII Treveri 475

I began with a query to the PAS's JSON-based search API to look for any Roman coin with "RIC" in the metadata. I loaded these data, as well as Nomisma IDs for ruler/person depicted, mint, and denomination (when available in the PAS data). With a lot of careful parsing, regex, and a number of other data munging techniques, I was able to isolate RIC numbers, and in combination with the RIC volume number and/or emperor name or Nomisma mint preferred label (particularly for RIC volumes VI to IX, which are organized by mint, rather than emperor), I did numerous passes through the OpenRefine reconciliation API that is inherent to OCRE (and Numishare projects, more broadly). Eventually, I ended up with over 26,000 matches. There may be some false positives here or there, but I'm pretty confident in the accuracy of the matching, and I did a substantial amount of manual checking when the API yielded more than one possible match.

I should note that, with only a few exceptions, Hadrianic coins were ignored, as we need to develop a different process to link to URIs for Richard Abdy's new RIC volume (II [second edition], Part 3) by means of the concordance between the original RIC numbers and the new ones that were published to OCRE in June, 2020.

Many (about half) of the coins link to a parish-level findspot, and so coordinates will appear on maps in OCRE and in the dynamically generated maps in relevant Nomisma concepts.

Finds distribution of Vespasian.

Another half of the coins are published to the PAS from the Iron Age and Roman Coins from Wales (IARCW) dataset. The findspots link to the district level, but do not display on the map in the current user interface. However, many districts can be rendered as GeoJSON polygons, which had been extracted from a Github repository set up by Dan Pett when he worked on the PAS database. Many of these coins are from hoards, and eventually we will be able to to hoard URIs that will be rendered differently on the map, to distinguish from individual finds. I will provide more details about this functionality when Oxford Institute of Archaeology formally launches the Iron Age Coins in Britain project in the next few weeks.

The Ordnance Survey URIs from the PAS database have been resolved to their matching entity in Wikidata, and the Wikidata SPARQL endpoint was used to extract coordinates for parishes, as well as the full administrative hierarchy from parish to country (UK). This makes it possible to query all finds within a district or county, according to modern divisions. I'll provide a more detailed look at this data structure eventually. Eventually all gazetteers used for findspots (whether the Getty Thesaurus of Geographic Names or Geonames.org) will resolve to Wikidata as a centralized authority service, which will make it possible to aggregate finds databases across countries, and query them through a shared gazetteer vocabulary.

Nearly 50,000 total records were extracted from the RIC query from the Portable Antiquities Scheme. About 4,000 are uncertain (and can't be matched to one single RIC number), but that leaves a further 20,000 or so that might be linked with further rounds of cleanup or crowdsourcing. The other major remaining task is to link coins to hoard records, whether these records are published in the PAS database or Coin Hoards of the Roman Empire, and this will enable the query and display of a large swatch of coins ingested into the system that otherwise have no public lat-long coordinates.

Thursday, January 14, 2021

DOIs minted for contributions to Nomisma.org

At long last, I have finally put into production a feature in the Nomisma.org back-end to post dataset metadata to Crossref in order to mint a DOI for each Nomisma editor. The groundwork for this was laid almost exactly two years ago, in the blog post, Formalizing editors and a step toward minting DOIs for Nomisma. However, we have opted toward publishing datasets directly through Crossref's APIs, instead of DataCite, on which the Kerameikos.org project relies for the same functionality. The ANS is already a subscriber to Crossref for minting DOIs for publications in its Digital Library (and eventually its journals), and the subscription charges are more reasonable for such a small institution as the American Numismatic Society.

As I stated in the above blog post:

Nomisma.org data constitute an enormous body of collective intellectual effort, and it's increasingly important that scholarly digital works receive equal weight as traditional publications. Therefore, the creation of DOIs for Nomisma contributions would appear in the scholarly profile in ORCID. Just recently, the AIA issued updated guidelines for the considerations of tenure and promotion, with specific guidelines for the recognition of digital projects, and so our goal of formalizing this recognition within Nomisma keeps us on the cutting edge with respect modern modes of scholarly communication.

By integrating Nomisma.org contributions into the broader Linked Open Data cloud for scholarly communication, the intellectual labor behind creating, defining, and organizing Nomisma concepts is formalized through an ORCID profile, where datasets are joined with more traditional forms of publication, such as journal articles and monographs. It has long been a challenge within Digital Humanities projects to recognize participation in a context that non-digital peers can appreciate with regard to employment or promotion within the academic sphere. There are few examples of Digital Humanities projects being published as a DOI with ORCID integration; we hope that our work in Nomisma will set an example in other DH disciplines. Beyond Nomisma, we aim to mint DOIs for our other typological projects, like OCRE and Hellenistic Royal Coinages. This is particularly important, as both of these projects employed undergraduate students who have gone on to enroll in PhD programs or are early career scholars.


Under the Hood

This new process is relatively straightforward and almost entirely automated by interacting directly with Crossref's APIs.

The Crossref XML file is generated by a source Nomisma SPARQL query to gets a list of editors and the earliest and latest dates of their edits, as well as an optional ORCID URI:

PREFIX dcterms: <http://purl.org/dc/terms/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX org: <http://www.w3.org/ns/org#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX prov:	<http://www.w3.org/ns/prov#>

SELECT ?editor ?name ?orcid (min(?date) as ?creation) (max(?date) as ?update) WHERE {
  ?editor a foaf:Person ;
            skos:inScheme <http://nomisma.org/editor/> ;
            skos:prefLabel ?name .  
  OPTIONAL {?editor skos:exactMatch ?orcid FILTER contains(str(?orcid), "orcid.org")}
  OPTIONAL {
      ?activity a prov:Activity ;
                  prov:atTime ?date.
      {?activity prov:wasAssociatedWith ?editor
          FILTER NOT EXISTS {?activity prov:used ?spreadsheet}}
      UNION {
        ?activity prov:used ?spreadsheet.
        {?spreadsheet dcterms:creator ?editor }
        UNION {?spreadsheet dcterms:contributor ?editor}
      } }
} GROUP BY ?editor ?name ?orcid ORDER BY ?name

 

The SPARQL response is then transformed through an XSLT stylesheet into the XML file above. In the Nomisma XForms back-end, this Crossref XML file is sent to their API, and if successful, each editor RDF is updated to insert the DOI into a dcterms:identifier.