Thursday, July 14, 2022

Geographic export in Nomisma.org finally migrated from Pelagios RDF to Linked Places GeoJSON-LD

At long last, the Nomisma.org geographic export from in the nearly ten year old Pelagios RDF/XML model has been migrated into Linked Places GeoJSON-LD. The query of places from Nomisma includes all mints and regions that have a spatial extent. These places are predominately mints, as only a few regions have explicit polygon spatial boundaries. As such, the Linked Places export does not include the full geographic hierarchy between mints and regions (yet), although it may be possible to generate a bounding box of a region formed by the extent of child mint locations. The new download URL is http://nomisma.org/linked-places.json.

Another requirement of a Linked Places place is a "when" property that includes a data range and optional period URIs. These are not explicit within Nomisma, although they can be derived by the link between the mint/region and a "field of numismatics" concept URI. There are about two dozen fields of numismatics in Nomisma (e.g., Roman, Byzantine, Islamic, and Greek), and I have updated these concepts to insert URIs for corresponding periods in the DAI's ChronOntology, Perio.do, and style facets defined in the Getty Art & Architecture Thesaurus. Fields of numismatics aren't precisely periods, and so they are linked with the skos:related property rather than skos:exactMatch, which is usually implemented for linking to external thesauri from Nomisma concepts.

The date range of the "when" property is required in Linked Places. It is not currently available in the JSON export, but we are going to put a broad start and end date range into the field of numismatics concepts, which we can then incorporate into the export after minor modifications to the underlying SPARQL query. Apart from this the period "name" is not populated, since we are only storing URIs for these concepts in Nomisma, and not any additional metadata we might extract from those target information systems.

Other than these temporary deficiencies, the model is fairly fleshed out. There are about 2,000 places with geographic coordinates in Nomisma.

Linked Places Geo-JSON-LD views in the GeoJSON sandbox.
 

Hopefully this export will pave the way for Nomisma.org geographic concepts to be integrated into the World Historical Gazetteer since the old Pelagios geographic aggregation system has been deprecated.

Friday, July 8, 2022

Esty's die calculations and frequency visualizations added to CRRO

At long last, I've had a gap between a couple of projects to spend some time developing a new feature in Coinage of the Roman Republic Online (CRRO) that Liv Yarrow and Lucia Carbone requested last winter. There is now a significant number of coins die-linked within the Roman Republican Die Project (RRDP) that this network of relationships can be used to dynamically calculate the estimated number of dies for a Roman Republican coin type in CRRO. These calculations are based on formulas in Warren G. Esty's 2006 "How to estimate the original number of dies and the coverage of a sample" and the 2011 follow-up, "The Geometric Model for Estimating the Number of Dies," with p = 1 instead of p = 2 in a later addendum.

A new API in Nomisma was created to perform these underlying calculations and respond with JSON data that are rendered into HTML with a d3.js chart in corresponding pages in CRRO. The http://nomisma.org/apis/dieCounts API executes a SPARQL query to generate a list of named graphs that correspond to die studies (in this case, Richard Schaefer's RRDP, http://nomisma.org/editor/rschaefer) and related type corpora. There is also a list of available formulas for calculating die estimates, although the only one at present is Esty's. The formulas are extensible.

The next phase of the API call once selecting a formula (http://nomisma.org/apis/dieCounts/esty) is to provide two request parameters: 'type' for the coin type URI and 'dieStudy' for the named graph URI. For example, http://nomisma.org/apis/dieCounts/esty?dieStudy=http%3A%2F%2Fnomisma.org%2Feditor%2Frschaefer&type=http%3A%2F%2Fnumismatics.org%2Fcrro%2Fid%2Frrc-336.1c

This JSON response includes the number of specimens (n), number of unique dies (d), singletons (dies that appear on only one coin, d1), estimated coverage (based on these numbers, c_est), estimated number of dies for the type (d_est, given the coverage value), and the minimum and maximum range of dies given 95% confidence.

The response includes these calculations for both the obverse and the reverse, if applicable (some types may be linked to only obverse or reverse dies, but not both). Four SPARQL queries are submitted for each side of the coin and the JSON response that results from these queries is used by CRRO to display the results in HTML and a line graph.

The SPARQL queries for the obverse dies associated with RRC 336/1c are as follows:

The first three queries are executed before implementing Esty's die estimate calculations, which are reflected in his 2006 article (with some later revision, as noted above):



The template for applying Esty's calculations is in XSLT with some additional math functions that aren't inherent in the XSLT 2.0 spec. These templates are extensible to include proposed formulas from other scholars.

The end result for RRC 336/1c appears below:

Die estimates and frequency visualization

The frequency statistics are also downloadable as CSV.

These calculations are performed dynamically when the page is loaded, so they reflect the current state of data entry. The numbers can and probably will change over time as more data are added into the Roman Republican knowledge graph stored in Nomisma.org. In the past, it may have taken weeks or months to aggregate the data necessary to perform these calculations, which can now be generated nearly instantaneously based on tens of thousands of coins linked to both types and dies.