Friday, September 27, 2019

First pass at processing Linked Art JSON-LD to Nomisma RDF

Over the last few weeks, I have been developing a harvester for Linked Art-complaint JSON-LD simultaneously in both Nomisma.org and Kerameikos.org, which share similar frameworks that are built around Orbeon XForms for manually editing or transforming large quantities of data (usually CSV) to RDF, and connecting these workflows directly to Apache Solr and a SPARQL endpoint. These new features, in both platforms, load JSON-LD from a URL, which is transformed into the XForms 2.0 spec's JSON-to-XML model, and is then validated and parsed into RDF/XML on the way into the SPARQL endpoint.

I will write something more comprehensive about how this functions specifically on the Greek pottery side of things, but I have successfully tested transforming the Linked Art JSON-LD for a test coin (http://numismatics.org/collection/1944.100.76933.jsonld?profile=linkedart) into the Nomisma.org hybrid data model that is composed of properties and classes from our own numismatic ontology and properties from other ontologies, like Dublin Core Terms and the Europeana Data Model.

This transformation process removes much of the developer-oriented cruft out of the JSON to distill the model specifically to the essential literals and URIs necessary for connecting a coin, its measurements, images, and coin type URIs to the numismatic knowledge graph in the Nomisma.org SPARQL endpoint.

Basically, it performs the following functions:

  • Maps the preferred term for an object dcterms:title and the accession number to dcterms:identifier
  • Measurements (weight, axis, diameter) are mapped to the correct Nomisma property and validated to ensure that they conform to the correct units. Inches and centimeters will be converted to millimeters for diameter, height, width, and thickness.
  • Images for each "part" (obverse, reverse) are placed into the appropriate nmo:hasObverse or nmo:hasReverse data object as foaf:depiction. IIIF service URIs are expanded into the edm:WebResource and svcs:Service model that we have appropriated from the Europeana Data Model specification.
  • Any top-level "type" (classified_as) that is not a Getty or Nomisma URI is presumed to be a coin type. We would like to discuss this further with the Linked Art community to formalize a method by which we can flag coin type URIs in a more stable and consistent manner.

It should be noted that Linked Art hasn't delved deeply into provenance, which would be necessary for encoding coin hoard URIs and findspot metadata.

You can see the resulting RDF/XML (that would get sent into the Nomisma SPARQL endpoint) here: https://gist.github.com/ewg118/049046755a670c3645689c68c14e794b.

This harvester will be adapted as changes are made to the Linked Art model. We hope that this feature in Nomisma will open the door to more streamlined and consistent aggregation of numismatic materials from the broader museum community, especially as we begin to work on new projects that are relevant to the American Art Collaborative.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.