numishare: lod

Wednesday, July 24, 2013

Coin Hoards of the Roman Republic – a new tool for Roman numismatics

Today the American Numismatic Society and the Institute of Archaeology of University College London, UK, launch an important new tool for the analysis of Roman Republican coin hoards.

Coin Hoards of the Roman Republic Online (CHRR Online) is a collaboration between Rick Witschonke and Ethan Gruber at the ANS and Dr. Kris Lockyear of UCL.

The new web-based tool makes available in searchable form the contents of a database created by Dr. Lockyear of 694 Roman Republican Coin hoards and the 115,000 coins that they contain. The new tool, which is based on the Numishare technology, makes it possible to browse, search, map and analyze the evidence of Roman coin finds in new and exciting ways.

“The database was initially created on a PC for my PhD, but it has continued to be expanded since then and forms the basis of my book Patterns and Process in the Late Roman Republican Coin Hoards, where it is used, amongst other things, to investigate the size of late Republican coin issues, the date of the import of Republican denarii to Dacia and the patterning created by the events of the civil wars. I am very grateful to Professor Michael Crawford for allowing me access to his unpublished archive held in the British Museum”, noted Dr Lockyear. “It was obvious that the database, the result of over twenty years work, was a valuable resource that could help others in their research if I could make it widely available. As the database continues to grow, updates will be posted to the online version, which I hope will encourage others to share information.”

The potential for the ANS to help in the process of online publication was spotted by curatorial associate Rick Witschonke. “It was clear that Kris’ database dovetailed very neatly with work being carried out at the ANS to create stable identities for numismatic concepts on the web”, explains Witschonke. “We were very fortunate also to be in touch with the curators at the British Museum, Ian Leins and Eleanor Ghey, who generously made available to us the work they had recently undertaken to catalogue the BM collection. By bringing together their data and Kris’ hoard database with the work that ANS has been undertaking at Nomisma.org, we were able to create a new tool based on Linked Open Data principles.”

The creation of the new web tool was the work of ANS database developer Ethan Gruber. The integration of Roman Republican Coinage coin types defined by Nomisma.org into CHRR Online enables maps and timelines showing the geographic and temporal extent of hoards. Furthermore, users of the quantitative analysis interface may compare the distribution of selected typological attributes across numerous hoards, visualizing results in the form of graphs or downloading data in CSV for more sophisticated analyses. For example, one may compare the distribution of mints or issuers across dozens of hoards: a common numismatic query, delivered nearly instantaneously.

“The CHRR project is wonderful example of the way that ANS is working with multiple partners to create new resources for our members and the whole community of collectors and scholars” notes ANS Executive Director Ute Wartenberg Kagan. “By sharing our data in standard, open formats, we increase its power hugely. The ANS is currently at the forefront of the development of digital tools for numismatics at an international level. It is tremendously exciting to see another tool launched today.”

Thursday, July 18, 2013

Nomisma: Using XForms to Manage and Publish Linked Open Data

One of the main improvements in the newly-redesigned Nomisma web architecture is in the administrative backend, not visible to the public. The previous iteration of Nomisma was built on top of open source wiki software. Each id was an XHTML+RDFa fragment in the filesystem, created and edited through the wiki. There was no validation, and the hand-coding of XHTML fragments occasionally led to human error: invalid XML documents which occasionally broke page loads or RDF distillation. We needed to move to a more stable and scalable infrastructure.

The XHTML+RDFa fragments remain a part of the new architecture of Nomisma, now maintained in a GitHub repository. The fragments are now edited in an XForms interface with the Orbeon processor, which enables not only editing of XML, but a variety of REST interactions to get and post data into the Apache Fuseki RDF triplestore and SPARQL endpoint, and post data into the Solr search index, which powers the Atom feed.

While the XForms web forms handle the simplest of XHTML templates, such as those for authorities, mints, regions, etc., it does not yet handle editing of more the more complex data models, such as those for IGCH hoards (like http://nomisma.org/id/igch0200) or coin types (for example, http://nomisma.org/id/rrc-174.1). However, hoards and coin types are least likely to be manually edited, so the editing interface is most useful for those numismatic concepts which are most likely to be enhanced with additional labels and references to other linked open data identifiers (like VIAF or Pleiades ids).

Validation

One of the main features of XForms is advanced validation. The @typeof attribute in the XHTML root div is tied to a drop down menu. The values in this drop down menu are generated dynamically before the form has finished loading (xforms-model-construct-done) directly from a SPARQL query to acquire all of the nm:numismatic_term ids in Nomisma:

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX dcterms: <http://purl.org/dc/terms/>
PREFIX nm: <http://nomisma.org/id/>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
SELECT ?uri ?label WHERE {
?uri rdf:type <http://nomisma.org/id/numismatic_term>.
?uri skos:prefLabel ?label .
FILTER (lang(?label) = "en")}
ORDER BY ASC(?label)

A similar query is passed from Orbeon to the endpoint to generate an XForms instance for nm:field_of_numismatics (e.g., Greek Numismatics, Roman Numismatics, etc.). Languages (xml:lang in the div) are also tied to an instance which contains every ISO language code and label.

XForms bindings and XPath also ensure that other requirements of the XHTML document are met: there must be an English preferred label, the labels cannot be blank, there can be no repetitive languages for preferred labels, latitude and longitude must be decimal values between -180 and 180, and related links must be valid URIs.

One of the new features of this interface is the interaction between XForms and dbpedia. It is possible to import labels in languages not already in the Nomisma id from dbpedia RDF. The XForms submission is fairly straightforward:

<xforms:submission id="get-dbpedia-rdf" action="http://dbpedia.org/data/{instance('control-instance')/dbpedia}.rdf" ref="instance('dbpedia')"
                replace="instance" method="get">
                <xforms:message ev:event="xforms-submit-error" level="modal">Failed to get Dbpedia RDF.</xforms:message>
                <xforms:action ev:event="xforms-submit-done" xxforms:iterate="instance('dbpedia')//rdfs:label">
                    <xxforms:variable name="lang" select="@xml:lang"/>
                    <xforms:action if="not(instance('doc')/xhtml:div[@property='skos:prefLabel'][@xml:lang=$lang])">
                        <xforms:insert context="instance('doc')" nodeset="./xhtml:div[@property='skos:prefLabel'][last()]" origin="instance('prefLabel-template')"/>
                        <xforms:setvalue ref="instance('doc')/xhtml:div[@property='skos:prefLabel'][last()]" value="context()"/>
                        <xforms:setvalue ref="instance('doc')/xhtml:div[@property='skos:prefLabel'][last()]/@xml:lang" value="$lang"/>
                    </xforms:action>
                </xforms:action>
</xforms:submission>

Thus it is easily to rapidly and easily incorporate new labels into Nomisma to facilitate multilingual interfaces in other projects which depend on it for data (like OCRE and the UVA collection).

Workflow

Since the ids need to be maintained in GitHub in the long-term, the editing workflow requires the loading and saving of XHTML+RDFa fragments in the filesystem rather than through a REST interface like eXist.

The workflow is as follows:

Load existing id from filesystem or create new one
Edit the id
Save id. When the document is valid, the save button becomes enabled, and clicking the save button initiates several processes:

Serialize the ids to XML and save back to the filesystem

Serialize the XHTML+RDFa into RDF

Using SPARQL/Update, POST the RDF back into the endpoint. Since using POST adds new triples into the subject (e.g., http://nomisma.org/id/rome) (creating duplicate triples), the subject must first be flushed from the endpoint before the RDF is sent to Fuseki. Therefore the following SPARQL query must be sent to the endpoint before the newly-edited RDF is inserted (wonky and unintuitive, but necessary with SPARQL/Update):

DELETE {?s ?p ?o} WHERE { <http://nomisma.org/id/rome> ?p ?o . ?s ?p ?o . FILTER (?s = <http://nomisma.org/id/rome>) }

After the RDF is updated in the endpoint, the XHTML+RDFa is serialized into a Solr XML document and posted into the search index (for the Atom feed, although we may implement faceted search/browse eventually). After the doc is sent, a commit is sent to Solr.
Finally, a nightly cron job adds new files into the GitHub repo, and then changes are committed and pushed into GitHub. Another job then runs to generate RDF dumps of the Nomisma data, which are available on the nomisma.org home page.

This is the gist of the editing workflow in the new version of Nomisma. I plan to improve the XHTML+RDFa editing templates to support a greater degree of complexity in the data model. Additionally, I aim to create an administrative interface to better manage datasets provided by other institutions. The endpoint includes not only Nomisma ids, but RDF provided by OCRE, UVA, CHRR, the ANS, and a portion of the Berlin coinage for Augustus. I want to be able to get VoID RDF files from new data contributors and do consistency checks on RDF dumps before ingesting them into Fuseki. I also want to be able to delete or update all triples from a single institution. This functionality will come eventually. It will become a higher priority once there are more contributors of numismatic data to Nomisma.

All the code discussed above is, of course, open source: https://github.com/ewg118/nomisma/tree/master/xforms

Pages