numishare: October 2015

Friday, October 23, 2015

More than 700 Greco-Roman mints updated in Nomisma

Thanks to Ryan Baumann's work of creating a concordance between geographic identifiers in the Pleiades Gazetteer of Ancient Places and the Getty Thesaurus of Geographic Names, Dan Pett of the British Museum was able to build on this work to incorporate these concordances into the Portable Antiquities Scheme database. Dan's Nomisma-Pleiades-TGN concordance R script is on Github.

Dan then emailed the Nomisma listserv with a large CSV document of all mints in the PAS database, with associated Nomisma IDs, Getty, BM, Geonames, dbPedia, Pleiades, etc. I stripped away all of the mints that don't already have Nomisma IDs so that I could upload the CSV into Google Sheets, which then makes it possible to import data from the Atom representation of this spreadsheet into the Nomisma RDF. I expanded all of the concordance ID columns into full URIs for the Nomisma spreadsheet validation process, and then successfully updated 721 Greco-Roman mints to add Getty, BM, Geonames, and dbPedia URIs as skos:closeMatch objects. Further, the spreadsheet import process parsed the dbPedia URIs to perform Wikidata lookup, enabling us to add further concordances extracted from Wikidata--including the Wikidata URI itself, plus GND, BnF, and Freebase identifiers. The Wikidata lookup also adds additional translations as skos:prefLabels in from article titles in other languages.

As a result, we have added more than a dozen new translations for Zeugma and a few additional URIs.

Wednesday, October 14, 2015

ANS Launches Online Catalogue with Dar al-Kutub, the Egyptian National Library

The American Numismatic Society (ANS) is pleased to announce, in collaboration with Dr. Jere Bacharach, Department of History at the University of Washington, and Dr. Sherif Anwar, College of Archaeology, Cairo University, the digital publication of the non-hoard numismatic collection of the Egyptian National Library (http://enl.numismatics.org).

The catalog consists of more than 6,500 objects, ranging from late Roman glassware and pre-Islamic Sasanian coinage to the modern Egyptian coinage of Anwar Sadat. The collection is particularly strong in Medieval Islamic coinage across all major dynasties. The catalog differs from its predecessors in a number of ways. The collection has been photographed in color, with inscriptions read and transcribed from these images. The database includes references to the 1982 catalog of the collection undertaken by Dr. Norman D. Nicol.

The interface is available in both English and Arabic, owing to translations provided by Dr. Sherif Anwar. The multilingual interface is driven by numismatic concepts defined by Nomisma.org. Over the course of this project, more than 700 Islamic entities—people, dynasties, corporate entities, mints, etc.—were created in Nomisma, with labels in English, Arabic, and other languages, forming the technical foundation for the aggregation of other Islamic numismatic collections. Geographic coordinates have been included for the majority of Islamic mints, permitting the mapping of the Egyptian National Library collection.

According to Ethan Gruber, the ANS Director of Data Science, "the effort undertaken in defining Islamic entities in a Linked Open Data environment will make it possible to improve the Islamic department in the ANS database, and may make Islamic type corpora similar to Online Coins of the Roman Empire (http://numismatics.org/ocre/) possible in the future." Like other ANS digital projects, the data are freely available with an Open Database License, and are published in the Numishare framework.

The ANS acknowledges the contributions of the individuals who are named at http://enl.numismatics.org/pages/acknowledgments.

(Image information: Glass – Mamluk, Sultanate of Egypt, CE 1250-1517.6057, Egyptian National Library)

For more information contact Joanne Isaac at 212-571-4470 ext. 112 or isaac@numismatics.org.

The American Numismatic Society, organized in 1858 and incorporated in 1865 in New York State, operates as a research museum under Section 501(c)(3) of the Internal Revenue Code and is recognized as a publicly supported organization under section 170(b)(1)(A)(vi) as confirmed on November 1, 1970.

Friday, October 2, 2015

On Open Data and Numismatic Typologies

edit (2 October 2015, 4PM): I want to make it clear that we have been collaborating with numerous members of the Coins and Medals departments for several years now on a few digital projects, including building a close relationship with the Portable Antiquities Scheme. Data usage concerns have been expressed by a small handful of individuals at the British Museum and are not, as far as I can tell, driven by the Trustees of the British Museum.

Can the British Museum make their data available with a Creative Commons license, but then restrict how the data are used?

The short answer is yes.

But the long answer in this case is a bit more complicated. The British Museum has authorized the reuse of their data and images under a CC 4.0 BY-NC-SA license, meaning that anyone has the right to use these data for non-commercial purposes as long as the BM is attributed and the creative works derived from these data and images and likewise freely and openly shared. ANS collaborative projects have always adhered to these requirements. For OCRE, CRRO, and PELLA, we have extracted data from the British Museum SPARQL endpoint and transformed these data into the Nomisma ontology. The full list of datasets are available at http://nomisma.org/datasets, and so one may download the entire BM RDF data dump at once or extract any associated data via the Nomisma SPARQL endpoint. Individual coins are also attributed to their collection throughout the various interfaces in our digital type corpus projects.

As the British Museum license currently stands, we (or anyone) have the right to use these images and data in this manner, without the need to ask the BM permission to do so.

Only if the BM changed their license to the more restrictive ND (No Derivatives) would they be able to exert absolute control over the reuse of their data. This means that the public can only download a dump of their CIDOC-CRM RDF in N-Quads. It would not even be permissible to transform these data into RDF/XML for XSLT processing. One could not match the places in their thesaurus to Pleiades URIs and transform the CIDOC CRM into the Open Annotation model used for the Pelagios project. One could generate CSV out of the data to load into Open Refine, Google Fusion Tables for visualization, or to analyze data with R. Of course, a CC ND license would obliterate any potential for reuse of British Museum data, and this is certainly why they have not sought to place this draconian license on their data.

What does this have to do with typologies?

All of the numismatic data in the British Museum SPARQL endpoint are open, and nearly every individual specimen contains at least one reference to a coin type number. By poking around the BM data, I was able to figure out that the reference URI containing 'GC30' as a short title refers to Price's The Coinage in the Name of Alexander the Great and Philip Arrhidaeus. I developed a simple SPARQL query that allowed me to extract a list of nearly 3,000 coins from the British Museum that contained Price references. One could extend this query to gather a list of unique Price references rather than objects, and therefore anyone would be able to generate a significant portion of the typologies from the Price catalog. Now, this catalog would not be complete because Price derived some of his typologies from other collections, such as the American Numismatic Society. The BM endpoint also does not contain a full account of all Alexander coins in the BM collection.

However, these typologies from Price can be derived from descriptions of individual specimens, and the BM CC 4.0 BY-NC-SA license still applies. This begs the question: can the British Museum exert copyright control over typologies published in print when these same typologies can be freely and openly derived from its own collection database?

In fact, it would be possible to derive other typologies not under British Museum copyright by the same mechanisms. The same goes for the ANS database, which is freely and openly available with an Open Database License. Can the British Museum and ANS even include type numbers within their public databases if it is possible to derive typological data that might be under copyright of another publisher? In the United States, data aren't even copyrightable. And the use of reference numbers in databases, specifically, falls within the realm of Fair Use. If we begin to debate whether or not type numbers may even be referenced on the Web, the only real loser in this debate is the general public.

Proof of Concept: Seleucid Coinage, an American Numismatic Society publication

The URI for Houghton and Lorber's Seleucid Coins: A Comprehensive Catalogue Part I Volume I is http://collection.britishmuseum.org/id/bibliography/6336.

Poking around at the CIDOC CRM structure of the coins associated with SC references, I constructed a SPARQL query that would extract most of the typological data from the endpoint. It is a bit messy, as the SPARQL XML response tends to be with the expression of triples, so I took this XML response and wrote some basic XSLT to convert the response into CSV that better reflects individual typologies.

There are only 11 Seleucid coins in the BM system with Houghton and Lorber 2002 references, but I was able to generate a CSV file for all of the typological data for the SC types. The metadata are strings, but one could easily drop this CSV into Google Spreadsheets to clean up. If we were dealing with a typological dataset that consistent of thousands of types, it could be cleaned up in Open Refine in, probably, less than an hour to fully link all concepts to Nomisma URIs.

I have written a number of PHP scripts (e.g., like this one) to transform CSV into NUDS, and so one of these scripts could be adapted to transform the Nomisma-linked CSV into NUDS for direct publication in Numishare. It is possible to go from BM SPARQL queries to a fully-functional digital type corpus like OCRE in about a day's worth of work.

So basically, what I have done here is use the BM SPARQL endpoint to extract open data that comprise typologies that have been published by the ANS and are under ANS copyright. I mean, who cares, right?

Thursday, October 1, 2015

On Concordances Between Type Corpora

By now some of you may have heard that the British Museum issued a take-down notice for the PELLA project, a digital type corpus of Argead coins. PELLA will ultimately generate URIs for all Argead types, and will hopefully be the canonical corpus for cataloging both museum, archaeological, and private collections. The first and most important step in building a large scale digital corpus/data aggregation system (like PELLA, OCRE, CRRO, etc.) is to build a concordance between existing type numbers and future ones. This is especially the case for PELLA, as OCRE and CRRO are merely reflections of the current editions of Roman Imperial Coinage and Roman Republican Coinage. Therefore, Martin Price's 1991 out of print, but in copyright The Coinage in the Name of Alexander the Great and Philip Arrhidaeus is just one part of the existing print corpora of Argead coinage.

The British Museum "cannot consent to the Price catalogue typology being released online until we have been able to assess whether the Trustees of the British Museum would approve the inclusion of the Price catalogue material within this online resource" because they erroneously believe that the data published through PELLA are verbatim descriptions from the corpus. We can demonstrate significant enhancements of the typology, so this brings to question whether URIs for Price types can even be minted.

The BM seemingly does not want us to issue typological data under the Price numbering system, despite the enhancements that clearly deviate from the original publication. The BM would find it impossible to enforce this demand, as Price numbers are a standard reference for the coins of Alexander the Great, and nearly every collection uses these numbers--whether commercial auction houses, large museum collections like the ANS, Berlin, or Bibliotheque nationale de France, or tiny little museums like the University of Virginia Art Museum, which owns two Price-numbered coins of Alexander. The entire purpose of numbered type corpora like RIC, RRC, Price, etc. is to create a an up-to-date system by which scholars may cite coin types in a standardized way. Price's own corpus wasn't created in a vacuum. His typologies were influenced by those of his predecessors, going back for centuries.

This brings us to the important point of concordances. While PELLA will eventually issue a new set of unique identifiers for Argead coin types (based on a numbering system combining the ruler and mint), currently, the four collections that contribute data on physical specimens to the PELLA project use Price numbers in their databases. It is impossible to aggregate these collections without first establishing a baseline numbering system. It takes little effort to map these collections to Price URIs in PELLA. Moving into the future, there will be a systematic and transparent mapping from Price to future-URIs following Linked Open Data methodologies.

The fact of the matter is http://numismatics.org/pella/id/price.4 will NEVER, EVER go away.

Even after we have established a new numbering system, the price.4 URI will be a semantic HTTP 303 See Other redirect to the new URI if you go to price.4 in your browser. If you request RDF for price.4 via content negotiation, it will say that this type has been dcterms:isReplacedBy ale3.amphipolis.4 (or whatever). The RDF for ale3.amphipolis.4 will say that this is a skos:exactMatch for its preceding Price number.

The Berlin Münzkabinett has already begun to incorporate Price URIs from Pella into its database. But moving forward, because of the inherent functionality in Numishare for building sustainable concordances between coin types, Berlin will be able to update their database to the most current version of PELLA type URIs through automated mechanisms. The same would be true for the British Museum, if they want in due course to benefit from the work of others freely offered, for the incorporation of PELLA (or OCRE or CRRO...or Getty or Pleiades or Wikidata, for that matter) URIs into the "5 Star" Linked "Open" Data system. This is a topic for another time, but I am feeling increasingly inclined to discuss it.

According to the British Museum, "as far as we can see this catalogue data has not been updated to incorporate any new attributions and contains no new scholarly research." This, of course, ignores the tremendous scholarly effort in mapping typologies into the graph of data in the Nomisma ontology and concept namespaces, as well as the reorganization of obverse and reverse symbols in a way that makes query possible on the Web.

This brings me to my final point:

We are also concerned that the Open Database License (ODbL v1.0) under which the PELLA project is distributed is not compatible with the Creative Commons License (CC-BY-NC-SA 4.0) license that governs all non-commercial uses of images and data available both through the British Museum Collection Online and the British Museum Semantic Web Collection Online, as well as all non-commercial uses of any documents produced by the British Museum in any format or through any medium.

The PELLA data were created by Andy Meadows and Peter van Alfen, with translations into German and French provided by Karsten Dahmen and Frédérique Duyrat, and the ANS has every right to place an Open Database License on these data, since these typologies are enhancements over what was originally provided in Price. And of course, our migration of their data from CIDOC-CRM into the Nomisma Ontology and integration of it into the Nomisma SPARQL endpoint does not violate any terms of their CC license whatsoever. PELLA is a non-commercial project. We are sharing the data alike, and we are attributing the data to the British Museum, when applicable. And we have linked to the images that they are providing through their own system, which is perfectly acceptable for their CC license. No one complained when the British Museum became partners in OCRE or CRRO, so one has to wonder why they are causing such a stink with this particular project. The British Museum is stifling scholarship and undermining the potential for numismatic research and the integration of numismatic materials into the wider cultural heritage domain. We at the ANS support Open Data and Open Access, and we hope that future scholars may make use of these data for answering more complex research questions.

Pages