What is a PURL?
A Persistent Uniform Resource locator (PURL) of an object is an URL that does not directly resolve to a web resource but to a HTTP link resolver which then, in turn, returns the actual web resource of the object. This uncouples the identifier of an object from its actual location (which may change), and thus ensures the continuity of object references. A prominent example for PURLs is the Digital Object Identifier (DOI) system which is (among other applications) commonly used for scholar web content. The DOI of a journal article, e.g. http://dx.doi.org/10.1016/S1535-6108(02)00133-2, acts as both, a permanent digital identifier, but it also is resolved to the actual location of the article in the web.
Naturalis Specimen PURLs
With the publication of specimen PURLs Naturalis takes a first step in it’s planned contribution to the need of persistent identifiers for Life Sciences. We decided to use PURL as a persistent identifier mechanism because of their ease-of-use, relative implementation-ease and strong technical community support. Every Naturalis specimen PURL refers to a physical object in the our botanical, geological and zoological collections. If a researcher refers to such an object in a scientific publication via its PURL, it is guaranteed that this reference will persist in the future, even if the location of the physical and data resource has changed. Specimen PURLs have the general form:
where institution is the data owner institution, e.g. Naturalis.
In data served by the NBA, each specimen record stores its PURL in the field
unitGUID. By default, the PURL returns the corresponding site of the specimen in the BioPortal in format
text/html. For instance, the PURL for an Anarosaurus specimen:
Specimens can have associated content such as videos or images. PURL resources that can have different formats are made available through content negotiation. This means that one and the same PURL can redirect to different locations based on the requested content type. The content type is passed as the Accept http header, e.g. with cURL:
curl -XGET -H "Accept:image/jpeg" http://data.biodiversitydata.nl/naturalis/specimen/RGM.443858
The following content types are accessible from a PURL (note that below, the content type in the accept header is passed via the query parameter
text/htmlis the default content type, e.g. http://data.biodiversitydata.nl/naturalis/specimen/RGM.443858/?__accept=text/html
image/jpegredirects to an image resource, if available, e.g. http://data.biodiversitydata.nl/naturalis/specimen/RGM.443858/?__accept=image/jpeg
audio/mp3redirects to an audio resource, if available, e.g. http://data.biodiversitydata.nl/xeno-canto/observation/XC144/?__accept=audio/mp3
video/mp4redirects to a video resource, if available, e.g. http://data.biodiversitydata.nl/naturalis/specimen/RMNH.AVES.110091?__accept=video/mp4
text/jsongives the JSON representation of the specimen, as served by the NBA, e.g http://data.biodiversitydata.nl/naturalis/specimen/RGM.443858/?__accept=application/json
Persistent Identifier Compliancy
With rapidly growing biodiversity data volumes, it becomes very important that collection objects or species occurrences can be unambiguously referenced. The Global Biodiversity Information Facility (GBIF) strongly encourages the use of persistent identifiers as stated in their guide on persistent identifiers.
Below we list the most important general characteristics herein and how PURLS issued by Naturalis apply to them:
A PID is globally unique: Naturalis' PURL structure aims to guarantee global uniqueness for specimen records, by combining the data owner institution or organisation and specimen unitID .
A PID exists indefinitely: Naturalis aims to assure the permanent character of its PURLs.
A PID is unambigiously applied: The specimen PURL service serves digital representations of physical specimens in our collection catalogues. Multiple content types per physical specimen can be requested. Specimen representations are being served based on their availability.
A PID is opaque: Opacity suggests that identifiers should not contain any readable information. The reason is to prevent users to make assumptions on data content from the identifier. The Naturalis PURL service does not entirely comply to this rule, since the source or owner institute is part of the PURL and specimen unitIDs carry at least some information about the specimen object.
A PID is permantly assigned to an object: Naturalis aims to assure the permanent assignment to an object.
A PID is actionable: Naturalis PURL specimen services are highly actionable as they resolve to data entries in the Bioportal for different content types.
- A PID allows for universal cross linking of information: Cross linking through PIDs is not yet implemented in the specimen PURL service.