Tuesday, 10 May 2011

URIs for AIM25 access metadata

We've been giving some thought to a suitable URI scheme to adopt for this project which could mesh with the requirements of the current AIM25 metadata requirements.

The current AIM25 system contains four types of access records: personal names (Person), corporate names (Organisation), subject (Subject) and geographic names (Place). To allow semantic linkages to be formed, we shall need a coherent set of URIs that can handle all of these.

For Subjects, one option may be the UKAT thesaurus which is available as SKOS RDF. Each concept here has already been assigned a URI: for instance, for 'Poetry' this is http://www.ukat.org.uk/thesaurus/concept/525. Note that UKAT is no longer being edited, however, and so may become out-of-date in the future.

For place names, there are several gazeteers available: the Archives Hub recommends the Getty Thesaurus (http://www.getty.edu/research/conducting_research/vocabularies/tgn/index.html),
but there is not yet a set of URIs authorised by the Getty (although they are working on this).

The LOCAH (Linked Open Copac Archives Hub - http://blogs.ukoln.ac.uk/locah/) projectproduced a set of guidelines for URIs which look very useful. For each of these categories they would take this form:-

Personal and corporate name names

LOCAH recommends the following format for personal names:-


so, once we have decided on a suitable root - let's say for the moment http://data.aim25.ac.uk/, for Burns | John | 1774-1868 | surgeon, we'd have


Similarly for corporate names, we'd have:-


For geographic names, we'd have:-


and finally for subjects:-


These seem to be viable options, although a final decision has yet to be reached.

Friday, 6 May 2011

A machine readable layer for AIM25

I've been busy on the AIM25 test server adding a machine readable layer.

Of course AIM25 has for a long time offered the metadata held for each collection as EAD. I've taken this as my jumping off point and had a go at adding some more formats to the AIM25 arsenal that will hopefully be of user to any silicon based users of the AIM25 service.

There are still a few screws to tighten but hopefully this work will represent a useful tool for the OMP work to enrich the AIM25 metadata.


First off I wanted to mimic the browsing structure that us humans take for granted as we make our way from the homepage to collection page on the website. For this we used Encoded Archival Context (EAC) to list and describe the Institutions and their Collections.

Next we wanted to extend the work started by Gareth and Richard using semantic web services at the collection level. Once at this level we can access EAD (as we always could), to this we added the dynamically generated output from OpenCalais (OCS) and a modified version of EAD with the OCS output embedded in the content (EAD aft. OCS). This latter is also dynamically generated.

Lastly we added some browser-side scripting to the original HTML pages to highlight terms identified by openCalais. All of the above uses openCalais dynamically so be patient. Obviously the goal would be to use a triple-store generated using OC at the point of creation (and change) of records.

This work so far is really a demonstration of some possible ways of expressing and enriching AIM25 content. It is by no means an exhaustive (or even authoritative) list of possible formats, but we hope it will serve to make tangible some of the ideas we've been discussing over the past month or so.

Thanks to our firewall-wallahs you can now browse AIM25-OMP here (thankfully in HTML too).

If nothing else this has been a good exercise in getting to know AIM25 a bit better and whipping my XSLT into a useful shape and of course dipping my toe in the semantic ocean.