Here I hope to give a quick start on how to use Schematron with your XCRI-CAP documents to give useful feedback.
The instructions here were originally tested on OS X but should also work on Windows or your Linux disto of choice providing you have Java. If there are any differences I am not aware of please leave a comment.
Setting up your environment
You will need:
• ISO Schematron
• A Schematron Schema
• An XLST processer
• An XCRI-CAP document for validation.
ISO Schematron
We’ll be using ISO Schematron, which is available for download from the schematron website: The file we want is iso-schematron-xslt2.zip, I recommend unzipping it to your desktop and renaming it schematron. Place the XCRI-CAP document you wish to validate into this directory.
Processor
The schematron schema is transformed into a XLST stylesheet using a XLST processor; in this example I am going to use Saxon-B for Java, which is an open source XSLT processor.
Once downloaded you will need to add saxon.jar to your classpath. On OS X 10.5+ you can place it into your /Library/extensions/Java directory.
Schematron Schema
A Schematron schema is needed validate against your XCRI-CAP XML document. The schema takes the form of xpath expressions; I have written one based on XCRI-CAP 1.1 which you can download here; feel free to use and modify. Once downloaded place into your schematron directory. You will also need this list of postcodes.
Now you should now have a directory named schematron with multiple files; it is important that you have the following five:
• iso_svrl_for_xlst2.xsl
• iso_schematron_skeleton_for_saxon.xsl
• xcri.iso.sch
• postcodes.xml
• your own XCRI-CAP document!
Validating your XCRI-CAP documents
First we need to create an xsl stylesheet which will be used as the validation engine against our XCRI-CAP document.
Open a terminal/prompt navigate to your schematron directory and issue the following command:
java net.sf.saxon.Transform -o xcri.iso.sch.tmp.xsl -s xcri.iso.sch iso_svrl_for_xslt2.xsl
This will use the Saxon processor we installed earlier to compile the schema into an XLST stylesheet (xcri.iso.sch.tmp.xsl) using our schema, Schematron xsl and the Saxon processor. The next step is to use the file we have created against our XCRI-CAP document to create an error report. To do this run the following changing ‘myfile.xml’ to the name of your document:
java net.sf.saxon.Transform -o errors.report.xml -s myfile.xml xcri.iso.sch.tmp.xsl
You should now have an error report called errors.report.xml. The report is in SVRL format which is a simple language defined in ISO schematon. SVRL can be used as the basis for further transformations and to demonstrate this I have written a small xsl which you can use to transform this into an HTML document. If you wish to do so, download to your schematron directory, make any modifications you require and use the following:
java net.sf.saxon.Transform -o errors.html errors.report.xml svrl_transformer.xsl
You should now have the HTML document of errors in your schematron directory errors.html
Line Numers
You should now be producing feedback on your XCRI-CAP documents and producing a simple html document of errors. If you are using a version of Saxon that allows access to its extensions then you we can use the saxon:line–number() extension to report line numbers.
To do so replace these files in your schematron directory with these modified versions:
iso_schematron_skeleton_for_saxon.xsl
And run though the validation steps again with the -l flag set. eg:
java net.sf.saxon.Transform -l -o xcri.iso.sch.tmp.xsl -s xcri.iso.sch iso_svrl_for_xslt2.xsl
java net.sf.saxon.Transform -l -o errors.report.xml -s myfile.xml xcri.iso.sch.tmp.xsl
java net.sf.saxon.Transform -l -o errors.html errors.report.xml svrl_transformer.xsl
Don’t have time to give it a go? You can use the ‘beta’ web based validator that will validate your XCRI-CAP document using the same steps shown in this post. You can find the validator a:
http://galadriel.cetis.ac.uk/XCRIValidator/
Comments very welcome!
8 Comments
Owen Stephens · March 11, 2010 at 10:29 pm
I’ve given the online version a go, and it seemed to work fine. I do get prompted to login each time I hit the page – although cancelling the login box and carrying on seems to work.
I don’t know if this is helpful, but I was wondering about the validation of any ‘identifier’ elements. The wiki recommends the use of URLs that resolve to human readable content – is there any scope for checking at least the use of an http URI as an identifier, and giving a warning where this isn’t done? Not sure whether this is (a) a good idea or (b) do-able
Owen Stephens · March 16, 2010 at 2:11 pm
I’ve been trying the online validator. Seems to work OK, but I get a popup box telling me ‘authentication is required’ when I access the page – this seems to be untrue – or at least, cancelling the pop up allows you to continue and validate a file.
A couple of comments:
1) It would be good to be able to point the validator at a URL rather than upload a file
2) How does validation work with identifiers? I don’t think (at the moment) the validator checks if the default indentifier is a URI (as specified in the wiki http://www.xcri.org/wiki/index.php/Identifier)
ayache · March 23, 2012 at 6:05 pm
Hi Dave,
I got a copy of your iso_svrl_for_xslt2.xsl, which handles the line nubmer. It doesn’t work for me (see a snippet of the report below). As you will see the line number is always set -1. I got saxon B you have suggested in my class path, any idea?
Line Number: -1.ISSN should have attribute pub-type=”ppub” for print or pub-type=”epub” for electronic publication.
Line Number: -1.ISSN does not conform to the expected syntax of two groups of four digits separated by a hyphen (-). The final character can be an ‘X’ rather than a number.
ayache · March 23, 2012 at 6:27 pm
Hi Dave
Sorry it’s me again, I didn’t see that I have to set the flag -l to get line number working. You showed above how to set if you run it in command line, how to set to run in Java application (currently running the sample on eclipse).
Thanks
dms2ect · March 26, 2012 at 10:54 am
Hi
I haven’t tried it in eclipse so I don’t know, If I get chance I’ll have a look over the next few days. This tutorial is a bit out of date now and I would hear how you get on with it in eclipse.
XCRI Blog » Blog Archive » Update on XCRI validator · January 11, 2010 at 2:23 pm
[…] For more info please visit,David’s blog […]
UCAS Course code lookup: Take two » Overdue Ideas · March 11, 2010 at 10:03 pm
[…] a slight confusion as to what makes valid XCRI-CAP – I’ve run the results through the validator blogged by David Sherlock, and get a small number of warnings regarding the lack of ‘descriptions’ for each […]
JISC Assemblies, Conferences and Publications « SSBR Newsletter · March 15, 2010 at 1:22 am
[…] Validating XCRI-CAP using Schematron […]