RDF Proposal (Was: Re: SKOS concepts in VOTable

Brian Thomas thomas at noao.edu
Wed Jun 6 06:22:12 PDT 2012


On 06/06/2012 07:08 AM, Norman Gray wrote:
> Greetings, all.
>
> On 2012 Jun 5, at 23:30, Norman Gray wrote:
>
>> For those who aren't familiar with it, the idea of RDFa is that it's a smallish extension to the HTML DTD which allows one to embed a broad range of RDF statements into an HTML document.  There's a good example in the Wikipedia article<http://en.wikipedia.org/wiki/Rdfa>, but
>>
>>     <p xmlns:dc="http://purl.org/dc/elements/1.1/">This page was
>>     written by<span property='dc:creator'>Norman</span>.</p>
>>
>> ...illustrates how it can intersperse normal HTML and RDF triples like "<>  dc:creator 'Norman'."
>>
>> Now, RDFa is defined with respect to HTML, but there's no reason why one couldn't define an RDFa-like thing for VOTable, and the registry, and any other XML used in the IVOA.  It would mean defining a couple of extra attributes in each of the relevant schemas, and mandating that they're ignored by existing applications.  I think the result of that thought-experiment would look very similar to what you're proposing.
> Actually, I've just tried this, and it works rather well.
>
> Below is a cropped version of one of the sample VOTables in the VOTable spec:
>
> <?xml version="1.0"?>
> <VOTABLE version="1.2"
>           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
>           xmlns="http://www.ivoa.net/xml/VOTable/v1.2"
>           xmlns:stc="http://www.ivoa.net/xml/STC/v1.30"
>           xmlns:dc="http://purl.org/dc/elements/1.1/"
>           xmlns:skos="http://www.w3.org/2004/02/skos/core#"
>           xmlns:phys="http://purl.org/astronomy/vocab/PhysicalQuantities/"
>           xmlns:ivoa='http://purl.org/astronomy/vocab/XXXivoa-relations/'>
>    <RESOURCE name="myFavouriteGalaxies">
>      <TABLE name="results">
>        <DESCRIPTION property='dc:title'>Velocities and Distance estimations</DESCRIPTION>
>        <PARAM name="Telescope" datatype="float" ucd="phys.size;instr.tel"
>               unit="m" value="3.6"/>
>        <FIELD name="R" ID="col6" ucd="pos.distance;pos.heliocentric"
>               datatype="float" width="4" precision="1" unit="Mpc"
>               typeof="ivoa:DatabaseColumn" about='#col6'>
>          <DESCRIPTION property='dc:title'
>                       rel='ivoa:hasSkosConcept' href='phys:Distance'
>                       >Distance of Galaxy, assuming H=75km/s/Mpc</DESCRIPTION>
>        </FIELD>
>        <DATA>
>          <TABLEDATA>
>          <TR>
>            <TD>010.68</TD><TD>+41.27</TD><TD>N  224</TD><TD>-297</TD><TD>5</TD><TD>0.7</TD>
>          </TR>
>          </TABLEDATA>
>        </DATA>
>      </TABLE>
>    </RESOURCE>
> </VOTABLE>
>
> Notice the addition of the extra namespaces, and the addition of various property, typeof, about, rel and href attributes.  When I send this through a program which extracts RDF from RDFa (namely rapper from librdf.org, but there should be plenty others), I get:
>
> % rapper -irdfa -oturtle sample-votable-cropped.xml file:///Users/norman/Desktop/vo-rdfa/sample-votable-cropped.xml
> @base<file:///Users/norman/Desktop/vo-rdfa/sample-votable-cropped.xml>  .
> @prefix rdf:<http://www.w3.org/1999/02/22-rdf-syntax-ns#>  .
> @prefix xsi:<http://www.w3.org/2001/XMLSchema-instance>  .
> @prefix :<http://www.ivoa.net/xml/VOTable/v1.2>  .
> @prefix stc:<http://www.ivoa.net/xml/STC/v1.30>  .
> @prefix dc:<http://purl.org/dc/elements/1.1/>  .
> @prefix phys:<http://purl.org/astronomy/vocab/PhysicalQuantities/>  .
> @prefix ivoa:<http://purl.org/astronomy/vocab/XXXivoa-relations/>  .
>
> <>
>      dc:title "Velocities and Distance estimations" .
>
> <#col6>
>      ivoa:hasSkosConcept<phys:Distance>  ;
>      dc:title "Distance of Galaxy, assuming H=75km/s/Mpc" ;
>      a ivoa:DatabaseColumn .
>
>
> The "ivoa:" namespace is a fake one, but the others are real.
>
> Immediate problems are that RDFa [1,2] tends to be good about adding these annotations to existing information that's in the form of elements, but adding properties to existing attributes looks slightly more intricate.  A theoretical problem is that RDFa is defined for XHTML, but I can see no reason why it wouldn't work in general for any other XML.
>
> The point of RDFa is that there are no _changes_ to the content of (in this case) the VOTable, but the extra attributes allow precise semantics to be overlaid on the information already present in the file.  It's "principled screenscraping".
>
> Obviously, this is designed to be processed by being parsed into RDF, but there's no _requirement_ that RDF be the end-point, and an application could process these annotations in any way it wanted that was consistent with the RDFa semantics.
>
> So: do we have concrete examples of what it'd be good to do here?  Can we find things we'd want to express, which can't be done this way?  Does this seem to match what you were suggesting, Brian?

Yes, more or less. I honestly had'nt thought of RDFa per se, I was 
simply shooting for
embedding some portion of RDF into an XML document. But what you outline 
appears
to be very compatible in spirit, and I think your ideas are an 
improvement on my proposal. I
only worry about implementing *all* of the RDFa standard. Based on the 
use cases we can
gather, I would suggest we endorse using only a portion of the standard 
to start so as to not
scare off our potential suitors in other areas of the IVOA. Perhaps that 
fear of mine is
overblown; I certainly have no technical reasons for avoiding adoption 
of all of RDFa and
the bits you have above are pretty simple and cover quite a lot of 
possible cases.

I'll end by noting that your suggestion above also appears to require a 
schema change ;)
One difference to your proposal I might try, is to namespace the new 
attributes so that
  they are easily dropped by an XSLT script to revert the document back 
to its canonical form.
For example, changes in this regard to your above document would be:

       xmlns:s="http://www.ivoa.net/xml/semanticMarkup/v1.0"


       <FIELD name="R" ID="col6" ucd="pos.distance;pos.heliocentric"
              datatype="float" width="4" precision="1" unit="Mpc"
              s:typeof="ivoa:DatabaseColumn" s:about='#col6'>
         <DESCRIPTION s:property='dc:title'
                      s:rel='ivoa:hasSkosConcept' s:href='phys:Distance'
                      >Distance of Galaxy, assuming H=75km/s/Mpc</DESCRIPTION>
       </FIELD>

But filtering with a very simple XSLT script would return the document 
to its 'non-semantic'
state (where it may be validated by the canon schema)


Cheers,

-brian

>
> All the best,
>
> Norman
>
> [1] RDFa Primer: http://www.w3.org/TR/xhtml-rdfa-primer/
> [2] RDFa Syntax: http://www.w3.org/TR/rdfa-syntax/
>
>



More information about the semantics mailing list