SKOS concepts in VOTable

Frederic V. Hessman Hessman at Astro.physik.Uni-Goettingen.DE
Fri May 25 02:03:09 PDT 2012


On 25 May 2012, at 05:43, Sebastien Derriere wrote:

>  The need to express SKOS concepts emerged during this interop,
> for example to indicate what SKOS concept corresponds to a column
> contents for the Theory WG.
>  There is no dedicated VOTable element or attribute to do this,
> and so several ideas have been suggested. I try to list these below,
> in order to see what people are thinking would be the best way to go.
> The question of the VOTable representation is only one aspect of the
> problem, because there are issues like being able to query on SKOS
> concepts in TAP for example...

Amazing: people are finally acknowledging that it would be nice to be able to ask for something as totally unusual and unexpected as a "galaxy" or a "star cluster" at a certain position....  ;-)

>  Several concepts coming from different SKOS vocabularies could in
> principle be associated to a single VOTable column.
> 
> Solution 1 :
> Use the "ucd" attribute. The UCD gives a semantic information on what
> a quantity is. With SKOS vocabularies, we have more flexibility, so the
> UCD attribute could be broadened to other vocabularies.
> PROs : use an existing attribute
> CONs : the regexp for ucd attribute in VOTable does not allow the '/'
> character, so it cannot contain URIs. It could break some apps expecting
> to find words from the UCD vocabulary if some alternate vocabularies are
> allowed in the same attribute. Is is not possible to describe several
> SKOS concepts, because there is only one attribute.

The CON kills this idea.

> Solution 2 :
> Create another attribute (or element) dedicated to SKOS vocabularies.
> PROs : clean way to do things. Well defined scope, no ambiguities.
> CONs : makes metadata generation more complicated or confusing for data
> providers (generate ucds, utypes, skos, ...). If it is an attribute,
> not possible to describe several SKOS concepts. Requires schema change in
> VOTable 1.3 to allow this solution.

This is the only way to go, since it makes things backward compatible.  On the long term, we'll have to use UCDs and utypes as SKOS vocabularies anyway, since that is all it is (and UCD's have already been so expressed); a common use model would make life much easier.  Thus a transition towards a better and more flexible semantic model would push things in the right direction.

> Solution 3 :
> Use <LINK> subelements for <FIELD>.
> PROs : already allowed in VOTable 1.2. URIs can be expressed in href="".
> Possible to use content-type and content-role attributes to give additional
> context. Possible to have multiple <LINK> for one <FIELD>.
> CONs : not sure how parsers would handle this or how this can be used
> in TAP...

The idea of a "link" is not really correct.  Also, we need a mechanism which will work just as well in other contexts (e.g. VOEvent, ....), since this is not just a problem for VOTable.


Practical would be comma separated lists in string attributes (similar to the use of concatenated UCDs now).  The redshift of a quasar would then be labeled

	<someVOTag semantic="ucd:src.redshift,iau93:Quasars" />

Yes, I know that this is a schema definition problem: one can't specify the structure of the string content of an attribute, which is a real pain. On the other hand, the concatenation of UCD terms hasn't been such a headache, so maybe we shouldn't worry too much about it.

A cheap, effective, but admittedly very ugly alternative would be to use numbered attributes

	<someVOTag sem1="ucd:src.redshift" sem2="iau93:Quasars" />

If the schema forsaw 16 of them (or better 64, a historical reference to the old 640K Windows limit), we'd be able to live with it for quite a while, I bet.  By the time this becomes a problem we'll be able to define content more flexibly : we should push through the adoption of an enumeration string structure in a new version of XML schema, say "xs:stringList".   Until then, the software would simply have to cycle through a search for "sem*" attributes - stupid, but straight-forward.

The attribute solution is better than creating new tags: this could be adopted for all IVOA data structures (e.g. VOEvent, ....) without changing things too much.

Rick


More information about the interop mailing list