SKOS concepts in VOTable

Mon May 28 02:53:53 PDT 2012

On 25 May 2012, at 17:46, Norman Gray wrote:

>>> Solution 2 :
>>> Create another attribute (or element) dedicated to SKOS vocabularies.
>>> PROs : clean way to do things. Well defined scope, no ambiguities.
>>> CONs : makes metadata generation more complicated or confusing for data
>>> providers (generate ucds, utypes, skos, ...). If it is an attribute,
>>> not possible to describe several SKOS concepts. Requires schema change in
>>> VOTable 1.3 to allow this solution.
>> 
>> This is the only way to go, since it makes things backward compatible.  On the long term, we'll have to use UCDs and utypes as SKOS vocabularies anyway, since that is all it is (and UCD's have already been so expressed); a common use model would make life much easier.  Thus a transition towards a better and more flexible semantic model would push things in the right direction.
> 
> I don't think that the element would have to be dedicated to SKOS.
> 
> Since each of the SKOS identifiers is a URI, these could be identified from their form, including the vocabulary they're part of.  Thus <http://purl.org/astronomy/vocab/Algorithms/HartreeFock> is from the algorithms vocabulary,<http://purl.org/astronomy/vocab/DataObjectTypes/Image> is from the data object type vocabulary.
> 
> This doesn't have to be dedicated to SKOS for the following reason.
> 
> If you dereference<http://purl.org/astronomy/vocab/DataObjectTypes/Image>, and follow the redirection, you get more information about the concept, including the statement that it's a skos:Concept.  That is, any other future non-SKOS things could go in the same slot, and if it was necessary to distinguish SKOS from non-SKOS things, for some reason, that information is to hand.

No, it doesn't have to be skos, but it has to be officially semantic.

>>> Solution 3 :
>>> Use <LINK> subelements for <FIELD>.
>>> PROs : already allowed in VOTable 1.2. URIs can be expressed in href="".
>>> Possible to use content-type and content-role attributes to give additional
>>> context. Possible to have multiple <LINK> for one <FIELD>.
>>> CONs : not sure how parsers would handle this or how this can be used
>>> in TAP...
>> 
>> The idea of a "link" is not really correct.  Also, we need a mechanism which will work just as well in other contexts (e.g. VOEvent, ....), since this is not just a problem for VOTable.
> 
> Myself, I think this is the best solution.
> 
> I don't really understand the CON.  Since <LINK> is already part of VOTable, parsers (including TAP clients) can deal with it already (perhaps by simply ignoring it, of course).
> 
> The use of <link> as a generic link is already established in HTML's <head> element, so there's no innovation here.  It simply expresses that there is a link of some sort between the current document and some other resource, with the nature of the link described in the @content-role attribute, orthogonally to @content-type.
> 
> The @content-role attribute is declared in the schema as NMTOKEN, so there are no restrictions on what values it can take.  Some future VOTable standard would simply have to note that content-role='type' (say) has a particular meaning, and that's all the standardisation required.  Indeed, since the value of this attribute is unconstrained, and there are no reserved words (appendix A.1 talks of "allowed values", but isn't normative), people could start using content-role='type' _now_, with no standardisation required at all.
> 
> Since each utype will have an associated documentation URL (that's still the case, isn't it? or has that changed again?), this would incidentally double as a way of associating utypes with parts of the VOTable, without further standardisation.  That is, people could start doing this tomorrow.
> 
> A slight variant on this would be arbitrarily extensible, expressive, and lightweight.

So you're suggesting a standardized @content-role?  Fine with me, but you still don't have content control.

>> A cheap, effective, but admittedly very ugly alternative would be to use numbered attributes
>> 
>> 	<someVOTag sem1="ucd:src.redshift" sem2="iau93:Quasars" />
>> 
>> If the schema forsaw 16 of them (or better 64, a historical reference to the old 640K Windows limit), we'd be able to live with it for quite a while, I bet. 

> Rick.  Rick!  Rick: you are joking, aren't you?

Yes and no.   Yes, this would be terrible for human readers and we would have to blush as we watch it easily perform the function we need, but no - I don't see any other formal (i.e. content-controlled) means of doing it with attributes, which means being forced to use whole tags:

	<someVOTag ....>
		...
		<link content-role="semantic">iau93:Quasar</link>
		<link content-role="semantic">ucd:src.redshift</link>
	</someVOTag>

Fine with me, but it is a definite decision which needs to be carried through all VO protocols in order to make any sense.  Adding attributes is cheaper because it puts semantic content on the same level as most other similar metadata:  VOTable doesn't now have a tag <UCDdescription>src.redshift</UCDdescription> for this reason, but one could change this as well.

My point was that the present schema formalism makes it difficult to do what we would need to do, so the solution is going to be a kludge or compromise of some sort, no matter what.

Rick