Vocabularies: next steps

Mon Nov 26 13:00:46 PST 2007

Rick, hello.

On 2007 Nov 26, at 16:02, Frederic V. Hessman wrote:

>> The format of the concept labels (case and character set)
> Careful!
>
> Namespace labels (e.g. http:www.ivoa.net/thesauri/IVOAT#21CmLine):
>
> 	Case:  
> MixedCapitalizationByOnlyRemovingNonAlphabeticAndNumericCharactersSoTh 
> atMercuryWillWorkTwiceButOtherwiseVerySimpleToImplement			
>
>
> 	Set: a-Z, 0-9
>
> prefLabels (e.g. "21-cm line"):
>
> 	Whatever is used by whoever does it.  E.g. IAU93 and AOIM have  
> spaces and things.

You're quite right.  I meant the concept URI: the concept fragment  
should I believe/agree, be drawn from [a-z0-9], though I wouldn't  
push very hard against [a-zA-Z0-9].  The prefLabel and altLabel  
fields should be Unicode.

>> The number of top concepts in the IAU thesaurus
>
> Huh?  The IAU thesaurus is the IAU thesaurus.  If "top concepts"  
> are defined either as 1) not having a BT or 2) having a NT, then  
> the number is already fixed.  Basta.

I agree.  I posted something to this effect in the middle of a longer  
message last week -- it's not something we have control over.

>> The grammatical number of the concept names (singular or plural)
> Singular, please! - it's a real pain to use the formal system of  
> singular concepts and plural countables and I agree that singular  
> should make the vocabulary simple to use

I think this is also a non-issue.  If a term is plural in the  
vocabulary we're adapting (IAU93 and A&A use this convention) then it  
should remain plural in the SKOS version, otherwise we're making  
gratuitious changes; if it's singular in the original vocabulary  
(AOIM) then it should remain singular in SKOS, for the same reason.

>> The number of vocabularies we intend to produce (in particular  
>> whether we produce a pair of `IAU' thesauri, including a corrected  
>> and updated one, and which UCD vocabulary we use), and which  
>> interrelationships we plan to publish
>
> Vocabularies:
>
> 	IAU93, AOIM, UCD1, A&A certainly (since they're easy)
>
> 	IVOAT hopefully (since this is hopefully the most useful)

Yes on all those vocabularies.  I admit to a certain nervousness  
about the UCD vocabulary, since I'm not convinced that's actually  
describing the same type of thing as the others.  If that's a  
problem, however, it'll appear during the production of the document  
and can be resolved then.

I wouldn't want to bet which of the vocabularies will end up the most  
useful in the end...

> Interrelationships:
>
> 	Tricky question:  we don't want to refer too much to IAU93,  
> because the suggestion will be that it's useful (which it really  
> isn't) and UCD1 really doesn't cover very many concepts contained  
> in the above vocabularies.  Stationary targets like the first list  
> are admittedly much easier to do, but I've already started to  
> connect IVOAT and UCD1, which is a good exercise since they are  
> only partially matchable.  IAU93 and IVOAT are so closely related -  
> even with the syntactic and content cleanups - that one could  
> automate that connection without too much trouble.

I'm with you on the potential for trickiness.  However, it might be  
simpler than this.  Perhaps we should just declare as many  
correspondences as we can, and see if a reasoner agrees the result is  
consistent.

All the best,

Norman

-- 
------------------------------------------------------------
Norman Gray  :  http://nxg.me.uk
eurovotech.org  :  University of Leicester, UK