Vocabularies: next steps

Wed Nov 28 01:46:13 PST 2007

Rob Seaman wrote:
> On Nov 26, 2007, at 2:00 PM, Norman Gray wrote:
>
>> On 2007 Nov 26, at 16:02, Frederic V. Hessman wrote:
>>
>>> MixedCapitalizationByOnlyRemovingNonAlphabeticAndNumericCharactersSoThatMercuryWillWorkTwiceButOtherwiseVerySimpleToImplement 
>>>
>>>            
>>>
>>>     Set: a-Z, 0-9
>>>
>>> prefLabels (e.g. "21-cm line"):
>>>
>>>     Whatever is used by whoever does it.  E.g. IAU93 and AOIM have 
>>> spaces and things.
>>
>> You're quite right.  I meant the concept URI: the concept fragment 
>> should I believe/agree, be drawn from [a-z0-9], though I wouldn't 
>> push very hard against [a-zA-Z0-9].  The prefLabel and altLabel 
>> fields should be Unicode.
>
> I think mixed case for readability.  With mono-case you always have to 
> force the token up or down before comparisons.  With mixed case you 
> ignore the issue and let mistakes pop back to the user (as failed 
> queries or whatever).
>
> That said, astronomy uses a lot of single letter identifiers in which 
> case matters, e.g., UBVRI versus ubvri.  This emphasizes the point 
> that you want to preserve the case, but also makes one wonder if 
> simply squeezing out the spaces might not generate identifier 
> collisions.  I guess folks will just have to be careful.
In which case, should the identifier be constructed from the full 
expanded term and the abbreviation turned into an alternate label? This 
should prevent identifier clashes within a single vocabulary.

Alasdair
>
>>> Singular, please! - it's a real pain to use the formal system of 
>>> singular concepts and plural countables and I agree that singular 
>>> should make the vocabulary simple to use
>>
>> I think this is also a non-issue.  If a term is plural in the 
>> vocabulary we're adapting (IAU93 and A&A use this convention) then it 
>> should remain plural in the SKOS version, otherwise we're making 
>> gratuitious changes; if it's singular in the original vocabulary 
>> (AOIM) then it should remain singular in SKOS, for the same reason.
>
> Sounds reasonable.
>
>> I wouldn't want to bet which of the vocabularies will end up the most 
>> useful in the end...
>
> Just as long as at least one of them proves useful to somebody.
>
>>> Tricky question:  we don't want to refer too much to IAU93, because 
>>> the suggestion will be that it's useful (which it really isn't) and 
>>> UCD1 really doesn't cover very many concepts contained in the above 
>>> vocabularies.  Stationary targets like the first list are admittedly 
>>> much easier to do, but I've already started to connect IVOAT and 
>>> UCD1, which is a good exercise since they are only partially 
>>> matchable.  IAU93 and IVOAT are so closely related - even with the 
>>> syntactic and content cleanups - that one could automate that 
>>> connection without too much trouble.
>>
>> I'm with you on the potential for trickiness.  However, it might be 
>> simpler than this.  Perhaps we should just declare as many 
>> correspondences as we can, and see if a reasoner agrees the result is 
>> consistent.
>
> Sounds like a plan.
>
> - Rob
>