Format of tokens

Frederic V. Hessman Hessman at Astro.physik.Uni-Goettingen.DE
Tue Nov 13 09:08:37 PST 2007


> My concern is that there is a discrepancy between Rick’s SKOS model  
> generated by his script and the original files. My feeling is that  
> the SKOS model representing the IAU Thesaurus that is to be  
> published by the IVOA should be an accurate model. If we cannot  
> produce an accurate SKOS model but claim that it is, then people  
> will not trust the IVOAT or any of the semantics works involving  
> vocabularies and ontologies.

The problem was simply that I had forgotten to delete the entries  
which turned into aliases.  The real raw statistics are

	Number of initial entries:  2950
	Number of explicit narrower entries (with BTs):  1226
	Number of explicit broader  entries (with NTs):  512
	Number of entries with references   (with RTs):  2134
	Final number of SKOS Concepts:  2551
	Number of TopConcepts:  1325

Thus, you can't assume that the BT's and NT's are all present in the  
original (trex.txt).  Alasdair's figure of 512 top concepts assumed  
that the IAU thesaurus was reasonably complete and self-consistent.
>        What should be the base URI for the thesauri? Can we  
> formalise this work within the semantics group and give the  
> thesauri a home within the IVOA domain?
I assume this will be needed only after it's possible to have access  
to the IVOA domain by working members of the semantics group.   I'll  
be happier than most when my own address disappears, but I figured  
that a real "incorrect" address is better than an imaginary or  
cumbersome "correct" address.

Why don't we simply apply for an obivous root URI like

	http://www.ivoa.net/thesauri/ ?
> ·         Looking at the namespace imports, rdfs, owl and iau93 are  
> not used within the document.
Not yet, yes.    Easy to get rid of for now.
> ·         The declared top level concepts should accurately match  
> those of the original IAU Thesaurus. (At the moment Rick’s script  
> does not generate anything close to the proper model here.)
Well, better than you thought and better now that I've found the  
(latest) bug.
> ·         The relationships within each concept need to point to  
> other concepts. (Although Rick has sorted this out, the version on  
> the web is still wrong.)
>
> ·         The 398 terms which declare Use relationships should only  
> appear as skos:altLabel. For example “ab variable stars” should not  
> appear as a concept but as an alternative label for “Bailey Types”  
> and “RR Lyrae Stars”.
This problem is solved (it was the bug).
> ·         Agreement on the format of labels. At the moment Rick has  
> left them as they appear in thesaurus files but I feel that it  
> would be more user friendly to use lower case with the first word  
> capitalised.
Frankly, the original document uses (practically) all capitals and we  
want to convert the original thesaurus using as few changes as  
necessary (the only point of doing it), so why not keep the original  
labels?  If people hate to be shouted at and think that the IAU93  
isn't very user-friendly, all the better.  Any other format will have  
problems: e.g. you don't really want to turn "BAADE WESSELINK METHOD"  
into "Baade wesselink method" - you want people to use the IVOAT and  
see "Baade-Wesselink method".
> ·         Agreement on the format of identifiers. The options that  
> have been considered are:
>
> 1.       Generating a new unique identifier, e.g. some number
>
> 2.       Using camel back notation based on the preferred label, so  
> “Bailey Types” would have the identifier “BaileyTypes”
>
> 3.       Using a lower case only version of the preferred label, so  
> “Bailey Types” would have the identifier “baileytypes”
>
> Please see the appropriate thread in the semantics list for a full  
> discussion of this issue.
... from which you'll see that there are few people who really care.   
I still haven't seen any recent complaints about compromise notation  
# 2 but previous stronger complaints about #1 and #3.  Barring  
complaints can we simply adopt #2?  There is not perfect solution  
(e.g. "Ba II stars" -> "BaIiStars", which looks like something else).
> Once we have agreement on these issues, then the results can be  
> applied to the IVOAT.
... and the rest of the thesauri we're going to generate in this  
exercise.
> Cheers (I think I’m going to go for a long drink to recover from  
> this),
Now you all know how many beers you all owe me.

Rick

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ivoa.net/pipermail/semantics/attachments/20071113/3a3a4708/attachment-0001.html>


More information about the semantics mailing list