Vocabularies: next steps

Frederic V. Hessman hessman at astro.physik.uni-goettingen.de
Tue Nov 27 03:04:50 PST 2007


>> 	Set: a-Z, 0-9
> You're quite right.  I meant the concept URI: the concept fragment
> should I believe/agree, be drawn from [a-z0-9], though I wouldn't
> push very hard against [a-zA-Z0-9].  The prefLabel and altLabel
> fields should be Unicode.
>
> [AG] I would probably argue for [a-zA-Z0-9]

"a-Z,0-9" was meant to mean exactly this.  By now, I think we can all  
agree on this.

>>> The number of top concepts in the IAU thesaurus
>> Huh?  The IAU thesaurus is the IAU thesaurus.  If "top concepts"
>> are defined either as 1) not having a BT or 2) having a NT, then
>> the number is already fixed.  Basta.
> [AG] I still feel that for the IAU93 Thesaurus we should adopt the  
> list
> of tokens given in the web version. However, I agree with Norman that
> the top concepts are there to aid the navigation and for no other
> reason. When it comes to the IVOAT, I would think that the top  
> concepts
> are those that do not have a BT.

For simplicity and consistency, I would argue that we define "top  
concepts" as those not having a BT.

This should be part of the IVOA vocabulary guidelines, e.g. (here's  
my first cut)

1. A single SKOS document defines the vocabulary and must be  
publically available at some URI, preferably
	at the central IVOA vocabulary repository http://www.ivoa.net/?????  
at least as a copy.

2. A concept token has the form

			{URI-root}{vocabulary-name}#{token}

	where the token should consist only of the letters a-z, A-Z, and the  
numbers 0-9.  The URI root and vocabulary
	name should be set centrally and not in the definition of each  
token.  For example, if a nominal concept is

			http://www.ivoa.net/Thesauri/Food#Apple

	(root="http://www.ivoa.net/Thesauri/", name="Food", token="Apple"),  
then the SKOS definition begins with

			<skos:Concept rdf:about="#Apple">

3. One is encouraged to use human-readable forms for the tokens with  
some obivous connection to
	the preferred labels, e.g. conversion from the label via dropping  
characters not included in the
	above list and sub-token separation via capitalization (e.g. "My  
favorite idea-label #42" ->
	"MyFavoriteIdeaLabel42")

4. Vocabulary entries should be singular unless based on previously  
determined sources where the
	conversion to singular forms would impare the usefulness of the  
vocabulary.

5. Thesaurus entries (BT/NT/RT) are encouraged but not required.

6. If thesaurus entries are included, they should be complete (all BT  
links are reflected in corresponding
	NT links in the referenced entries).

7. "TopConcept" entries should normally be those not  having a BT  
reference but the maintainers of
	a  vocabulary can decide to restrict the choice of TopConcepts if  
appropriate.

8. Use of standard SKOS documentation is encouraged but not required:  
e.g.

	scopeNote		to clarify usage
	historyNote		to identify when the vocabulary entry was created
	changeNote		to identify changes in already created entries

9. The maintainers of a vocabulary should provide on-line  
documentation permitting the easy perusal of labels
	and any thesaurus and usage information.  The IVOA will try to  
maintain a list of links to known vocabularies
	and may choose to provide it's own consistent on-line documentation  
based on the SKOS files alone.

10. The maintainers of a vocabulary should attempt to cross-reference  
their vocabulary with one or more IVOA
	supported vocabularies, e.g. UCD1 and/or IVOAT.

Anything else?  Having just Ten Commandments would be nice.



>>> The grammatical number of the concept names (singular or plural)
>> Singular, please! - it's a real pain to use the formal system of
>> singular concepts and plural countables and I agree that singular
>> should make the vocabulary simple to use
> I think this is also a non-issue.  If a term is plural in the
> vocabulary we're adapting (IAU93 and A&A use this convention) then it
> should remain plural in the SKOS version, otherwise we're making
> gratuitious changes; if it's singular in the original vocabulary
> (AOIM) then it should remain singular in SKOS, for the same reason.
>
> [AG] The issue raises its head when it comes to the IVOAT. However,
> since this is based on the IAU93 thesaurus we could, as I believe  
> is the
> case, just adopt the IAU93 practice.

No, in fact I want to remove the plural terms from IVOAT as soon as  
possible (I finally got to this point in my list of things to do).    
Any complaints?

External vocabularies like IAU93, AOIM and A&A are pre-defined and so  
are what they are.  With IVOAT, we can choose to have what we want.

>>> I wouldn't want to bet which of the vocabularies will end up the  
>>> most
> useful in the end...

Well, the whole purpose of IVOAT is to create something useful.  If  
we're already failing, please tell me so I can stop now...... :-(

>> Interrelationships:
>>
>> 	Tricky question:  we don't want to refer too much to IAU93,
>> because the suggestion will be that it's useful (which it really
>> isn't) and UCD1 really doesn't cover very many concepts contained
>> in the above vocabularies.  Stationary targets like the first list
>> are admittedly much easier to do, but I've already started to
>> connect IVOAT and UCD1, which is a good exercise since they are
>> only partially matchable.  IAU93 and IVOAT are so closely related -
>> even with the syntactic and content cleanups - that one could
>> automate that connection without too much trouble.
>
> I'm with you on the potential for trickiness.  However, it might be
> simpler than this.  Perhaps we should just declare as many
> correspondences as we can, and see if a reasoner agrees the result is
> consistent.

Sounds like a good idea to me: we stick in whatever we can manage and  
see if anybody notices/benefits.  This is why I would like to test  
the UCD1<-->IVOAT connection so that one can ask questions like "I've  
got an UCD1 label in my VOTable - is there an IVOAT entry which would  
enable me to put it into a more general context?" or "I've got  
something easily described by an IVOAT token - can I trivially put  
this in a VOTable using some UCD1 label?".    Andeas is interested in  
getting the A&A vocabulary convertable to some other vocabulary to  
show that, e.g., the MNRAS or ApJ vocabularies can be shown to be  
equivalent at some stage - the question is only what intermediary  
vocabularies are usable (we've been praying that IVOAT as the  
replacement of the SV would be this medium, since the others are not  
good/extensive enough).

Rick



More information about the semantics mailing list