Draft draft 0.04

Rob Seaman seaman at noao.edu
Thu Feb 7 13:38:59 PST 2008


> I've put a vocabularies-0.04 at <http://www.astro.gla.ac.uk/users/norman/ivoa/vocabularies 
> >, with the corresponding issues list at <http://www.astro.gla.ac.uk/users/norman/ivoa/vocabularies/issues 
> >.  The latter has been extended by a couple of issues that seemed  
> to arise in the last few days, and has the first two issues  
> [masterformat-1] and [distformat-2] marked as provisionally resolved.

Let's see if my comments here can resolve an issue  
[dowereallywanttodefineeverything-1] before it can be added to the  
list :-)

> Question: Do you agree that I've included adequate mention of the  
> issues of the last couple of days?  If there's an issue you thought  
> was live but which isn't acknowledge here, please do say (I've an  
> uncomfortable feeling I've forgotten something important).

Looks good!

Ok, first a general statement.  Humans layer language with multiple  
meanings.  Computers are completely literal.  We aren't the only  
community to face the dilemma of resolving this.  We are, however, the  
only community with our particular mix of use cases.  Our resolution  
of the issue of multiple definitions may differ from other communities.

The nature of astronomical issues will often simplify the problem.   
Our users are unlikely to make a fuss about distinguishing between  
focal plane images and FITS images, or between the coma of a telescope  
and the coma of a comet - to reference Ed's examples.  Most of these  
straightforward synonyms will sort out in the hierarchies of UCDs or  
the separation of the problem into different vocabularies for solar  
system objects, instrumentation and data formats, for example.

On the other hand, facilities such as VOEvent (and technologies such  
as ontologies) exist precisely to help people discern and convey  
subtle distinctions.  There is a hard version of the issue of synonyms  
that I don't think we can wish away.  Ed's dissection of the wikipedia  
definition of GRB does a good job of indicating where those boojums  
come in.

The v0.04 text includes this statement, "The purpose of a thesaurus is  
to guide both the indexer and the searcher to select the same  
preferred term[s]".  I think this need not be spelled out in the  
document.  The roles of indexer and searcher don't map onto all the  
roles of interest to the VO community - and the notion that this is  
the sole purpose of a thesaurus is trivial to discount not just by  
appealing to Roget's authority (i.e., his "say so" or "countenance"),  
but also simply since different users may well be drawing from  
different vocabularies.

Rather, a key interest in thesauri is to provide mappings from source  
documents back to whatever controlled vocabulary is of interest for a  
particular use case.  That is - precisely to separate the terms  
selected by the indexer from those chosen by the searcher.  For  
example, I don't want to have to understand a VOEvent publisher's  
preferred term - and with the right thesaurus I may not ever become  
aware of what term the publisher actually used.

Today's brouhaha was about definitions.  The Turtle #spiralGalaxy  
example from v0.04 has:

	skos:definition """A galaxy having a spiral structure."""@en;

This is a good example itself of my point about multiple definitions.   
The brouhaha was about one meaning of the word "definition" - an  
extended description, or as Marcus Aurelius says:

	"Ask yourself, what is this thing in itself, by its own special
	constitution? What is it in substance, and in form, and in matter?
	What is its function in the world? For how long does it subsist?"

(Also Hannibal Lector to Agent Starling in Silence of the Lambs.)

The meaning in SKOS, however, is more like "a brief description to  
distinguish one token from another".

To be blunt, I don't give a kangaroo rat's ass about the latter  
issue.  (NB - a "kangaroo rat" is neither a "kangaroo", nor a "rat".)   
I find a definition that merely restates the obvious (a #spiralGalaxy  
is a spiral galaxy) to be without utility - but perhaps others  
disagree.  (Also, isn't distinguishing terms the point of mappings  
like narrower, broader and related?)

Regarding the substantive issue of providing detailed, scientifically  
precise, consensus definitions for these thousands of terms, we simply  
do not have standing.  For some limited set of terms, i.e., "planet",  
some commission or other of the IAU has spoken.  For others, any  
random VO user may have more authority than this WG to speak to a  
term's definition.  (see below under section 2.2)

Specific comments:

Title: ok

version: bump it up to something like 0.94 or nobody will take it  
seriously

authors; good job guys!

editors: we need to identify these ASAP, the document doesn't need  
much work IMHO

Abstract: good and to the point

Status: let's promote this to a public working draft ASAP

TOC: didn't verify

Intro:  excellent analysis of the problem

1.1: ok

1.2:

Replace "<what/> element" with "<Why/> and <What/> elements".

Replace "Gamma Ray Burst" with "gamma-ray burst".  Need to identify  
and follow a general policy for terminology, especially in a document  
about vocabularies :-)

Delete the word "modishly".  No need to be snarky about folksonomies.

Delete ", with its systematising instincts, and aware of the benefits  
of standardisation,"

Replace "supernova1a" with " type 1a supernova".

BTW, how are plurals handled in SKOS?  Meaning - I understand we're  
supposed to pick a coherent naming scheme, either plural or singular,  
but is there some way to recognize SNe as referring to multiple SN?   
Or is my question ill-posed?

SIMBAD is sometimes SIMBAD and sometimes Simbad.  Which is it?

1.3: ok

2: good

2.1:

Delete "NOTE: The purpose..." or perhaps expand the following  
description.

I don't think "a vocabulary (SKOS or otherwise)..." needs to be bolded.

2.2:

Shouldn't be shy about announcing a preferred format (XML versus  
Turtle).

Further discussion about definitions:

A SKOS entry may contain a "definition for the concept, where one  
exists in the original vocabulary".  This notion permits the many-to- 
many mapping we need.  A particular vocabulary may include the limited  
"I am a spiral galaxy" type of definition.  A separate vocabulary may  
include a concept with a more fleshed out definition, perhaps  
including (or simply plagiarizing) links covering all of what  
astronomical science has to say about the birth, life and death of  
spiral galaxies.  Cross-linking the concepts provides enough slack to  
say whatever one wants.

2.3:

I remain a bit unclear about where the equivalences are going to  
live.  What document will contain the example "iau93:#SPIRALGALAXY  
map:exactMatch ivoat:#spiralGalaxy"?  I suggest the need for some  
document external to both iau93 and ivoat to make this equivalence.   
How are multiway equivalences conveyed?

3: ok

3.1: ok

3.2:

#5 - typo "defintions" (saw a similar typo somewhere else in the  
document)

#8 - Again, does the publisher of iau93 map to ivoat, or does ivoat  
map to iau93, or both, or one or more third parties?

4: ok (meaning looks like you guys did a vast amount of work)

Appendices: ok

Bibliography:

If the UCD authors are listed, so should the VOEvent authors.

Issues doc:  Good job!  Seem to be well in hand.

Rob



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ivoa.net/pipermail/semantics/attachments/20080207/1ef21107/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Boojum2.jpeg
Type: image/jpeg
Size: 12578 bytes
Desc: not available
URL: <http://www.ivoa.net/pipermail/semantics/attachments/20080207/1ef21107/attachment-0001.jpeg>


More information about the semantics mailing list