Thes^H^H^H^HVocabularies (was: A murder of crows)

Norman Gray norman at astro.gla.ac.uk
Wed Nov 21 13:18:02 PST 2007


Brian and Rob, hello.

On 2007 Nov 21, at 18:35, Rob Seaman wrote:

>> Those arguments are to do with audience (expert vs. non-expert)  
>> and previous investments (three important journals already have  
>> actual resources tagged with actual vocabulary items).
>
> So there are a number of curated vocabularies, one of which (the  
> "IAU Thesaurus") is called a thesaurus, but is actually a  
> vocabulary like all the rest?

Until today, I'd had the nagging feeling that a thesaurus was  
something more exotic than it is.  But no, it's just a vocabulary  
plus light structure.  So...

> As far as audience, we should refer to our work products by names  
> designed to reach the non-experts.

Absolutely.  And I think that 'vocabulary' would do perfectly well  
for everyone.

I think we should carry on SKOSifying the structure already contained  
within things like A&A, AOIM and IAU, because it will be valuable and  
is free, but I suggest that outside this group we talk exclusively  
about 'vocabularies', on the grounds that no-one but us knows the  
distinction, and even we're not that bothered.  Yes?



Brian:

> Human-to-machine interaction is not particularly relevant in my mind
> at this time, as it is, and is likely to be, an interface which is  
> crafted by
> the individual archive/repository/tool builder. IF we were going ahead
> to specify a natural language query (NLQ) in which the terms of the  
> thesaurus
> were to be used, then I can see a need for it. But a NLQ (particularly
> one which may be executed across the entire IVOA!!) is just far,  
> far away
> and not as pressing as the issues of dataset labeling, machine to
> machine interchange and development of machine understanding of
> data (e.g. ontologies).

That's interesting -- I see it as very much the other way around!

I'm not thinking of full-scale NLQ, but simply getting the machine to  
do something a bit brighter when a user types 'cataclysmic binary'  
into a VOExplorer search box.  "Ah: 'cataclysmic binary' is part of  
an altLabel of the iau#cataclysmicvariablestars concept, so I'll make  
that concept-query of Registry++; mmm, not many hits, so I'll  
speculatively query iau#binarystars and iau#variablestars as well and  
offer those to the user.  In any case, by this time we're in logic- 
land, so I'll find what CDS-AstroOnt ontology classes have  
iau#cataclysmicvariablestars as a relatedConcept (say), because I  
know that the CDS-AstroOnt classes have links to SIMBAD terms, so I  
can hit the SIMBAD database, too.  Plus, via inter-vocabulary links,  
I now know what A&A concepts these relate to, and from their  
prefLabels know which strings to look up in ADS."  And so on.

Now, Brian and Ed could tell (and have frequently told) a very  
similar story using only ontologies; indeed _I've_ told a similar  
story using ontologies.  But ontologies can do more than this, all  
the way up to machine understanding of data, and we will need this,  
just as you say.

What I see the vocabulary stuff as doing is a couple of relatively  
simple things:

   * gathering the low-hanging fruit represented by the minimally  
structured keyword lists already in existence; and
   * helping users get from strings to a controlled vocabulary, and  
thence into ontology-land
   * ...which should help with searching.

So that's why I see the vocabularies stuff as being easier, bringing  
short-term gains, and providing a route into the fuller ontologies  
work to come.

> If "Thesaurus" automatically implies machine-to-human interaction,  
> then
> I apologize, and then move that we change to a compatible term which
> implies "machine-to-machine" instead (vocabulary? dictionary?)

I think 'ontologies' is good....

All the best,

Norman


-- 
------------------------------------------------------------
Norman Gray  :  http://nxg.me.uk
eurovotech.org  :  University of Leicester, UK




More information about the semantics mailing list