Beyond the draft proposal

Alasdair Gray agray at dcs.gla.ac.uk
Wed Feb 6 03:11:31 PST 2008


Brian Thomas wrote:
> Hi Alasdair, all,
>
> On Tuesday 05 February 2008, Alasdair Gray wrote:
>   
>> Brian Thomas wrote:
>>     
>>> 	I'm aware they are different, however, I don't like messy things,
>>> 	and its not clear to me that a vocabulary is purely for human-machine
>>> 	interaction, as seems to be implied above. Perhaps it is, but I still
>>> 	don't understand why that makes the vocabulary necessiarily messy. 
>>> 	Having a controlled, clean set of unique tokens, seems to me a very good
>>> 	thing. 
>>>       
>> If the compound terms are going to be in common usage by the astronomers 
>> who will ultimately be using the ivoa software that makes use of the 
>> vocabularies then they need to be first class citizens in the vocabulary 
>> and not derived from some grammar for combining terms.
>>     
>
> 	I have no argument with this, but are you sure that all compound terms
> 	are needed/useful? I suppose this is an eye of the beholder sort of thing,
> 	some of these compound terms look nasty to me.
>   
The question then becomes one of where do you start drawing the lines, 
which I guess is what this discussion is about.
>   
>>> Do we really have to canvas every possible meaning, and way of
>>> 	expressing that meaning, into the vocabulary? Some terms seem to be
>>> 	of very limited utility. I point to the earlier example  of having "volcano" 
>>> 	included as a token/term.
>>>   
>>>       
>> It depends on how wide you want the coverage of your vocabulary to be. 
>> If the idea of the IVOAT is to cover all terms then yes, they all need 
>> to be in there. This does not preclude the setting up of smaller, more 
>> focused vocabularies with clearly defined mappings to the IVOAT.
>>     
>
> 	Well, perhaps it is time to ask (and I suppose this is the sort of thing
> 	Frederick was getting at earlier), what is the purpose of the IVOAT?
>
> 	From my own point of view:
>
> 	As I have written earlier, we are in bad need of a list of standard tokens
> 	which identify astronomical objects, as well as instrumentation, and
> 	all the other concepts which are involved in doing Astronomy and online
> 	research. I don't know how to define the exact scope of the vocabulary better
> 	than that. Probably what I just listed could result in 60,000 terms if one
> 	is fairly pedantic, but I would hope it would be smaller than that..if only
> 	because it would take years to get a 60,000 word vocabulary assembled
> 	and agreed on...
> 	
>   
I'm afraid this is one where I cannot be of assistance due to my lack of 
domain knowledge. I am happy to help verify/validate the resulting 
vocabulary in terms of skos compliance, etc.
>> [snip]
>>     
>>>  
>>>   
>>>       
>>>> And I think the result _should_ look much like the IAU original.  My  
>>>> impression of what was being aimed at in the IVOAT was a tidied up and  
>>>> updated IAU93.  Let's keep it simple and quick.
>>>>     
>>>>         
>>> 	Yes, well, we are beyond simple and quick now. To my mind that would
>>> 	have encompassed no more than technical editing (just enough to get 
>>> 	the IVOAT into SKOS). But we have added terms and have (at last count)
>>> 	4 vocabularies in total (are all of those going into the draft??). So its a 
>>> 	matter of opinion that the process has been sufficiently limited.
>>>   
>>>       
>> The skos version of the IAUT should not alter its content at all. 
>> However, the IVOAT should contain the concepts that are in use now.
>>     
>
> 	Agreed.
>
>   
>>>   
>>>       
>>>> [snip]
>>>>         
>>> Soo.. you are in favor of including something beyond repeating the token
>>> name under skos:description? 
>>>   
>>>       
>> I would say that the IAUT, A&A keywords and AOIM vocabularies will 
>> unfortunately not contain very good definitions as the original source 
>> vocabularies are lacking in this area. However, the IVOAT *should*, 
>> actually *must*, contain definitions of all of the concepts, otherwise 
>> the whole exercise is wasted as no-one will know the true meaning of the 
>> concepts. Whether taking these from on-line dictionaries is the best 
>> approach is open for debate.
>>     
>
> 	Well, machine assignment, as a starter, is a good thing.  I didn't 
> 	say we just let the machine assign stuff and forget about it. But 
> 	in my experience, definitions form WordNet usually give you the right 
> 	definition with no trouble at all. Absolutely, a human needs to validate all
> 	the entries, but its faster to have the machine generate most of the
> 	text and then a human checking (and editing as needed) rather than 
> 	the human going it alone and typing it all in.
>   
That sounds like a reasonable approach to me, particularly since the 
current version just repeats the preferred label which I realise is easy 
but is not helpful for those who do not have any domain knowledge.

Best regards,

Alasdair
> 	Regards,
>
> 	=brian
>
>   
>> Cheers,
>>
>> Alasdair
>>
>>     

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ivoa.net/pipermail/semantics/attachments/20080206/e4f5d022/attachment-0001.html>


More information about the semantics mailing list