Beyond the draft proposal

Brian Thomas brian_thomas at earthlink.net
Mon Feb 4 15:00:23 PST 2008


Hi Norman, all,

On Monday 04 February 2008 4:48:58 pm Norman Gray wrote:
> 
> Brian and all, hello.
> 
> On 2008 Feb 4, at 18:01, Brian Thomas wrote:
> 
> > On Monday 04 February 2008 10:55:04 am Frederic Hessman wrote:
> >> Starting to think beyond the IVOA draft proposal:
> >>
> >> [snip]
> >
> > Yes. there are too many 'compound' terms. These may generally be  
> > identified
> > by the multiple words which comprise the token.
> 
> I smell mission creep!

	Sure, but the man asked, AND, he indicated that he was thinking
	beyond the draft proposal, so all of my suggestions where 
	indicative of post-draft work.

> 
> Remember that we're defining _vocabularies_ here.  One of the main  
> distinctions between vocabularies and ontologies is that the former  
> service a different goal from the latter.  That goal is searching, or  
> something very like it; vocabularies are much closer to humans -- to  
> UIs -- than ontologies are, and in consequence they are inevitably  
> messier.

	I'm aware they are different, however, I don't like messy things,
	and its not clear to me that a vocabulary is purely for human-machine
	interaction, as seems to be implied above. Perhaps it is, but I still
	don't understand why that makes the vocabulary necessiarily messy. 
	Having a controlled, clean set of unique tokens, seems to me a very good
	thing. Do we really have to canvas every possible meaning, and way of
	expressing that meaning, into the vocabulary? Some terms seem to be
	of very limited utility. I point to the earlier example  of having "volcano" 
	included as a token/term.

	As for the messiness, well, its a natural impulse for me to want to see 
	the plural terms (floating in a sea of singular tokens), the repetition of 
	meaning between more than one token, the degeneracy in meaning of 
	a token removed or cleared up. 

	As for compound terms, I didn't say we should have none of those, but perhaps 
	the IVOAT is just a bit overboard in this area. Witness the already large 
	size of the thing. Cutting some of these out will surely remove "bloat".

	I realize that there is impetus to 'cut a draft'. I agree with this.
	But if you ask what might be done to 'clean up' the draft, *after* creating
	an initial release, then I answer.

> [snip]
> 
> Vocabularies have to comprise the terms that users actually use,  
> minimally tidied up.  The result may be messy and hard to reason with,  
> but that's OK, because the world is messy, and we don't want to reason  
> with vocabularies.

	I think we are back at the issue of "who" is the "user". I think the user
	is drawn from a more technical group:  data archivists, application writers
	and the like. It is not scientists or students, which I think is
	your audience. My audience, I believe, is much more comfortable with 
	atomic terms, unique meanings for tokens, and simple grammars 
	with which to express compound terms. Why would I label data in my
	archive with messy terms? Why would I search for data when the search 
	will result in a  combination of objects coming back which are dissimilar/unrelated? 
	Why would I ship out a table to an application where the meaning of the columns is 
	not unique. Having messiness like this, well, its nuts, and makes the vocabulary 
	fairly useless for practical applications.
 
> 
> And I think the result _should_ look much like the IAU original.  My  
> impression of what was being aimed at in the IVOAT was a tidied up and  
> updated IAU93.  Let's keep it simple and quick.

	Yes, well, we are beyond simple and quick now. To my mind that would
	have encompassed no more than technical editing (just enough to get 
	the IVOAT into SKOS). But we have added terms and have (at last count)
	4 vocabularies in total (are all of those going into the draft??). So its a 
	matter of opinion that the process has been sufficiently limited.

>[snip]
> 
> As a separate thing:
> 
> > And I would also like to add that I'd like to see a *dictionary* of  
> > the vocabulary
> > terms. This then would settle the semantic meaning of these tokens,  
> > which is
> > the crutial missing link between a vocabulary to ontology usage. I  
> > have been
> > rebuffed/ignored about adding the definitions to the SKOS vocabulary
> 
> Have you?  Gracious, no: I think it's important to have scope-notes in  
> the vocabulary where possible.  The only problem here is that the  
> definitions in the IAU93 (coming back to that) are rather terse, and  
> in many cases are just the IAU93 term translated to lowercase.  In  
> that case, however, an elaborate scope note may not be vital, because  
> most of those terms are immediately intelligible to the intended  
> community (ie, astronomers) to the degree of precision appropriate to  
> a vocabulary (as opposed to the degree necessary for an ontology).

Soo.. you are in favor of including something beyond repeating the token
name under skos:description? 

> 
> > which
> > identify things which should be 'cleaned up' (probably by splitting  
> > the offending token
> > into 2 or more other tokens each with separate meaning).
> 
> I'm all for cleaning things up, but I think we need to be vigilant  
> against this tidyup turning into a full-blown ontological exercise  
> which, as the DM group can tell us, could end up taking five years  
> before anyone notices it's late. How about an effective definition of   
> 'cleaned up enough for the IVOAT initial release' being 'whatever can  
> get done in the next month'?

Sure. My bar for what is 'good enough' for an initial draft is actually pretty low.
The present draft is probably sufficient for community comment. Why do
much more work?

Regards,

=brian
 




More information about the semantics mailing list