Plural terms and their multiple definitions

Norman Gray norman at astro.gla.ac.uk
Mon Feb 11 09:57:59 PST 2008


Rob, hello.

On 2008 Feb 11, at 14:19, Rob Seaman wrote:

> Well, if the only recourse provided is to include separate terms for  
> singular and plural, e.g., for SN and for SNe, then it is certainly  
> preferred to attach them as altLabels to the same concept.

Imagine you go into a library and ask for books about 'Widgets'.   
You're directed to the right part of the stacks, but although all the  
books there are indeed about 'Widgets', none are the book you want.   
You go back to the desk and complain.  Ah, says the librarian, perhaps  
you want the books about 'Widget', they're over in _that_ direction.   
You'd probably hit them.

Whether it's labelled 'supernova', 'supernovas', 'supernovae' or  
'supernova'@fr, it's all the same underlying concept, <#Supernovas>

> However, I must still believe that the librarians have some way not  
> only of seeing that SN and SNe refer to the same objects (well,  
> subjects), but to go further and perceive the difference in number  
> expressed.  Dictionaries will list the plural under the singular's  
> heading, but do provide information to tell the two apart.

Librarians are perfectly able to distinguish one from several, and  
they carefully avoid the distinction when filing books.  If 'SN' or  
'SNe' appeared in a librarian's taxonomy, it would be next to an  
instruction saying 'file under Supernovas'.  That way, all the books  
about supernovae end up filed together, rather than in five different  
places, under 'SN', 'SNe', 'Supernova', 'Supernovas' and 'Supernovae',  
which doesn't help anyone.

In idiomatic english one says `I want a book about supernovae' or `I  
want a book about aberration', rather than `...about supernova' or  
`...about aberrations'.  This, as far as I understand it, is the  
origin of convention we've mentioned about which english-language  
thesaurus terms are pluralised and which are singular.

So much for books.  What we're doing here is precisely analogous.  We  
want to identify which URI should be attached to a resource to help  
people find it (that is, where we want the resource to be uniquely  
filed).

In this context, the distinction between one supernova and several  
doesn't matter, and if a user turns up with either of those strings,  
they should be told to `go and look under <#Supernovas>'.

The way we do this will (bonus!) allow limited, but useful, reasoning,  
such as supporting an application in reasoning that if it is searching  
for supernovae, then type 1 supernovae should be looked at as well,  
because that's a narrower term.  That is, what this putative VOEvent  
filter is doing is a type of searching/retrieval/filing, broadly  
considered.

>> Absolutely. There are no limits on the number of alternative  
>> labels. Of course, the applications that make use of the  
>> vocabularies will have to be wary that the same label can be used  
>> for different concepts and get the user to clarify which of the  
>> meanings they intended. This is why it is so important to have  
>> definitions for the terms as the application would only be able to  
>> display the labels back to the user if the definitions did not exist.
>
> Isn't displaying the labels precisely the point, though?  A user (or  
> other source) provides a token.  That token is not part of the  
> controlled vocabulary, but rather is a label.  The label is as  
> likely to be an altLabel as a prefLabel.

Indeed.  The function of the _single_ prefLabel is that it is the term  
which is displayed to a user, all other things being equal.  altLabels  
are other strings which might occur to a user, which allow them to be  
led to the single underlying concept.

> Thus there is a dialogue with the user, parroting not the altLabels  
> (and not the definitions), but rather the prefLabels for all  
> matching terms back for the user to select.  Perhaps the labels  
> won't be enough, in that case other information about the terms  
> would be available - but not just the definitions (short or long),  
> but the narrower and broader thans, etc.  I actually think the  
> latter will be of more utility than the former in further limiting  
> the search.  "Do you mean the "Milky Way" that is narrower than  
> "candy bar", or rather the "Milky Way" that is narrower than "spiral  
> galaxy" that is narrower than "galaxy"?

Precisely.  Even the particularly problematic term 'coma' can be  
handled this way.  The IAU-93 SKOS looks like:

<#Coma> a skos:Concept;
     skos:prefLabel "COMA"@en;
     skos:broader <#Aberrations>;
     skos:related <#Astigmatism>, <#ChromaticAberration>, [...].

<#Comas> a skos:Concept;
     skos:prefLabel "COMAS"@en;
     skos:related <#Comets>, <#Galaxies>.

(omitting the multilingual labels).  Those labels aren't very helpful,  
but a useful interface, given the search string 'coma', would show  
both these terms, plus the broader and related terms, which would make  
it instantly clear to the user which Concept they were actually  
searching for.

All the best,

Norman


-- 
Norman Gray  :  http://nxg.me.uk
eurovotech.org  :  University of Leicester



More information about the semantics mailing list