Taxonomy issues

Carole Goble carole at cs.man.ac.uk
Sat Sep 28 07:42:16 PDT 2002



Anita

> >
> > A great advantage of an emergent behaviour system is that the computers do
> > the work, and there is no need to hire a librarian who is also an expert in
> > astronomy.
>

In fact it isn't true at all in my opinion. Emergent behaviours need just as much
effort. It also depends what kind of decision making you are doing on the
results. If its for some search tool used by a person you may not care if its is
screwed or odd; if its the basis of an automated decision procedure for composing
some web services or steering a simulation you probably do care.

>
> True up to a point, but the emerging system will need an overhaul every so
> often.  The CDS allocate UCDs automatically to new catalogues, which
> works very well but occasionally something inappropriate slips in and then
> propagates, so a periodic review by someone who is indeed both a librarian
> and has a wide knowledge of astronomy is going to be required.
>

Emergent behaviour needs reasoning perhaps more than a priori schemes in order to
identify inconsistencies so you can do something about them. Inconsistencies may
well be represented -- which leads to diverging ontologies with some common and
some alternatives, which means effectively handling different ontologies with
varying degrees of ontological commitment. The point here is that to be ignorant
or oblivious of inconsistencies is undesirable.

The idea that there are two different worlds - emergent and a priori -- is naive
and a false.  The semantic web effort is based on the merging of both approaches.
hence all the effort in ontology learning.

Of course in biology there are different classifications and definitions -- there
is no standard definition for what a gene is for example. DAML+OIL allows
multiple definitions, and will report on definitions that are inconsistent. What
you do then is an interesting story. And it depends on what you want to use the
ontology / taxonomy for.

My colleague Sean Bechhofer offers these insights below.   A key point he makes
is: are you talking about individuals or concepts? Some of the examples seem to
confuse the two. Ontologies are about concepts. Knowledge bases are about
individuals. They are different spaces.

Cheers


Carole


Sean writes:

Quick answer:

It depends.

Slightly longer answer:

It depends on what you mean by "classifying objects".

Even longer answer:

[Apologies for introducing some technical stuff here, but I think you
have to understand exactly what classification in OWL means in order
to explain this -- I will use "OWL" rather than DAML+OIL as it's
likely to become the standard and it's easier to type :-). Also
apologies if at any point I'm entering into Grandma/egg sucking
territory.]

OWL allows us to define classes of objects and possibly apply
reasoning techniques to discover relationships between those classes
of objects. An interpretation of an OWL ontology maps the classes in
the ontology to collections of objects in the domain (the instances of
the class). We can then do reasoning over the ontology because we have
a precise and unambiguous interpretation of what it *means* when we
form a composite class description using the constructors in the
language.

So, for example, if I talk about the class (Cat or Dog) I know
*exactly* what the interpretation of this is -- it is all the
instances in the domain that are either instances of the class Cat or
the class Dog. The reasoning that we perform about class subsumptions
is then based on those interpretations. Essentially, if we can show
that in *any* interpretation of the ontology, it must be the case that
any instance of B must be an instance of A, then we can deduce that B
is a subclass of A. This, in general, is what we (i.e. myself and my
colleagues) tend to mean when we talk about classification -- it's the
organisation of *classes* into a hierarchy using class subsumption.

Once we have defined an OWL ontology, the relationships between the
classes in the ontology are then not "up for discussion" -- the
semantics tells us precisely when classes will subsume, and when they
will not. (This all assumes, of course that you buy into the
underlying semantics of OWL. You can choose not to, of course and then
interpret the classes in whatever way you choose, but if you do that
then this discussion is not for you anyway.). The ontology represents
some kind of indefeasible state of the world. If I want to use your
ontology, then I have to make a commitment to believing the way that
you have structured the world -- that's what using ontologies does for
us. It provides the shared conceptualisation. With a language like
OWL, not only do we get shared *vocabulary*, but we also get explicit
descriptions of what we mean. If I say in my ontology that (I use a
non-astronomical example here for two reasons: a) it helps to avoid
technical nit-picking; and more importantly b) I know nothing about
astronomy :-)

  CatLover == Person with more than 3 Cats

then I'm being explicit here about what I mean. You may disagree and
say "oh no, a cat lover doesn't have to *have* lots of cats, they just
have to love them". However, at this point, we can now discuss the
definition and possibly refine it. The key point is that rather than
encoding the characteristics of the class in the name or some
documentation as happens in traditional controlled vocabularies, we
are providing a "machine interpretable" description, which will allow
us to make inferences about the classes.

So, onto the subject of contradictions and inconsistency. I say that
all Widgets are blue, while you think that Widgets are red. So in our
ontology, I say:

  Widget -> colour blue

and you say

  Widget -> colour red

If we also have some constraints that say that blue and red are
mutually exclusive, then we have arrived at a contradiction. It is
simply *not possible* for anything to meet the criteria that we have
said must hold for something to be an instance of the class Widget, so
the class is inconsistent. The reasoner will be able to tell us
this. If, after "full and frank discussions", we still maintain these
views, then it's simply the case that we do *not* share a common
conceptualisation of our domain. What I mean by Widget is different to
what you mean by Widget. At this point we either go our separate ways,
or alternatively say nothing in the ontology about the colour of
Widgets, which although now less detailed, now reflects our *shared*
understanding.

Alternatively, we might have different views on the same class. A
classic example here is that of the Triangle. I might define a
Triangle as being a polygon with three sides. You say its a polygon
with three angles. These are, in fact perfectly acceptable (and
mutually consistent) alternative definitions for the class of
Triangles. OWL allows us to supply such alternative definitions, and
the reasoner will tell us when these definitions are consistent and
when they are not.

One solution to this is to use the classification as a "conceptual
coat rack" and hang "extralogical" information off it. This is the
kind of thing that the RuleML initiative is aimed at. This then allows
us to state things about OWL classes that are not necessarily part of
the ontological or intensional definition of the class, but which we
want to make available to applications.

If when you talk about "classifying objects", you actually mean
describing individuals, then the situation is slightly different. The
usual picture (from the Description Logic perspective, which has
strongly influenced the web ontology language work) is that we have an
ontology (or "T-Box" or "schema") that describes the classes of the
domain (as above). We then have a collection of instances (or "A-box")
along with information describing the properties that particular
individuals have. There is a slight blurring of these worlds due to
some of the constructors in OWL, but it's worth thinking about it this
way.

Now once we have our schema or ontology fixed, we can make assertions
about objects in the world. You might think that star Beta-X-17 is a
Red Dwarf. I might say it's a Green Fairy (is my lack of astronomical
knowledge showing??? :-). Let us assume that these two assertions are
contradictory. However this is now a disagreement about the properties
of a particular *individual*, rather than a disagreement about the
properties of the classes in the ontology.

An analogy is that of database and schema. We may agree on the schema
of our database (people have names, addresses and so on), but disagree
on who lives at No.14 Acacia Avenue. This is an discrepancy at the
data level (that we may want to be informed about), but does not
change the fact that we agree about the general way the world fits
together.

So in this case, resolving the discrepancy concerning our different
descriptions of Beta-X-17 is not an *ontology* issue, rather an issue
about the *use* of that ontology.

This problem is not exclusive to OWL -- we could easily have
contradictory descriptions of Beta-X-17 using some controlled
vocabulary or terminology. I guess the interesting thing here though
is that the richer language allows us to spot that there really *is* a
contradiction between our opinions. Indeed, allowing contradictory
descriptions of individuals is key to supporting the Semantic Web --
it certainly will not be the case that all the information out there
will agree!

Cheers,

        Sean
--
Sean Bechhofer
seanb at cs.man.ac.uk
http://www.cs.man.ac.uk/~seanb



>
> best wishes
>
> Anita
>
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> Dr. Anita M. S. Richards, AVO Astronomer
> MERLIN/VLBI National Facility, University of Manchester,
> Jodrell Bank Observatory, Macclesfield, Cheshire SK11 9DL, U.K.
> tel +44 (0)1477 572683 (direct); 571321 (switchboard); 571618 (fax).



More information about the semantics mailing list