Taxonomy issues

Sat Sep 28 08:05:40 PDT 2002

Anita, Carole and all

The medical world draws a sharp distinction between "classifications" and "conceptual
models".  "There are many reasons for creating classifications, e.g. epidemiological
studies, payment for procedures, bibliographic retrieval, etc.  The classification
criteri may be logical or pragmatic, e.g. many epidemiological classifications have
more to do with the history of what diseases were important in 1900 than they
do with any logical structure of our understanding today -- but that is still how
they
must be reported to the World Health Organisation for international statistics.

"Logic based ontologies" (just "ontologies" for short) on the other hand are, or
should be, purely logical.  The hierarchy is based on pure subsumption,. i.e.
necessary implication.   If well built, they will be likely to exhibit certain sorts
of emergent behaviour - i.e. you can't predict all the logical consequences from any
rich set of relationships. However, they will be lawful (indeed logical), and changes
will have predictable consequences - this is very important in the issue of revisions
which you allude to.

On the other hand, the source of the information in the logical structure may require

domain experts - e.g. astronomers - or may be induced (we hope) from various corpora.

There is a trade-off between logical rigour and match to pragmatic behaviour.
How you manage that trade off depends on the applications.  For clinical
decision making a high degree of rigour is required.  For many information
retrieval tasks, pragmatic behaviour is more important.

As systems scale up, the trade-offs become more problematic because
humans' memory capacity is exceeded so we can't remember the
idiosyncrasies.  On the one hand, non-logical behaviour becomes awkward
because it gives rise to increasing numbers of exceptions to be dealt by
any software operation; on the other, our ability to construct fully
logical structures without relying on emergent behaviour decreases.

Our belief is that least one important task is to provide tools that make
easier the building of logical ontologies,
or if you prefer of logical views of conceptual representations -which
may have many and diverse origins, emergent or a priori.

Regards

Alan

--
Alan L Rector
Professor of Medical Informatics
Department of Computer Science
University of Manchester
Manchester M13 9PL, UK
TEL: +44-161-275-6188/6239/7183
FAX: +44-161-275-6204
email: rector at cs.man.ac.uk
web: www.cs.man.ac.uk/mig
        www.opengalen.org

Carole Goble wrote:

> Anita
>
> > >
> > > A great advantage of an emergent behaviour system is that the computers do
> > > the work, and there is no need to hire a librarian who is also an expert in
> > > astronomy.
> >
>
> In fact it isn't true at all in my opinion. Emergent behaviours need just as much
> effort. It also depends what kind of decision making you are doing on the
> results. If its for some search tool used by a person you may not care if its is
> screwed or odd; if its the basis of an automated decision procedure for composing
> some web services or steering a simulation you probably do care.
>
> >
> > True up to a point, but the emerging system will need an overhaul every so
> > often.  The CDS allocate UCDs automatically to new catalogues, which
> > works very well but occasionally something inappropriate slips in and then
> > propagates, so a periodic review by someone who is indeed both a librarian
> > and has a wide knowledge of astronomy is going to be required.
> >
>
> Emergent behaviour needs reasoning perhaps more than a priori schemes in order to
> identify inconsistencies so you can do something about them. Inconsistencies may
> well be represented -- which leads to diverging ontologies with some common and
> some alternatives, which means effectively handling different ontologies with
> varying degrees of ontological commitment. The point here is that to be ignorant
> or oblivious of inconsistencies is undesirable.
>
> The idea that there are two different worlds - emergent and a priori -- is naive
> and a false.  The semantic web effort is based on the merging of both approaches.
> hence all the effort in ontology learning.
>
> Of course in biology there are different classifications and definitions -- there
> is no standard definition for what a gene is for example. DAML+OIL allows
> multiple definitions, and will report on definitions that are inconsistent. What
> you do then is an interesting story. And it depends on what you want to use the
> ontology / taxonomy for.
>
> My colleague Sean Bechhofer offers these insights below.   A key point he makes
> is: are you talking about individuals or concepts? Some of the examples seem to
> confuse the two. Ontologies are about concepts. Knowledge bases are about
> individuals. They are different spaces.
>
> Cheers
>
> Carole
>
> Sean writes:
>
> Quick answer:
>
> It depends.
>
> Slightly longer answer:
>
> It depends on what you mean by "classifying objects".
>
> Even longer answer:
>
> [Apologies for introducing some technical stuff here, but I think you
> have to understand exactly what classification in OWL means in order
> to explain this -- I will use "OWL" rather than DAML+OIL as it's
> likely to become the standard and it's easier to type :-). Also
> apologies if at any point I'm entering into Grandma/egg sucking
> territory.]
>
> OWL allows us to define classes of objects and possibly apply
> reasoning techniques to discover relationships between those classes
> of objects. An interpretation of an OWL ontology maps the classes in
> the ontology to collections of objects in the domain (the instances of
> the class). We can then do reasoning over the ontology because we have
> a precise and unambiguous interpretation of what it *means* when we
> form a composite class description using the constructors in the
> language.
>
> So, for example, if I talk about the class (Cat or Dog) I know
> *exactly* what the interpretation of this is -- it is all the
> instances in the domain that are either instances of the class Cat or
> the class Dog. The reasoning that we perform about class subsumptions
> is then based on those interpretations. Essentially, if we can show
> that in *any* interpretation of the ontology, it must be the case that
> any instance of B must be an instance of A, then we can deduce that B
> is a subclass of A. This, in general, is what we (i.e. myself and my
> colleagues) tend to mean when we talk about classification -- it's the
> organisation of *classes* into a hierarchy using class subsumption.
>
> Once we have defined an OWL ontology, the relationships between the
> classes in the ontology are then not "up for discussion" -- the
> semantics tells us precisely when classes will subsume, and when they
> will not. (This all assumes, of course that you buy into the
> underlying semantics of OWL. You can choose not to, of course and then
> interpret the classes in whatever way you choose, but if you do that
> then this discussion is not for you anyway.). The ontology represents
> some kind of indefeasible state of the world. If I want to use your
> ontology, then I have to make a commitment to believing the way that
> you have structured the world -- that's what using ontologies does for
> us. It provides the shared conceptualisation. With a language like
> OWL, not only do we get shared *vocabulary*, but we also get explicit
> descriptions of what we mean. If I say in my ontology that (I use a
> non-astronomical example here for two reasons: a) it helps to avoid
> technical nit-picking; and more importantly b) I know nothing about
> astronomy :-)
>
>   CatLover == Person with more than 3 Cats
>
> then I'm being explicit here about what I mean. You may disagree and
> say "oh no, a cat lover doesn't have to *have* lots of cats, they just
> have to love them". However, at this point, we can now discuss the
> definition and possibly refine it. The key point is that rather than
> encoding the characteristics of the class in the name or some
> documentation as happens in traditional controlled vocabularies, we
> are providing a "machine interpretable" description, which will allow
> us to make inferences about the classes.
>
> So, onto the subject of contradictions and inconsistency. I say that
> all Widgets are blue, while you think that Widgets are red. So in our
> ontology, I say:
>
>   Widget -> colour blue
>
> and you say
>
>   Widget -> colour red
>
> If we also have some constraints that say that blue and red are
> mutually exclusive, then we have arrived at a contradiction. It is
> simply *not possible* for anything to meet the criteria that we have
> said must hold for something to be an instance of the class Widget, so
> the class is inconsistent. The reasoner will be able to tell us
> this. If, after "full and frank discussions", we still maintain these
> views, then it's simply the case that we do *not* share a common
> conceptualisation of our domain. What I mean by Widget is different to
> what you mean by Widget. At this point we either go our separate ways,
> or alternatively say nothing in the ontology about the colour of
> Widgets, which although now less detailed, now reflects our *shared*
> understanding.
>
> Alternatively, we might have different views on the same class. A
> classic example here is that of the Triangle. I might define a
> Triangle as being a polygon with three sides. You say its a polygon
> with three angles. These are, in fact perfectly acceptable (and
> mutually consistent) alternative definitions for the class of
> Triangles. OWL allows us to supply such alternative definitions, and
> the reasoner will tell us when these definitions are consistent and
> when they are not.
>
> One solution to this is to use the classification as a "conceptual
> coat rack" and hang "extralogical" information off it. This is the
> kind of thing that the RuleML initiative is aimed at. This then allows
> us to state things about OWL classes that are not necessarily part of
> the ontological or intensional definition of the class, but which we
> want to make available to applications.
>
> If when you talk about "classifying objects", you actually mean
> describing individuals, then the situation is slightly different. The
> usual picture (from the Description Logic perspective, which has
> strongly influenced the web ontology language work) is that we have an
> ontology (or "T-Box" or "schema") that describes the classes of the
> domain (as above). We then have a collection of instances (or "A-box")
> along with information describing the properties that particular
> individuals have. There is a slight blurring of these worlds due to
> some of the constructors in OWL, but it's worth thinking about it this
> way.
>
> Now once we have our schema or ontology fixed, we can make assertions
> about objects in the world. You might think that star Beta-X-17 is a
> Red Dwarf. I might say it's a Green Fairy (is my lack of astronomical
> knowledge showing??? :-). Let us assume that these two assertions are
> contradictory. However this is now a disagreement about the properties
> of a particular *individual*, rather than a disagreement about the
> properties of the classes in the ontology.
>
> An analogy is that of database and schema. We may agree on the schema
> of our database (people have names, addresses and so on), but disagree
> on who lives at No.14 Acacia Avenue. This is an discrepancy at the
> data level (that we may want to be informed about), but does not
> change the fact that we agree about the general way the world fits
> together.
>
> So in this case, resolving the discrepancy concerning our different
> descriptions of Beta-X-17 is not an *ontology* issue, rather an issue
> about the *use* of that ontology.
>
> This problem is not exclusive to OWL -- we could easily have
> contradictory descriptions of Beta-X-17 using some controlled
> vocabulary or terminology. I guess the interesting thing here though
> is that the richer language allows us to spot that there really *is* a
> contradiction between our opinions. Indeed, allowing contradictory
> descriptions of individuals is key to supporting the Semantic Web --
> it certainly will not be the case that all the information out there
> will agree!
>
> Cheers,
>
>         Sean
> --
> Sean Bechhofer
> seanb at cs.man.ac.uk
> http://www.cs.man.ac.uk/~seanb
>
> >
> > best wishes
> >
> > Anita
> >
> > - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> > Dr. Anita M. S. Richards, AVO Astronomer
> > MERLIN/VLBI National Facility, University of Manchester,
> > Jodrell Bank Observatory, Macclesfield, Cheshire SK11 9DL, U.K.
> > tel +44 (0)1477 572683 (direct); 571321 (switchboard); 571618 (fax).