Taxonomy issues

Alan Rector rector at cs.man.ac.uk
Sat Sep 28 08:26:27 PDT 2002


Addendum

Some groups in the medical area draw the following distinctions which may be
helpful

    "Terming" - finding the linguistic unit corresponding to a concept
    "coding" - finding the concept in a logical sense
    "classification" - classifying it according to some specific purpose, usually
            an externally imposed set of classes from WHO, a funding body, etc.
    "Grouping" - an even more arbitrary grouping together of classes and codes,
            usually tied to management, payment, or planning, which are typically
            linked by their relative cost/expenditure/resource implications rather than
            any affinity within the domain of discourse - e.g. uncomplicated
            appendicectomy and brief hospitalisation for pneumnia might find themselves
            in the same 'group' because they both involve roughly 3 days hospitalisation.

Alan


Alan Rector wrote:

> Anita, Carole and all
>
> The medical world draws a sharp distinction between "classifications" and "conceptual
> models".  "There are many reasons for creating classifications, e.g. epidemiological
> studies, payment for procedures, bibliographic retrieval, etc.  The classification
> criteri may be logical or pragmatic, e.g. many epidemiological classifications have
> more to do with the history of what diseases were important in 1900 than they
> do with any logical structure of our understanding today -- but that is still how
> they
> must be reported to the World Health Organisation for international statistics.
>
> "Logic based ontologies" (just "ontologies" for short) on the other hand are, or
> should be, purely logical.  The hierarchy is based on pure subsumption,. i.e.
> necessary implication.   If well built, they will be likely to exhibit certain sorts
> of emergent behaviour - i.e. you can't predict all the logical consequences from any
> rich set of relationships. However, they will be lawful (indeed logical), and changes
> will have predictable consequences - this is very important in the issue of revisions
> which you allude to.
>
> On the other hand, the source of the information in the logical structure may require
>
> domain experts - e.g. astronomers - or may be induced (we hope) from various corpora.
>
> There is a trade-off between logical rigour and match to pragmatic behaviour.
> How you manage that trade off depends on the applications.  For clinical
> decision making a high degree of rigour is required.  For many information
> retrieval tasks, pragmatic behaviour is more important.
>
> As systems scale up, the trade-offs become more problematic because
> humans' memory capacity is exceeded so we can't remember the
> idiosyncrasies.  On the one hand, non-logical behaviour becomes awkward
> because it gives rise to increasing numbers of exceptions to be dealt by
> any software operation; on the other, our ability to construct fully
> logical structures without relying on emergent behaviour decreases.
>
> Our belief is that least one important task is to provide tools that make
> easier the building of logical ontologies,
> or if you prefer of logical views of conceptual representations -which
> may have many and diverse origins, emergent or a priori.
>
> Regards
>
> Alan
>
> --
> Alan L Rector
> Professor of Medical Informatics
> Department of Computer Science
> University of Manchester
> Manchester M13 9PL, UK
> TEL: +44-161-275-6188/6239/7183
> FAX: +44-161-275-6204
> email: rector at cs.man.ac.uk
> web: www.cs.man.ac.uk/mig
>         www.opengalen.org
>
> Carole Goble wrote:
>
> > Anita
> >
> > > >
> > > > A great advantage of an emergent behaviour system is that the computers do
> > > > the work, and there is no need to hire a librarian who is also an expert in
> > > > astronomy.
> > >
> >
> > In fact it isn't true at all in my opinion. Emergent behaviours need just as much
> > effort. It also depends what kind of decision making you are doing on the
> > results. If its for some search tool used by a person you may not care if its is
> > screwed or odd; if its the basis of an automated decision procedure for composing
> > some web services or steering a simulation you probably do care.
> >
> > >
> > > True up to a point, but the emerging system will need an overhaul every so
> > > often.  The CDS allocate UCDs automatically to new catalogues, which
> > > works very well but occasionally something inappropriate slips in and then
> > > propagates, so a periodic review by someone who is indeed both a librarian
> > > and has a wide knowledge of astronomy is going to be required.
> > >
> >
> > Emergent behaviour needs reasoning perhaps more than a priori schemes in order to
> > identify inconsistencies so you can do something about them. Inconsistencies may
> > well be represented -- which leads to diverging ontologies with some common and
> > some alternatives, which means effectively handling different ontologies with
> > varying degrees of ontological commitment. The point here is that to be ignorant
> > or oblivious of inconsistencies is undesirable.
> >
> > The idea that there are two different worlds - emergent and a priori -- is naive
> > and a false.  The semantic web effort is based on the merging of both approaches.
> > hence all the effort in ontology learning.
> >
> > Of course in biology there are different classifications and definitions -- there
> > is no standard definition for what a gene is for example. DAML+OIL allows
> > multiple definitions, and will report on definitions that are inconsistent. What
> > you do then is an interesting story. And it depends on what you want to use the
> > ontology / taxonomy for.
> >
> > My colleague Sean Bechhofer offers these insights below.   A key point he makes
> > is: are you talking about individuals or concepts? Some of the examples seem to
> > confuse the two. Ontologies are about concepts. Knowledge bases are about
> > individuals. They are different spaces.
> >
> > Cheers
> >
> > Carole
> >
> > Sean writes:
> >
> > Quick answer:
> >
> > It depends.
> >
> > Slightly longer answer:
> >
> > It depends on what you mean by "classifying objects".
> >
> > Even longer answer:
> >
> > [Apologies for introducing some technical stuff here, but I think you
> > have to understand exactly what classification in OWL means in order
> > to explain this -- I will use "OWL" rather than DAML+OIL as it's
> > likely to become the standard and it's easier to type :-). Also
> > apologies if at any point I'm entering into Grandma/egg sucking
> > territory.]
> >
> > OWL allows us to define classes of objects and possibly apply
> > reasoning techniques to discover relationships between those classes
> > of objects. An interpretation of an OWL ontology maps the classes in
> > the ontology to collections of objects in the domain (the instances of
> > the class). We can then do reasoning over the ontology because we have
> > a precise and unambiguous interpretation of what it *means* when we
> > form a composite class description using the constructors in the
> > language.
> >
> > So, for example, if I talk about the class (Cat or Dog) I know
> > *exactly* what the interpretation of this is -- it is all the
> > instances in the domain that are either instances of the class Cat or
> > the class Dog. The reasoning that we perform about class subsumptions
> > is then based on those interpretations. Essentially, if we can show
> > that in *any* interpretation of the ontology, it must be the case that
> > any instance of B must be an instance of A, then we can deduce that B
> > is a subclass of A. This, in general, is what we (i.e. myself and my
> > colleagues) tend to mean when we talk about classification -- it's the
> > organisation of *classes* into a hierarchy using class subsumption.
> >
> > Once we have defined an OWL ontology, the relationships between the
> > classes in the ontology are then not "up for discussion" -- the
> > semantics tells us precisely when classes will subsume, and when they
> > will not. (This all assumes, of course that you buy into the
> > underlying semantics of OWL. You can choose not to, of course and then
> > interpret the classes in whatever way you choose, but if you do that
> > then this discussion is not for you anyway.). The ontology represents
> > some kind of indefeasible state of the world. If I want to use your
> > ontology, then I have to make a commitment to believing the way that
> > you have structured the world -- that's what using ontologies does for
> > us. It provides the shared conceptualisation. With a language like
> > OWL, not only do we get shared *vocabulary*, but we also get explicit
> > descriptions of what we mean. If I say in my ontology that (I use a
> > non-astronomical example here for two reasons: a) it helps to avoid
> > technical nit-picking; and more importantly b) I know nothing about
> > astronomy :-)
> >
> >   CatLover == Person with more than 3 Cats
> >
> > then I'm being explicit here about what I mean. You may disagree and
> > say "oh no, a cat lover doesn't have to *have* lots of cats, they just
> > have to love them". However, at this point, we can now discuss the
> > definition and possibly refine it. The key point is that rather than
> > encoding the characteristics of the class in the name or some
> > documentation as happens in traditional controlled vocabularies, we
> > are providing a "machine interpretable" description, which will allow
> > us to make inferences about the classes.
> >
> > So, onto the subject of contradictions and inconsistency. I say that
> > all Widgets are blue, while you think that Widgets are red. So in our
> > ontology, I say:
> >
> >   Widget -> colour blue
> >
> > and you say
> >
> >   Widget -> colour red
> >
> > If we also have some constraints that say that blue and red are
> > mutually exclusive, then we have arrived at a contradiction. It is
> > simply *not possible* for anything to meet the criteria that we have
> > said must hold for something to be an instance of the class Widget, so
> > the class is inconsistent. The reasoner will be able to tell us
> > this. If, after "full and frank discussions", we still maintain these
> > views, then it's simply the case that we do *not* share a common
> > conceptualisation of our domain. What I mean by Widget is different to
> > what you mean by Widget. At this point we either go our separate ways,
> > or alternatively say nothing in the ontology about the colour of
> > Widgets, which although now less detailed, now reflects our *shared*
> > understanding.
> >
> > Alternatively, we might have different views on the same class. A
> > classic example here is that of the Triangle. I might define a
> > Triangle as being a polygon with three sides. You say its a polygon
> > with three angles. These are, in fact perfectly acceptable (and
> > mutually consistent) alternative definitions for the class of
> > Triangles. OWL allows us to supply such alternative definitions, and
> > the reasoner will tell us when these definitions are consistent and
> > when they are not.
> >
> > One solution to this is to use the classification as a "conceptual
> > coat rack" and hang "extralogical" information off it. This is the
> > kind of thing that the RuleML initiative is aimed at. This then allows
> > us to state things about OWL classes that are not necessarily part of
> > the ontological or intensional definition of the class, but which we
> > want to make available to applications.
> >
> > If when you talk about "classifying objects", you actually mean
> > describing individuals, then the situation is slightly different. The
> > usual picture (from the Description Logic perspective, which has
> > strongly influenced the web ontology language work) is that we have an
> > ontology (or "T-Box" or "schema") that describes the classes of the
> > domain (as above). We then have a collection of instances (or "A-box")
> > along with information describing the properties that particular
> > individuals have. There is a slight blurring of these worlds due to
> > some of the constructors in OWL, but it's worth thinking about it this
> > way.
> >
> > Now once we have our schema or ontology fixed, we can make assertions
> > about objects in the world. You might think that star Beta-X-17 is a
> > Red Dwarf. I might say it's a Green Fairy (is my lack of astronomical
> > knowledge showing??? :-). Let us assume that these two assertions are
> > contradictory. However this is now a disagreement about the properties
> > of a particular *individual*, rather than a disagreement about the
> > properties of the classes in the ontology.
> >
> > An analogy is that of database and schema. We may agree on the schema
> > of our database (people have names, addresses and so on), but disagree
> > on who lives at No.14 Acacia Avenue. This is an discrepancy at the
> > data level (that we may want to be informed about), but does not
> > change the fact that we agree about the general way the world fits
> > together.
> >
> > So in this case, resolving the discrepancy concerning our different
> > descriptions of Beta-X-17 is not an *ontology* issue, rather an issue
> > about the *use* of that ontology.
> >
> > This problem is not exclusive to OWL -- we could easily have
> > contradictory descriptions of Beta-X-17 using some controlled
> > vocabulary or terminology. I guess the interesting thing here though
> > is that the richer language allows us to spot that there really *is* a
> > contradiction between our opinions. Indeed, allowing contradictory
> > descriptions of individuals is key to supporting the Semantic Web --
> > it certainly will not be the case that all the information out there
> > will agree!
> >
> > Cheers,
> >
> >         Sean
> > --
> > Sean Bechhofer
> > seanb at cs.man.ac.uk
> > http://www.cs.man.ac.uk/~seanb
> >
> > >
> > > best wishes
> > >
> > > Anita
> > >
> > > - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> > > Dr. Anita M. S. Richards, AVO Astronomer
> > > MERLIN/VLBI National Facility, University of Manchester,
> > > Jodrell Bank Observatory, Macclesfield, Cheshire SK11 9DL, U.K.
> > > tel +44 (0)1477 572683 (direct); 571321 (switchboard); 571618 (fax).

--
Alan L Rector
Professor of Medical Informatics
Department of Computer Science
University of Manchester
Manchester M13 9PL, UK
TEL: +44-161-275-6188/6239/7183
FAX: +44-161-275-6204
email: rector at cs.man.ac.uk
web: www.cs.man.ac.uk/mig
        www.opengalen.org




More information about the semantics mailing list