Vocabulary: Ontology

Tue Sep 11 06:28:20 PDT 2007

Hello all

I've been lurking at this thread for a while. Just a few cents I would 
like to add.

Indeed the "O" word does not help, although it conveys very simple an 
quite general ideas, that IMO can be expressed in a language anyone can 
understand (hopefully what I am about to write is more understandable 
than the excerpt picked up by Ron :-) ).

Data can most of the time (always?) be considered as descriptions of 
*things*, in a very broad sense : individual objects such as galaxies, 
planets, people, instruments, books, instruments, observatories, but 
also general concepts such as "infrared" or "gravitational collapse". Of 
course there are such things as "broad data" singularly in astronomy, 
such as those coming out of an instrument, but let's address here data 
interpreted as being attached to (attributes of) things. The ontological 
status of such things is of course a major point in science, and 
singularly in astronomy, but this is yet another story we can forget for 
a moment.

Now the things data are about, and their types/classes/categories are 
not always explicit in the data. They might be explicited in the data 
schema (more or less so).
Most of the time, in a given data base, a same data schema is used for 
things of the same category/type/class, there again whether it is 
explicited or not. You have one schema for stars, and one for galaxies, 
and one for books, and so on.

What does the RDF pile brings to that? What the heck with ontologies? 
Roughly the following principles, and the tools supporting them.

Explicit the types of things your data are about. (called "classes" in 
OWL - big deal).
Explicit the structure of data (elements of description) used for each 
type/class of thing (called "properties" in RDF).
Explicit the properties which express relations between things (this 
star belongs to that cluster/that type; this photograph if of that star, 
and is taken by that instrument ...), against properties of which value 
are simple data (mass, distance, luminosity, date of discovery, catalog 
number ...).
Use standard identifiers such as URI for things, their types and 
properties, so that data coming from different data sources can be 
compared, merged, or generally queried or browsed as if it were a single 
data base.

This can be achieved by using just about any kind of internal data model 
for each data source, and transformed on demand (processing dump export, 
web services, whatever) into a common RDF syntax (which boils down at 
the end of the day to a triple store).

As an illustration, I don't know if folks here are aware of the Linking 
Open Data project [1], and singularly of the DBpedia [2] data base, 
which happens to contain quite a lot of descriptions extracted from 
Wikipedia pages - which quality might vary, OK, but just try the following.
Go to http://dbpedia.openlinksw.com:8890/sparql
And copy the following SPARQL query in the query box

PREFIX  category:  <http://dbpedia.org/resource/Category:>
PREFIX  skos:  <http://www.w3.org/2004/02/skos/core#>
PREFIX  prop: <http://dbpedia.org/property/>

SELECT    ?x ?p
WHERE {  ?x  skos:subject category:Delta_Scuti_variables.
                  ?x  skos:subject category:Bayer_objects.
                  ?x  prop:parallax ?p.}

I am sure it does not need translation in natural language :-) .
The query results as of today are as following

x 	p
http://dbpedia.org/resource/Vega 	129.01
http://dbpedia.org/resource/Beta_Cassiopeiae 	69.5
http://dbpedia.org/resource/Delta_Capricorni 	84.58
http://dbpedia.org/resource/Denebola 	90.16
http://dbpedia.org/resource/Sigma_Octantis 	12.07

Put any of those URI in your browser and see what you get.

All the point now is : Do you care for IVOA data to be merged/compared 
or otherwise linked to such public data? Or not?

Best

Bernard

[1] http://linkeddata.org
[2] http://dbpedia.org

Rob Seaman a écrit :
> Rick wrote:
>
>> I think the main point is that it doesn't really matter what format 
>> we use, as long as 1) VOcabulary remains primarily a token list, 2) 
>> thus remains "easy" to process with "standard" tools, and 3) we all 
>> adopt it as the main (only?) standard in our daily VO-operations (the 
>> latter is the whole point of this frustrating exercise).  If someone 
>> needs a copy in OWL or Excel or CSV or cunieform, then there will 
>> always be simple means for translating a token list, with or without 
>> some ontological baggage.
>
> The W3C tells us that "The RDF specifications 
> <http://www.w3.org/RDF/#specs> provide a lightweight ontology system 
> to support the exchange of knowledge on the Web."  Bafflement has been 
> expressed at why RDF - and presumably ontologies in general - have yet 
> to catch on in the VO.  I think it may be baggage like the following 
> that demands ones attention when googling around for ontological info:
>
>>> /*Objective Pretensions and Metaphysical Baggage:*/
>>> /*A Defense of Normative Descriptivism*/
>>> /Can we accommodate normative truth and fact sans ontological 
>>> baggage?  In this paper I explore whether expressivism or 
>>> constructivism can capture the objective pretensions of normative 
>>> reason claims in ethics and epistemology.  I argue that they 
>>> cannot.  Expressivists fails because reason claims are thought to be 
>>> assessable by stance-independent semantic standards.  Depending on 
>>> the version, constructivism fails because it either does not offer a 
>>> stance-independent semantic standard of assessment, or because it 
>>> cannot capture the normative authority of reason claims.  In the 
>>> end, we cannot accommodate objective pretensions without descriptive 
>>> semantics, and that brings with it ontological baggage./
> /
> /
> (From http://www.u.arizona.edu/~mbedke/Bedke_Online_Papers.htm 
> <http://www.u.arizona.edu/%7Embedke/Bedke_Online_Papers.htm>.  As Mr. 
> Bedke helpfully explains,  "I am currently interested in normativity 
> that is reason-implicating, so I tend to look at normative reasons in 
> both epistemology and value theory.")
>
> I suspect that this actually means something - but what the heck does 
> it have to do with the VO?  Among other things, the VO is a really 
> interesting discussion between computer scientists and astrophysicists 
> (and those who seek to bridge the divide).  The biggest distinction 
> between the VO and similar efforts in bioinformatics, for instance, is 
> a budget that is a couple orders of magnitude smaller.  That smaller 
> budget translates to increased skepticism.  Why should we invest in A, 
> rather than B?  It isn't that we aren't willing - even eager - to be 
> convinced, rather, it's just that the case has yet to be made in the 
> right way to the right audience.  To date there is a significant 
> impedance mismatch between ontologies and astronomy.
>
> I might also say that astronomers themselves have a lot of experience 
> overcoming conceptual hurdles.  Eddington worked out that gravity 
> can't explain the energy source of stars.  Just when this is resolved 
> via fusion (and the holy grail of the alchemists, the transmutation of 
> the elements), someone wonders what happens when the fuel runs out.  
> The discovery of white dwarfs led to neutron stars led to black holes 
> - gravity finally wins after all - and Alice falls down the hole.  I'm 
> no Carl Sagan - but even without benefit of an ontology, I can string 
> three sentences together and give a capsule biography of the stars 
> from birth to death.
>
> It is ironic that computer scientists searching for a term indicating 
> a "data model <http://en.wikipedia.org/wiki/Data_model> that 
> represents a set of concepts within a domain 
> <http://en.wikipedia.org/wiki/Domain_of_discourse> and the 
> relationships between those concepts" chose a word heaped with 
> metaphysical pretensions.
>
> "Black hole".  People - laymen - don't even know what it is - but they 
> know what it is.  Why is it again that ontologies aren't taking the VO 
> by storm?
>
> - Rob

-- 

*Bernard Vatant
*Knowledge Engineering
----------------------------------------------------
*Mondeca**
*3, cité Nollez 75018 Paris France
Web:    www.mondeca.com <http://www.mondeca.com>
----------------------------------------------------
Tel:       +33 (0) 871 488 459
Mail:     bernard.vatant at mondeca.com <mailto:bernard.vatant at mondeca.com>
Blog:    Leçons de Choses <http://mondeca.wordpress.com/>