First-attempt Catalogue DM for discussion

Fri Sep 17 08:30:03 PDT 2004

Mark Taylor wrote:

>
>VOTable addresses the problem of a standard transport/exchange/storage
>format.  It does not address the problem of semantic interpretation
>of the data thus represented, although to the casual eye it might look
>like it does.  The introduction of the 'utype' attribute in VOTable 1.1
>makes this explicit (VOTable 1.1 recommendation sec 4.5).  In order to
>gain semantic information from a VOTable (e.g.: what class of physical
>object does row #i represent? what is its position on the sky?)
>you really need to associate elements of the VOTable with elements
>of a data model.  You can have a go at this kind of semantic
>interpretation by grubbing around with UCDs and column names, but it
>is not a rigorous or reliable way to go about things.
> 
>So VOTable does stand in need of a data model it can hook up to.
>I agree that attempting to answer this need is more a workmanlike task
>than a great voyage of discovery, but I don't think that makes it
>less worthwhile.
>  
>
Mark,
    I agree with you.  We need to discuss semantics as well.  We need to 
decide of UCDs are sufficient, or UCDs plus utypes, or something else 
altogether.  So lets start.
    What do we need to know about the items within a  field/column 
semantically?  What they are to the highest degree of specifity: eg, 
variable giant stars in binary systems.  What their relationship is to 
ID_MAIN: eg, hosts to the planets in ID_MAIN.  What their relationship 
is to other objects in the catalog:  eg components of Deneb and observed 
in image 12.  Is there anything else that is needed?

So a simple table of planets in orbit about components of a binary star 
would normally look like this
ID_MAIN       component         e     <v>  
P1               A             0.1    112
P2               B             0.3     22
P3               A,B          undef     6

How would we put this into Catalog in such a way that it is  machine 
understandable?
With just UCD type mechanism (without actually looking it up in our 
present UCD system, which?) we would get something like this for 
component column:
"stellar; binary component, giant, variable  planetary system; host 
star" (This is a parsing nightmare).
For the e column, we have "orbit; ellipticity"
For the <v> column, we have "orbit; mean velocity"
And we need some sort of link from component to the binary star name 
Deneb (perhaps in an upperl level table) and image 12.
What is missing here (aside from sanity)  is that nowhere does the table 
say that the planets in ID_MAIN orbit the stars in component!  Humans 
comprehend it pretty fast, but a computer would not.

If I understand utype (I am not on any discussion group that has 
discussed utype),  utype tries to provide some additional knowledge 
here.  One provides a model of  gravitional orbit and point from the 
table fields to parts in the model.   I could imagine having an OWL 
description of orbitingSystem: 2 objects in circular motion about each 
other.  A subclass would be a planetaryOrbitingSystem.  It would  
require that atleast one of its components is a planet. The ID_MAIN 
would have utype that indicated these are planets and another pointer to 
indicate component of planetaryOrbitingSystem, while the component would 
point to planetaryOrbitingSystem and binarySystem where binarySystem is 
another subclass of orbitingSystem with two similar components.

Ellipticity could also point to something in the "model".   However, it 
is ambiguous if the ellipticity refers to the orbit of  component A and 
B in Deneb or the planets around their hosts.  Same with mean velocity.

In a Catalog with a tree approach, information can be inserted directly 
into the right locations.  We can know that the velocity applies to the 
planet because it is a child of planet.  If it were the velocity of the 
component star, then it would be a child of  that star.  In the 
following the binary system Deneb and its components would be intoduced 
as ancestors to the planets. 
<binaryStar name="Deneb">
    <component id="A">
       <spect_class>K Giant</spec_class>
       <variable/>
       <planet name="P1">
         <orbit>
            <around>
                <star ref="A"/>
            </around>
            <ellipticity>
                </unitless>
                <value>0.1</value>
            </ellipticity>
            <velocity>
                <unit>km/s</unit>
                <value>112</value>
            </velocity>
        </orbit>
       </planet>
    </component>
    <component id="B">
       <spect_class>K Giant</spec_class>
       <variable/>    
       <planet name="P2">
         <orbit>
            <around>
                <star ref="B"/>
            </around>
            <ellipticity>
                </unitless>
                <value>0.3</value>
            </ellipticity>
            <velocity>
                <unit>km/s</unit>
                <value>22</value>
            </velocity>
         </orbit>
       </planet>
    </component>
    <planet name="P1">
        <orbit>
          <around>
              <star ref="A"/>
              <star ref="B"/>
          </around>
          <ellipticity>
                </unitless>
                <value special="undefined"/>
          </ellipticity>
          <velocity>
            <unit>km/s</unit>
            <value>112</value>
          </velocity>
        </orbit>
       </planet>
</binaryStar>

The relationships and containments  are now clear and parsed by standard 
XML tools.  I  introduced an orbit which takes  around, ellipticity , 
and  velocity.   I allow around to take one or more objects to allow  a 
planet to go around A and B although I could have said it goes around 
Deneb.  Perhaps if it is done this way it would mean that it weaves 
between components and the other way it means it just goes completely 
around the system.