[CATALOGUE]Starting Data Model Subgroup

Pedro Osuna Pedro.Osuna at sciops.esa.int
Mon Jul 26 03:48:16 PDT 2004


Dear all,

at the last IVOA meeting in Cambridge, Boston, I approached Jonathan to
get the vacant responsibility of coordinating the efforts in a
"Catalogue" subgroup of the Data Model.

After his agreement, I'm sending this note to ask for volunteers to join
this subgroup and start getting inputs from all of you to compile
information that would eventually become a Catalogue Data Model
recommendation. 

In order to give a bit of flesh on what I understand we are after, I
send you some brainstorming on the whole idea of the Catalogue DM and
hope it serves to start proper discussions on the issue.

Further mails on this will appear with a [CATALOGUE] heading so that
they can be conveniently filtered/trashed.

Thank you.

Cheers,
Pedro Osuna. 



Catalogue Data Model Subgroup starting inputs
---------------------------------------------

In order to build a proper Data Model for Catalogues, I think it would
be important to answer the following questions:

1) What is a Catalogue?
2) What is a Catalogue used for?
3) Why do we want to model Catalogues?
4) Where do Catalogues find a place within the VO?
5) What are the interesting Use Cases for a Catalogue DM?


The most important in the first stages of this work is to
identify what exactly we mean by a Catalogue, to come to a common
agreement on what we will be modeling.
Some of my own views on the definition of what a Catalogue is follow
with the idea to serve as a starting/discussion point.



DEFINITION OF A CATALOGUE
-------------------------

>From "Webster's Revised Unabridged Dictionary (1913)":

"[...]A list or enumeration of names, or articles arranged
methodically, often in alphabetical order; as, a catalogue of
the students of a college, or of books, or of the stars.[...]"


In the case of astronomy, thus, a catalogue would be a list or
enumeration of certain astronomical objects (to be clarified later) in a
certain order and including certain information per object.

The definition of an astronomical object in this context would vary.
An astronomical object could be anything from Stars to Galaxies, etc.,
but also something more general like Observations, Sources or
Observatories.

In this sense, the Catalogue data model would not have to describe the
inner details of the object it is cataloging, that should be described
in other data models, but just the information relevant for the
catalogue itself.  It is also true that some of the internal properties
of the astronomical objects would appear in the catalogue itself through
its columns.

For example, the XMM-Newton "1XMM" is a list of serendipitous sources
detected by the satellite in its observing campaign. The model for this
catalogue could consist of things like the provenance (ESA), number of
columns (400) number of rows (~32000), etc., or it might give more
relevant information like: column number three in the catalogue is the
Source.likelihood where likelihood is an attribute of the Source Data
Model.
I think this is an interesting point for discussion.....


A place to find literally thousands of catalogues is the CDS, where they
have 5587 Catalogues available. Their clasification of the catalogues
obeys to the type of data they are cataloging, e.g., Astrometric Data,
Photometric data, Spectroscopic data, etc.. The same question as above
on whether we would have to create specific data model for each of the
eventual astronomical object categories we are cataloging arises.

It would be nice, in passing, to get someone from CDS directly involved
in this subgroup, given their experience in catalogues.


A point to clarify as well is whether a catalogue -in the Data Model
sense- has to be bi-dimensional or can have more than two dimensions.
What I mean by this is that, for example, we might have two different
catalogues for the same set of objects, one for filter A and the other
one for filter B. In the Data Model, however, we might have a unique
object with just three axes, one for the objects, other for filter A and
the other for filter B. The final representations of the catalogues
would always be bi-dimensional, but a Data Model representation allowing
more axes would be more compact, powerful and flexible. Whether this
would be a Pandora box or not I hope to get people's impressions....




In summary, there are obvious things to model from a catalogue, like its
provenance, number of columns, type of columns, names of columns, number
of rows, etc., but there are others which might make the model more
interesting and powerful, like including n-dimensions (in the, let's
say, cartesian sense of orthogonal catalogues, not in a relational one)
or linking the objects cataloged with their own data model....

Hope this serves somehow as a starting point.

I will be on holiday, back on Aug 23., then I'll process any eventual
inputs you sent.



P.S.: on a personal note to me, Jonathan was touching on the issue of
whether we should say CATALOG or CATALOGUE, and the same for other IVOA
standard docs, whether we should use British or American english.
Not being a native speaker, I don't feel with the right to say anything
and apologize beforehand because of my absence of accuracy when writing
this, and other, word(s).

-- 
Pedro Osuna Alcalaya

 
Software Engineer
European Space Astronomy Center
(ESAC/ESA)
e-mail: Pedro.Osuna at esa.int
Tel + 34 91 8131314
                                                                                
European Space Agency
VILLAFRANCA Satellites Tracking Station
P.O. Box 50727
E-28080 Villafranca del Castillo
MADRID - SPAIN



More information about the dm mailing list