UTYPEs in VOTABLE for data models
Doug Tody
dtody at nrao.edu
Sun Apr 4 17:09:52 PDT 2004
Hi All -
I am still trying to get caught up on the flurry of email from last week
while I was travelling. My apologies if I haven't read all the email yet
and and if I duplicate anything which has already been said.
Anyway, on this issue of UCD and UTYPE being orthogonal, what I meant was
that each programmatic interface and interface element (service interface,
data model, etc.) has a unique UTYPE assigned to identify it, whereas UCDs
relate similar elements of different interfaces.
For example,
UTYPE | UCD
------+--------------------------
a.1 | x
a.2 | x
b.1 | x
c.1 | x
c.2 | x
d.1 | x
Here, the interfaces are A, B, C, etc. The interface elements are a.1,
a.2, b.1, etc. Every interface or interface element has a unique UTYPE.
Elements a.2, b.1, and c.2 share the same UCD, meaning that they are similar
quantities from different interfaces.
An example of this is the original use case for which UCDs were invented:
associating similar fields of catalogs from different sources. In this
case each survey defined a specific interface (data model in this case) for
their catalog. The interfaces were all different, hence it was difficult
to relate different catalogs. UCDs were invented to make it easier to
relate similar fields of such catalogs. Fields from two catalogs share
the same UCD if they are physically similar astrophysical quantities.
In this old use case, the defacto UTYPE values are the field names or
indices of the each specific catalog, e.g,. USNO.1, USNO.2, etc.
VO interfaces are similar, except that we attempt to define standard
interfaces for data access, standard data models, and so forth. These are
really no different than the old data provider-specific interfaces,
except that we try to define a standard. Instead of merely trying to
derive some order from chaos after the fact, we ask data providers to
implement standard interfaces which mediate (translate) external data into
some standard data model at access time. This makes it much easier for
data analysis software to use data from multiple sources, and encourages
data providers to produce data in a form which makes rigorous automated
processing feasible. Ultimately it makes things easier for data providers
too, by providing them with carefully designed, standard models to follow.
UCDs define a standard language for astrophysical quantities, providing
a simple tool for semantic inference and allowing data from different
sources to be compared in a semi-automated fashion (and encouraging
new data as well as standard interfaces to conform to this language).
Standard interfaces, including standard data models (like WCS, a uniform
model for physically characterizing data, etc.,) attack the problem
directly, relying upon direct mediation to map data to and from the
standard models. The UTYPE tags merely provide a simple, direct means
to identify the elements of pre-defined standard interfaces and data models.
None of this has anything to do with mandating how code is written, and
in any case interfaces are best defined in an implementation-independent
fashion. This is key however to be able to write code to reliably process
multiwavelength data from multiple heterogeneous sources.
- Doug
More information about the dm
mailing list