[QUANTITY] Plea for pragmatism
Doug Tody
dtody at nrao.edu
Wed Oct 29 11:07:06 PST 2003
On Wed, 29 Oct 2003, Alberto Micol wrote:
> In all cases, I do not find useful to start from the Quantity level.
>
> Let's concentrate on the top level things we need to solve for the VO,
> eg, how to describe coverage (bandpasses, regions, time intervals, depths)
> of existing data products, how to describe images, spectra, light curves,
> exposure maps, visibilities, etc, how to package products, etc.
This is my view as well, and is more or less what we agreed to do back in
Cambridge in May.
We should focus first on the component data models. This includes
coverage of all sorts including image characterization, for example time
of observation, spectral bandpass, spatial bandpass, spatial resolution,
flux bandpass or sensitity / limiting flux, observation metadata, WCS,
and so forth.
These components can be modeled largely independently, then used
to characterize actual datasets, to implement queries, and so forth.
They are simple enough so that it should be possible to reach an agreement.
In the process we will work out how to formally define a data model and
how to use it in various contexts.
We should intentionally keep the initial data models simple, with the
expectation that these will evolve with time as they are used. Having a
simple form of a given data model will be useful even in the long run as
more sophisticated models become available. For example, if one has a
fully general data model for something like spectral bandpass, it will
always be useful to have a "summary data model" which simplifies the
concept of a spectral bandpass to the point where it can be expressed
in a handful of attributes. Most likely it is the simpler "summary"
version which will see the most use. Only a few applications will need
to drill in deeper to look at the details.
A global, all encompassing data model for something like an imaging
observation could be very complex. A better approach might be to simply
provide a mechanism to aggregate and associate the component data models
to model actual instances of complex datasets.
The hierarchy needed is something along these lines:
Container
A mechanism, e.g., based on VOTable (or FITS), for aggregating
components to model actual datasets. For example, SIA does
this now in a crude way by using a row of a table to model the
metadata of a dataset, with component data models mapped to the
fields of the table.
There can be more than one type of container and more than one
way of representing the same data. The same components would be
reused in different types of containers. The components used by
a container could be data models, or storage elements of some sort.
Component data model
This is an abstract model for some finite concept such as time
of observation, spectral bandpass, WCS, and so forth. It is
defined independently of representation and can be understood
independently of how it is used (i.e. it is a component).
Such componets would be reused in many different contexts.
Quantity
Component data models may or may not be based on formally defined
quantities. We can define and use component data models without
a quantity data model - this can come later.
It could be useful to have models for fundamental quantities that
are used in multiple contexts, such as different data models,
queries, and so forth. A quantity should be something simple
such as a scalar value with a unit and name. This would help
standardize the use of quanties of various types in the VO.
One could come along later and add tools to do things like
transform one quantity to another. An API for such a tool might
allow either scalars or vectors to be transformed, but the basic
concept would be a simple scalar quantity.
Our most urgent need is for the component data models, and simple models
for things like image and spectrum/SED. We also need to define ways to
represent such data models in formats such as XML/VOTable.
- Doug
More information about the dm
mailing list