use cases and Re: Comments on Canadian VO data model

Patrick Dowler patrick.dowler at nrc-cnrc.gc.ca
Wed Apr 23 10:51:53 PDT 2003


Hi Alberto,

Thanks for the comments; responses embedded below.

On April 23, 2003 02:23, Alberto Micol wrote:
> I see Pat's point of differentiating Archives (products) and Catalogs
> (products metadata,
> defined to be the ones that people will query upon),
> and I do understand the CVO requirement that calls for a data model
> integrated with the queries it supports; this choice simplifies the
> implementation.
>
>
> Nevertheless, I do not agree with Pat in that I think that the most
> generic data model
> should stand alone.
> It is of course fine to implement a partial data model when one has a
> specific problem to
> solve, but the IVOA data model has to be able to support many different
> views (ie queries),
> and the users won't necessarily be people.

I completely agree that the data model must be useful on its own. Of the two 
statements I made, I think that "query tools must leverage the data model"
is the  more important one, so I think the dm is more fundamental. For 
example, once tyou decide that the dm will include data types like Polygon
and Point, your query toolkit has to support those geometric types. That is
why we put so much effort into the basic type system - it drives the design
on both the "actual data model" (where types are an important base component)
and the query toolkit. That was what I really meant: the types one choses to 
use in the data model drive both the dm and query design.

> Pat's data model stops at the user query level, and delegates the rest
> to some intelligence
> placed somewhere else. The more general data model might want to go
> deeper.
> Suppose (in CVO) that a piece of software wants to read the pixel values
> of an image;
> the image might be in fits format, but does not need to be; a fits file
> might be structured in many different ways. A well defined data model
> could describe the image format, so that
> a generic image reader based on the data model could be conceived.

On the contrary, the CVO data model is designed for software consumption.
We have already implemented several "discovery agents" that look for suitable
entries in the observation catalog and create entries in the process catalog  
that define some (future) processing that can be done, using rules about 
which known processes can be applied "data products" described via  certain 
meta-data. Once an entry lands in the observation catalog, many relatively 
automatic software agents do the rest up to and including production and
insertion into the source catalog.

It is very true that we have not tried to tackle the "data product" issue at 
all in the data model, except by omission (ie. what doesn't belong in the 
catalog-level dm :-) Here there are all manner of thorny issues to deal 
with... we so far assume that a data product is "FITS + extra stuff you need 
to process it". How the extra stuff is delivered (as part of the product is 
TBD, but I see others are thinking along these lines too. Doug Tody posted 
something about including a VOTable with the FITS files returned by SIAP, for 
example. I don't know what the best solution here is, but I am pretty sure it 
lies within the realm of the Archive (or VO-enabled Archive if you want the 
fancy name :-)

In general it is the processing software rather than the exploration software 
that needs to understand the data products. From a practical point of view,
however, one needs to be able to tell if a data product is processable with
known/available (to the explorer) tools, so some characterisation is probably
useful....


> We need people looking at things from different perspectives.
> We should probably try to compile a list of use cases and data model
> requirements,
> basing it on the existing data models (cvo, cfa, cds, astrovirtel,
> astrogrid).
> What remains to be seen is if such list is complete or whether more
> scenarios have
> to be envisaged.
> It would be nice if we could come up with a proper list before
> Cambridge.
> What do you think ?
>
>
> Here the Astrovirtel data model use case:
>
>
> A data model to describe ESO and HST telescopes, instruments, cameras,
> filters, detectors, controllers, and observing modes
> in order to build:
>
>
> 1) a metadata service to help astronomers finding instruments with
> certain characteristics
>     eg, which ESO instruments offer a certain resolution and field of
> view in the R band ?
>     (this is probably a good bridge case between the Registry and the
> DM)

We haven't gone into the area of an instrument data model. It seemed
more profitable to concentrate on observations, sources, processes,
and objects and leave the archive-specific stuff exactly that: 
archive-specific. 

What is the real use-case of the above question? If you are then going to say 
"which observations came from that instrument?" you could have just asked an
observation catalog "which observations have spatial_resolution > $res and 
spatial_bounds.size() > $field and spectral_bounds.contains( $R_band )" with
the CVO model? if you can ask that question, why do you care about the 
instrument? If you do care about it being an ESO instrument, you can add:
"and data_product.archiveName() == ESO" or something like that :-)

Just trying to understand what this question is really after...

>
> 2) a Limiting Magnitude (or Flux) Calculator (LMC) for archived images.
>    This requires a data model describing things like the telescope area,
>    the filter transmission curves, the detector sensitivies,  the pixel
> scale, the seeing, etc

The observation data model has an optional property called flux_SN10 which
is the flux of a fuctitious point source with signal:noise of 10. According to 
our staff astronomers, this is a pretty good meaure of the depth of an 
observation. It isn't the limit, but it is "a limit for point sources" (or 
"unresolved" sources if we follow a recent suggestion). If that is not a
sufficient characterisation of depth/limit, then we can improve it...

Note that the "spectral_bounds" property ostensibly describes spectral  
coverage of the observation - as an interval. This is a 1st order 
approximation to the filter transmission curve + detector sensitivy; a more 
complex description (data type: Function.Float instead of Interval.Float) 
could be used to describe it since this value is published by the archive 
(i.e. the people that know). However, we may not get much value out of a more 
detailed description from a query point of view.




-- 
Patrick Dowler
Tel/Tél: (250) 363-6914 | Fax: (250) 363-0045
Canadian Astronomy Data Centre    | Centre canadien de donnees astronomiques
National Research Council Canada  | Conseil national de recherches Canada
Government of Canada                   | Gouvernement du Canada
5071 West Saanich Road                | 5071, chemin West Saanich
Victoria, BC                                   | Victoria (C.-B.)



More information about the dm mailing list