ranges

Patrick Dowler patrick.dowler at nrc-cnrc.gc.ca
Thu May 1 09:20:05 PDT 2003


PS for dm list: some points below cross the line between registry and dm
and one would expect this...

On May 1, 2003 01:32, Clive Page wrote:
> On Wed, 30 Apr 2003, Robert Hanisch wrote:
> > Elizabeth's AstroGrid prototype registry, Anita's document on frequency
> > coverage, and Bob Jackson's previous work on telescope and instrument
> > capabilities, all use range specifications for quantities such as bandpass
> > coverage.  The RSM V.6 document generally does not do this.  In fact, I 
> > have
> > a bit reluctant to put in specific metadata elements for upper and lower
> > limits, as they are often poorly defined (what are the exact  wavelength
> > limits for various optical filters?),

Surely an interval defines a bandpass better than a name? Of course, something 
even more complicated with the response function et al would be a better 
characterisation but much worse to query against...


Clive responded:
> Since I have argued in favour of the spectral ranges approach in the
> past,
> maybe I can explain it, and defend it somewhat.

In our CVO data model, we characterise every observation by a 
"spectral_bounds" interval. This allows one to ask something like
"how many observations have spectral_bounds that includes 
$my_favourite_wavelength?" etc. The great thing is you are remaining in the 
analog world rather than attaching names to things; in analog, an interval
[300,330] (nm) and [302,332] are almost the same and both would be returned by 
most exact searches (both would be returned by all suitably  fuzzy searches).
But, they are not exaclty the same, so attaching the same name is wrong and
attaching different names leads to picking one or the other OR horrendous 
complexity as we manage 1000s of bandpass names.

Intervals are great to use beacause they can overlap and they are easy to
index and search against, even in large DBs...

> The main point of the Registry is to make every query a wanted query
> (sorry about the political overtones of that phrase).  In your example
> the
> important thing is to avoid the HST archive being pestered with queries
> from folk who want, say, radio or x-ray data, which it cannot provide.
> If we don't do this, every query will be sent to potentially hundreds of
> sites.  The way to ensure that the HST archive _always_ gets queries it
> _can_ help with, and avoids _most_ of the queries to which it cannot
> respond, is for the HST archive's declared spectral range to be all
> inclusive.  If there is any possibility that the HST archive could help
> somone wanting data at 9005 angstroms, then the limits should be
> declared
> to encompass this wavelength, and not just go out to 9000 angstroms.
> There must be _some_ wavelength completely outside the HST's band which
> can be used to ensure that radio/millimetre-wave/XUV/Xray/Gamma-raw
> queries do not get passed on, but all others do.

Exactly. You want the registry to tell you which services have a chance of 
returning something if you submit your query. It doesn't matter so much if
the registry gives you too many services and some of those return nothing
(within reason) but it matters a great deal if the registry fails to tell you 
about a service you might want to try.

> The most obvious alternative would be to have a simple enumerated list
> of
> wavebands, with each archive declaring which named band or bands it
> covers.  This would be much simpler to set up, but somewhat less
> selective
> in practice.  Maybe this doesn't matter much, and we should adopt it as
> an
> interim solution?  The main problem I foresee is that we are likely to
> get
> into interminable arguments over where the boundaries are between bands,
> and how many named bands there should be.  If we distinguish between
> soft
> and hard x-rays, what about medium-energy x-rays, don't they deserve
> their
> own enumeration value?  And what about UV and IR etc.: do we just call
> them soft/hard or near/far, or go for named bands (J, K, L etc)?  If I
> thought that the community could agree on this in a finite time, I'd be
> in
> favour of it, at least in the interim.

Such a list would be very large. 

> The other extreme is to have some method of registering the spectral
> coverage of each archive or resource in detail.  Here I foresee that
> debates on which units to use (MHz, Angstroms, eV?) and choice of
> resolution might go on for years.  It's also hard to see how all this
> information could be extracted in the first place, nor how it could be
> stored in a Registry of modest size and complexity, unless the relative
> resolution were rather low (in which case it doesn't provide much more
> selectivity than the spectral ranges approach).

It is an issue in the data model world already to specify units and for 
services to be able to do some common comversions. In spectral_bounds,
one might chose energy, wavelength, or frequency (all usd in different 
regimes). Internally, a service can pick one and do conversions as long as
the query declares what units their constants are in... by units here I really 
mean one of eV, ?m, Hz and not all the mega- kilo- centi- variants. There 
should be one supported unit for each concept and the variants are a UI issue
(almost a user preference issue). 
 
> On balance, it seemed to me that the spectral range approach was a
> reasonable compromise betwen these two extremes.

Again, it works wonderfully in the data model layer and - necessarily - the
registry should be talking the same langauge as the data model and query
parts.


-- 
Patrick Dowler
Tel/Tél: (250) 363-6914 | Fax: (250) 363-0045
Canadian Astronomy Data Centre    | Centre canadien de donnees astronomiques
National Research Council Canada  | Conseil national de recherches Canada
Government of Canada                   | Gouvernement du Canada
5071 West Saanich Road                | 5071, chemin West Saanich
Victoria, BC                                   | Victoria (C.-B.)



More information about the dm mailing list