SSA working draft

Mon Nov 20 14:39:19 PST 2006

Hi Inga -

On Mon, 20 Nov 2006, Inga Kamp wrote:
>> If the service generates cutouts, filters the data, does spectral
>> extraction, etc., this can *only* be done at access time, because
>> these are intrinsically on-demand operations driven by parameters
>> supplied by the client at access time.
>
> I would filtering on the database level not describe as a dynamic creation
> of the dataset. At least it confuses people the way it's phrased now.

By filtering I mean something which changes the actual data samples,
like interpolating or running a fourier filter on the data, editing
defects, flux calibration, etc. - as opposed to a cutout which does
not compute new data samples.  For a spectrum, filtering at access
time is probably not going to happen unless the service changes
the dispersion or applies a flux calibration at access time (e.g.,
to help get the data in a more generic form), but it is possible.

Anyway, you are probably right that some clarification is necessary.

>>> Why do you want to single one type of response format out? Why not have
>>> them all equal?
>>
>> Not sure what you are referring to here; 1.4.1 describes the levels of
>> compliance of a service, not response formats.
>
> This was referring to "if you provide only one format, then VOTable is the
> preferred".

Thanks for the clarification.

Actually this is a good question: should we define a preferred format
in which we would like to get spectra back?  This could be desirable
so that clients don't have to be prepared to deal with all the various
kinds of data formats (the main reason we define multiple formats
is so that the client can get whatever it prefers).  If we suggest
a format to support my tendency is to go with VOTable for spectra
since it is much better than FITS for metadata, and ok in terms of
efficiency in most cases for spectra, especially if protocol-level
compression such as gzip is used.

>>> 2.3:
>>>
>>> How can you have metadata on virtual data? How should we anticipate all
>>> possible ways a user may ask for spectral cutouts, extraction etc. to be
>>> prepared to answer?
>>
>> This is what on-demand data generation and virtual data are all about.
>> The service describes the metadata of the virtual data product it would
>> generate.
>>
>> There are an infinite number of possible virtual data products.
>> You don't have to describe them all, rather, given what the client
>> requested, the limitations of your service, and the characteristics
>> of the data, you describe what the service would generate to best
>> match what the client requested.
>>
>> A simple example is if the client requests a certain bandpass range
>> and you have a cutout service, the virtual data product would be a
>> spectrum covering only the given wavelength region (or however close
>> the service can get given a range of other details).
>>
>> If the query is detailed enough, the query response may refer to a
>> single data product.  Hence, the query mechanism may be used not only
>> for data discovery, but to negotiate with the service on the details
>> of the data product to be generated.
>>
>
> How do you know the SNR ratio a priori for all possible spectral cutouts?

You wouldn't know it without computing it from the data for the cutout
region, but the interface assumes that the SNR may not be known for spectra
in general, hence it is optional.  Most metadata can however be computed
for virtual data given the overall archival dataset values.

>>> 3.3.2.3:
>>>
>>> I thought that BAND is always a string. How can you have then "If a
>>> bandpass is spcedified as a string it is..."
>>
>> BAND is either a numerical bandpass (wavelength in vacuum in meters)
>> or a bandpass name (unspecified; prior discovery is needed to determine
>> the possible values).
>
> Yes, but the type is always string and never real number. It's confusing.

Ok, I see what you meant now.  I was referring to the semantic type within
the range-list, but the range-list parameter itself is always a string.
The text should be clarified.

>>> Apertures need not be circular, so you may want to phrase the respective
>>> sentence different.
>>
>> Only circular apertures are currently supported for on-demand spectral
>> extraction and this should be adequate for point-source or compact
>> objects (even for Grism data).  We could generalize this if needed,
>> but it complicates the interface.
>
> How do you avoid confusion when people actually want to search for a
> particular observing aperture? Wouldn't it be better to make this clear in
> the name like EXTRACT_APER?

In an earlier version of the interface we used the APERTURE parameter
for this purpose as well.  But it is confusing to overload the parameter,
and searching by aperture gets complicated in any case as one might have
rectangular apertures and so forth.  Hence APERTURE is now used only for
spectral extraction.  As you suggest, it might be useful to use a more
specific parameter name to ensure that people don't confuse the meaning.

There is no way in the current interface to query directly on
the aperture size and geometry, but one can query on the spatial
resolution, which for most spectra is roughly similar to the aperture
size.  This does not cover the case where the aperture is much larger
than the spatial resolution, however this is probably rare enough to
not be worth supporting directly in the query interface.  One can always
submit a more general query and refine the query on the client side
using the more detailed spatial coverage information which comes back
in the query response.

 	- Doug