gzipped images in SIAP 1.0 (fwd)

Wed May 30 16:42:12 PDT 2007

Hi Pat -

This issue has been discussed off and on for a long time now.  There
are attractions to handling the format stuff at the getData stage; some
implementations actually do this already, e.g.,

     http://webtest.aoc.nrao.edu/ivoa-dal/JhuProxySsap?
     REQUEST=getData&FORMAT=votable&
     PubDID=ivo%3A%2F%2Fjhu%2Fsdss%2Fdr5%2380442261170552832

(which is a real acref, and you can replace the "votable" with other
formats such as "native" or "csv" and get the data today).  Maybe we
will go this way in the future, but it is harder than it seems to
formally standardize in a rigorous fashion, as a service can serve any
kind of data, the available formats could differ depending upon the
image or data collection, etc.  It can be done, but it requires more
complexity (probably one would want to replace Access.Format in the
query response with a list of some sort, and replace the acref with
a template, hence this is the old issue of the templated access ref;
it could work but has its own issues which we won't go into here).

Just listing the available formats which match the query is a simple
technique which always works.  The query response gets annoyingly
bloated, but it is simple and it works, and handles all the odd cases
(plus now we know how to compress it!).  We can use a MultiFormat
Association to describe the multiple formats available for a dataset
in the QR, and the Association mechanism used is general and can deal
with any other type of logical association.

I think we may want to consider promoting getData to a real operation
at some point, but a simple opaque access reference URL has its
advantages as well.

 	- Doug

On Wed, 30 May 2007, Patrick Dowler wrote:

> On Wednesday 30 May 2007 14:59, Doug Tody wrote:
>> The first question asked was what SIAP 1.0 intended, and this is what
>> I have addressed above.  SIAP has always worked this way, and I am
>> surprised that anyone is confused.
>
> I don't think the FORMAT thing in SIA 1.0 is confusing. For most people, I
> expect, the collection they are serving is in one format or they decide to
> serve one format, so if a query comes in asking for GIF and they only have
> FITS, they return an empty VOTable (this is what we do).
>
> We have also toyed with on-the-fly conversion, which if deployed would mean we
> could respond "yes" to any format and do the conversion in the getData stage.
> This and the above approach are both kind of "a priori" knowledge of the
> possible types, without doing a DB query.
>
> We also have in some cases pre-computed preview images in graphics format, but
> I always found it kind of ugly for the observation catalog to know about
> these different formats. Essentially, trying to support this in a really
> general way brings in a very large database denormalisation problem,
> especially when you have set of systems for storing all the files and another
> for enabling the querying of metadata. I prefer to leave the file type stuff
> for the delivery mechanism to handle (eg the getData method) since that can
> be largely independent of the querying (except in this case of formats).
>
> if one had the on-the-fly conversion in place, then pre-computed previews (for
> example) could be just an optimisation of the retrieval process, which is
> again nice and clean and simple. But in general I think this FORMAT thing is
> an optimisation that introduces some complexity to implementing the service
> in some cases. It is easy enough to avoid that by reducing the scope of the
> service (eg we have fits and jpg but only deliver fits because we don't
> always have jpg on hand) but that does reduce overall value.
>
> summary: it's more complicated than it looks, but not confusing :-)
>
>