World coordinates cutouts/versus pixel cutouts Re: Multi-dimensional Data Access minimal requirements
Tom McGlynn
Thomas.A.McGlynn at nasa.gov
Fri Mar 14 12:03:06 PDT 2014
Hi Francois,
There is no doubt that there needs to be a capability that is able to
translate subset parameters expressed in celestial coordinate to a
parametrization in pixel space. And we clearly need to be able to
extract subsets, not just describe them.
What I am suggesting is that these are potentially separable
capabilities and that there could be substantial benefits to doing
this separation. Since I frequently get confused by the abstract
discussion I'd like to illustrate this with an example.
Let's start with my example of an SIA request for a distant galaxy
where we want a 5" cutout region.
The user invokes a request something like:
http://host/getSIA.pl?POS=167,57&SIZE=0.0013888&CUTOUT=true
(I'm not worrying too much about the details here).
Hopefully the user gets back a VOTable with one or more rows. For
each row there is URL that gets the cutout the user requested. The
returned URL implements the standard way we present cutouts to the user.
The question is does this URL look like?
Is it
http://host/getSIACutout?file=baseFile&POS=167,57&SIZE=0.00138888
or
http://host/getFitsSubset?file=baseFile&XR=1000..1020&YR=1400..1420
?
In the first case, all the initial SIA service request does is pass
the subsetting parameters to some other service after it ascertains
that there is coverage for the particular file. The program
getSIACutout knows something about images and so it's going calculate
the appropriate axes ranges and then return the subset to the user.
One way that it could do this (and this overall approach would be fine
with me) is to calculate the actual image subranges and then do a
redirect to the appropriate call to the getFitsSubset in the second
choice.
In the second approach it is the explicit responsibility of the SIA
service to do the image-based calculation right away. This makes
sense to me since then we have the Image service handling the
image-based calculations and returning something that is now usable in
contexts that don't necessarily understand images and WCS's as such.
We've normalized the code so that all the image stuff happens in the
same place. But if we want to have an intermediate subsetting layer
that converts from WCS to pixels I can live with that.
What I think is a bad idea is tightly coupling the calculations of the
actual data subset range with the extraction of that range from the
data files, i.e., having getSIACutout directly returning the cutout
FITS file.
I've said this above, so to reiterate: if we define the actual
extraction step as a separately implementable interface then not only
can we immediately think about supporting subsetting in VO interfaces
other than SIA, we free our community to use subsetting however they
would like it.
Even if we just consider images this would be very useful.
E.g., a few years back I created some mosaics using ROSAT PSPC data.
For some I wanted to try to maximize the resolution which degrades
rapidly offcent for the PSPC. If I was doing something like this with
the PSPC images I could just request the center fraction of the image
with a given pixel boundary. Don't need to worry about WCS, just the
fixed pixel locations. There are lots of cases where the actual pixel
locations are important for a given set of images.
For reasons that I'm not aware of some GALEX images are provided with
a circular field of view within the square image frame where the FOV
is not centered. You can get the pixel center from the database. If I
wanted to retrieve more centered GALEX data I could have used this
information to get a nice subset where all of my GALEX data would
actually be the same rather than wandering over the image.
And we've unlocked users to use the subsetting when they already have
the ability to do the image calculations. There's lots of WCS-aware
software out there. What isn't there is the ability to extract only a
subset of a file over the web. We can make it easy for lots of tools
to take full advantage of the Web to extract only what they need over
the web.
If we have a simple and generic capability users can build lots on
non-image tools that extract data from photon lists, time series,
object lists, anything and everything that's described by tables or
arrays.
And last but not least, for the case of FITS and VOTables all the
software to do the extraction already exists and we just need to
define how it is to be invoked -- not a trivial task but one which is
easier if we limit the functionality rather than trying to support
semantic data models.
Tom
François Bonnarel wrote:
> Hi Markus, all
> Le 14/03/2014 12:56, Markus Demleitner a écrit :
>> On Fri, Mar 14, 2014 at 12:13:28PM +0100, François Bonnarel wrote:
>>> Hi Paul, Tom, all,
>>> Of course it would be nice to have this functionnality and it has
>>> been discussed in the DAL group vor AccessData version .... 1.1.
>>> While it may seem simpler, (and it is as far as syntax definition is
>>> concerned maybe) it is actually not true. Because if it is for a
>>> pixel cutout query to have any scientific value, some a priori
>>> knowledge (even rough) of the Mapping between the pixels and the
>>> world coordinates. This knowledge has to be used either by a client
>>> or by the service itself to prepare usefull pixel cutouts queries.
>> Knowing full well I'm getting on everyone's nerves, I'd still like to
>> point out that in the structured parameters approach --
>>
>> http://www.ivoa.net/pipermail/dal/2013-December/006602.html, chapter
>> 6.1 "common parameters"
>>
>> --, pixel-wise cutouts aren't in any way special and are cheap to
>> implement for both clients and servers (although, as I said, I'd now
>> use PIX(n) and PIX(n)_WIDTH, although for pixels, MIN and MAX are
>> just as appropriate).
> Sure it's possible to define easilly pixel cutouts syntax. An
> alternative to the parameters you arev propsing is one parameter
> (PIXCUTOUT or whatever) with cfitsio syntax which is very general for
> all n-d arrays of values. But my point here was about the availability
> of mapping information necessary to build usefull pixel limitations.
>
> Cheers
> François
>> François is right, though, that for most interesting use cases,
>> operating on pixel coordinates requires knowledge of the mapping. At
>> least for common FITS images, that's again easy for clients and
>> servers with the proposed mechanism, too: clients would use KIND
>> (trivial to implement) and get back the FITS header they can already
>> interpret.
>>
>> Not quite as general as the full DM approach, but very cheap measured
>> in implementation effort and, I would claim, effective in terms of
>> "Wow!" potential in our clients' users.
>>
>> Cheers,
>>
>> Markus
More information about the dal
mailing list