Multi-dimensional Data Access minimal requirements

Tom McGlynn Thomas.A.McGlynn at nasa.gov
Tue Mar 11 12:56:20 PDT 2014


To my mind there is a bit of a confusion here between what should be 
two levels of the interface.  I'm not sure what AccessData is and it 
may be that its development is addressing my concerns.  If not and all 
of this is out of left field I will return to my box....

It seems like the the data interfaces, SIA, SSA, whatever, talk to 
users in the terms of the data model on which they are based.  So in 
SIA we specify some set of geometric terms that describe a region in 
the sky that we wish for a cutout.

When we retrieve a cutout we are going to retrieve a subset of a 
larger image, where as I understand it we are limiting the subset to 
n-dimensional sub-box of the original image.

There is no requirement that the specification of the cutout in the 
SIA request have any relationship to the coordinate system, 
orientation, ... of the actual data (again as I understand it).

So, the request might be a simple RA/Dec box a couple of arcminutes on 
a side.  But if the image being cutout is oriented in Galactic 
coordinates, then the service will not provide data in the requested 
box to the user.  As I understand it the intent is that what will be 
returned is the smallest box (in Galactic coordinates) which fully 
includes the requested region.

In this framework supporting a circular region makes a lot of sense to 
me.  I suspect it's easier to calculate the appropriate bounds for 
true circular regions (i.e., circles on the sphere not some particular 
projection plan) than it will be for an RA/Dec rectangle.

However it suggests to me that it would make sense to have this 
separated into two protocols.  The SIA protocol would take the 
requested region and WCS of the image and calculate the actual image 
subset that meets this requirement at whatever level the standard and 
implementation decided upon.

A second lower level protocol (Maybe this is the DataAccess layer. For 
the nonce I'll assume so) is invoked to actually get the subset.  Any 
service implementing a cutout SIA service would be required to 
implement (or provide access to someone who implements) the DA 
protocol where can specify at the data level that one wishes a 
particular extraction of a given file.  The DA level knows nothing 
about WCS's or data models or such.  In FITS terms the only thing it 
cares about are the NAXISn keywords (well it would update the CRPIX's 
too I guess).  Say the DA level only supports simple subsets of 
arrays.  That handles our image subsetting of course, but it can also 
be used to extract regions in a spectrum or rows in a table.  Upgraded 
versions of the DA could support skips between pixels returned, or 
averages or other filters defined purely in terms of the array indices.

More importantly, for me, the DA could be accessible directly without 
going through the SIA service.  Now if some scientist wants to get 
subsets and she happens to know where the data is she can just grab 
the subset directly.  Providing a generic capability of downloading 
subsets of data  -- regardless of whether we've attached them to some 
lovely data model -- would be an invaluable contribution to the community.

Note, by the by, that my vision of a data access service isn't limited 
to FITS data.  An implementation could get a row subset of a VOTable 
just as easily.  Manifestly the protocol would need to be able to deal 
with multiHDU FITS data and multi-table VOTables, but that's easy 
enough to do.  Any it would be fine for a service to respond with 'I 
don't know how to do that' when invoked inappropriately.

Just my two cents...


	Tom

Ray Plante wrote:
> On Tue, 11 Mar 2014, Robert J. Hanisch wrote:
>> Remember, too, we are talking about the query that gets sent from the
>> interface to a service.  SIAP queries will most likely result from web
>> forms or programmatic interfaces in which user-friendly inputs can be
>> specified.  So we need not make the range specification so dumbed-down as
>> "circle".
>
> Recalling that this question arose from the requirement for supporting
> simple cut-outs, we should clarify where the use of circle/range would
> appear.  I'm gathering from Pat's response that this something that
> goes specifically into AccessData, and would not affect image
> searching, which is handled by SIAv2.  Is this correct?
>
> This reminds me of another related question.  SIAv1 had the feature
> that allowed a service to bill itself specificially as a "cutout"
> service, which meant that the search queries would specifically return
> images that are cut-outs matching (as close as possible) the search
> region.  Is this expected to be allowed/supported by SIAv2?
>
> cheers,
> Ray
>



More information about the dal mailing list