Multi-dimensional Data Access minimal requirements

Douglas Tody dtody at nrao.edu
Tue Mar 11 13:14:04 PDT 2014


On Tue, 11 Mar 2014, Tom McGlynn wrote:

> However it suggests to me that it would make sense to have this separated 
> into two protocols.  The SIA protocol would take the requested region and WCS 
> of the image and calculate the actual image subset that meets this 
> requirement at whatever level the standard and implementation decided upon.
> A second lower level protocol (Maybe this is the DataAccess layer. For the 
> nonce I'll assume so) is invoked to actually get the subset.

This is exactly what was specified in the Sept-2013 SIAV2 draft, and
what is currently implemented in our VAO SIAV2 prototype.

     -   The SIAV2 query, with mode=cutout (or mode=match), "take[s] the
         requested region and WCS of the image and calculate[s] the
         actual image subset that meets this requirement".

     -   The access reference URL in the query response is a call back to
         the service (actually to the accessData service request not a
         different protocol, but it amounts to the same thing).  The
         information required to generate the virtual image is passed
         internally to the accessData capability.

     -   When the client GETs the virtual image, the accessData method
         "is invoked to actually get the subset".

If the client knows enough about the image to be accessed (e.g, via a
prior queryData and/or getMetadata) then it can instead just call the
accessData request on the desired image.  This is how we do things like
interactive image cube visualization and analysis, where the same image
is repeatedly accessed.  It is too low level however for simple
automated virtual image generation.

         - Doug


On Tue, 11 Mar 2014, Tom McGlynn wrote:

> To my mind there is a bit of a confusion here between what should be two 
> levels of the interface.  I'm not sure what AccessData is and it may be that 
> its development is addressing my concerns.  If not and all of this is out of 
> left field I will return to my box....
>
> It seems like the the data interfaces, SIA, SSA, whatever, talk to users in 
> the terms of the data model on which they are based.  So in SIA we specify 
> some set of geometric terms that describe a region in the sky that we wish 
> for a cutout.
>
> When we retrieve a cutout we are going to retrieve a subset of a larger 
> image, where as I understand it we are limiting the subset to n-dimensional 
> sub-box of the original image.
>
> There is no requirement that the specification of the cutout in the SIA 
> request have any relationship to the coordinate system, orientation, ... of 
> the actual data (again as I understand it).
>
> So, the request might be a simple RA/Dec box a couple of arcminutes on a 
> side.  But if the image being cutout is oriented in Galactic coordinates, 
> then the service will not provide data in the requested box to the user.  As 
> I understand it the intent is that what will be returned is the smallest box 
> (in Galactic coordinates) which fully includes the requested region.
>
> In this framework supporting a circular region makes a lot of sense to me.  I 
> suspect it's easier to calculate the appropriate bounds for true circular 
> regions (i.e., circles on the sphere not some particular projection plan) 
> than it will be for an RA/Dec rectangle.
>
> However it suggests to me that it would make sense to have this separated 
> into two protocols.  The SIA protocol would take the requested region and WCS 
> of the image and calculate the actual image subset that meets this 
> requirement at whatever level the standard and implementation decided upon.
>
> A second lower level protocol (Maybe this is the DataAccess layer. For the 
> nonce I'll assume so) is invoked to actually get the subset.  Any service 
> implementing a cutout SIA service would be required to implement (or provide 
> access to someone who implements) the DA protocol where can specify at the 
> data level that one wishes a particular extraction of a given file.  The DA 
> level knows nothing about WCS's or data models or such.  In FITS terms the 
> only thing it cares about are the NAXISn keywords (well it would update the 
> CRPIX's too I guess).  Say the DA level only supports simple subsets of 
> arrays.  That handles our image subsetting of course, but it can also be used 
> to extract regions in a spectrum or rows in a table.  Upgraded versions of 
> the DA could support skips between pixels returned, or averages or other 
> filters defined purely in terms of the array indices.
>
> More importantly, for me, the DA could be accessible directly without going 
> through the SIA service.  Now if some scientist wants to get subsets and she 
> happens to know where the data is she can just grab the subset directly. 
> Providing a generic capability of downloading subsets of data  -- regardless 
> of whether we've attached them to some lovely data model -- would be an 
> invaluable contribution to the community.
>
> Note, by the by, that my vision of a data access service isn't limited to 
> FITS data.  An implementation could get a row subset of a VOTable just as 
> easily.  Manifestly the protocol would need to be able to deal with multiHDU 
> FITS data and multi-table VOTables, but that's easy enough to do.  Any it 
> would be fine for a service to respond with 'I don't know how to do that' 
> when invoked inappropriately.
>
> Just my two cents...
>
>
> 	Tom
>
> Ray Plante wrote:
>> On Tue, 11 Mar 2014, Robert J. Hanisch wrote:
>>> Remember, too, we are talking about the query that gets sent from the
>>> interface to a service.  SIAP queries will most likely result from web
>>> forms or programmatic interfaces in which user-friendly inputs can be
>>> specified.  So we need not make the range specification so dumbed-down as
>>> "circle".
>> 
>> Recalling that this question arose from the requirement for supporting
>> simple cut-outs, we should clarify where the use of circle/range would
>> appear.  I'm gathering from Pat's response that this something that
>> goes specifically into AccessData, and would not affect image
>> searching, which is handled by SIAv2.  Is this correct?
>> 
>> This reminds me of another related question.  SIAv1 had the feature
>> that allowed a service to bill itself specificially as a "cutout"
>> service, which meant that the search queries would specifically return
>> images that are cut-outs matching (as close as possible) the search
>> region.  Is this expected to be allowed/supported by SIAv2?
>> 
>> cheers,
>> Ray
>> 
>


More information about the dal mailing list