WD-AccessData-1.0-20140312

Douglas Tody dtody at nrao.edu
Mon Sep 8 00:45:02 CEST 2014


Hi Francois, all -

Having reviewed the recent discussions, after implementing actual SIAV2
and accessData prototypes, my view on the issues raised is as follows:

* Compatibility with SIAV2

The scope of accessData in its planned full glory differs from that of
sia.queryData so they will necessarily depart somewhat, but similar
interface components should be the same as far as practical in the two
interfaces.  In particular, specification of a cutout region in world
coordinates via POS,BAND,TIME,POL should be the consistent.

Aside from enhancing user sanity and code reuse in implementations, this
enables more advanced interface extensions.  AccessData can be extended
for example, to compute the metadata for a virtual image instead of
computing the virtual image itself.  These are closely related
operations and planning virtual data generation is an important
capability for scaling up to very large cubes (we currently do this in
our prototype here via the queryData MODE parameter extension).

When we get into advanced image/cube access where we are computing 
moments, collapsing along an axis, slicing at arbitrary positions and
angles, etc., we will be dealing with access operations that are
quite specific to the type of data we are dealing with (Image).

While I agree that the basic accessData interface for e.g. spectra or
time series should be similar or identical where practical, they will
necessarily differ for the different classes of data, as the operations
one wants to perform on an image cube, spectrum, or time series differ -
the use cases differ.  Hence, we are probably looking at a common
interface pattern, but actual accessData capabilities and interfaces
will differ depending upon the type of data.  This is just the usual
object oriented model of course.

* Parameter Form

My preference continues to be for parameters such as POS,BAND,TIME,POL
etc. - essentially one parameter per axis as Francois describes it.  I
find this to be a simpler and more powerful approach than multiple
atomic parameters, as all the information needed to deal with an axis is
encapsulated in one primary parameter.  Such a parameter is an object
with an abstract type.  One can try to do the same thing with multiple
associated primitive parameters, but the hidden complexities of the
association can easily be much worse than the parameter object
semantics.  A single parameter object can be easily composed, parsed,
and verfied.  Generic interface discovery is not an issue so long as
abstract parameter types are supported.

A parameter mechanism that supports abstract types can easily permit
simple types where this is sufficient - the reverse is not true.
Forcing use of only primitive types merely moves the semantics up into
associations of multiple parameters.  Instances of primitive parameters
may be missing or may be multiple - it can get quite complicated in a
hurry.

* Parameters

For the Image version of accessData, supporting only simple cutouts
aligned with the image axes initially, we need to define the extent of
the cutout region in world coordinates, e.g., POS,BAND,TIME,POL.  Also
at least PubDID (or ID if you prefer) to identify the dataset to be
accessed.

One thing that has not been addressed sufficiently is pixel space
operations - this is a primary use case for image data.  It is important
whether or not the CSP use cases got that deep - it mainly affects
client applications and how they are implemented.  In theory World space
operations are sufficient, but it doesn't work that way for in actual
image access applications where we are directly accessing a single
dataset.  World space is required for discovery, but pixel space is
required for accessing a single pixelated dataset (non-pixel datasets
such as event or visibility datasets can either be sampled to pixels in
the filter or WCS term, or can forbid the pixel space term).

Simply adding a SECTION parameter, specified in pixel coordinates
relative to the image being accessed, suffices to provide this.  The
pixel space term can be combined with the filter term (cutout in World
coords), in which case the pixel space operation is applied to the
result of the filter term.

Again, we have already implemented all this in our VAO prototype, and
can report on relevant experience if desired.  It is also how much
current image analysis software present in the real world outside VO
currently operates.

 	- Doug


On Mon, 1 Sep 2014, François Bonnarel wrote:

> Hi Jose, Markus , all,
> 
> 
> Well, before trying to play my editor role on that and hire authors I
> would like to give my own personal point of view which is that I still
> advocate for the single PARAMETER per axis syntax (so called "POS-SOUP")
>
> 
>
>      - I like the idea of one single parameter mastering everything for
> an axis.
>         Spectral and time axes  work with simple intervals  or values
> which may  have four  different flavors.
>         How does it look like in practice ?
>                Wavelength intervals in the "meter" order of magnitude
> atre not really realistic but they are easy to write and read.
>          We get can get
>                1 ) finite Interval              BAND = 5 / 6
>                2 ) no upper limit interval      BAND = 5 / .
>                3 ) no lower limit interval      BAND = . / 6
>                4 ) single extracted value       BAND = 5.5
>
>        Parsing this kind of "language" doesn't seem more tricky than 
> managing the various combinations of atomic parameters we would need to
> render all these flavors
>
>               1 ) will be rendered by [WAVELENGTH_MIN = 5&WAVELENGTH_MAX
> = 6]
>               2 ) will be rendered by [WAVELENGTH_MIN = 5] and lack of 
> WAVELENGTH_MAX,  I guess
>               3 ) similarly, will be rendered by [WAVELENGTH_MAX = 6] and
> lack of WAVELENGTH_MIN
>               4 ) will be rendered by [WAVELENGTH_MIN = 5.5 &
> WAVELENGTH_MAX = 5.5]
>
>         By the way union of intervals may be ambiguous with the atomic
> notation. [BAND = 5/ 6 & BAND = 7/8] is unambiguous while [WL_MIN = 5 &
> WL_MIN = 7 & WL_MAX = 6 & WL_MAX = 8] may result in the same union as the
> BAND combination or alternativly simply interval 5/8 !
>
>         Of course the POS and POL Case doesn't introduce any fondamental
> difference in this argumentation. The range of possibilities is simply a
> little richer for POS because The "POS Soup" allows to describe various
> shapes.
>
> 
>
>      - A service based on The "atomic" parameter syntax is actually fully
> implementable as a customized service using the "service" descriptor
> mechanism (see The DataLInk proposed recommendation where this is
> described).
>        If the two  approaches are not reconcilable and we want to let all
> this question opened in the hands of implementers we could  say that all
> this is done by customized parameters à la "service descriptor". The UCD
> and unit attributes could also be used in the "POS-SOUP" case. But of
> course the "service descriptor" mechanism doesn't provide any solution
> for describing the little POS-SOUP syntax. That's one of the reasons why
> I would like the Draft document to define it. (Because there is no other
> standard IVOA way I know to define a parameter syntax as this one I
> guess)
>
>       - I think there are strong benefits (and no real drawback) in the
> long term to have the same syntax for SIA and AccessData.
>                * The SIAV2 query syntax can be directly reused for
> accesdata/cutout. The same query you used to discover the data can be
> almost reused for cutout just adding the PUBDID as additional input
> paremeter.
>                * By the way if the SIAV2 service present the virtual
> capability ( a functionality which is part of the use cases  of some
> groups and has been proposed for version 2.1 )  The same URL will realize
> the query and drive the generation of the discovered virtual dateset
> retrievable via the "accessReference" URL.
>         In addition nothing would prevent to use this syntax in
> combination with SSA or other simple protocols before future version of
> these protocols could be harmonized with SIAV2. 
>
>        Last but not least I would like to know if there are some
> implementations of the single PARAMETER per axis syntax (so called
> "POS-SOUP" syntax ) and if there is some feedback about it.
> 
> Cheers
> François
> 
> 
> Le 06/08/2014 17:20, Jose Enrique Ruiz a écrit :
>       Hi all
>       I support the line proposed by Markus that AccessData does
>       not need to be thoroughly consistent with DAL S*AP discovery
>       interfaces, and hence I do not see it either a goal in
>       itself. Discovery and Access are in principle different
>       methods at their basis, and I think we should keep in mind
>       that AccessData should be open enough to address use cases
>       others than those related with SIA or SimDAL. Considering
>       this, and moreover if we see Access methods as very accurate
>       extractions of multidimensional sub-regions for a given
>       dataset, I must say I'd rather prefer the "three-factor
>       semantics", though this is just a matter of preference.
> 
> In relation to:
> "Can we use the SELECT feature to extend to extraction of ranges on
> other parameters such as RadVel/Redshift (observations)?" 
> 
> I think the COORD feature could in principle be used to
> define extraction regions for other non-yet standard axes/params,
> but a bit more characterization for that axis/param is needed in
> the case of RadVel/Redshift observations (i.e. for velocity at
> least the units and the emission line observed to derive the
> velocity of the gas, maybe also a reference system as
> well) Moreover, for the sake of interoperability I guess it will be
> also useful to provide some info on units and reference points for
> the potential values of the SELECT param. 
> 
> My 2c
> 
> 
> --
> Jose Enrique Ruiz
> Instituto Astrofisica Andalucía - CSIC
> Glorieta de la Astronomía s/n
> 18009 Granada, Spain
> Tel: +34 958 230 618
> 
> 
> 
> 
> 2014-07-31 11:54 GMT+02:00 Markus Demleitner
> <msdemlei at ari.uni-heidelberg.de>:
>       Hi DAL,
>
>       On Thu, Jul 31, 2014 at 09:42:59AM +0200, François
>       Bonnarel wrote:
>       >    I encourage people to (re)start to send comments
>       on the current
>       > draft (see below) and have in mind we had on this
>       before and during
>       > the last interop in may.
>       >
>       >    What we have for interface is basically sufficient
>       for the first
>       > priority cutouts and selection requirements from the
>       CSP. It has the
>       > great advantage to be consistent with the SIA query
>       interface. An
>       > update of the draft to take into account last
>       evolutions of SIAV2 is
>       > needed
> 
> I'd *really* like AccessData to not only work with SIAv2; we
> have our
> SSAP use cases, for instance.  Hence, while I've given up
> resistance
> against the POS alphabet soup, I *really* think (in
> particular
> version 1) of AccessData should contain the "three-factor
> semantics"
> (as discussed in my Madrid talk
> http://wiki.ivoa.net/internal/IVOA/InterOpMay2014DAL/flexdatalink.pdf)
> -- a.k.a. simple, atomic parameters with strong metadata.
> 
> In that vein I'd also argue that consistency with the SIA
> "query
> interface" is not a goal in itself; as I won't tire to
> mention, the
> conventional DAL S*AP interfaces have serious issues (mainly
> in
> interface discoverability), and I'd prefer not to be
> consistent with
> those.
> 
> The big advantage of adopting three factor semantics for
> "non-magic"
> parameters is that we don't have to wait for "getMetadata" or
> a similar mechanism to allow clients to discover allowed
> ranges and
> such and actually provide meaningful user interfaces.
> 
> The POS alphabet soup could still be part of this; it'd be
> another
> parameter, possibly even a mandatory one if you insist
> (although I'd
> consider this unwise), identified through a UCD and without
> further
> metadata until the time ImageDM is ready and we have good
> ways to
> serialise instances of it.
> 
> TIME, BAND, POL, SELECT from the existing AccessData could
> fit fairly
> well into the sanitised parameters (ok, TIME should become
> TIME_MAX
> and TIME_MIN, BAND LAMBDA_MIN, LAMBDA_MAX, but that's
> details).
> 
> I'd be happy to contribute the prose for this.  For that,
> however, I
> believe the standard text should go into version control.
>  I've been
> planning to do an ivoatex package for authoring IVOA
> standards in
> LaTeX (based on what Mark Taylor uses for VOTable and SAMP)
> since
> Madrid; can the authors imagine moving over the document to
> such a
> system?
> 
> Cheers,
>
>          Markus
> 
> 
> 
> 
>


More information about the dal mailing list