[data-cube] Comparing protocols

Arnold Rots arots at cfa.harvard.edu
Tue Oct 29 08:51:06 PDT 2013


Yes, it is essential that we recognize these as three distinct aspects.
Returning to the point I made originally in this thread: it is vital that
we consider the requirements in the context of the end-to-end workflow
from the user's perspective. Each of these three components has its
own role in those workflows and the requirements on them vary depending
on the scenario.
Particularly discovery takes on different functional roles depending on
whether the user is specific or flexible when it comes to, say, file
formats,
spectral coverage, FOV, resolution, object or types of objects, intended
analysis, data types, etc.
If the user knows exactly what (s)he wants, it's straightforward, but if the
question is: What spectral information can you find on AGNi? or: I am
looking for Doppler and spatial data on HVCs; then a different approach is
called for.
And, of course, there will be feedback between analysis and further
discovery,
as well as the situations where discovery, retrieval, and analysis are
integrated in a more or less comprehensive tool.
It is, I think, fairly obvious that SIAP2 will satisfy the straightforward
case,
but I must admit that it still is not clear to me what the strengths and
weaknesses of the different protocols are in the more open-ended workflows.

Cheers,

  - Arnold

-------------------------------------------------------------------------------------------------------------
Arnold H. Rots                                          Chandra X-ray
Science Center
Smithsonian Astrophysical Observatory                   tel:  +1 617 496
7701
60 Garden Street, MS 67                                      fax:  +1 617
495 7356
Cambridge, MA 02138
arots at cfa.harvard.edu
USA
http://hea-www.harvard.edu/~arots/
--------------------------------------------------------------------------------------------------------------



On Tue, Oct 29, 2013 at 11:17 AM, Lazio, Joseph (9000) <
Joseph.Lazio at jpl.nasa.gov> wrote:

> On Oct 28, 2013, at 7:52 PM, Douglas Tody <dtody at nrao.edu> wrote:
>
> > There are two main aspects to cube data access and analysis: simple data
> > discovery and retrieval, and interactive analysis of large cubes.  Both
> > are important, however the discussion below addresses only simple
> > discovery and retrieval.
>
> There was a bit of discussion of this topic in the CSP context at the
> IVOA.  We agreed that there were three aspects: discovery, access, and
> analysis.  As has been mentioned many times, to date, we've focused on the
> discovery and access aspects.  One of the challenges quickly identified for
> analysis was that different data cubes may (likely) demand different kinds
> of analysis, and the question of what analysis means rapidly becomes data
> provider specific.
>
> I don't think that we made it much beyond this stage (and the CSP meeting
> was occurring before or during the discussions about the resolution between
> TAP/ObsTAP and SIAP-v2).
>
> -- Joe
>
>
>
> > The TAP/ObsTAP and SIAV2 approaches have different strengths for simple
> > discovery and retrieval of whole images/cubes.  TAP is data type
> > agnostic and better for general (non image-specific) archive data
> > browsing and discovery, which is adequate for simple discovery and
> > retrieval of whole image datasets.  SIAV2 provides a richer Image data
> > model and more powerful discovery for image datasets.  The parameter
> > interface is simpler and more powerful for simple discovery queries, but
> > ADQL is more powerful for complex adhoc queries.  As Bob notes, a small
> > data provider that only needs to put up an image data collection or two
> > is more easily served via SIAV2, which does it all with a single service
> > optimized for image data.  Larger providers can more easily deal with
> > the complexities of TAP/ObsTAP - and should probably support both TAP
> > and SIAV2 if they have the resources to do so.
> >
> > However what is being missed here is the requirement for direct access
> > to large data cubes for interactive analysis.  Very large cubes, i.e.,
> > tens of GB, or up to the Terabyte scale or larger, are impractical to
> > download, or even deal with locally at all by a client unless they have
> > an unusual compute capability locally.  Practical analysis requires
> > remote access to the dataset.  Only SIAV2 is capable of providing this
> > capability.  It also provides an enhanced, image-specific data discovery
> > capability, hence covering the entire range of capabilities required for
> > image data.  Simply stated, SIAV2 (or some comparable image-specific
> > interface) is required to support the potentially very large cubes
> > provided by modern instruments, particular the very large cubes coming
> > soon from radio instruments.
> >
> > We really need both - TAP for general archive browsing and data queries,
> > possibly augmented by a generic AccessData / cutout capability for
> > simple cutouts.  SIAV2 for enhanced discovery for image-only data
> > collections, but most notably for direct access to remote image datasets
> > for distributed/scalable image analysis.  That said, 2D images of modest
> > size still dominate, and a simple image discovery/access protocol
> > building upon the very successful SIAV1 will enhance community take-up.
> > SIAV2 provides both a simple protocol for basic discovery and retrieval,
> > plus capabilities for advanced distributed data access to arbitrarily
> > large image datasets.
> >
> >       - Doug
> >
> >
> >
> > On Tue, 29 Oct 2013, Robert J. Hanisch wrote:
> >
> >> Thus far it appears to be equally easy to build GUIs for either of the
> >> protocols being discussed for SIA V2.  CADC and JVO have done it using
> >> the ObsTAP/Datalink approach, VAO has done it with the SIAP V2 approach.
> >>   Arnold and Jonathan's points are certainly relevant, but in the case
> of
> >> SIA V2, the bigger impact is on data providers.  Do they have SIA V1
> >> services that can be fairly easily upgraded to V2?  Do they implement
> >> ObsCore and ObsTAP?
> >> For these protocols to be successful they need significant take-up on
> the
> >> data provider side.  Otherwise there is little motivation to implement
> >> clients, and the ease of use for building clients becomes a red herring.
> >>  In any case, it seems to be a wash, client-side.
> >> Bob
> >> From: <Tedds>, "Jonathan A. (Dr.)" <jat26 at leicester.ac.uk>
> >> Date: Sunday, 29 September 2013 4:40 AM
> >> To: Arnold Rots <arots at cfa.harvard.edu>
> >> Cc: data-cube <data-cube at usvao.org>, "dm at ivoa.net" <dm at ivoa.net>, DAL
> >> mailing list <dal at ivoa.net>
> >> Subject: Re: [data-cube] Comparing protocols
> >>
> >>      Anyone working as and with end users would have to second
> >>      these excellent points made by Arnold. Rather like the
> >>      initial Research Data Alliance Working Groups, which I have
> >>      more involvement with than IVOA these days, it is being
> >>      pointed out that an emphasis on technical solutions alone and
> >>      in isolation will not have the desired effect. The difficult
> >>      balance is between catering for the diversity of end user
> >>      requirements while at the same time actually getting
> >>      something done. The RDA will tend to emphasise the latter.
> >>      IVOA has been successful at doing likewise, albeit it's never
> >>      a quick process! Bioscientists appear to be presiding over a
> >>      Darwinian evolution of overlapping standard schemes through
> >>      their much higher numbers. RDA certainly presents an
> >>      opportunity for IVOA to look at other disciplines and compare
> >>      approaches so it was good to see it represented at the 2nd
> >>      RDA Plenary a couple of weeks ago. A little more involvement
> >>      in Interest and Working Groups would be of mutual benefit.
> >> Cheers,
> >> Jonathan
> >> On 28 Sep 2013, at 20:41, "Arnold Rots" <arots at cfa.harvard.edu>
> >> wrote:
> >>
> >>      With apologies if you receive multiple copies of this
> >>      message.
> >> It occurred to me that the discussion we had yesterday on
> >> the relative merits of SIAP, ObsTAP, and DataLink only had
> >> moderate relevancy and lost sight of the bigger picture.
> >> The problem is that within the IVOA people and groups have
> >> been designing protocols that make sense within their own
> >> context, but very little attention has been paid to the
> >> end-to--end
> >> use case scenarios - with the emphasis on "end-to-end."
> >> The question of how flexible or easy to use a particular
> >> interface
> >> protocol is really needs to be assessed in the context of the
> >> full
> >> scenario that real-life users follow.
> >> I must admit that it is not clear to me how either
> >> ObsTAP/DataLink
> >> or SIAP fit into the various scenarios and what their effect
> >> would be
> >> on the the total number of steps that users have to go
> >> through in
> >> order to get their data.
> >> And the issue is, of course, that there is no single use case
> >> scenario.
> >> There are users who will simply be interested in retrieving
> >> their, let's
> >> say, ALMA observations. How easy and how many steps does it
> >> take
> >> to get where they want to get, using the different protocols?
> >> Then there are users who will get there through the VAO
> >> Portal.
> >> And those who enter through Aladdin, and so on.
> >> In how many scenarios do we envision users to start querying
> >> and
> >> retrieving data through IVOA protocols and how well or how
> >> poorly
> >> does that work depending on which protocol is chosen?
> >> And how does that depend on the users' objectives?
> >> I would like to see flow diagrams for the different cases to
> >> get a better
> >> sense of the ramifications of choosing one protocol over
> >> another
> >> in the context of the larger picture of the full end-to-end
> >> scenario.
> >> Just quibbling over the relative merits of protocols in the
> >> limited
> >> context of their own characteristics does not address the
> >> real issues.
> >> We really need to focus on the users' perspective, minimizing
> >> steps and increasing protocols' ability to support intuitive
> >> use.
> >> If we don't do that, we relegate ourselves to irrelevancy.
> >> To complicate the issue further, it is, of course, not the
> >> user-friendliness
> >> of the protocol per se that matters. What really counts is
> >> the interface
> >> through which the users use the protocols.
> >> Which protocols make it easiest to develop user-friendly GUIs
> >> while
> >> at the same time supporting those who swear by the Command
> >> Line?
> >> Finally a comment on one of my favorite subjects:
> >> distinguishing
> >> between the spectral and redshift/Doppler velocity axes.
> >> None of the protocols currently supports this and that is a
> >> problem.
> >> It means that users in their queries cannot indicate whether
> >> they
> >> are interested in multi-band image cubes or in cubes where
> >> the
> >> third axis is Doppler velocity, they cannot express whether
> >> they
> >> want spectra for, say, SED or line equivalent width analysis,
> >> or
> >> Doppler profiles.
> >> It is going to annoy users no end if they get offered large
> >> numbers
> >> of datasets that they are not interested in and thought they
> >> didn't
> >> ask for.
> >> And note that making this distinction means that it allows
> >> one to
> >> construct hypercubes that contain Doppler velocity profiles
> >> in multiple
> >> spectral lines.
> >> Cheers,
> >>
> >>   - Arnold
> >>
> --------------------------------------------------------------------------------
>
>
> --
> Joseph Lazio
> Jet Propulsion Laboratory, California Institute of Technology
>  (o) +1-818-354-4198
> M/S 138-308, 4800 Oak Grove Dr.
>                                            (m) +1-626-390-5370
> Pasadena, CA  91109
>                            Joseph.Lazio at jpl.nasa.gov
>
>
> _______________________________________________
> data-cube mailing list
> data-cube at usvao.org
> http://www.usvao.org/mailman/listinfo/data-cube
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ivoa.net/pipermail/dal/attachments/20131029/87f47c94/attachment-0001.html>


More information about the dal mailing list