Cube Data Access Layer implementation note

Thu Mar 3 19:06:09 CET 2016

I understand what is being said about SODA being able to emit a datalink
service descriptor, but let's remember the archietcture diagra, from the
roadmap after the Hawaii interop, here:

http://wiki.ivoa.net/twiki/bin/view/IVOA/2013BRoadmap#Data_Access_Layer

We have renamed some items in there but the landscape is basically the same
and we are working on the most basic part of SODA (accessData "filter").
The solid lines are show the most general case where a provider has all
service capabilities. The dashed lines show possible optimisations with one
being relevant to the current discussion:

*directly go from query to access*: the spec that provides the ability to
do this is the DataLink service descriptor; providers can include SODA
descriptors directly in the output of the data discovery *if* they can
provide ID values in that output that can be used with their SODA
implementation (note: we cannot due to a 1-n relation between publisher_did
and files)

So, you do not need to implement DataLink at all! With a simple collection,
one can have just SIA-query and SODA and satisfy the use cases.

As for the a SODA service being able to emit a service descriptor, the
DataLink spec does make a comment about this (service descriptors with
ID="this"). The issue is that SODA services by-and-large will not be
returning VOTable, so such a request would have to be

ID=<publisher_did>
RESPONSEFORMAT=application/x-votable+xml

I don't think the "content=datalink" would be appropriate because that says
here is a table of links. So, is it a good idea to specify a VOTable output
format for SODA services? Will it be confusing when we try to use SODA for
spectra (timeseries, eventlist) and VOTable is a valid data format?

Is adding a separate endpoint (new capability, new standardID) better?

This sounds somewhat like the metadata capability that we postponed until
after the dot-0 versions of the minimal specs. It seems to me like an
increase in scope and there is too much uncertainty here... so IMO we
should be OK with TAP|SIA -> DataLink -> SODA or the optimised TAP|SIA ->
SODA. If a provider can't do the latter, it is only because you already
need a DataLink service.

my 2c,

Pat

On 3 March 2016 at 07:38, François Bonnarel <
francois.bonnarel at astro.unistra.fr> wrote:

> Hi again,
>
>
> On 03/03/2016 09:03, Markus Demleitner wrote:
>
>> Dear Laurent,
>>
>> On Wed, Mar 02, 2016 at 05:20:37PM +0100, Laurent Michel wrote:
>>
>>> Hi Markus
>>>
>>> Let's "try again with an explanation":
>>>
>>> Imagine the following fields in your DAL response (SIAV2 e.g.)
>>>
>>>   access_type=application/xml+votable;content=soda
>>>   access_url=http://my.soda?ID=xyz
>>>
>>> These 2 values are enough to run SODA with the common parameters
>>> (POS,TIME..).
>>> The relevant parameter ranges can likely be found within the
>>> current VOTable. No need to any explicit reference to any DM for
>>> this.
>>> If you want to refine the SODA description, just run
>>> http://my.soda?ID=xyz and get a classical service descriptor.
>>>
>> Ah -- but note that the difference between this and a normal datalink
>> endpoint is very small.  This SODA endpoint would return documents of
>> the structure
>>
>> <VOTABLE>
>>    <RESOURCE type="meta" utype="adhoc:service" ID="svc">
>>      [Service descriptor content]
>>    </RESOURCE>
>> </VOTABLE>
>>
>> whereas if we just re-used datalink as-is, the structure would be:
>>
>> <VOTABLE>
>>    <RESOURCE type="results">
>>      [Data links]
>>    </RESOURCE>
>>    <RESOURCE type="meta" utype="adhoc:service" ID="svc">
>>      [Service descriptor content]
>>    </RESOURCE>
>> </VOTABLE>
>>
>> If a service really does not have any data links, the [Data links]
>> thing above is very stereotypical; it's just a link with the
>> dataset's pubdid, a #proc semantics and a pointer to svc.  With a bit
>> of organisation, you could have the entire results RESOURCE in a
>> fixed string.
>>
>> To me, the gain is 2 folds: 1) No need operate a Datalink service to run a
>>> SODA service 2) Applying this scheme to DataLink response could (or not
>>> - up
>>> to you) save the duplication of the SODA service descriptor for each
>>> link.
>>> The (low) cost: pushing the parameter description from the Datalink
>>> service to this new and simple SODA capability .
>>>
>> So, ad (1): Saving the stereotypical results RESOURCE on the service
>> side seems a neglible benefit to me, in particular considering the
>> cost that we'd still have to define a new service interface (which
>> furthermore happens to be almost identical to datalink, except
>> perhaps that ID would be fixed to single-value from the start), a new
>> media type, and that the clients have to deal with two different
>> but very similar formats.
>>
> I support Laurent's proposal which recall me what we had in SIAV1.0 and
> SSA under the "FORMAT=METADATA" query. This was answering by describing the
> input parameters of the service in a way which looks a lot  like a modern
> service descriptor (apart some archaisms)
>
> The main limitation was that format = metadata could not be attached to a
> specific dataset and provide metadata valid only for it.
>
>> Ad (2): I cannot see a benefit in taking the service blocks out of
>> the datalink response proper -- you have to generate them in your
>> scheme, too; and why should it be preferable to serve them from an
>> extra endpoint than to just embed them into the main datalink
>> response?
>>
> The benefit will be to enlight the fact that DataLink is not done only for
> exposing SODA services or dynamical standard services but also custom
> services and static links. service description the DataLink response is
> fine for custom services wher we have absolutly no control on the way they
> can answer to an autodescription query. In the case of standard services we
> currently define like we do for SODA, we can still fine tune this in the
> SODA service itself.
>
> DataLink Table is a little bit of "glue" (with little description of the
> links) between datasets and resources. "cutout" is only ONE among  SIXTEEN
> semantical possibilities on the nature of the links. See
> http://www.ivoa.net/rdf/datalink/core/2014-10-30/datalink-core-2014-10-30.html
>
> We should avoid too much overloading of DataLink responses by cutout
> details
>
> Cheers
> François
>
>
>> I totally give you it is unfortunate that the datalink interface
>> leads clients to expect they can request multiple IDs, and you'd have
>> my vote to remove that any day.  But as Pat says, even with the
>> current spec you're allowed to discard extra IDs (at the expense of
>> having to give an overflow indicator, which may be a bit annoying but
>> is not dramatic).  Plus I expect clients won't use the multi-ID thing
>> anyway.
>>
>> And the multiple service blocks aren't horrible on the serice side
>> anyway.  On the client side -- well, that's another matter, but I've
>> not formed an opinion there yet.
>>
>> So -- I'd still maintain we shouldn't increase the number of
>> endpoints.  And rather write clients.
>>
>> On that very matter of clients (or makeshift clients), I mention in
>> passing that I've extended my XSLT-to-datalink stylesheet to do basic
>> UI generation from service descriptors, including intervals.  You can
>> see the current state in action from the dlmeta links on
>>
>>
>> http://dc.zah.uni-heidelberg.de/califa/q2/cubesearch/form?__nevow_form__=genForm&_DBOPTIONS_ORDER=&MAXREC=100&_FORMAT=HTML&submit=Go
>>
>> (and I'm sure you can break the xtype=interval params in the current
>> implementation, which at this point are at the proof-of-concept
>> level; also, the port from atomic to interval parameters isn't quite
>> done everywhere yet).
>>
>> Implementors are welcome to use (and improve!) the stylesheet at
>>
>> https://github.com/msdemlei/datalink-xslt.git
>>
>> My next plan with this would be to use three-factor-semantics to
>> provide a cutout over a sky image when RA and DEC are present (if
>> someone has some javascript for that already, I'd be highly
>> interested) and to put in sliders where appropriate when the client
>> knows enough javascript to have "two-nosed" (lower and upper limit)
>> sliders with editable input boxes.  Again, contributions are highly
>> welcome -- I'm sure this kind of thing has been written many times
>> before.
>>
>> Cheers,
>>
>>             Markus
>>
>
>
> --
> =====================================================================
> François   Bonnarel           Observatoire Astronomique de Strasbourg
> CDS (Centre de données        UMR 7550 CNRS / Université de Strasbourg
> astronomiques de Strasbourg)  11, rue de l'Université
>                               F--67000 Strasbourg (France)
>     Tel: +33-(0)3 68 85 24 11     WWW:
> http://cdsweb.u-strasbg.fr/people/fb.html
> Fax: +33-(0)3 68 85 24 25     E-mail: francois.bonnarel at astro.unistra.fr
> ---------------------------------------------------------------------
>

-- 
Patrick Dowler
Canadian Astronomy Data Centre
Victoria, BC, Canada
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ivoa.net/pipermail/dal/attachments/20160303/dea3b356/attachment-0001.html>