Cube Data Access Layer implementation note

Laurent Michel laurent.michel at astro.unistra.fr
Fri Mar 4 11:48:48 CET 2016


Hello, Markus and DALers,


Le 03/03/2016 09:03, Markus Demleitner a écrit :
> Dear Laurent,
>
> On Wed, Mar 02, 2016 at 05:20:37PM +0100, Laurent Michel wrote:
>> Hi Markus
>>
>> Let's "try again with an explanation":
>>
>> Imagine the following fields in your DAL response (SIAV2 e.g.)
>>
>>   access_type=application/xml+votable;content=soda
>>   access_url=http://my.soda?ID=xyz
>>
>> These 2 values are enough to run SODA with the common parameters (POS,TIME..).
>> The relevant parameter ranges can likely be found within the
>> current VOTable. No need to any explicit reference to any DM for
>> this.
>> If you want to refine the SODA description, just run
>> http://my.soda?ID=xyz and get a classical service descriptor.
>
> Ah -- but note that the difference between this and a normal datalink
> endpoint is very small.  This SODA endpoint would return documents of
> the structure
Exact, that is voluntary since I aim at enable SODA to work both independently and embedded from/in  Datalink
>
> <VOTABLE>
>    <RESOURCE type="meta" utype="adhoc:service" ID="svc">
>      [Service descriptor content]
>    </RESOURCE>
> </VOTABLE>
>
> whereas if we just re-used datalink as-is, the structure would be:
>
> <VOTABLE>
>    <RESOURCE type="results">
>      [Data links]
>    </RESOURCE>
>    <RESOURCE type="meta" utype="adhoc:service" ID="svc">
>      [Service descriptor content]
>    </RESOURCE>
> </VOTABLE>
>
> If a service really does not have any data links, the [Data links]
> thing above is very stereotypical; it's just a link with the
> dataset's pubdid, a #proc semantics and a pointer to svc.  With a bit
> of organisation, you could have the entire results RESOURCE in a
> fixed string.
This is technically feasible, but the result would likely be confusing (winding?) and its standardisation would be on long way 
for sure.
My self-description proposal is  certainly both more elegant and more robust
>
>> To me, the gain is 2 folds: 1) No need operate a Datalink service to run a
>> SODA service 2) Applying this scheme to DataLink response could (or not - up
>> to you) save the duplication of the SODA service descriptor for each link.
>> The (low) cost: pushing the parameter description from the Datalink service to this new and simple SODA capability .
>
> So, ad (1): Saving the stereotypical results RESOURCE on the service
> side seems a neglible benefit to me, in particular considering the
> cost that we'd still have to define a new service interface (which
> furthermore happens to be almost identical to datalink, except
> perhaps that ID would be fixed to single-value from the start), a new
> media type, and that the clients have to deal with two different
> but very similar formats.

Once your client can deal with a {RESOURCE}, it can do it either in a SODA response or in a DL response
>
> Ad (2): I cannot see a benefit in taking the service blocks out of
> the datalink response proper -- you have to generate them in your
> scheme, too; and why should it be preferable to serve them from an
> extra endpoint than to just embed them into the main datalink
> response?
In some situation you do not have any working Datalink service available (see the use case below). I think it is important to 
let implementers to keep free of implementing both SODA and DL or just SODA or just DL
>
> I totally give you it is unfortunate that the datalink interface
> leads clients to expect they can request multiple IDs, and you'd have
> my vote to remove that any day.  But as Pat says, even with the
> current spec you're allowed to discard extra IDs (at the expense of
> having to give an overflow indicator, which may be a bit annoying but
> is not dramatic).  Plus I expect clients won't use the multi-ID thing
> anyway.
I agree with you, this discussion would never happened whether  multiple IDs were prohibited.
Notice that if I could rewrite the story, the self description would be a capability of ad:hoc services :=)
But saying "The solution proposed by the standard for dealing with multiple IDs  is not totally clean but client's won't use 
this pattern anyway" is not satisfactory to me.
>
> And the multiple service blocks aren't horrible on the serice side
> anyway.  On the client side -- well, that's another matter, but I've
> not formed an opinion there yet.
On client side, if you want to make a nice interface you have to do some inferences on the {RESOURCES}, which is to me a 
signature of a standard weakness.
>
> So -- I'd still maintain we shouldn't increase the number of
> endpoints.  And rather write clients.
You are right, I'm adding an endpoint (although strictly that remains the SODA endpoint with another key after the ?) but at the 
same time  I exempt some service providers from implementing a datalink service. So the final balance is rather positive. (see 
USECASE below)
>
> On that very matter of clients (or makeshift clients), I mention in
> passing that I've extended my XSLT-to-datalink stylesheet to do basic
> UI generation from service descriptors, including intervals.  You can
> see the current state in action from the dlmeta links on
>
> http://dc.zah.uni-heidelberg.de/califa/q2/cubesearch/form?__nevow_form__=genForm&_DBOPTIONS_ORDER=&MAXREC=100&_FORMAT=HTML&submit=Go
>
> (and I'm sure you can break the xtype=interval params in the current
> implementation, which at this point are at the proof-of-concept
> level; also, the port from atomic to interval parameters isn't quite
> done everywhere yet).
>
> Implementors are welcome to use (and improve!) the stylesheet at
>
> https://github.com/msdemlei/datalink-xslt.git
>
> My next plan with this would be to use three-factor-semantics to
> provide a cutout over a sky image when RA and DEC are present (if
> someone has some javascript for that already, I'd be highly
> interested) and to put in sliders where appropriate when the client
> knows enough javascript to have "two-nosed" (lower and upper limit)
> sliders with editable input boxes.  Again, contributions are highly
> welcome -- I'm sure this kind of thing has been written many times
> before.
What you are describing here is very similar with what I've implemented on TapHandle but in pure JS
http://saada.unistra.fr/taphandle => gavo => ivoa.obscore => access_url
No problem, just some pain,  while we just have one ID in the DL response.

USECASE:
I'm a Hips provider
My Hips have progenitors
I want to propose a cutout service on those progenitors.
This SODA service is declared in the registry, it is one of my service capabilities: so people know it exists before connect my 
Hips service - no need to discover it in DAL responses -
* With the scheme I'm proposing, when a user gets a progenitor, it just invoke SODAURL?ID=progenitorid to get both parameters 
description and range and then to run its cutout.
* With the actual schema, I've to implement and support a DL service in addition with my cutout and this, just for carrying the 
description of another service (SODA).


Cheers
LM
-- 
jesuischarlie

Laurent Michel
SSC XMM-Newton
Tél : +33 (0)3 68 85 24 37
Fax : +33 (0)3 )3 68 85 24 32
laurent.michel at astro.unistra.fr <mailto:laurent.michel at astro.unistra.fr>
Université de Strasbourg <http://www.unistra.fr>
Observatoire Astronomique
11 Rue de l'Université
F - 67200 Strasbourg
http://amwdb.u-strasbg.fr/HighEnergy/spip.php?rubrique34


More information about the dal mailing list