SODA, half-client and remaining gripe collection

Fri Apr 15 19:24:48 CEST 2016

Hi Markus,
    Thanks for providing this nice implementation.

    I have a few remarks.

    -   In your obstap califa collection, the obs_creator_did is empty. 
I guess the califa creator is Calar Alto observatory. Does that mean 
this observatory has no will to define a creator id for its products? Or 
maybe it's delayed for whatever reason?

     - 2nd point  : ob_publisher_did for califa cube in the Obscore 
table follow the template:
**ivo://org.gavo.dc/getproduct#califa/datadr2/?????.????.??????.fits
    But this publisher_did is not used to call the associated {links} 
resource response from DataLink. Instead is used something with the 
following template :
ivo://org.gavo.dc/~?califa/datadr2/?????.?????.?????.fits

    It is perfectly allowed to call "DataLink" with an internal  ID or 
an ID different than the publisher_did.
    But here the iD used for DataLink also look like a registered 
ob_puublisher_did. Only it is different.
    So my question: do you have two different publisher_did for the same 
product ? And if yes why ?

Best regards
François*
*


Le 29/03/2016 16:21, Markus Demleitner a écrit :
> Dear Colleagues,
>
> Sorry that this is again a rather long mail, but it should be the last
> of these monsters on SODA for a while.  Also, it's split into two
> sections, the first of which I'd ask everyone with some interest in SODA
> to at least have a glance at.  The rest is for the nerds.
>
>
> I. SODA Prototype
> -----------------
>
> So, building on the XSLT hack for datalink I reported on in Syndey (see
> below for details) I've now done a makeshift SODA client so you can try
> out how these things could work (and see where we should work on the
> client side).
>
> To follow these instructions, start TOPCAT and Aladin.
>
> (1) Discovery
>
> We're using obscore; so, in TOPCAT open the VO/TAP dialog, double click
> the "GAVO DC TAP" entry you're seeing.
>
> Use
>
>    select * from ivoa.obscore
>    where
>    dataproduct_type='cube'
>    and obs_collection='CALIFA'
>
> as your query.  Or, really, anything that'll return one of these cubes
> (also see below, at (4) Other data).
>
> (2) SODA
>
> There's not yet a cube-enabled SODA client, so I'm using a bit of XSLT
> and javascript that lets you use SODA results in the browser.  To use
> it, in TOPCAT's main window click "Activation action" and check "View
> URL as a Web Page".  Select "access_url" as "Web Page Location column"
> and "system browser" (or anything but "basic browser") as the Browser
> type.  "Ok" the dialog.

>
> Now open a plot or table display for your obscore result.  When you
> click on a row or point, a web browser page opens with a SODA dialog.
> If you have javascript disabled, you can use a conventional form
> interface, where the thing at least tells you what you can enter.  This
> particular service supports BAND and POS from the current SODA draft
> parameters, and in addition RA and DEC (which I claim we can't really do
> without, see the nerd section).
>
> There's also some additional parameters in there that might be up for
> later standardisation -- don't worry about them now.  Up to now, that's
> all datalink.
>
> If you enable Javascript, you'll get some SODA magic:
>
> (a) there's a spatial cutout overlaid on a sky preview courtesy of the
> Aladin image server.  Use a click-and-drag rubberband to determine your
> spatial cutout.
>
> (b) You'll get a custom BAND widget that lets you do the cutouts in your
> chosen units.  For instance, to get the vicinity of H alpha, switch to
> Ångström and enter 6560 and 6564 there (or something else if you got a
> cube that doesn't have Halpha -- you'll see that readily from the
> limits).  Yes, the formatting of the limits in that widget is suboptimal
> at this point.  Visual improvements forthcoming.
>
> Notice how changing these "custom" widgets updates the SODA parameters
> (in the case of the spatial units, I could change POS, but since I have
> RA and DEC anyway, I use this).
>
> (3) Retrieval
>
> Hit "broadcast dataset via SAMP" and inspect your cutout in Aladin (or
> DS9, for that matter).  Or hit "Retrieve data" to download the cutout.
>
>
> That's about it; no, I'm not claiming the XSLT-thing is more than a
> proof-of-concept client.  But still, I think you can get an idea how the
> pieces can fit together.
>
>
> (4) Other data
>
> This happens to be useful for other data, too.  For instance, I have
> this collection of plate scans that are a Gigabyte a pop.  With SIAP, I
> only handed out cutouts, but that's not an option with obscore, so I've
> always been a bit worried.  Now, I'm handing out datalinks for those.
> So see how things work out for these images (in effect, degenerate
> cubes), try a discovery query like
>
>    select * from ivoa.obscore
>    where
>    t_min<gavo_to_mjd('1925-01-01')
>    and target_name='M42'
>    order by t_min desc
>
> If you don't want to configure the activation action, you can simply cut
> the access URL in TOPCAT with a quadruple click and paste it somewhere
> else.
>
> And that concludes the part of the general public.  Nerds, please read
> on.
>
>
> II. Technics
> ------------
>
> (1) The stylesheet
>
> First off, the stylesheet I'm using is public.  I've put it on github
> and I'd like it very much if we could share development there; even if I
> don't hope datalink in the browser is going to be a big thing, it's
> certainly a good thing to have.  So, please go ahead and
>
>    git clone https://github.com/msdemlei/datalink-xslt
>
> It should already work for other datalink services.  In principle,
> prepending something like
> <?xml-stylesheet href='/static/xsl/datalink-to-html.xsl' type='text/xsl'?>
> should be enough; but there's a same-origin policy for XSLT, too, so
> you'll have to have a local checkout.
>
> For the javascript-based SODA, some javascript is pulled in that ideally
> should come from your site, too.  How to parameterise all that and how
> to pack up a distribution are things I'd love to work out with you.  As
> is, the javascript retrieval will fail, but that's mainly a matter of
> figuring out where to sensibly put all these resources.
>
> Also, you'll need to arrange that browsers see the text/xml media type
> for them to apply the stylesheet (see
> http://wiki.ivoa.net/internal/IVOA/InteropOct2015DAL/datalink-xslt.pdf)[1].
>
>
> (2) Standards issues
>
> I guess I won't be able to spend as much time on SODA as I have in
> January and February until Cape Town, so contrary to my original plan
> I'll dump the remeining major gripes here all in one go.  So, the big
> one really is:
>
>
> (3) RA and DEC
>
> The stylesheet really needs to know where it is looking in order to
> generate a UI.  The easiest way to achieve that is by adding RA and DEC
> parameters.  True, this is a bit tricky at the stitching line (easily
> solved by allowing RA<0) and at the pole, which is ugly, but unavoidable
> with POS, too, since RA and DEC is essentially just a rationalisation of
> POS' RANGE.
>
> The alternative would be Pat's CIRC and POLY from his Feb 29th mail
> http://mail.ivoa.net/pipermail/dal/2016-March/007370.html, except that
>
>> Pat:
>> For CIRC and POLY the service includes a "maximum sensible extent" with
>> which to perform cutouts.
> I'm not sure I care a lot about the maximum sensible extents.  The
> circle, however, would let you say where there is data, just as for
> RA/DEC.  So, if there's really a good reason against RA and DEC and they
> don't make it to the standard, I could see an escape by giving actual
> per-element limits in CIRCLE, perhaps like this:
>
>      <PARAM name="CIRC" datatype="double" ucd="obs.field"
>        unit="deg" xtype="circle" arraysize="3" value="">
>         <VALUES>
>            <MIN value="230.4 32.1 0.1" />
>            <MAX value="236.4 34.7 3.5" />
>          </VALUES>
>
> -- which would say "this is a dataset covering RA 230.4 .. 236.4, DEC
> 32.1 .. 34.7, and you can choose cutout radii between 0.1 and 3.5".
> Which would be ok with me, except we're probably violating VOTable.
> I think all existing practice is that MIN and MAX are not array-valued but
> give the values for the individual items of an array (not sure if the
> standard itself is precise enough there).
>
> Be that as it may, I'd still much prefer
>
>        <PARAM arraysize="2" datatype="double" name="DEC" ucd="pos.eq.dec"
>          unit="deg" value="" xtype="interval">
>          <DESCRIPTION>The latitude coordinate</DESCRIPTION>
>          <VALUES>
>            <MIN value="32.4"/>
>            <MAX value="34.7"/>
>          </VALUES>
>        </PARAM>
>        <PARAM arraysize="2" datatype="double" name="RA" ucd="pos.eq.ra"
>          unit="deg" value="" xtype="interval">
>          <DESCRIPTION>The longitude coordinate</DESCRIPTION>
>          <VALUES>
>            <MIN value="32.1"/>
>            <MAX value="34.7"/>
>          </VALUES>
>        </PARAM>
>
> -- which uses VOTable  as in common practice and otherwise requires zero
> additional definitions.  Also, it's probably what 95% of users and
> implementors would ask for anyway.
>
> Any elucidation as to why we shouldn't just do it would be appreciated.
>
>
> (4) the async/multiple param thing
>
> I've already said something about having multiple values for a single
> param before, but as Pat mentioned it in his mail
> http://mail.ivoa.net/pipermail/dal/2016-March/007372.html let me just
> briefly comment on one thing I couldn't help thinking:
>
>> Pat wrote:
>>
>> input file I can use with curl to post multiple positional cutouts (note
>> the mix of POS, CIRC, and POLY):
>>
>> ===multicut.txt===
>> POS=circle 140 0 0.1&
>> POS=circle 66 10 20&
> [...]
>
> Hm... If you found yourself writing your parameters to a file anyway --
> why not then do the multi-cutout by just writing one set of parameters
> per VOTable line -- it'd be much simpler and more predictable all around
> -- and really not hard to implement (with some trivial conventions).
>
> But anyway: Perhaps we should defer the whole multiple param thing until
> the next version?  I see some urgency for enumerated params, but perhaps
> even that can still wait.
>
>
> (5) @value and @ref semantics documented for PARAMs
>
> I've found the inclusion of "contant" params makes my life a bit easier.
> So, in my service there's
>
>    <PARAM name="ID" [...]
>
>         value="ivo://org.gavo.dc/~?lswscans/data/part2/Walz/FITS/D774.fits">
>
>         <DESCRIPTION>The pubisher DID of the dataset of interest</DESCRIPTION>
>   </PARAM>
>
> with the understanding that clients should just propagate on that PARAM
> (the XSLT does, in input type="hidden").
>
> One could do this by manipulating the accessURL, but I'd feel a bit
> better if we were explicit about these things being params -- e.g., it
> might be a nice thing to show informationally in UIs.
>
> Also, I believe the PARAM/@ref semantics that's introduced in datalink
> somewhat off-handish should be defined a bit more precisely.  Perhaps
> this would be an erratum to Datalink.  I guess we would need to say
> something like
>
>    If PARAM/@ref points to a FIELD, the parameter's values must be taken
>    from the corresponding column of the table embedding the FIELD.  An
>    appropriate widget type in a UI could be a select box.
>
>    [<Do we want this?> If PARAM/@ref points to another PARAM, that param's
>    value is simply copied literally]
>
>    A PARAM/@ref pointing to any other VOTable element is an error.
>
>
> (6) @value="" universally valid
>
> VOTable 1.3 says "If the TD element is empty (<TD/> or <TD></TD>) the
> cell is considered to contain no data, i.e. to be null."  Since we're
> saying "parse PARAM/@value like you parse TABLEDATA cells" (I really
> think VOTable itself should say this in sect. 4.1, last bullet point), I
> *think* we're fine saying "Generate your UI from all the PARAMs that
> have value="" (which, in effect, we're doing right now and we'd
> certainly be doing if we adopt (5).  I'd much prefer if that were made
> explicit somewhere.  Bug me for a doc patch if everyone interested agrees.
>
>
> (7) Behaviour for queries with only ID given
>
> I think we should leave open at this point what should happen if a SODA
> service is called with an ID only.  While people have suggested it
> should return some sort of "self-description", I'd say it doesn't make
> sense to replicate what datalink already does (to at least my
> close-to-perfect satisfaction), and I don't think it makes much
> sense to run both SODA and datalink on the same endpoint (in which case
> you'd have that behaviour already).
>
> After these considerations, my services just return the full dataset
> when you just pass in ID, which I think is the behaviour one would
> naively expect ("no constraints).  But we might still find that's
> actually not such a good idea.
>
> Hence, I'd say we explicitly say "This is undefined behaviour at this
> point.  Clients must not have any expectations as to what is returned."
>
> Same thing for requests missing all parameters.  I doubt there's
> something smart we can do with that, but who knows?  Let's just keep it
> undefined for now.
>
> So -- that's it from me.  There's nothing else really relevant up my
> sleeve on SODA.
>
> Ain't that nice?
>
>         -- Markus
>
>
>
> [1] There's another technicality: client-side XSLT plus DOM operations
> plus javascript is a mixture which current browsers tend to be a bit
> buggy with.  It's  bit like in the 90ies with Javascript only.  I
> therefore do server-side XSLT right now if I guess a Datalink request
> comes from a browser.  I think with a bit of experimentation and
> restriction to known-good parts we can push XSLT to the client again,
> but I'll invest that work only if I know others will re-use the result.
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ivoa.net/pipermail/dal/attachments/20160415/e7108864/attachment-0001.html>