datalink questions

Markus Demleitner msdemlei at ari.uni-heidelberg.de
Wed Sep 2 10:00:00 CEST 2015


Hi Tom, hi list,

On Mon, Aug 31, 2015 at 09:29:25AM -0400, Tom McGlynn (NASA/GSFC Code 660.1) wrote:
> I was on vacation last week, so sorry for the delay in
> responding...  I'm still pretty confused but I've tried to clarify
> my issues a little.  The biggest help would be a pointer to active
> services that use both the service_def and which have multiple
> fields used to point to the link.

I'm not aware of a service that uses multiple fields, and I don't
have a use case for anything like it.  Now that we have the feature,
it'd be great if there was such a service, though, so client
implementors failing to do what the spec says have a service they'll
fail on.

However, conceptually it wouldn't be very hard.  Consider, for
example,

http://dc.zah.uni-heidelberg.de/feros/q/ssa/ssap.xml?REQUEST=queryData&MAXREC=1

(you'll want an XML pretty-printer for that).

There are two datalink resources in there.  One is:

  <RESOURCE type="meta" utype="adhoc:service">
    <GROUP name="inputParams">
      <PARAM arraysize="*" datatype="char" name="ID" ref="ssa_pubDID" 
        ucd="meta.id;meta.main" value=""/>
    </GROUP>
    <PARAM arraysize="*" datatype="char" name="standardID" 
      value="ivo://ivoa.net/std/DataLink#links-1.0"/>
    <PARAM arraysize="*" datatype="char" name="accessURL" 
      value="http://dc.zah.uni-heidelberg.de/feros/q/sdl/dlmeta"/>
  </RESOURCE>

(I've removed a LINK that's in the live version for reasons of
backward compatibility).

As the standard id says, that points to the datalink service itself.
It has one parameter (as behooves a datalink service), ID.  While you
probably could add more parameters here, too, possibly even with ref,
your service would still have to work only with ID, as that's what
datalink wants.

And then there's 

  <RESOURCE ID="apudntihmadn" type="meta" utype="adhoc:service">
    [...]
    <GROUP name="inputParams">
      <PARAM arraysize="*" datatype="char" name="ID" ref="ssa_pubDID" 
          ucd="meta.id;meta.main" value="">
        <DESCRIPTION>The pubisher DID of the dataset of interest</DESCRIPTION>
        <LINK content-role="ddl:id-source" value="#ssa_pubDID"/>
      </PARAM>
      <PARAM arraysize="*" datatype="char" name="FLUXCALIB" ucd="phot.calib" 
        utype="ssa:Char.FluxAxis.Calibration" value="">
        <DESCRIPTION>Recalibrate the spectrum.  Right now, the only 
          recalibration supported is max(flux)=1 ('RELATIVE').</DESCRIPTION>
        <VALUES>
          <OPTION name="RELATIVE" value="RELATIVE"/>
          <OPTION name="UNCALIBRATED" value="UNCALIBRATED"/>
        </VALUES>
      </PARAM>
      [...]
    </GROUP>
    <PARAM arraysize="*" datatype="char" name="accessURL" ucd="meta.ref.url" 
      value="http://dc.zah.uni-heidelberg.de/feros/q/sdl/dlget"/>
    <PARAM arraysize="*" datatype="char" name="standardID" 
      value="ivo://ivoa.net/std/SSDP#sync"/>
  </RESOURCE>

(where I've made up the standard id for what's now called AccessData
and what should, I believe, really be called Server Side Data
Processing or something else less generic).

So, that's a service that lets you do cutouts, recalibrations, and
similar operations.  And here, you could bind certain parameters to
table columns.  For instance, you could say

   <PARAM arraysize="*" datatype="char" name="FLUXCALIB" ucd="phot.calib" 
      utype="ssa:Char.FluxAxis.Calibration" value=""
      ref="ssa_fluxcalib">

and a client would, on service invocation, take the value of
FLUXCALIB from the ssa_fluxcalib column.  For why one might want
this, I'd like to defer to the champions of that feature.

> >>When I invoke   http://xxx?id=1234  I want to get two rows back (see the
> >>first question).  The first points to the URL for the observation,
> >>the second will request subproducts at the next level in the recursion.
> >Ok, this is the datalink call.
> Right.
> >>These be given by the URL
> >>    http://xxx?id=1234&products=sub1,sub2,sub3
> >...and that's now a call to a custom service that just happens to
> >share its endpoint with a datalink service.
> Did I get this right so far?
> 
> Not really "just happens"...  The idea is that the http://xxx URL
> is a generic data product service.  We give it a row identifier (in
> practice we'd give both a table and row identifier) and it returns
> data products associated with that row.  If no product is specified
> it returns 'root' products.  But each root product may have one or
> more subproducts that I'd like to link to.  Typically the root
> product will be a directory and the subproducts subdirectories or
> specific files, but that's not required.

Being a fan of good-looking URLs I'm not sure this would be a design
I'd choose, but one thing I'd really avoid is the "products=a,b,c"
thing.  In the spirit of DALI and web forms, I'd much rather
recommend products=a&products=b&products=c.  Added benefit: If
a comma turns up in a, b, or c you're fine.

> >>Two possibilities from the documentation....
> >>
> >>    <VOTABLE>... <FIELD="service_def"><FIELD="access_url">...
> >><TR><TD/><TD>http://heasarc/obs/1234</td>..
> >><TR><TD>id=1234&Products=sub1,sub2,sub3</TD><TD/>
> >>    </VOTABLE>
> >I don't quite get this one, I have to say.  But I guess this is meant
> >to be the reponse on the *datalink* endpoint, right?  If so, why
> >don't you just give the two URLs and be done with it?
> 
> Then I'm not showing the hierarchical structure of the products. There can
> be hundreds of different
> product types for a given table.  Returning them all to the user without
> providing any structure is
> potentially confusing.  However if using a service_def I can point to just
> the next level of hierarchy
> of products then it keeps things comprehensible for the user.
> 
> Regardless, the question is what string do I put in the service_def to point
> to some more data products?
> Can you provide a complete example?

Let me try to describe what I think your use case is: You have n
"datasets", each of which consists of 100s of files that allow some
semantic grouping.  For the sake of examples, say it's herbiv, carniv,
and plant.  So, it'd look like this:

zoo1/
  herbi/
    cows/
      file1.txt
      pie.txt
    geese/
      wing.part
      foot.part
      otherfoot.part
  carni/
    lions/
    tigers/
zoo2/
  herbi/
    rabbit/
  plant/
    rose/
    hyacinth/
    daisy/

If I had that structure, I'd not bother with service_def at all
(beyond the basic descriptor as given above).

Instead, I'd be a bit generous on the term "dataset" and define an id
for each file in the hierarchy.  So, the root (and the actual
dataset) for the first dataset would be

  ivo://example.com/data?zoo1

If your datalink service is http://example.com/dl, you would then
return from http://example.com/dl?ID=ivo://example.com/data?zoo1

something like

  ID                 URL                                                          mime                                        semantics    description
  ivo://(as above)   http://example.com/dl?ID=ivo://example.com/data?zoo1/herbi   application/x-votable+xml;content=datalink  #progenitor  What is being eaten
  ivo://(as above)   http://example.com/dl?ID=ivo://example.com/data?zoo1/carni   application/x-votable+xml;content=datalink  #derivation  What eats the beasts

-- so, you're just pointing back at the datalink service again.
Incidentally, there's nothing that requires this parameter-based
approach.  If I had do so such a thing, I'd be severely tempted to
statically generate the datalink documents, put them into the respective
directories as index.datalink, configure an apache to deliver them with
the appropriate mime and as index documents, and I'd have beautiful
links to the datalink documents (simply the directory URLs).  I like
that idea so much I almost wish I had such data...

Anyway, if you then retrieve the URL
http://example.com/dl?ID=ivo://example.com/data?zoo1/herbi that's given
up there, you'd get a datalink document that would look like this:

  ID                    URL                                                               mime                                        semantics            description
  ivo://...zoo1/herbi   http://example.com/dl?ID=ivo://example.com/data?zoo1/herbi/cows   application/x-votable+xml;content=datalink  http.//e#animal      Moo
  ivo://...zoo1/herbi   http://example.com/dl?ID=ivo://example.com/data?zoo1/herbi/geese  application/x-votable+xml;content=datalink  http://e#animal      Small but noisy

Of course, you could have put in more than one ID here (which is an
advantage over the fs-based approach outlined above.  For instance, one
could finally call the datalink service with both cows and geese, which
would then yield:

  ID                    URL                                                               mime                                        semantics            description
  ivo://...herbi/cows   http://example.com/herbi/cows/file1.txt       text/plain             #noise      It's not always quiet near cows
  ivo://...herbi/cows   http://example.com/herbi/cows/pie.txt         poo/soft               #weight     Better than cold feet anyway
  ivo://...herbi/geese  http://example.com/herbi/geese/wing.part      application/singlepart #auxiliary  No flying without
  ivo://...herbi/geese  http://example.com/herbi/geese/foot.part      application/singlepart #auxiliary  No walking without
  ivo://...herbi/geese  http://example.com/herbi/geese/otherfoot.part application/singlepart #auxiliary  No swimming without

Does that help?

Cheers,

            Markus



More information about the dal mailing list