Prototype SODA and DataLink Service Descriptor at CADC

Patrick Dowler pdowler.cadc at gmail.com
Tue Mar 1 01:05:47 CET 2016


TL;DR - CADC implemented a prototype SODA services and use DataLink service
descriptors to convey data-specific metadata and parameter info. The
implementation demonstrates that with POS you cannot convey metadata so we
introduced new positional cutout paramseters (CIRC and POLY) that conform
to WD-DALI-1.1 xtypes (circle and polygon) and allow us to convey useful
parameter metadata as a result.

Before anyone panics: we also show that POS, CIRC, and POLY can co-exist :-)

image dataset: ID=caom:IRIS/f212h000/IRAS-25um

* ObsCore-1.1 metadata *

http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/sia/v2query?ID=caom:IRIS/f212h000/IRAS-25um

So, this is a dataproduct_type=image, calib_level=2, s_xel1 & s_xel2 say it
is 500x500 (pixels)

* image links *

http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=caom:IRIS/f212h000/IRAS-25um

Note the generic SODA service descriptors (not linked!):

<RESOURCE type="meta" ID="soda-sync" utype="adhoc:service">
  <PARAM name="resourceIdentifier" datatype="char" arraysize="27"
         value="ivo://cadc.nrc.ca/soda#sync" />
  <PARAM name="standardID" datatype="char" arraysize="*"
         value="ivo://ivoa.net/std/SODA#sync-1.0" />
  <PARAM name="accessURL" datatype="char" arraysize="*"
         value="http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/sync"
/>
    <GROUP name="inputParams">
      <PARAM name="ID" datatype="char" ref="fileURIRef" arraysize="*"
value="" />
      <PARAM name="POS" datatype="char" ucd="obs.field" arraysize="*"
value="" />
      <PARAM name="CIRC" datatype="double" ucd="obs.field" unit="deg"
xtype="circle" arraysize="3"
             value="" />
      <PARAM name="POLY" datatype="double" ucd="obs.field" unit="deg"
xtype="polygon" arraysize="*"
             value="" />
      <PARAM name="BAND" datatype="double" ucd="em.wl;stat.interval"
unit="m" xtype="interval" arraysize="2"
            value="" />
      <PARAM name="TIME" datatype="double" ucd="time;stat.interval"
unit="d" xtype="interval" arraysize="2"
             value="" />
      <PARAM name="POL" datatype="char" ucd="phys.polarization.stokes"
arraysize="2*" value="" />
    </GROUP>

</RESOURCE>
<RESOURCE type="meta" ID="soda-async" utype="adhoc:service">
    <PARAM name="standardID" datatype="char" arraysize="*"
           value="ivo://ivoa.net/std/SODA#async-1.0" />
    ... same params as above
</RESOURCE>

params: ID, POS, CIRC, POLY, BAND, TIME, POL

Above and below you will see a resourceIdentifier param; this is there to
support the use of a runtime registry lookup to generate the accessURL.
Doing it this way allows us to generate URLs to development, test, or
production servers depending on the work environment... our DataLink and
SODA services are not actually registered but my intent is to make these
resolvable by registering the services in the near future.

Likewise, the standardID values for SODA are not (yet) resolvable but they
will be...

The link-specific SODA descriptors (the ID attributes have UUIDs in them)
e.g.:

<RESOURCE type="meta" ID="soda-cbb62ed5-c2c9-4dd9-aed6-46d7d5173dca"
utype="adhoc:service">
    <PARAM name="resourceIdentifier" datatype="char" arraysize="27"
           value="ivo://cadc.nrc.ca/soda#sync" />
    <PARAM name="standardID" datatype="char" arraysize="32"
           value="ivo://ivoa.net/std/SODA#sync-1.0" />
    <PARAM name="accessURL" datatype="char" arraysize="*"
           value="http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/sync"
/>
    <GROUP name="inputParams">
      <PARAM name="ID" datatype="char" ucd="" arraysize="*"
value="ad:IRIS/I212B2H0" />
      <PARAM name="POS" datatype="char" ucd="obs.field" arraysize="*"
value="" />
      <PARAM name="CIRC" datatype="double" ucd="obs.field" unit="deg"
xtype="circle" arraysize="3"
             value="">
        <VALUES>
          <MAX value="140.63049941314583 0.2007826788236291
8.778341996040131" />
        </VALUES>
      </PARAM>
      <PARAM name="POLY" datatype="double" ucd="obs.field" unit="deg"
xtype="polygon" arraysize="*"
             value="">
        <VALUES>
          <MAX value="146.83673628162285 -6.408995958971017
134.3808989000012 -6.370135804011464 134.4242625446688 6.007430601323759
146.8700273500712 5.96918267453771" />
        </VALUES>
      </PARAM>
    </GROUP>
</RESOURCE>

The value for the ID parameter is specified in the value attribute because
this is a file-specific (at CADC) service descriptor and it needs this
file. This *is not* the same kind of ID that one uses to call the DataLink
service (that is a publisher_did which we will be changing into a
resolvable ivo-id once I'm happy with the registration of our collections;
this is a file identifier from our storage system). I could have used a ref
attribute to the links table (since the file URI is there for our other
services and for the generic soda service descriptors) but once you have
link-specific descriptiorsd anyway this seems tidier (e.g. I'll eventually
be able to remove the custom fileURIref column from our links table).

BAND, TIME, and POL are missing because this is a 2D image so cutouts on
those axes are not possible.

POS is listed there because positional cutout is possible, but I don't see
a sane way to convey sensible values to help someone use POS; the client
has to know the extent (from data discovery) or get it in some other way
(metadata capability).

For CIRC and POLY the service includes a "maximum sensible extent" with
which to perform cutouts. The value attribute of the MAX element is a
string and my interpretation of the intent is that the client should
interpret it as the same "type" of thing as the PARAM in which it is found.
It feels like the MAX extent conveys useful and sensible information, but I
didn't see anything useful to put in MIN. Is this an abuse of MAX? Maybe
(in the sense that MAX usually implies that there is an ordering) but given
the implied type consistency that is already there people are interpreting
this now and when I showed this to a few techy astronomers that understand
VOTable they interrpetted this as I meant it. (The CIRC MAX is the minimum
spanning circle; the POLY MAX is the polygon boundary -- so using those
values would get all the pixels.)

Right now, CIRC and POLY are my own custom parameters and they should not
bother a strict client that used this descriptor because of the standardID.
I chose different parameter names so i could be more explicit about the
value metadata (datatype, arraysize, xtype, units, ucd -- WD-DALI-1.1)
*and* so I could provide the "maximum value" of the exact same type. In the
SODA service this is very straightforward to implement (I use the same
Format classes for reading and writing VOTables and for parsing and
validating SODA params).

For these URLs, I recommend just "curl -v" so you can see the HTTP headers,
but you can download the data if you want to :-)

* image cutout: POS *
http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/sync?ID=ad:IRIS/I212B2H0\&POS=circle%20140.5%200.0%200.5

decoding the redirect url: cutout=[0][235:275,238:278,*]

* image cutout: CIRC *
http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/sync?ID=ad:IRIS/I212B2H0\&CIRC=140.5%200.0%200.5

decoding the redirect url: cutout=[0][235:275,238:278,*]

* image cutout: POLY *
http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/sync?ID=ad:IRIS/I212B2H0\&POLY=140%200.0%20140.5%200.0%20140.5%200.5%20140.0%200.0

decoding the redirect url: cutout=[0][255:275,258:278,*]

If successful, these SODA requests to /caom2ops/sync respond with an error
message or a redirect to a URL with a pixel cutout using cfitsio syntax.
That is completely an implementation detail of our archive infrastructure
and not part of the prototype per se.

If they fail (easy to do, just mess with the params) the response (after
redirect) is text/plain with a suitable HTTP status code.





On 29 February 2016 at 16:03, Patrick Dowler <pdowler.cadc at gmail.com> wrote:

>
> I have finally finished and deployed our latest prototype SODA services
> and augmented our DataLink service to provide service descriptors to enable
> use of SODA. This works spans several services so, following Markus'
> "gripes" appoach I will try to separate things into separate messages, but
> I'll just make the messages replies to this one so they will be a single
> thread and I promise to put the TL;DR at the top of each :-)
>
> So, coming up:
>
> 1. description of datalink service descriptor output and soda sync cutout
> of a 2D image
>
> 2. less wordy description of datalink service descriptor output and soda
> sync cutout of a 3D cube
>
> 3. description of datalink service descriptor output and soda async
> cutout(s) of a 2D image
>
> It is quite a lot to look at, but I would like to point out here that
> implementing the whole end-to-end usage forced me to reconsider some
> earlier decisions and refine things to make them more clear and more
> useful. Although DataLink and SODA are loosely coupled in a technical
> sense, they do need to get along and work together and each has some effect
> or influence on decisions one takes while implementing the other.
>
> more to follow...
>
> --
> Patrick Dowler
> Canadian Astronomy Data Centre
> Victoria, BC, Canada
>



-- 
Patrick Dowler
Canadian Astronomy Data Centre
Victoria, BC, Canada
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ivoa.net/pipermail/dal/attachments/20160229/0ce49b35/attachment.html>


More information about the dal mailing list