SODA, section 4.3

Markus Demleitner msdemlei at ari.uni-heidelberg.de
Tue Nov 8 15:50:17 CET 2016


Dear Colleagues,

I'm sorry to open up this painful chapter again, but while doing a
review of the SODA PR with a view to implement it, I was dismayed by
several aspects of section 4.3 (which is SODA in Datalink and
therefore what I really care about).

(1) I don't think the discussion of the various ways a DAL
response might convey datalink URLs helps the discussion of SODA.  On
the contrary, I'd say it is confusing and should be removed; this is all
of 4.3 up to "normally use a single or small number of ID(s) per
invocation."

Instead, by way of introduction, I think

  The alternative scenario has the discovery service return Datalink
  documents (see \citep{std:Datalink} for ways to do that).  These
  Datalink documents can then contain one or more SODA descriptor(s),
  most typically one per dataset described.  To allow SODA clients
  the inference of parameter ranges and the presentation of useful
  user interfaces, data providers SHOULD communicate the admissable
  ranges of the parameters in question using the VOTable
  \xmlel{VALUES} element.

(2) This should be followed by a halfway formal definition of how
this should be done.  Just giving an example is not good enough and
will lead to endless interoperability problems.  Even worse, the
example for BAND is against the expectation a normal VOTable user
will have; it's suggesting <MAX value="300.0e-9 800.0e-9"/> where, I
claim, most everyone would expect 
  <MIN value="300.0e-9"/>
  <MAX value="800.0e-9"/>

It's bad enough that we have the ugly hacks for CIRCLE and POLYGON,
let's not uglify VOTable where we don't absolutely need to.  Also, in
the example DESCRIPTIONs are missing -- these are important, in
particular further down the road when people have custom parameters.

So, I'd suggest to take out the rest of 4.3, too, and to then continue:

  For float-valued intervals (e.g., the standard BAND and TIME
  parameters), \xmlel{VALUES/MIN} and \xmlel{VALUES/MAX} are used to
  communicate the range of values for which clients can expect to
  receive data.  Example:

  \begin{lstlisting}[language=XML]
    <PARAM name="BAND" unit="m" ucd="em.wl"
      datatype="double" arraysize="2"
      xtype="interval" value="">
      <DESCRIPTION>The wavelength intervals to be extracted</DESCRIPTION>
      <VALUES>
        <MIN value="3e-7"/>
        <MAX value="8e-7"/>
      </VALUE>
    </PARAM>
  \end{lstlisting}

  Enumerated values, both for integeral and textual types, use
  \xmlel{VALUES/OPTION} elements unless there are too many possible
  values.  Again, only values for which nonempty responses can be
  expected for the described dataset should be listed.  Example:

  \begin{lstlisting}[language=XML]
    <PARAM name="POL" ucd="meta.code;phys.polarization"
      datatype="char" arraysize="*" value="">
      <DESCRIPTION>Polarization states to be extracted.</DESCRIPTION>
      <VALUES>
        <OPTION>I</OPTION>
        <OPTION>V</OPTION>
      </VALUE>
    </PARAM>
  \end{lstlisting}

  In case the option enumeration becomes too large, the descirption
  of the parameter should carefully describe what values are
  admissable, e.g., by providing a link to an enumeration in the
  \xmlel{DESCRIPTION}.

  Intervals of integers are described analogous to float-valued
  intervals, i.e., using \xmlel{MIN} and \xmlel{MAX} elements.

  Standard VOTable semantics are insufficient for the metadata of
  the SODA POLYGON and CIRCLE parameters.  We therefore define
  special cases.

  For CIRCLE, only a \xmlel{MAX} is given. It contains three
  floating point values, separated by whitespace.  These correspond
  to the RA and Dec of the center of a spherical circle covering the
  dataset, and a radius of such a covering circle.  Data providers
  SHOULD make sure they choose the center and radius such that the
  covering circle is close to the minimal one of the dataset.
  Example:

  \begin{lstlisting}[language=XML]
  <PARAM name="CIRCLE" unit="deg" ucd="phys.angArea;obs"
    datatype="double" arraysize="3"
    xtype="circle" value="">
    <DESCRIPTION>A spherical circle to be contained by the cutout</DESCRIPTION>
    <VALUES> <MAX value="12.0 34.0 0.5"/> </VALUES>
    </PARAM>
  \end{lstlisting}

  For POLYGON, again only a \xmlel{MAX} is given.  It consists of
  a sequence of floating-point values, again separated by blanks,
  describing RA and Dec of the vertices of a spherical polygon
  covering the dataset.  Data providers are encouraged to choose a
  minimal polygon.  Example:

  \begin{lstlisting}[language=XML]
  <PARAM name="POLYGON"  unit="deg" ucd="phys.angArea;obs"
          datatype="double" arraysize="*" 
          xtype="polygon"  value="">
    <DESCRIPTION>A polygon to be contained by the cutout</DESCRIPTION>
    <VALUES>
      <MAX value="11.5 33.5 12.5 33.5 12.5 34.5 11.5 34.5"/> 
    </VALUES>
  </PARAM>
  \end{lstlisting}

  Angles in both CIRCLE and POLYGON are in degrees.  As in the input
  the ICRS reference system is assumed.

  For POS, useful metadata cannot be given.  Services supporting POS
  should therefore provide POLYGON as well, and clients wishing to
  use POS can infer sensible values for that parameter from
  \xmlel{VALUES} given for POLYGON.

Sorry for coming in with this now, but the 

  <MAX value="300.0e-9 800.0e-9"/> 

that really shocked me only came in in July, so perhaps it's not my
fault alone.

Anyway: content-wise I claim the changes with my proposal are minimal
(essentially, just use MIN/MAX as intended), but implementors have a
much clearer guideline as to what to implement against.

So -- can we just replace 4.3 as suggested?

Cheers,

        Markus


More information about the dal mailing list