SODA, section 4.3
Markus Demleitner
msdemlei at ari.uni-heidelberg.de
Tue Nov 8 15:50:17 CET 2016
Dear Colleagues,
I'm sorry to open up this painful chapter again, but while doing a
review of the SODA PR with a view to implement it, I was dismayed by
several aspects of section 4.3 (which is SODA in Datalink and
therefore what I really care about).
(1) I don't think the discussion of the various ways a DAL
response might convey datalink URLs helps the discussion of SODA. On
the contrary, I'd say it is confusing and should be removed; this is all
of 4.3 up to "normally use a single or small number of ID(s) per
invocation."
Instead, by way of introduction, I think
The alternative scenario has the discovery service return Datalink
documents (see \citep{std:Datalink} for ways to do that). These
Datalink documents can then contain one or more SODA descriptor(s),
most typically one per dataset described. To allow SODA clients
the inference of parameter ranges and the presentation of useful
user interfaces, data providers SHOULD communicate the admissable
ranges of the parameters in question using the VOTable
\xmlel{VALUES} element.
(2) This should be followed by a halfway formal definition of how
this should be done. Just giving an example is not good enough and
will lead to endless interoperability problems. Even worse, the
example for BAND is against the expectation a normal VOTable user
will have; it's suggesting <MAX value="300.0e-9 800.0e-9"/> where, I
claim, most everyone would expect
<MIN value="300.0e-9"/>
<MAX value="800.0e-9"/>
It's bad enough that we have the ugly hacks for CIRCLE and POLYGON,
let's not uglify VOTable where we don't absolutely need to. Also, in
the example DESCRIPTIONs are missing -- these are important, in
particular further down the road when people have custom parameters.
So, I'd suggest to take out the rest of 4.3, too, and to then continue:
For float-valued intervals (e.g., the standard BAND and TIME
parameters), \xmlel{VALUES/MIN} and \xmlel{VALUES/MAX} are used to
communicate the range of values for which clients can expect to
receive data. Example:
\begin{lstlisting}[language=XML]
<PARAM name="BAND" unit="m" ucd="em.wl"
datatype="double" arraysize="2"
xtype="interval" value="">
<DESCRIPTION>The wavelength intervals to be extracted</DESCRIPTION>
<VALUES>
<MIN value="3e-7"/>
<MAX value="8e-7"/>
</VALUE>
</PARAM>
\end{lstlisting}
Enumerated values, both for integeral and textual types, use
\xmlel{VALUES/OPTION} elements unless there are too many possible
values. Again, only values for which nonempty responses can be
expected for the described dataset should be listed. Example:
\begin{lstlisting}[language=XML]
<PARAM name="POL" ucd="meta.code;phys.polarization"
datatype="char" arraysize="*" value="">
<DESCRIPTION>Polarization states to be extracted.</DESCRIPTION>
<VALUES>
<OPTION>I</OPTION>
<OPTION>V</OPTION>
</VALUE>
</PARAM>
\end{lstlisting}
In case the option enumeration becomes too large, the descirption
of the parameter should carefully describe what values are
admissable, e.g., by providing a link to an enumeration in the
\xmlel{DESCRIPTION}.
Intervals of integers are described analogous to float-valued
intervals, i.e., using \xmlel{MIN} and \xmlel{MAX} elements.
Standard VOTable semantics are insufficient for the metadata of
the SODA POLYGON and CIRCLE parameters. We therefore define
special cases.
For CIRCLE, only a \xmlel{MAX} is given. It contains three
floating point values, separated by whitespace. These correspond
to the RA and Dec of the center of a spherical circle covering the
dataset, and a radius of such a covering circle. Data providers
SHOULD make sure they choose the center and radius such that the
covering circle is close to the minimal one of the dataset.
Example:
\begin{lstlisting}[language=XML]
<PARAM name="CIRCLE" unit="deg" ucd="phys.angArea;obs"
datatype="double" arraysize="3"
xtype="circle" value="">
<DESCRIPTION>A spherical circle to be contained by the cutout</DESCRIPTION>
<VALUES> <MAX value="12.0 34.0 0.5"/> </VALUES>
</PARAM>
\end{lstlisting}
For POLYGON, again only a \xmlel{MAX} is given. It consists of
a sequence of floating-point values, again separated by blanks,
describing RA and Dec of the vertices of a spherical polygon
covering the dataset. Data providers are encouraged to choose a
minimal polygon. Example:
\begin{lstlisting}[language=XML]
<PARAM name="POLYGON" unit="deg" ucd="phys.angArea;obs"
datatype="double" arraysize="*"
xtype="polygon" value="">
<DESCRIPTION>A polygon to be contained by the cutout</DESCRIPTION>
<VALUES>
<MAX value="11.5 33.5 12.5 33.5 12.5 34.5 11.5 34.5"/>
</VALUES>
</PARAM>
\end{lstlisting}
Angles in both CIRCLE and POLYGON are in degrees. As in the input
the ICRS reference system is assumed.
For POS, useful metadata cannot be given. Services supporting POS
should therefore provide POLYGON as well, and clients wishing to
use POS can infer sensible values for that parameter from
\xmlel{VALUES} given for POLYGON.
Sorry for coming in with this now, but the
<MAX value="300.0e-9 800.0e-9"/>
that really shocked me only came in in July, so perhaps it's not my
fault alone.
Anyway: content-wise I claim the changes with my proposal are minimal
(essentially, just use MIN/MAX as intended), but implementors have a
much clearer guideline as to what to implement against.
So -- can we just replace 4.3 as suggested?
Cheers,
Markus
More information about the dal
mailing list