SODA gripes (3): Explanatory introduction

Fri Feb 12 10:58:48 CET 2016

Hello,

I think that this kind explanatory introduction is very helpful to help 
people to survive the IVOA doc reading. To me, the right place for such 
an introduction is inside the spec rather than in a separate note.

I've a comment about the cases detailed in the Markus's note.
The example of "a service containing relatively homogeneous results of a
single instrument, " on which is based the 1.4.3 scheme could be 
extended to DAL responses compliant with Obscore, since is is quite 
straightforward to build a SODA query from Obscore parameters. I know 
how much Markus is reluctant about considering Obscore parameters as 
natural ranges for Soda parameter values, but I'm still believing that 
this remains legal while these values have not been overridden in the 
PARAM description of the service descriptor itself.

Laurent

Le 27/01/2016 09:47, Markus Demleitner a écrit :
> Dear Colleagues,
>
> In his mail
> http://mail.ivoa.net/pipermail/dal/2016-January/007235.html Mark
> wrote:
>
>> Finally (at least for now), it's not obvious to me from this document
>> how to actually use a SODA service.  Possibly that's because I'm
>> not familiar enough with Datalink or other associated standards,
>
> -- and from other discussions it seems to me that that's a fairly
> common sentiment.
>
> I hance think that we should have an explanatory introduction as to
> how SODA is intended to work, and what clients and servers have to do
> to implement it.  I've drafted such an explanatory introduction in a
> branch of the SODA document at
> https://volute.g-vo.org/svn/branches/SODA-markus[1].
>
> The new, explanatory chapter is 1.4, with three subsections (diffs
> below).  If you don't want to build the document yourself, I've
> uploaded a PDF to http://docs.g-vo.org/SODA-expl.pdf
>
> Opinions?  Problems with the general drift or individual questions
> addressed?  Does this even help understanding SODA?  What changes
> would people want before they'd consider this ready for merging into
> trunk?
>
> Cheers,
>
>          Markus
>
>
> [1] If you have the source checked out and want to switch to the
> branch, say
>
> svn switch https://volute.g-vo.org/svn/branches/SODA-markus
>
> To get back to the editor's draft, you'd say
>
> svn switch  https://volute.g-vo.org/svn/trunk/projects/dal/SODA/
>
>
>
> So, here's a diff against SODA-trunk (probably reading the PDF is
> easier on your eyes, and the changes are really just one diff hunk,
> but just in case):
>
> svn diff https://volute.g-vo.org/svn/trunk/projects/dal/SODA/ .
> Index: SODA.tex
> ===================================================================
> --- SODA.tex	(https://volute.g-vo.org/svn/trunk/projects/dal/SODA)	(revision 3231)
> +++ SODA.tex	(working copy)
> @@ -121,6 +121,230 @@
>   This use case will be developed and supported in the
>   SODA-1.1 (or later) specification.
>
> +\subsection{SODA Operation}
> +
> +In contrast to other IVOA protocols, SODA services are not usually
> +discovered through Registry queries.  Instead, clients encounter them
> +in Datalink \citep{std:datalink} declarations, which can either be
> +standalone or embedded within other services' responses.
> +
> +Since this pattern can appear somewhat confusing at first, this
> +introductory (non-normative) chapter discusses the usage scenarios for
> +SODA services.  In parallel, we provide advice on the server-side
> +implications of these scenarios.
> +
> +In all cases, the first step is a data discovery service; when used
> +below, this term could refer to, for instance, SIA, SSA, or ObsTAP, but also
> +to some sort of resolution engine for persistent identifiers.
> +
> +\subsubsection{Pure Datalink discovery}
> +\label{sect:pure-datalink}
> +
> +In the baseline scenario, the data discovery step has yielded a
> +result with the media type
> +$$\hbox{\texttt{application/x-votable+xml;content=datalink}}.$$
> +
> +To the client, this indicates that what is given in the access reference
> +(e.g., the \texttt{access\_url} column in ObsTAP or SIA version 2 or the column
> +with the UCD \texttt{VOX:Image\_AccessReference} in SIA version 1) is a datalink
> +document.  Within that document, there is a SODA service descriptor
> +written as specified by Datalink.  The whole document would looks
> +somewhat like this:\todo{of course,
> +this needs descriptions and ranges;  if this text is accepted for the
> +main standard, MD will fill this in.}
> +
> +\begin{lstlisting}[language=XML,basicstyle=\footnotesize]
> +<RESOURCE type="results">
> +  [datalink links, one of them being]
> +  <TR>[id=ivo://example.com/data?ds1 service-def=soda; semantics=#proc]</TR>
> +</RESOURCE>
> +
> +<RESOURCE type="meta" utype="adhoc:service" ID="soda">
> +
> +  <PARAM name="standardID" datatype="char" arraysize="*"
> +        value="ivo://ivoa.net/std/SODA#sync-1.0" />
> +
> +  <PARAM name="accessURL" datatype="char" arraysize="*"
> +        value="http://example.com/my-svcs/soda/sync?ID=ivo://example.com/data?ds1" />
> +
> +  <GROUP name="inputParams">
> +     <PARAM name="POS" ucd="phys.angArea;obs" datatype="char"
> +          arraysize="*" />
> +     <PARAM name="BAND" ucd="em.wl" unit="m" datatype="double"
> +          arraysize="*"/>
> +     <PARAM name="TIME" ucd="time.interval;obs.exposure"
> +          unit="d" datatype="double"
> +          arraysize="*" xtype="interval"  />
> +     <PARAM name="POL" ucd="meta.code;phys.polarization" datatype="char"
> +          arraysize="*" />
> +  </GROUP>
> +</RESOURCE>
> +\end{lstlisting}
> +
> +Of course, the service is free to choose the VOTable ID of the resource
> +with the utype
> +\texttt{adhoc:service}; the service will only declare the parameters it
> +(and the underlying data) actually supports.
> +
> +From the Datalink row with \texttt{\#proc} semantics,
> +the client sees that there is a service for the
> +dataset in question (identified here through its publisher DID,
> +\nolinkurl{ivo://example.com/data?ds1}), and from the service
> +descriptor's standardID \xmlel{PARAM} it learns that the service's
> +parameters follow the rules laid down here, in particular as regards the
> +three-factor semantics.  For instance, the client is guaranteed that
> +BAND, with UCD \texttt{em.wl} and unit meters actually denotes the parameter
> +controlling where a cutout on the spectral axis will happen.
> +
> +SODA's role here is exactly this guarantee of a specific semantics, as
> +opposed to a non-standard service that could use BAND in an entirely
> +different way.
> +
> +An attractive implementation strategy for small-to-medium sized
> +installations is to pre-generate the datalink files.  In that way, no
> +extra endpoint is required besides the discovery service and the SODA
> +service.
> +
> +Here is a sketch of the query pattern in this case\todo{If people think
> +this is a good idea, I'll do SVGs of these}:
> +
> +\begin{verbatim}
> +Client ---- discovery query ----> DAL service
> +                                     |
> +     +----- Results with ------------+
> +     |      Datalink-valued accrefs
> +     v
> + DAL client --- retrieves accref ---> e.g., plain HTTP
> +                                              service
> +                                                |
> +    +-------- Datalink document with -----------+
> +    |         SODA descriptor
> +    v
> + SODA client -----> SODA instructions ----> SODA service
> +                                               |
> +Data viewer <------ sliced-and-diced data -----+
> +\end{verbatim}
> +
> +\subsubsection{Datalink Discovery with Backward Compatiblity}
> +\label{sect:dlplusbackward}
> +
> +The problem with the scheme discussed in sect.~\ref{sect:pure-datalink} is
> +that legacy clients, i.e., those that do not understand Datalink, will
> +not be able to interpret the results of the discovery step.  While this
> +is probably desirable when services hand out large data cubes that
> +legacy clients probably will not properly handle anyway, in many other
> +situations services should deliver conventional (e.g., FITS) data
> +products to such legacy clients.  To still enable SODA and other
> +Datalink functionality, DAL services can add a serivce descriptor in the
> +DAL response that indicates the availability of a Datalink
> +\emph{service} accompanying the DAL service, looking more or less like
> +this:
> +
> +\begin{lstlisting}[language=XML]
> +<RESOURCE type="results">
> +  [a result from services like TAP, SIA, SSA]
> +  <TABLE>
> +    [in particular, we have one field like]
> +    <FIELD ID="primaryID" name="pubDID" datatype="char" arraysize="*">
> +      <DESCRIPTION>The publisher DID for the dataset</DESCRIPTION>
> +    </FIELD>
> +    ...
> +  </TABLE>
> +</RESOURCE>
> +<RESOURCE type="meta" utype="adhoc:service">
> +  <PARAM name="standardID" datatype="char" arraysize="*"
> +    value="ivo://ivoa.net/std/DataLink#links-1.0" />
> +  <PARAM name="accessURL" datatype="char" arraysize="*"
> +    value="http://example.com/mylinks/get" />
> +  <GROUP name="inputParams">
> +    <PARAM name="ID" datatype="char" arraysize="*"
> +      value="" ref="primaryID"/>
> +  </GROUP>
> +</RESOURCE>
> +\end{lstlisting}
> +
> +Note that while this looks very similar to the SODA descriptor above,
> +this fragment is in the DAL response rather than within a Datalink
> +document itself, and it also describes a Datalink rather than a SODA
> +service.
> +
> +It references one (or more)
> +field(s) from the DAL response.\footnote{That pattern can be used
> +within the Datalink document as in sect.~\ref{sect:pure-datalink}, too,
> +to refer to Datalink's ID column, which lets services use a constant
> +access URL in the SODA descriptor.}
> +This is explained in more detail in section 4.2 of
> +the Datalink recommendation 1.0.  The net result is that
> +datalink-enabled clients can find ancillary data and use SODA services
> +for data access by virtue of being able to retrieve Datalink documents,
> +whereas legacy clients still retain basic functionality.
> +
> +On the service side, this incurs the additional cost of having to
> +provide a datalink \{links\} resource, on the client side, some extra
> +dereferencing becomes necessary.  Hence, this pattern should be
> +preferred over the simpler pattern from sect.~\ref{sect:pure-datalink}
> +only if there is a significant advantage in serving data to legacy
> +clients.
> +
> +The query pattern in this case looks like this:
> +
> +\begin{verbatim}
> +
> +Client ---- discovery query ----> DAL service
> +                                     |
> +     +----- Results with ------------+
> +     |      pubDIDs and a {links} descriptor
> +     v
> + Datalink client ----- ID=pubDID -----> Datalink service
> +                                                |
> +    +-------- datalink document with -----------+
> +    |         SODA descriptor
> +    v
> + SODA client -----> SODA instructions ----> SODA service
> +                                               |
> +Data viewer <------ sliced-and-diced data -----+
> +\end{verbatim}
> +
> +\subsubsection{Sidestepping Datalink}
> +
> +In some situations, the extra request to retrieve the datalink document
> +for each dataset is inconvenient, while the client may have sufficient
> +information to operate the SODA service based on common metadata.  A
> +classic example would be a service containing relatively homogeneous
> +results of a single instrument, perhaps a spectrograph where all
> +spectra essentially have the same spectral coverage and a client may
> +want to only retrieve, say, the vicinity of a spectral line.
> +
> +In such cases a service may provide a shortcut by including a SODA
> +descriptor directly in the DAL response.  In essence the resulting
> +descriptor looks like a union of the one given in
> +sect.~\ref{sect:pure-datalink} and the one given in
> +sect.~\ref{sect:dlplusbackward}: It includes the SODA parameters, the ID
> +parameter with the reference to the column to take the publisher DID
> +from, but it has a SODA standardID from sect.~\ref{sect:pure-datalink}
> +rather than the Datalink one from sect.~\ref{sect:dlplusbackward}.
> +
> +While sidestepping the extra datalink request might appear attractive in
> +principle, the difficulty of determining the useful parameter ranges
> +make this pattern only interesting in relatively few special cases.
> +Clients must not rely on the presence of full SODA descriptors in DAL
> +responses.  Normal SODA operation follows the pattern given in
> +sects.~\ref{sect:pure-datalink} and~\ref{sect:dlplusbackward}.
> +
> +The query pattern here is:
> +
> +\begin{verbatim}
> +Client ---- discovery query ----> DAL service
> +                                     |
> +     +----- Results with ------------+
> +     |      SODA descriptor
> +     v
> + SODA client -----> SODA instructions ----> SODA service
> +                                               |
> +Data viewer <------ sliced-and-diced data -----+
> +\end{verbatim}
> +
> +
>   \section{Resources}
>
>   SODA services are implemented as HTTP REST \citep{richardson07} web
>

-- 
---- Laurent MICHEL              Tel  (33 0) 3 68 85 24 37
      Observatoire de Strasbourg  Fax  (33 0) 3 68 85 24 32
      11 Rue de l'Universite      Mail laurent.michel at astro.unistra.fr
      67000 Strasbourg (France)   Web  http://astro.u-strasbg.fr/~michel
---