"DALI services" in VOSI and Registry

Wed Oct 17 10:18:50 CEST 2018

ear Apps, Registry, DAL, and GWS,

First, apologies for the wide cross-post, but a decision here has a
fairly wide impact. I'd suggest to follow up to Registry, as I think
whatever we come up with here should end up in VODataService; let's
avoid group replies here.

This is again on the topic of registering "DALI services", i.e.,
services with (possibly) sync, async, tables, capabilities, and examples
endpoints[1]; apart from Registry and hence discovery, this also
concerns what should come back from /capabilities (which might in
particular matter for authenticated services).

What made me think a bit more is that client authors keep saying they're
really interested in "bundles" for TAP (which currently is our main
example for such DALI services). These are combinations of sync,
async, tables, capablities, and perhaps examples.  Thinking a bit
further, I found I'm farily convinced they'll want that will all future
DALI-compliant services: as a client, you'll quite usually need a
substantial subset of the various endpoints defined by DALI to operate a
service.

But if you need all the endpoits to reliably operate the service, aren't
they all part of the service's interface?

Did you notice the last "the" in the previous question?  Well, this
question is how I've convinced myself that all the DALI-prescribed
endpoints are really part of *one* interface.  Which to me indicates
that we'd be doing ourselves a favour if we modelled them as such in
VOResource.

That, in turn, would mean that future DALI service discovery should be
based on a DALI *interface* type, which I'd like to include in
VODataService.  Such an interface, for a TAP service, would look like
this:

  <capability standardID="ivo://ivoa.net/std/TAP">
    <interface xsi:type="vs:DALIInterface">
      <accessURL>http://dc.g-vo.org/tap"</accessURL>
      <endpoint>sync</endpoint>
      <endpoint>async</endpoint>
      <endpoint>capabilities</endpoint>
      <endpoint>tables</endpoint>
      <endpoint>examples</endpoint>
    </interface>
  </capability>

In endpoint elements, you'd enumerate a number of bespoke names
(defined, )'d say, in a little vocabulary within DALI) of interface
endpoint names that are available on the service.  These happen to
conincide with the fixed URI segment to be appended to the accessURL to
reach the endpoint.  

This would be compatible with what TAP 1.0 does but adds explicitness,
and it would cater to other DALI standards that may not require all of
these endpoints (e.g., Datalink, SIAv2,...).  One *could* do without the
endpoint enumeration, as clients could probe for their presence, but
let's save them the extra requests.

The following is a longer argument for why I like this better than any
of the previous proposals and/or the current state, and what it would
mean for our current and future practices.

Right now (previously, we had some variations on this), the UWSRegExt
proposal suggests something like (rough sketch):

<capability TAP>
  <uwsInterface sync url="http://example.org/foo/sync">
  <uwsInterface async url="http://example.org/foo/async">
  <uwsInterface sync url="http://example.org/foo/sec-sync" security="basic">
  <uwsInterface async url="http://example.org/foo/sec-async" security="basic">
  <uwsInterface sync url="http://example.org/foo/x-sync" security="advanced">
  <uwsInterface async url="http://example.org/foo/x-async" security="advanced">
</capability>

<capability VOSI-capabilties>
  <paramHTTPInterface url="http://example.org/foo/capability">
</capability>  (just one of them)

       ** AND EITHER **
<capability VOSI-tables>
  <paramHTTPInterface url="http://example.org/foo/tables">
  <paramHTTPInterface url="http://example.org/foo/sec-tables" security="basic">
  <paramHTTPInterface url="http://example.org/foo/sec-tables" 
    security="advanced">
</capability>

        ** OR **

<capability VOSI-tables>
  <paramHTTPInterface url="http://example.org/foo/tables">
</capability>

Now, to build a complete service interface, a client that has just
discovered a TAP capability has to assemble matching sync and async
endpoints, presumably based on securityMethod/@ivoid (sketched as as
security in the example) -- but who knows?  Which other criteria might
occur in the future?  It will then have to get the other capabilities
from the capabilities endpoint; it is, I think, the selling point of
this approach that this is unique and can be found using the sibling
rule "take away the last segment from the URL, glue on capabilities".

Now, in the other capabilities, the client will have to see if there are
interfaces with the securityMethod/@ivoid it is currently gathering for.
If there are none, I suppose they are to accept whatever interface is
given without a securityMethod (what happens when there's neither the
exact security method nor an open interface?).  And they will have to
repeat that procedure for examples and whatever other DALI endpoints we
will come up with.  Savour all this -- and then imagine how much fun
this becomes when we throw in mirrorURLs.

That's a lot of complexity we pile upon our client writers.  And it's a
lot of moving parts that our registry record writers can break.

On the other hand, with the DALI interface, you'd have (sketch again):

<capability TAP>
  <paramHTTP url="http://example.org/foo"/>
  <daliInterface url="http://example.org/foo">
    <endpoints>sync async capabilities tables examples</endpoints>
  </daliInterface>
  <daliInterface url="http://example.org/sec-foo" security="basic">
    <endpoints>sync async capabilities tables examples</endpoints>
  </daliInterface>
  <daliInterface url="http://example.org/x-foo" security="basic">
    <endpoints>sync async capabilities tables examples</endpoints>
  </daliInterface>
</capability>

[as you can see, this nicely coexists with the legacy TAP 1.0 record;
it's just a couple of extra interfaces] -- and that's it.  Also,
mirrorURLs won't complicate the situation much, neither on the service
nor on the client side nor on the Registry side --

The *client* selects a security method, fetches the accessURL and
immediately knows all endpoints.  I consider this simplicity to be
really important -- it's the client writers first and foremost we need
to get on board.

On the *service* side, I'll be glad if I can legally get rid of the
stupid VOSI capability declarations at some time in the future (that
RegTAP won't have to list five capabilites any more where there's really
only one service clients actually want to talk to is another plus).  

The maintenance of the resource tree doesn't worry me; it's a wee bit
ugly that there's one capabilities document per interface containing
pointers to the others, but if you really only have one capability
document, just use a 303 redirect.  For the other endpoints (tables, in
particular) I have heard use cases where their responses should vary by
authentication, so for them it's actually simpler this way.  I'd hence
say this has Pat's use case of having per-path authentication nicely
covered.

Donning my *Registry* enthusiast hat, I can't see any major fault with
this scheme, either.  The basic discovery scheme is as with TAP 1.0 --
we all know and like this; it seems even a bit slicker when it comes to
declaring auxiliary capabilities.

As to transitioning from our current scheme to the new one: As shown
above, the daliInterfaces can nicely sit within our conventional TAP
capabilities next to ParamHTTP, so we keep keep these cheaply as long as
necessary.  When more than, say, 90% of our TAP services have
daliInterfaces, the clients can switch the discovery pattern to use
daliInterface (or they could discover both interface types and
disambiguate client-side based on ivoid and capability index if they
want to be really diligent).

For SIAv2 (and datalink, in particular within VOSI-capability), the
pattern would be

  <capability SIAv2>
    <paramHTTPInterface url="http://example.org/sky/sync">
    <daliInterface url="http://example.org/sky">
  </capability>

-- the different URLs might look a bit odd at first, but it's still
smooth evolution; and if you want async SIAv2, you'll simply *have* to
use daliInterface; as in the UWSRegExt draft, I'm sure we shouldn't
start abusing ParamHTTP for UWS services.

What would need to happen to get there?

* TAP 1.1 has to re-fix endpoint names and remove the new-style
  capability example (that I consider marginally appropriate at best 
  anyway, even as an example)
* VODataService would define the type.  I'll do this if I hear some
  encouragement
* TAPRegExt 1.1 would use that new type (in addition to the legacy
  TAPRegExt 1.0-compatible ParamHTTP interface)
* DALI would need to say "To register fully DALI compliant services, use
  daliInterface. For partially DALI compliant or legacy services, the
  use of ParamHTTP is allowed.  In that latter case the sibling rule
  applies, except for TAP 1.0 (and we apologize for that exception)."
* We'd need a quick change (or perhaps just an erratum?) for
  SimpleDALRegExt for SIAv2.
* We'd need a small update for Datalink; fixing the endpoint names might
  be considered a major version step, but frankly, I doubt such an
  update would break any actual services or clients, so we might ask for
  an exception.
* Services and clients could evolve at their leisure, as daliInterface
  harmonises with legacy ParamHTTP.

Yeah, looking at this list you might think it's painful, but I'd say the
horror of limping on with clients having to piece together their
components using what seems to me ill-defined heuristics seems a lot
worse to me.

TAP clients, by the way, would immediately work with daliInterface,
clients for other protocols would need to be taught TAP-style URL
algebra (which, again, has served us pretty well up to now).  In
particular, the URL algebra to switch from sync to async if necessary
would be as straightforward in the presence of by-path authentication as
it is now.

So... what's not to like?  Have I overlooked something?

            -- Markus

(who apologises for having used ParamHTTP in TAPRegExt; that shouldn't
have happened, and I should have defined a new interface type.  I don't
think we'd have this discussion now if I hadn't made that mistake)

[1] I'm leaving the VOSI availability endpoint out of these
considerations; it's a weird beast about which I'd be happy to discuss
some other time.  But I'll reveal already that I think availability
should be per-interface (rather than per-resource, as it is now because
it's given as a capability) and should then say something about mirrors,
too.