TAP discovery in 1.1

Mon Sep 11 14:13:38 CEST 2017

Dear DAL, dear Registry,

[since this concerns the TAP standard, I'd say followups should go to
the DAL list exclusively, but Registry should know about it, since
it's essentially a Registry problem]

On contemplating the implications of chapter 2.4 of PR-TAP 1.1, I
wonder what to make of its rules concerning the capabilities.  What's
in there now will most certainly complicate service discovery, and I
believe even confuse clients that do everything right according to
current rules and recommendations.

Background
----------

Leaving the "data discovery" use case[1] aside for now, many clients
want to enumerate all available services they can operate.  

Let's use the TAP version 1 example.  RegTAP has said to run

  SELECT ivoid, access_url 
  FROM rr.capability
  NATURAL JOIN rr.interface
  WHERE standard_id like 'ivo://ivoa.net/std/tap%'
    AND intf_type='vs:paramhttp'

In the presence of new-style standard ids and auxiliary
capabilities this needs to be modified to

 SELECT ivoid, access_url 
  FROM rr.capability
  NATURAL JOIN rr.interface
  WHERE standard_id LIKE 'ivo://ivoa.net/std/tap#sync-1%'
    AND intf_type='vs:paramhttp'

and it's probably wise to further constrain "and intf_role LIKE
'std%'", but that's not what current clients implement, anyway.  

What clients in all likelihood expect that each service in the result
sets of theses queries is only in there once.

With current PR-TAP 1.1, both patterns would break, the uniqueness
expectation would be violated, and the three services a current
client would probably see with the capability element like in section
2.4 have a high likelihood of appearing broken to them them because
the interfaces discovered would be requiring authentication.

Let me try to discuss the two problems:

PR-TAP-1-1 changes 1: multiple interfaces
-----------------------------------------

PR-TAP-1 shows how each TAP capability can have multiple interfaces,
all of type vs:ParamHTTP and with @role="std".  The function of the
additional interfaces is to support different authentication methods.

That's trouble because clients right now typically, in effect, choose
a random interface.  In a case like this:

  <capability xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
      standardID="ivo://ivoa.net/std/TAP#sync-1.1">
    <interface xsi:type="vs:ParamHTTP" role="std" version="1.1">
      <accessURL use="base"> http://example.net/myTAP/sync </accessURL>
    </interface>
    <interface xsi:type="vs:ParamHTTP" role="std" version="1.1">
      <accessURL use="base">http://example.net/myTAP/auth-sync</accessURL>
      <securityMethod standardID="http://www.w3.org/Protocols/HTTP/1.0/spec.html#BasicAA"/>
    </interface>
    <interface xsi:type="vs:ParamHTTP" role="std" version="1.1">
      <accessURL use="base">https://example.net/myTAP/sync</accessURL>
      <securityMethod standardID="ivo://ivoa.net/sso#tls-with-certificate"/>
    </interface>
  </capability>

this would end up at an interface probably not working for the client
in two out of three times.

Possible fixes:

(1) Recommend that clients do discovery with securityMethod
constrained to None.

Plus: relatively clean

Minus: They currently don't, so legacy clients will break; also, at
least with current RegTAP, query patterns are ugly, and given
securityMethod is 0..n in interface, a relational mapping won't be
terribly pretty whatever we do.

(2) Drop role="std" from interfaces requiring authentication

Plus: simple, straightforward query pattern that was implicitly
recommended before (although RegTAP in effect discouraged the
practice).  No new features.

Minus: Fishy, because using auth doesn't make a TAP service non-TAP.
Also, current clients probably don't all constrain @role, either.

(2a) Actually, one of the reasons RegTAP discouraged the use of @role
in discovery is that VOResource lets people write   role="std:foo"
and it's still supposed to be standard.  We could, of course, say
role="std" means exactly "unencumbered access to a standard
protocol" (and that's how they should be discovered), whereas, say,
role="std:encumbered" means that it's a standard interface that has
extra strings attached.  I'm not sure what other std:X terms we could
see after that, but that's not necessarily a disadvantage.

(3) Drop interfaces requiring authentication at the searchable
registries

Plus: Kinda makes sense, because at least right now you'll have to
get credentials anyway and hence perhaps don't need to discover the
services, learning about the access URLs when receiving your
credentials.  It's also simple to implement, and works flawlessly
with existing clients.

Minus: What happens if once we have auth that works across services
and people actually want to discover services they may have access
to?

Opinions?  Other alternatives?

We need to work out this stuff very soon -- some services requiring
some sort of authentication are online already, and current clients
have can have fairly ugly failure modes on them depending on what
technology is in use.

PR-TAP-1.1 changes 2: multiple capabilities
-------------------------------------------

This is orthogonal to problem 1: TAP 1.1 says there should be
separate capabitites for sync and asyc.  The standard query above
will then return each set of interfaces twice.  Also, at least I will
declare a TAP 1.0 capability for quite a while in addition because
otherwise my service will disappear from legacy clients.  In sum,
you'll have

<capability standardID="ivo://ivoa.net/std/TAP">
  ...

<capability standardID="ivo://ivoa.net/std/TAP#async-1.1">
  ...

<capability standardID="ivo://ivoa.net/std/TAP#sync-1.1">
  ...

as in the PR-TAP 1.1 example.  In addition, I suppose each of these
capabilities should contain all TAPRegExt metadata, which then is
expanding to quite a mouthful of XML on top.

I think we all agree that all these capabilities should be shown in a
user interface as one single service.  How, then, does a client know
which capability to choose as representative, provided it is smart
enough to group by ivoid (which is already a bummer for backwards
compatibility)?  Can we come up with a query pattern that is all of
simple, transparent, backwards compatible, and workable in the
general case?

Since 

(a) these are fairly tricky questions, the answers to which I'd like
    to see proven in practice,
(b) we have the can of worms that is authenticated services and their
    relationship to discovery already for this standard, and
(c) this is a point release, so we shouldn't really do anything that
    breaks legacy clients (including discovery),

I wonder: Is there anything we need to have the split sync/async
capabilities in TAP 1 for?  Is it urgent (e.g., I'm not sure I find
the ability to declare two different limits on the two endpoints
proportional to the impacts the whole thing has)?  Are there
alternatives?

If not, then, frankly, I'd much rather keep a single capability with
the old ivo://ivoa.net/std/TAP standardID.

Yes, its form is not compliant to the new-style standardIDs that we
want to have in the future, but then TAP 1 predates Identifiers 2,
and changes of the standard id are, given what TAP 1 said, at best
borderline within a major version series (unless you've planned for it,
of course, as with the new-style standard ids).

Cheers?  Boos?

         -- Markus

[1] see
http://ivoa.net/documents/Notes/discovercollections/20161111/PEN-discovercollections-1.1-20161111.html
if you're wondering what that is