VOSI availability

Markus Demleitner msdemlei at ari.uni-heidelberg.de
Tue Jun 15 13:52:05 CEST 2021


Dear DAL, dear GWS,

[I'd suggest followups to DAL only]

In order to take some steps to remedy the problems pointed out in
http://ivoa.net/documents/caproles/, I'd like to tackle the problem
of the VOSI availability endpoint, because I think it's the clearest,
simplest, and the one with the least legacy to keep in mind.  

To recapitulate, the problem in short is: availabilities of different
endpoints (e.g., TAP vs. SSAP, auth vs. unauth, primary site vs.
mirror) can vary independently as machines, network components and
support services fail and get fixed.  In contrast, the way we've been
telling people to declare VOSI a10y was a single VOResource
capability, and thus there is just one a10y per resource.

In particular in the case of primary site vs mirrors that essentially
defeats the purpose of a10y.

Now... how do we fix this?

I see three possible ways forward:

(1) drop a10y from VOSI.  The a10y endpoint has now been defined for
more than 10 years, and I'm not aware of any interoperable client
meaningfully consuming with it.  If I look at the a10y endpoints not
directly attached to a TAP service (use

  with avids as (
	  select ivoid from rr.capability where standard_id like '%availability')
  select ivoid from avids
  where not exists(
	  select 1 from rr.capability as c
	  where standard_id='ivo://ivoa.net/std/tap'
		  and c.ivoid=avids.ivoid)

on http://dc.zah.uni-heidelberg.de/tap), this list is dominated by
some stubborn DaCHSes (and don't look at how they produce their
availabilities...) and a few early Astrogrid services (with
standardIDs of ivo://org.astrogrid/std/vosi/v0.3#availability).

That would suggest to me that there's no really strong
interoperable use case for a10y *that's realistic to attain*.  

Sure, as a service operator, I'd like to be able to say something
like "Upgrading database cluster; expect to be back at 10 UTC" or so
when I have to take my box offline, perhaps even "This thing is gone
forever; use that thing instead".  But interest in this on the side
of client writers seems to be rather limited, and whether there's
user interest for it I doubt (they'll go for "just keep it running,
thank you" any day).

I'm told CADC is using availability in internal processes -- but for
that, we don't need to burden our standards, and we don't need to
accept the disgrace that 10 years into VOSI's declaration that
all services MUST implement VOSI a10y, only about 400 out of about
23000 actually do (and the rest are thus in violation of a REC).


(2) Go capabilities with a10y.  We get away with declaring the VOSI
capability endpoint using a VOResource capability element because it
returns all info for the entire resource.

We could take this principle to a10y by having it return a mapping
(access-url -> human-readable-message).  This would work for a single
service with mirrors; it would even work when a big data centre has
that mapping for all of its services in one file in an off-site,
high-availability server, and thus could still provide meaningful
info even if their reverse proxy's parallel port is on fire.

While I'm not overly happy with calling whatever that a10y endpoint
(well, the static file) is a "capability" of a service, at least
logically there's nothing wrong with attaching this as a capability
with a standard id http://ivoa.net/std/vosi/availability2.  I'd
implement that for DaCHS in some way, I think, but someone else would
have to actually write the spec and put it into VOSI.


(3) Change VOResource's MirrorURL and AccessURL to include an
availabilityURL attribute pointing to your old VOSI 1.1 a10y
document.  You'd then write something like

  <capability>
    <interface xsi:type="vr:WebBrowser">
      <accessURL use="full"
        availabilityURL="http://big.iron/g-av/wirr-q-ui-fixed.xml"
        >http://localhost:8080/wirr/q/ui/fixed</accessURL>
      <mirrorURL use="full"
        availabilityURL="http://mirror.big.iron/g-av/wirr-q-ui-fixed.xml"
        >http://mirror.g-vo.org/wirr/q/ui/fixed</mirrorURL>
    </interface>
  </capability>

We'd have a minor change in VOResource for that. I'd volunteer for
guiding that change through the standards process.


So... What do we do?  I don't have a major favourite among those;
given I've never *much* missed a10y, I think my first preference
would still be (1).  I would seem that (2) gives us what I think the
one plausible use case is ("In case of failure provide a meaningful
error message"), so that's nice.  (3) has the clear advantage that we
don't need to touch VOSI (except to say the thing with having the
VOSI endpoints in capabilities was a bad idea, but that we must do
anyway).  However, given that I think the typical mode of VOSI a10y
is static files on a rock-solid, ultra-plain web server, having to
potentially keep thousands of files updated when one of reasonable
size (about 100 bytes per endpoint) would do isn't what I'd call an
obvious solution.

What about you?  Is your order of preference (1), (2), (3), too?  Do
you have other strong use cases for VOSI a10y that I've missed so
far?  Do you see other alternatives?

           -- Markus


More information about the dal mailing list