Discovering Data Collections

Markus Demleitner msdemlei at ari.uni-heidelberg.de
Tue Aug 11 15:51:18 CEST 2015


Dear Registry WG,

You may remember my talk on discovering resources that are part of
other resources -- currently, the main use case is tables in TAP
services -- in Sesto.

As it is a somewhat cross-REC business, I thought I should write a
Note on it.  I've drafted one now, and I'd very much welcome your
input on it before submitting it to the document repository.

A formatted version is on http://docs.g-vo.org/discovercollections.pdf
-- sources are in volute at 
https://volute.googlecode.com/svn/trunk/projects/registry/discovercollections/
and as usual I'm happy if people do changes there (or in volute's
replacement, when volute goes r/o).

I'd like to make this an Endorsed Note if these actually happen, or a
REC if it turns out that'll be needed; so, if you feel that's a
genuinely bad idea, this would be the perfect time to protest before
things move ahead any further.

If you think this is a great idea and what to help, I of course
welcome co-authors.


Also, I already have a question I'd welcome opinions on:

In a preliminary test with TOPCAT, Mark Taylor found it very much
necessary to be able to get from the "auxiliary" resource to the
"main" one, e.g., from a table to the embedding TAP service. Mark,
for instance, wants the title, the ivoid, and the number of tables
served for the main service (which he can get if he has the ivoid).

The current draft doesn't take this properly into account -- you can
go this way using the served-by relationship from the auxiliary to
the main resource, but that's only a SHOULD, and I've not really
planned for any important functionality to be built on it.

To satisfy Mark's requirements, I'm considering two solutions right
now (other proposals are of course highly welcome):

(a) Make the existing served-by relationship mandatory

Plus: it's a minimal change; relationship was pretty much created for
this kind of thing; it works already (Mark already has code that does
what he wants) -- but that's mainly because we mainly have "tame"
resources so far.

Minus: the relationship element is lexically quite a bit removed from
the capability, so people might forget it, and there's no way to
enforce its presence via a schema; the relationship really isn't
between resource and resource but really between capability and
capability (think of an SCS service, the table of which is queriable
by TAP -- is it proper to say the SCS service is served-by the TAP
service?); doesn't really work reliably when there's multiple
auxilary capabilities (see below on this).

(b) Make an AuxiliaryCapability type in a little registry extension
that has an ivoid-valued mainService attribute.

Plus: All info (i.e., accessURL  and ivoid) regarding the main
service is in one place; its presence is enforcable by XML schema;
semantically very precise ("look here for the service *this* is in");
hence, works with multiple auxiliary capabilities.

Minus: Major change, we'll probably need a proper REC for our schema;
very close in purpose to relationship; sucks a bit in RegTAP, as it
will end up in res_details, where value isn't case-normalised, which
IVORNs should be.


Any opinions you have on that are welcome.

Me, I'm leaning fairly strongly towards (b), quite a bit stronger
than before I started with this mail.  The main reason is the case of
multiple auxiliaries.  I already have that -- for instance,
http://dc.g-vo.org/oai.xml?verb=GetRecord&metadataPrefix=ivo_vor&identifier=ivo://org.gavo.dc/maidanak/res/rawframes/rawframes
has both an aux SIA and TAP capability.  There's no sane way a client
can pick the right service from servedBy.

And that case will become more important as we'll have, at least
temporarily, multiple versions of endpoints, TAP 1.0 and 1.1, say, or
SIA 1.0 and 2.0, you name it.

Cheers,

          Markus



More information about the registry mailing list