param-link-to-field (was: Datalink data access services)

Fri Feb 14 02:28:04 PST 2014

Hi Pat, hi DAL,

On Tue, Feb 11, 2014 at 11:10:06AM -0800, Patrick Dowler wrote:
[reordered, abridged]
>> On 12/12/13 05:04 AM, Markus Demleitner wrote:
> >     <FIELD name="obs_publisher_did" ID="datalinkID"
> >       utype="obscore:Curation.PublisherDID"
> >
> >     <PARAM name="ID" arraysize="*" datatype="char" ucd="meta.id;meta.main"
> >       value="">
> >        <DESCRIPTION>The pubisher DID of the dataset of interest</DESCRIPTION>
> >        <LINK content-role="ddl:id-source" value="#datalinkID"/>
> >     </PARAM>
> >
> Is the ref in FIELDref explicitly defined to be an XML ID value in
> the document? Is a LINK value with a fragment explicitly an XML ID
> value? Or are one or both just conventions we are adopting?

The VOTable spec is not clear on what FIELDref's ref actually is. The
(non-normative) Schema says it's of type xs:IDREF; funnily enough,
xmlschema-2 then says on this that the "value space of IDREF is the
set of all strings that match the NCName production" -- i.e., schema
doesn't require referential integrity here.  This may be for the
better, as I've never quite wrapped my head around the suggested
interaction between XML ids and VOTable IDs.  Anyway, it's clear that
the intention is that it's a reference to *some* id within the
document.

For LINK, ahem, I just realize a bug in my proposal.  The #datalinkID
should fairly certainly be in LINK's href attribute rather than in its
value attribute.

With that, the VOTable XML schema would want us to have an AnyURI,
which "can be absolute or relative" (xmlschema-2).  I suppose we're
fine there with our fragment identifier (I've not digged down to the
actual schema definition, so I'm not 100% sure at this point).  The
question of what makes up a fragment in a concrete XML or SGML
instance is IIRC up to what's called application in these circles
(HTML or VOTable or DocBook or what have you); we're free to say for
VOTable, that's defined by ID, but since the VOTable REC doesn't say
anything about that, I suppose this counts as a convention at this
point.

> Is there any particular significance *right now* to the
> content-role="ddl:id-source"?  Is this to allow for other LINK(s) in
> there that are not so marked? Is this to enable some semantic magic?

It is, indeed, to be explicit about this link's role; I do not think
it would be wise to clobber the entire LINK element for children of
this particular parameter.

Clients might use it to discover the parameter in which to pass the
ID, but I am convinced we should recommend to just assume the
parameter called ID is just that and to refuse the operation of a
datalink service where no such parameter is present (dream: compare
names case-sensitively -- aahhhh).

> We are talking about a set of PARAM elements inside a RESOURCE with
> type="service" ... are there any other types of PARAMs than input?

The access URL, for one, would come in a param in my scheme.
Possibly other stuff that operators want, maybe for custom services.

For services having input parameters that happen to be STC roles
("minimal RA", say), there'd be PARAMs within the STC declarations,
too, if (bambi eyes) we agree to warmly recommend to helpfully
declare STC metadata.  Of course, these are nicely stowed away within
a group, too, but I take that as an indication that GROUPing things
that are related will help in the evolution of the standard.

> Are we intending to allow someone to describe the output parameters
> of the service as well? If not, do we really need the GROUP of input
> params?

On the need: after what I've said above, I'd say yes.  Output
parameters: You know, we're experimenting here with making Splat a
datalink client, and the fact that, with such a service, the client
only has a fairly hazy notion of what it is it'll get back from a
service is ugly.  So, I'd be all for a PARAM giving a list of MIME
types that can come out of a service, for the sake of simplicity (web
things probably know how to parse this) maybe as shown by HTTP
content negotiation.  But maybe that's becoming too complex.

But the bottom line: Yes, I believe at some point we may want
additional service metadata, and some of that may most conveniently
go into PARAMs.  I'm not quite as sure if I like the notion of
"output params" in this context, but maybe I just don't see an
otherwise obvious use case here.

My take on this is we need the client writer's input on this; the
fact that it's far harder to be sure what's coming back is IMHO one
of the two main differences between the old (and I contend somewhat
proven) getData hack and the Datalink server-side processing
proposal, so it's entirely possible more thought is needed here.

Cheers,

          Markus