VO-DML registration

Markus Demleitner msdemlei at ari.uni-heidelberg.de
Mon Jan 23 11:39:43 CET 2017


Hi Omar,

[restricting to DM to reduce cross-posting; I've said as much over on
Registry]

On Fri, Jan 20, 2017 at 10:15:51AM -0500, Laurino, Omar wrote:
> well. It is not clear to me if your proposal implicitly allows this or not.

Ok, attached I'm amending Friday's proposal so it explicitly allows
this and makes some extra provisions for that.

> This is relevant for, but not limited to, this use case (Let's call it UC1):
> 
> > * Ascertain that a chosen prefix is not being used by another data model
> 
> 
> More generally, how would I search for all registered data models so to
> ascertain that a prefix has not been taken yet?

Oh -- I just notice that the careless author of RegTAP has decreed:

  Some complex metadata -- tr:languageFeature or vstd:key being
  examples -- cannot be kept in this [rr.res_detail] table. If a
  representation of such information in the relational registry is
  required, this standard will need to be changed.  [PDF p. 28]

Well, with VOResource 1.1 I'll touch RegTAP anyway, and the present
proposal would be a clear discovery case for vstd:key, so we'd just
be defining an additional table:

rr.key: ivoid, name, description

Then, it's just a matter of

SELECT
  ivoid, description
FROM
  rr.key
WHERE
  rr.name='vodml-prefix'

I'm not altogether happy with (ab-) using "description" for what's
obviously values, but as long as that's good enough[TM] I'd consider
that preferable to adding standards clutter.

The alternative would be to define an actual resource record type for
data models.  Doable, but then we'd have to start collecting use
cases fast.  Just doing it because we don't like "description"
doesn't seem proportional to me.

> The current approach is that DMs should be versioned so that backward
> compatible fixes and updates result in a new minor version, without a
> change of prefix, while major changes should result in a new major version
> and a new prefix.
> 
> Matched against your proposal this means that for each version (minor or
> major) one should create a new URI and add a registry entry, right? There

I am assuming that there is always a bijective mapping between
standard, prefix, and URI.  The VO-DML file itself might still
change, just as envisioned by Paul's schema versioning note (which is
in RFC right now -- check it out if you haven't yet:

http://wiki.ivoa.net/twiki/bin/view/IVOA/XMLVersRFC )

I believe the considerations made there largely apply to VO-DML, too.
This, in particular, means that DM URIs may not have minor versions
in them, at least if they have some operational meaning to clients
(which I'm not certain of).  Otherwise, if you change the DM URI,
you'd be breaking a client, which you're not allowed in a minor
update.

> is some ambiguity here, because the proposal talks about data models, but
> then in the example and in this sentence:
> 
> > It does not allow the automatic resolution between prefix and DM URI
> > from Registry data when a standard defines multiple data models.
> 
> 
> it looks like an entry corresponds to a standard document, which in turn
> might define multiple DMs. In that case you would have to define multiple
> prefixes and multiple URIs for the same entry, and the entry would fall
> short on both the main use cases. To me, unless I am missing something in

No -- you just have multiple prefix and uri definitions.  The RegTAP
table given above will then look like this:

ivo://gavo/std/example  vodml-prefix foo
ivo://gavo/std/example  vodml-dmuri http://www.g-vo.org/xml/foo-1
ivo://gavo/std/example  vodml-prefix bar
ivo://gavo/std/example  vodml-dmuri http://www.g-vo.org/xml/bar-1

As long as you don't need to work out which URI belongs to which
prefix, all is fine.  Going from URI to prefix, incidentally, should
be simple because, IIRC, dmuri should resolve to the document.

Going the other way looks hard.  If you absolutely need to do this
using the Registry, we need to brainstorm.

> your proposal, it would seem more natural and useful to have a clear
> mechanism for mapping URIs to VODML descriptors in a 1-to-1 fashion in the
> registry. I don't know enough about the registry to formalize this in a

-- this would in effect require something like an extra table for
vodml registration in RegTAP.  If we go that way, I'd like to have an
extensible mechanism that would work for other record types facing
similar problems; actually, StandardKeys is such a similar problem,
so we even have something additional to try such a solution.

> Somewhat related to this, we should probably add a URI to the Model type in
> VODML. A VODML description document should be aware of its URI, analogously
> to what happens with XML schemata and namespaces. So, the URI should be

Yes!  Yes!  I think I've said as much in one of my RFC comments.

Actually, now that I think of it, the other thing that needs to go in
there to make it useful is the base URI of the HTML documentation
mentioned.  Again, the registry can't easily do that for you (as that
would require a sequence of complex types, which is always a
pain^Wextra table in a 1NF relational database), but in the VO-DML
file it's natural.

> To summarize, in my opinion we need to make sure that:
>   * models are searchable in the registry, so that UC1 is fulfilled.

True with the current proposal, at least with the RegTAP extension
mentioned above, which could have a reference implementation almost
immediately, and could be in REC by this time next year.

>   * URIs can be resolved to the VODML/XML descriptors, possibly for all
> versions and certainly for all major versions, so that UC2 is fulfilled,
> given the current agreement on prefixes and versions.

What you're saying is that something like
foo:something.else.altogether should be resolvable to
http://www.g-vo.org/xml/foo-1#altogether (or similar) by Registry
means alone?  That doesn't work in the current design.  I'd say
there's little need for that, however, given that such a VO-DML
reference should usually come from a serialised document that has a
Model declaration, mapping prefix to URI anyway.

>   * it should be possible to register custom models and extensions.

Check.

>   * we need URIs to be part of the descriptors, by adding an identifier
> field to the Model type.

Agreed, but not a Registry matter.

> I claim some significant ignorance when it comes to the registry standards.
> As far as I can tell you can probably do all of the above without requiring
> a custom extension. The question to me is more whether it's more convenient

No, not as far as going from prefix to URI is concerned unless I'm
missing some nifty hack.  Perhaps one could shoehorn some existing
elements (relationship comes to mind) to somehow hack things, but I
guess you won't get around an extension if this is something you want
to solve using the Registry.

The quickest way to figure this out for database-savvy users is to go
to a TAP service with the rr schema (e.g., http://dc.g-vo.org/tap)
with your favourite TAP client and browse that schema.  If you find
an appropriate table, the rest should be fairly straightforward.


> to introduce a simple extension rather than relying on the standard
> Resource and patch a solution together, e.g, to reduce potential clutter in
> the registry. One might try and decouple the registration from the VODML
> standard per se, but I believe the way URIs are defined could depend on the
> selected approach (e.g. a possible approach, not necessarily one I would
> advocate for, might be to to assign a unique URI and registry entry to a
> major version of a DM, while minor versions might be identified by
> fragments inside that document...).

Well, ideally the URI conventions/rules (in particular as regards the
relationship of the DM URI to a URI to pull the VO-DML from, and the
URI the HTML is under, and then the question what happens when new
minor versions come out) should be fixed by DM/the VO-DML authors
based on what makes sense for them.  Registry will then have to cope
with what you need (from Registry, which of course can't solve all of
these difficult problems; see, again, Paul's XML schema versioning
Note).

If what's attached would already work, that'd of course be great.

        -- Markus
-------------- next part --------------
I propose to replace the current section 5.1 with:


-----------------------8<------------------------------
5.1 IVOA-standardized data models

A data model specified in VO-DML can be endorsed by the IVOA to become a
standard data model.  A standard data model consists of at least three
artifacts:

* A standard text adopted by the IVOA according to the rules laid down
  in DocStd \citep{2010ivoa.spec.0413H}.  This must at least discuss 
  use cases and the general design of the model.  The level of detail to
  which individual data model items are discussed in the standard text
  is up to the authors.  The authoritative source on these details
  always is the VO-DML source.
* The data model itself, written in VO-DML.
* A detailed documentation in HTML format containing human-readable 
  definitions for all elements of the data model, formatted in HTML and
  furnished with HTML-accessible anchors (a/@name or @id attributes) for
  the vodeml-refs contained in the data model.  The intent is that the
  data model URL with an element's vodml-id as a fragment identifier
  will lead to element-specific documentation.  It is recommended to
  generate this HTML document from the VO-DML using the vo-dml2html.xsl
  script available from the IVOA document repository.

When a standard data model reaches the status of Proposed Recommendation, 
the VO-DML document and the HTML are made available in the IVOA document
repository at their final URLs.  They may, however, be modified there
without further notice until the document reaches REC status.

At the same time, a StandardsRegExt \citep{2012ivoa.spec.0508H} document
for the standard is uploaded to (or, for updated models, updated in) in
the Registry.

In addition to the usual StandardsRegExt metadata, registry records for
standard data models define namespaces and prefixes for all data models
defined by the standard in StandardKey elements.  For this purpose, this
standard defines two key names:

* vodml-prefix -- the description child of the key contains a prefix
  defined by the model.
* vodml-dmuri -- the description child of the key contains a data model
  URI defined by the model.

Note that this allows Registry clients to support use cases like:

* Locate the specification for a data model based on either prefix or
  URI
* Ascertain that a chosen prefix is not being used by another data model

It does not allow the automatic resolution between prefix and DM URI
from Registry data when a standard defines multiple data models.  That
use case does not appear significant enough to warrant an extension of
the existing registry infrastructure.

A sample record for the present standard (which does not necessarily
match what the Registry actually contains at any point) is given in
appendix
[http://volute.g-vo.org/svn/trunk/projects/registry/StandardsRegExt/samples/VODML.xml].

5.2. Other registered data models

Data providers can register their application-specific data models
without going through IVOA specification.  In that case, only two
artefacts have to be publicly available, perferably on a web server
under the provider's control:

* The data model itself, written in VO-DML.
* A detailed documentation in HTML format containing human-readable 
  definitions for all elements of the data model, formatted in HTML and
  furnished with HTML-accessible anchors (a/@name or @id attributes) for
  the vodeml-refs contained in the data model.  The intent is that the
  data model URL with an element's vodml-id as a fragment identifier
  will lead to element-specific documentation.  It is recommended to
  generate this HTML document from the VO-DML using the vo-dml2html.xsl
  script available from the IVOA document repository.

As in 5.1, the party defining the data model constructs a
StandardsRegExt registry record defining vodml-prefix and vodml-uri
keys; in case the custom data model really is not accompanied by an IVOA
note (which we discourage), use \code{n/a} as
\xmlel{endorsedVersion/@status}.  This record can be uploaded to any
publishing registry \emph{except} the registry of registries.


------------------8<--------------------------------


More information about the dm mailing list