[utypes] Versioning, open issues

Markus Demleitner msdemlei at ari.uni-heidelberg.de
Wed Feb 8 09:41:10 PST 2012


Hi Omar,

On Wed, Feb 08, 2012 at 11:09:50AM -0500, Omar Laurino wrote:
> I am wondering whether you received the email in which I presented the
> draft for use cases, requirements and offered an example of how my
Uh, I have to admit I just discovered
http://www.ivoa.net/pipermail/utypes/2011-December/000019.html and
suddenly understand much better what Mireille didn't like.

I have to say I largely agree with her.  In particular, I don't think
we want versions in the utype strings, neither before nor after the
colon.

As to use cases, I'd like to see more trivial ones added, too; most
importantly, I'd like to see basically this:

"""
Applications exchange metadata on positions (reference system,
reference position, epoch and equinox, etc) or similarly structured
physical quanities (magnitudes, radial velocities, etc) and can
reconstruct complete instances of classes defined in the data model
from VOTables.  Applications that do not understand a data model 
easily preserve such data and metadata.
"""

The latter point might better be put down as a requirement.

Is your work in the SVN?  should I add stuff like this there?

> In any case, let's call "The Thing" the stuff in front of the colon. My
> question was whether The Thing should be universally defined for all the
> instance documents or dynamically defined in the instance itself.
Universally defined.  Otherwise utypes aren't opaque strings, which I
think would be bad (that property, and [from my view, unfortunately]
case-insensitiveness, have been "requirements" for utypes for a long
time; I don't think we can drop them now, even if we wanted).

>> So, we should *not* have versions in the data model
>> prefix.
> 
> 
> Having one string in one place ("dm=char-1.0") is equivalent to having two
> strings in two places ("dmname=char", "dmversion=1.0") and it requires the

Not exactly.  Having it in two places makes it easier to ignore it
when you don't care.  Since I still maintain that most clients will not
and should not care about the data model version, this makes a big
difference.

Example: When I interpret an ObsTAP result, my client may only need
to know the footprint.  So, it looks for a column with the utype

obscore:char.spatialaxis.coverage.support.area

and is done.  The string can be hardcoded, but the client will still
do about the right thing even as obscore evolves.  Pulling a number
from thin air, I'd say >95% of the clients would be fine with such a
scheme (plus maybe an optional version check as discussed below), and
we'd not need to worry too much about amending data models.

I don't *quite* buy that the likelihood of silent failures on data
model changes with non-version-aware clients is noticeable, in
particular not compared to the real-world glitches we'd still have
even in a best-of-all-worlds VO.

> when parsing the file. So I am happy with any solution that lets the client
> know these two pieces of information. In either case, my question stands:
> is The Thing universally defined (if yes, where? Is there a list of IVOA
> Things somewhere?)
There's no such list.  It would be nice to have one, though, and I
could well imagine keeping a list of ivoa-sanctioned data model names
in the registry.  If we agree to have one, it would be easy to
maintain it as a StandardKeyEnumeration in the utype standard's
registry record.

> > name DataModel in each data model; in there there's an item URI
> > giving the data model URI (i.e., you can figure out the data model
> > URI by checking the value of the <modname>:DataModel.URI utype.
> >
> >
> You really want to require a 1-client-2-servers-2-stages operation for
> getting the data model version? (ask for resolution of the URI, then
No -- for figuring out the data model version you don't need any
network access at all.  You can simply compare the mod:DataModel.URI
values, and that's it.

I'd not be opposed to adding a mod:DataModel.version item if you want
easier access to the version.

Operations involving the registry would be exclusively for "Learn
more about this data model"-type operations directed at the user.
Requiring network access in such situations is IMHO reasonable.

Of course, you could add as much information under DataModel as you like,
except I think it should be terse since this information *should* be
repeated in any "utype context", which would include every group
using utypes...

> both options are functionally equivalent. I guess in the end it is a
> trade-off between self-consistency and compactness.
...as you rightly said.

> You didn't comment much on this part, so I guess we can happily iron out
> the non-functional details and get the document done.
Well, for me the mail body
http://www.ivoa.net/pipermail/utypes/2011-December/000019.html mainly
is about a data modelling process that integrates utypes -- right?
As you say, it largely wouldn't be part of the utypes REC but rather be
referred to by it.

As to utypes, I don't have much of an idea how far along we are in
terms of reaching a consensus.  The fields I'd most like to hear
people's opinions on are:

* Generation of utypes (how do we get from some DM to utypes?)
* Versioning (how do clients that care figure out the DM?)
* Typedness (is it ok to only talk about strings in the utype REC?)
* Serialization of structures of utype-value pairs in VOTable
  (e.g., would it be cool to push GROUPs as a means for embedding?)
* any other serializations (e.g., do we want to recommend a means for 
  binding together utypes and FITS header keys?)


Cheers,

        Markus



More information about the utypes mailing list