RegTAP 1.1 PR

Markus Demleitner msdemlei at ari.uni-heidelberg.de
Wed Sep 5 08:48:05 CEST 2018


Hi Mark,

On Tue, Sep 04, 2018 at 03:51:02PM +0100, Mark Taylor wrote:
> On Wed, 1 Aug 2018, Markus Demleitner wrote:
> 
> > I've uploaded a Proposed Recommendation for RegTAP 1.1 to the
> > document repository.  Here's the changelog since WD-20171206:
> 
> I have read through this (PR-RegTAP-1.1-20180731).  The document
> is generally in good shape, but I found a few issues of varying
> degrees of triviality.  Some have been inherited from RegTAP 1.0.

Thanks for reviewing.

> Sec 8.1:
>    "... where the canonical prefixes are used."
>    It might increase readability to add a reference to Section 5 here.

Added one.

> Sec 8.2:
>    The text states that the base_role column can assume one of the
>    values "contact, publisher, contributor or creator", but the
>    tabulated description of this item mentions only "contact, publisher
>    and creator".

Fixed, thanks.

> Sec 8.7:
>    The tabulated descriptions of the "ucd" and "std" items say
>    "parameter" when they should say "column".

Fixed.

>    "The following columns MUST be lowercased during ingestion: 
>     ivoid, name, ucd, utype, datatype, type_system."
>    - should extended_type be added to this list?

Uh.  Skeleton in the closet.  Several people (including everyone
running DaCHS) have used extendedType to store VOTable xtypes, but as
far as I know[1], nobody bothered to declare the extendedSchema that
should go with this.

Here's what VODataService has to say on the pair:

  More descriptive information about the type can be provided via
  extendedType and extendedSchema, which provide an alternate data
  type name. It is expected that this name will only be understood by
  a special subset of applications. The name given in the element
  content, then, represents a more commonly understood "fall-back"
  type.

Hm.  Does anyone remember what the original plan has been, in
particular regarding extendedSchema?

As for xtypes, a quick grep in DALI (where I think we should define
all our xtype values) doesn't yield anything about the case
sensitivity of xtypes, and in VOTable 1.3 I can't see anything
normative either.

This would lead me to assume that VOTable xtype is case-sensitive
(not that we should exploit that).  That alone makes me doubt
case-folding here should be done lightly.

I've added a todo.  In the context of VODataService 1.2, we'll have
another look at extendedType; let's hope we'll see clearer then.

> Sec 8.12:
>    "The content of incoming date/@type attributes must be normalized 
>     according to the rules laid down in sect. 4.5 before lowercasing."
>    - should that read "date/@role"?

Of course.

> Sec 9:
>    The documentation of ivo_hasword and ivo_hashlist_has ought
>    (like ivo_nocasematch) to make explicit that the return value
>    is 0 in the case of no match.

It does now.

> Sec 10.3:
>    ivo_hashlist_has arguments are the wrong way round.
>    "1=ivo_hashlist_has('infrared', waveband)" should read
>    "1=ivo_hashlist_has(waveband, 'infrared')"

Ouch.

> Sec 10.12:
>    The description tails off in mid-sentence.

I've completed the sentence, but somehow feel I should add some
warning about being careful as to which side of the join one takes
metadata from.
> 
> Appendix C:
>    This appendix looks like it's up for removal, so my comments 
>    are pretty ignorable.  However in case it stays: I think the 
>    recommended VARCHAR(4) type for interface.query_type is inadequate, 
>    since the content is a hash-separated list, so could be, 
>    e.g. "get#post" (or "get#post#head"??).

True.  But that's not easy to fix, and since nobody came forward with
a wish to keep it, let's pretend it's already gone.

> Appendix E.1:
>    "inline XSLT utype maker" -> "inline XSLT utype marker"?

No, this is referring to the XSLT that makes utypes from XSDs.

> I have one other suggestion on readability: the information from
> the "the following columns MUST be lowercased during ingestion"
> boilerplate in sections 8.* might be easier to digest if it
> appeared as some sort of flag in the tabulated field descriptions
> rather than in the text (e.g. marked "string/lc" rather than "string").

I see, and I kind of agree -- the trouble is that the tables are
generated from TAP_SCHEMA, and that doesn't have this information.

But now that I think of it, having the information on case folding
would be handy in TAP_SCHEMA anyway, so perhaps I should just add a
"(lowercased)" to all forced-lower columns?  Compact, reasonably
non-ugly, probably helpful.

It's not in yet, though -- perhaps someone has a better idea?

> What I haven't done is upgrade Taplint to be RegTAP-1.1 sensitive.
> I probably should do that at some point.

I solemnly vow to not even think of going for RFC before you have...

The changes mentioned in the mail are in Volute rev. 5129.

        -- Markus


[1] That's based on a query

select distinct extended_schema from rr.table_column

-- which is empty.  People might still have extendedSchema in their
VOSI endpoints, though.


More information about the registry mailing list