datatype values in TAP_SCHEMA.columns

Markus Demleitner msdemlei at ari.uni-heidelberg.de
Wed Jun 4 23:53:37 PDT 2014


Dear DAL,

On adql:VARCHAR vs. VARCHAR in TAP_SCHEMA:

On Wed, Jun 04, 2014 at 03:54:25PM -0700, Patrick Dowler wrote:
> I take it that the "non-prefix" argument is that the last column
> "database column type" is specifying what goes in TAP_SCHEMA, but
> there is some evidence that that is not intended. First, I don't

My take here is that at best it's up to exegesis.  Which would mean:
right now, there no *real* right or wrong, but I believe on reading
of the TAP text as it is, most people would choose the non-prefix
version.

> intended to be used in TAP_SCHEMA. However, the large number of
> blanks (blank here to indicate that xtype is not needed for most
> datatypes in an uploaded table) is not supposed to imply that
> TAP_SCHEMA.columns.datatype is blank :-) I think the trouble came

There's been a lone adql:VARCHAR in that column at one point or
another in the document's history, as I discovered when prompted by
Mark's question I tried to figure out why I had the prefixes in; I do
not know who put it in or who removed it [hint: this is an
advertisment for using version control in standards development] or
why.  But that would support Pat's hypothesis.

> So that looks like a mess. The prefixes definitely go into the xtype
> attribute and the VOSI-tables output. Whether the spec says they do

Well, if it's a mess, let's clean it up.  For xtype, things are
clear.  But what do you mean with VOSI tables?  None of the strings
allowed in VODataService datatypes have any colons in them, neither
TAPType nor VOTableType.  Am I missing something?

As far as TAP_SCHEMA goes, I think we should be asking ourselves why
we expose types there.  And I think the answer has to be: To indicate
to clients what operations are supported on references to these
columns.  Given that TAP results come in VOTables (unless people void
their warranty by ordering something else), they are definitely not
necessary to interpret of the results.  I don't even think the types
are terribly useful in data discovery.

This means: we don't need 100% precision in the type descriptions
Thus, I submit we can allow some information loss in going from
actual database types to what's in TAP_SCHEMA -- in general, it's
basically ok if clients can figure out whether sin(col) will work
or if col || 'foo' will likely be a good idea..

Additionally, TAP_SCHEMA is (presumably) shared between different
languages on the same TAP server (I, for one, am tempted to allow in
subsets of postgres, e.g., for WITH).  It hence would seem unwise to
hard-code ADQL types in TAP_SCHEMA, in particular if they aren't
really ADQL types in the first place.  

Thus, I'd argue we should drop the prefixes and just use "generic"
SQL types (while not disallowing stuffing any old junk in there
("NETMASK", "BOOLEAN"), as currently consumers of this are humans
anyway).

That this plays right into my less-colons preference is, of course, a
matter completely unrelated to the argument...

Cheers,

         Markus



More information about the dal mailing list