[TAP] data type for column metadata

Gerard gerard.lemson at mpe.mpg.de
Wed Mar 25 03:16:23 PDT 2009


Hi Pat 

> DOUBLE - also an ADQL reserved word
> SMALLINT - also an ADQL reserved word
Yes, these I forgot to add in my mail. I had them in the table on the wiki.
> 
> BINARY - not an ADQL reserved word
> VARBINARY - not an ADQL reserved word
> 
I have no experience with these. 
(how) is varbinary different from BLOB?

> Plus we would have to add at least one construct for region, 
> maybe two:
> 
> POINT     aka coordinate system + coordinates
> REGION  aka coordinate system + some shape
> 

> I think we need both types because some functions (COORDSYS, 
> COORD1, COORD2) take only a POINT as an argument. The other 
> geometric functions are still useful if we consider them as 
> utility functions that create these two types of things.
> 
> For simplicity, we could make a shorter list and avoid some 
> of the intricacies (for the first version, anyway). From the 
> above types, I have made lots of use of the ones that map to 
> primitive types as well as VARCHAR, VARBINARY, and TIMESTAMP. 
> I have never found a good use for the separate TIME and DATE, 
dateOfBirth maybe, but could well use TIMESTAMP. 
> nor the arbitrary precision numeric values 
I have seen DECIMAL (same as NUMERIC I think) a lot in business applications
(amounts of money).
not since then.

> and rarely/never use fixed size CHAR. So my short-list would include:
> 
I think the treatment of CHAR and VARCHAR is somewhat different per
database.
I see CHAR usage especially when you have a list of fixed length enumerated
values,
especially CHAR(1). I seem to remember that oracle (or was it db2, don't
remember) suggested
that for length 1 strings one should use CHAR, otherwise always VARCHAR.
I have used CHAR a few times but keep having to deal with trailing spaces.

> BOOLEAN
> INTEGER
> BIGINT
> REAL
> DOUBLE
> TIMESTAMP
> VARCHAR
> VARBINARY
> POINT
> REGION
> 
I would suggest adding SMALLINT for sure.
Are we sure we don't want CLOB and BLOB?
I believe those (used to?) have different semantics from shorter (VAR)CHARs
where ordering is concerned.
I.e. I have seen "order by <somclob>" to be illegal in some databases.

Btw, I suppose that ADQL can make no statement about the data types of
mathematical expressions.
For example the result of multiplying an integer by a float, or dividing
integer by integer. 
Does SQL92 say anything about this? Is it TAP's  task to define this
semantics? 

E.g. my SQLServer2005 instance truncates int/int to an int, which forces me
to either
cast numerator and/or denominator to a float, or multiply one of them by 1.0
(if cast is not available).
Try "select 3/4" in http://www.g-vo.org/Millennium/MyDB. 
I have also had to cast integers to bigints when used in an aggregate SUM if
the total was exceeding the limits on INTEGER.

It may be necessary to address this in TAP.

Gerard






More information about the dal mailing list