Sexagesimal metadata

Markus Demleitner msdemlei at ari.uni-heidelberg.de
Fri Mar 11 09:41:39 CET 2022


Hi,

On Thu, Mar 10, 2022 at 05:17:23PM +0000, Mark Taylor wrote:
> But one issue that remains is how to flag up sexagesimal quantities,
> which are currently marked up like this:
> 
>    <FIELD name="RAJ2000" ucd="pos.eq.ra;meta.main" ref="J2000" 
>           datatype="char" arraysize="12" unit="'h:m:s'">
>    <FIELD name="DEJ2000" ucd="pos.eq.dec;meta.main" ref="J2000" 
>           datatype="char" arraysize="13" unit="'d:m:s'">
> 
> The unit, by long VizieR tradition, is quoted as 'h:m:s' or 'd:m:s'
> (including the single quotes), which is probably recognised as an
> ad hoc indication by quite a bit of client code out there.  

I'll need to start with a disclaimer:  my personal opinion has always
been that sexagesimal positions should be considered part of
provenance and hence be squeezed out of operational use as much as we
can.  I'll spare you some inappropriate joke about Babylon (where
these things came from).

But well, they're there, far too many people love them, and it's
clear we have to do something with them.

*If* there's substantial client code evaluating unit strings as
above, then I think we have little choice but to introduce an extra
section into VOUnits ("Units on string values") where we document
(and perhaps deprecate?) the practice.  Since VOUnits is currently in
WD, this would actually be a good time for that.  Do we have
indications for who actually looks at these strings?  Is the practice
widespread enough to justify this uglification of the spec?

If, on the other hand, we're free to invent something less ugly, I'd
say Pat (in a sibling mail) is right, and this is partly a job for
xtypes; I'd say "hms" and "dms" might work to zeroeth order, but as
Anita (in another sibling mail) has rightly pointed out, that is
regrettably not enough, as we may still encounter anything from
12.12.12.12 to +12:12:12.12 to 12 12 12.12 to 12h12m12.12s in the
wild, not to mention 12 12.12 (which was likely intended to mean
12.202 deg rather than 12.2033333, i.e., decimal minutes).

We certainly do not want xtypes for all these variations, and we
probably don't want to invent unit strings to tell them apart,
either.  Hence, this problem would persist even if we were to
sanction units on sexagesimal strings in VOUnits.

I suppose we could define a canonical grammar that would cover most
of what's out there, but that wouldn't be pretty because, if we want
it to be context-free (and that we certainly do), we'd essentially
have to enumerate the various full specifications to ensure the
separator characters match.

Or we could designate a single canoncial format and require everyone
to comply.  But then many data providers would have to touch their
data, and then wouldn't it be more prudent to just tell people to go
all the way to decimal angles in VOTables (except for provenance) and
leave the sexagesimal formatting to clients?

            -- Markus


More information about the dal mailing list