Proposal: Require "ref" for TIMESYS access

Markus Demleitner msdemlei at ari.uni-heidelberg.de
Mon Feb 11 13:36:08 CET 2019


Hi,

On Mon, Feb 11, 2019 at 07:44:53AM +0100, François Bonnarel wrote:
> It's nice to see TIMESYS in VOTABle now. Thanks.
> 
> However I don't understand the beginning of this sentence syntaxically
> 
> "The TIMESYS element defines such a time system and gives itself the
> identifies itself with ID="time frame". Then
> the obs time FIELD indicates that its values should be interpreted in that
> time system by referring back to the
> TIMESYS element using ref="time frame".
> 
> And about the meaning : we cannot force the ID to be "time_frame". What
> happens if we have several TIMESYS.

Yes, the wording needs to be improved; I'd say "has an ID attribute"
is more or less good enough.

> On Tom's suggestion. I think it's too much asking. In many use cases there

Asking of whom?  True, a *few* data providers that can't make do with
a simple template but won't use a proper library, either, *may* have
a bit of extra work because they have to manage the ref-ID
relationship.  On the other hand, with mandatory @ref *all* data
providers are spared the extra thought whether or not they should
have ref-ID, and there's one piece of complicated language and
potentially ambiguous spec (see below) less.  So, overall I'd say the
majority of the producer side has it simpler with mandatory @ref.

On the client side, the rules to associate a FIELD with a TIMESYS
become a whole lot simpler with mandatory @ref.  Just put yourself in
the shoes of people writing the astropy table parser.  How should
they decide which columns to furnish with a free-hanging TIMESYS?  By
unit?  By UCD?  Just guessing?

The longer I think about it the more I'm convinced  that if we really
think @ref should be optional, we must give an algorithm for how
FIELDs, PARAMs, and TIMESYS are to be linked.  Somewhat like

for all FIELDs and PARAMs:
  if there's a ref attribute referening a TIMESYS, add the metadata
  else if thing is-timelike (TODO: how do we decide that?):
    if there's no TIMESYS:
      Forget it?  Complain?  Assign unknown metadata?
    else if there's exactly one TIMESYS:
      Assign the metadata from that timesys
    else:
      Forget it?  Complain?  Assign unknown metadata?

-- note that 80% of that code (in terms of lines) is only there
because of optional @ref, and we haven't even started to explain how
to find FIELDs and PARAMs eligible for TIMESYS annotation.  I'd say
*that* is asking quite a bit (and probably too much).

> will be no ambiguity. and the single TIMESYS is valid for any times in the
> table. So we absoluty requires a ref when there is ambiguity.

I'd contest that multiple time fields/params are rare.  To get an
idea of the minimal level of the problem (i.e., ignoring PARAMs and
such, ignoring tables with insufficient UCD annotation, resource
records without table metadata), I ran the
following RegTAP query the other day:

select table_name, ivo_string_agg(name, ', ')
from rr.table_column natural join rr.res_table
where ucd like 'time%'
group by table_name
having count(*)>1

This gives you a table of about 5000 rows or so of table names and
aggregated column names for tables in the Registry having more than
one column annotated with time-like UCDs; that means that about 25%
of the tables in the VO have multiple time-like *columns*.  When you
consider that quite a few of these (should) have some additional
time-like params and we have quite a few tables with substandard
annotation, you'll see that the multi-times case is probably the rule
rather than the exception.


> I suggest
> "The TIMESYS element (introduced in VOTable 1.4) defines metadata for
> temporal coordinates. To reference the time system defined by a TIMESYS
> element, FIELDs (and possibly PARAMs) MUST reference the TIMESYS giving
> their frame using the VOTable ref attribute when there is an ambiguity;
> Otherwise the TIMESYS is considered pertinent for all time-like quantities..

...like, an exporsure time?  Ok, I'm being a bit polemic here, but
"time-like quantities" is written a lot easier than defined in a way
that computer have a chance to decide what an actual FIELD or PARAM
is.

Let's save our implementors all that trouble and just require
TIMESYS/@ref.

       -- Markus


More information about the apps mailing list