Generic FIELD/PARAM metadata items in VOTable

Mark Taylor m.b.taylor at bristol.ac.uk
Fri May 26 13:06:20 CEST 2023


Hi FX.

On Wed, 24 May 2023, F.-X. Pineau wrote:

> 1 - General case of adding "arbitrary key/value pairs with a FIELD/PARAM"
> 
> Sorry if it as already been discussed (and let me know if it is dumb):
> what about allowing additional non-VOTable-reserved attributes in the
> FIELD/PARAM tags

That is a possibility, though I don't think it's my favourite one.

I was going to say it's not something that's done elsewhere in VOTable,
so it looks stylistically strange.  But it turns out that's not quite
true, since I see buried in VOTable-1.4.xsd:

   <xs:complexType name="Resource">
     ...
     <!-- Suggested Doug Tody, to include new RESOURCE attributes -->
     <xs:anyAttribute namespace="##other" processContents="lax"/>
   </xs:complexType>

though I don't know if anybody has ever used that facility.

It could complicate validation, since freely chosen attributes could
get in the way of predefined ones, e.g. if somebody writes

   <FIELD name="l" datatype="double" units="m">

they probably meant to write 'unit="m"' but it would be interpreted
instead as a bit of custom metadata.  But that could be mitigated
by requiring use of ##other namespaces.

> The fact that you (Mark) uses the "key/value pairs" term seems to indicate
> that you have
> in mind the uniqueness of keys (so it is compatible with XML tag attributes),
> right?

Actually I would prefer not to prevent the use of multiple
occurrences of the same "key", since the point of this is largely
to facilitate usages that we haven't predicted.  Off the top of
my head I could imagine wanting to do something like:

   <FIELD name="identifier" datatype="char" size="*">
     <META name="resolver_url" value="https://ned.ipac.caltech.edu/"/>
     <META name="resolver_url" value="https://simbad.u-strasbg.fr/simbad/"/>
   </FIELD>

There may be other examples that are more compelling; on the other
hand you could argue that allowing this complicates processing
(and the above could be achieved with e.g. a space-separated list).
But on the whole I'd prefer to do what we have with PARAM and FIELD,
which is RECOMMEND that a given name/key is not used more than once
in the same scope, without forbidding it.


> 2 - On the particular HEALPix example
> 
> VOTable represents tabular data, thus a very flat view on data.
> 
> I lack fantasy to imagine what to do from an HEALPix number without knowing
> its order.
> The order is an important piece of data going along with the HEALPix indices.
> When each row have a different order, it is natural to provide the order in a
> separate column.
> We do not consider the "order" column as a sub-column.
> So, if all rows have the same order, it seems natural to me to provide the
> info in a PARAM
> (= column of constant value), thus at a table level (and not at a column
> level).
> 
> From my point-of-view, it is the role of GROUPs or VODML or refs
> (or other mechanism possibly complementing but letting the FIELD/PARAM
> structure unchanged)
> to introduce a hierarchy/logic/semantic(?) (beyond UCDs) in the set of
> FIELDS/PARAMS.

You are right that it can be done like this, and indeed revisiting
the discussion from
http://mail.ivoa.net/pipermail/apps/2016-August/001131.html
I see that the consensus was more or less in that direction,
though the semantic details didn't get finalised.

> (I agree for the readability, but are VOTable made to be human readable?
> And from a human point-of-view, the column name (e.g. 'hpx8') and the
> DESCRIPTION
> of the column should be enough to know the order).

I do agree that it is not VOTable's job to be human readable,
I maybe didn't express myself very well about that.
I'm not so concerned that humans looking at the XML of individual
VOTables should be able to extract metadata (though that can be a nice
bonus e.g. during debugging), it's more that if the metadata is
stored as child elements of the FIELD/PARAMs then it's absolutely
clear (e.g. to client authors) from the structure of the document
what the semantics is, and equally clear (e.g. to VOTable authors
or data providers) from the VOTable schema how to store per-column
metadata.  With the solutions based on additional PARAMs and GROUPs,
and to a lesser extent the ability to use freely assigned attributes,
it is necessary to read and understand the relevant parts of the
VOTable standard - and these details are not currently spelled out
though the use of GROUPs is hinted at in Section 4.9.

So: I like something like the META children of FIELD/PARAM for clarity.
Using free attributes goes somewhat in the same direction but I feel
is a bit more problematic and less obvious.
But it is possible to solve the problem with existing mechanisms
(GROUPs and refs), so if people prefer that we should just make
sure that it's properly documented.

Mark

--
Mark Taylor  Astronomical Programmer  Physics, Bristol University, UK
m.b.taylor at bristol.ac.uk          https://www.star.bristol.ac.uk/mbt/


More information about the apps mailing list