apps Digest, Vol 135, Issue 8

gilles landais gilles.landais at astro.unistra.fr
Fri May 26 18:59:09 CEST 2023


Hi all,

Considering DCP, I am in favor like Baptiste and Markus to exploit 
semantic in VOTable (using rdf).
It seems to be a good way to improve interoperability and open doors in 
a interdisciplinary context.

for information, solution (1), (2) or (3) doesn't break astropy parser.

Regards,

Gilles

Le 25/05/2023 à 12:00, apps-request at ivoa.net a écrit :
> Send apps mailing list submissions to
> 	apps at ivoa.net
>
> To subscribe or unsubscribe via the World Wide Web, visit
> 	http://mail.ivoa.net/mailman/listinfo/apps
> or, via email, send a message with subject or body 'help' to
> 	apps-request at ivoa.net
>
> You can reach the person managing the list at
> 	apps-owner at ivoa.net
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of apps digest..."
>
>
> Today's Topics:
>
>     1. Re: Generic FIELD/PARAM metadata items in VOTable (F.-X. Pineau)
>     2. Re: Generic FIELD/PARAM metadata items in VOTable
>        (Baptiste Cecconi)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Wed, 24 May 2023 12:34:41 +0200
> From: "F.-X. Pineau" <francois-xavier.pineau at astro.unistra.fr>
> To: Mark Taylor <m.b.taylor at bristol.ac.uk>
> Cc: IVOA Applications WG <apps at ivoa.net>
> Subject: Re: Generic FIELD/PARAM metadata items in VOTable
> Message-ID: <93a209b2-309d-565b-4f3b-be7452fa42b8 at astro.unistra.fr>
> Content-Type: text/plain; charset=UTF-8; format=flowed
>
> Mark and All,
>
> 1 - General case of adding "arbitrary key/value pairs with a FIELD/PARAM"
>
> Sorry if it as already been discussed (and let me know if it is dumb):
> what about allowing additional non-VOTable-reserved attributes in the
> FIELD/PARAM tags
> (yes, it will break too restrictive existing parsers).
> The fact that you (Mark) uses the "key/value pairs" term seems to
> indicate that you have
> in mind the uniqueness of keys (so it is compatible with XML tag
> attributes), right?
>
> If we serialize in JSON or TOML, we just get additional key/value pairs
> in the FIELD/PARAM objects.
> It seems pretty straightforward and elegant (no additional complexity
> with sub-objects, ...).
>
> The Rust VOTable parser support this since the beginning
> https://github.com/cds-astro/cds-votable-rust/blob/81f8c481dca03f1766ab1d922c64e9726c29ef52/src/field.rs#L118-L120
> and it will be even simpler if we assume values are only strings (like
> you suggest).
>
>
> 2 - On the particular HEALPix example
>
> VOTable represents tabular data, thus a very flat view on data.
>
> I lack fantasy to imagine what to do from an HEALPix number without
> knowing its order.
> The order is an important piece of data going along with the HEALPix
> indices.
> When each row have a different order, it is natural to provide the order
> in a separate column.
> We do not consider the "order" column as a sub-column.
> So, if all rows have the same order, it seems natural to me to provide
> the info in a PARAM
> (= column of constant value), thus at a table level (and not at a column
> level).
>
>   From my point-of-view, it is the role of GROUPs or VODML or refs
> (or other mechanism possibly complementing but letting the FIELD/PARAM
> structure unchanged)
> to introduce a hierarchy/logic/semantic(?) (beyond UCDs) in the set of
> FIELDS/PARAMS.
>
>> You could also end up with multiple PARAMs having the same name, but
>> referring to different columns, but I don't think there is any rule
>> against that.
> I probably miss your point since I don't see this as problematic knowing
> that
> there is (at most) one ref per PARAM and the IDs are supposed to be unique.
> (A good practice, though, should be to have unique FIELDs/PARAMs names).
> Are you thinking of several columns sharing a same PARAM (that will have
> to be duplicated)?
>
> (I agree for the readability, but are VOTable made to be human readable?
> And from a human point-of-view, the column name (e.g. 'hpx8') and the
> DESCRIPTION
> of the column should be enough to know the order).
>
> Bonus: when serializing a VOTable in CSV, I tend to think that
> PARAMs should be represented as columns containing constant values
> (so even if the metadata is lost, we still have the PARAMs (redundant)
> values in output).
>
> What do you think?
>
>
> fx
>
>
> Le 23/05/2023 ? 11:34, Mark Taylor a ?crit?:
>> FX,
>>
>> I hadn't thought of that, it's definitely a possibility.
>> The semantics of the various ref/ID linkages are rather under-documented
>> in VOTable, so like the other options it would need to be written
>> in the standard what the meaning of this construction would be.
>> Compared to the other options it's less obvious to a human reader
>> what's going on, but it's a bonus that it doesn't require any changes
>> to the schema.
>>
>> One negative consideration is that legacy software (e.g. current
>> version of STIL/STILTS/TOPCAT) would see such PARAMs, ignore the ref,
>> and assume that this was table-level rather than column-level metadata -
>> but the same might happen for option (1).
>> You could also end up with multiple PARAMs having the same name, but
>> referring to different columns, but I don't think there is any rule
>> against that.
>>
>> Interested in other people's opinions.
>>
>> Mark
>>
>> On Mon, 22 May 2023, Francois-Xavier PINEAU wrote:
>>
>>> Hi Mark and all,
>>>
>>> Only considering the given example (so the following may be irrelevant), what
>>> about something like:
>>>
>>> <PARAM name="healpix_order" value="8" ref="healpix_id"/>
>>> <FIELD ID="healpix_id" name="healpix_id" datatype="int"/>
>>>
>>> which popped up as the more natural way of describing this to me
>>> (PARAM <=> constant column; with a ref to "link" it to another existing
>>> column).
>>>
>>> If the order is different for each row, it will naturally be described as
>>> (italic = optional):
>>>
>>> <FIELD name="healpix_order" datatype="int" /ref="healpix_id"//>
>>> <FIELD /ID="healpix_id"/ name="healpix_id" datatype="int"/>
>>>
>>> Cheers,
>>>
>>>
>>> fx
>>>
>>>
>>> Le 17/05/2023 ? 18:07, Mark Taylor a ?crit?:
>>>> Dear Applications,
>>>>
>>>> this mail is a summary of a proposed modification to VOTable that has
>>>> been discussed on Github (https://github.com/ivoa-std/VOTable/issues/29)
>>>> and that may make it into the proposed VOTable 1.5; I'm summarising it
>>>> for comment on the apps mailing list at the request of Tom Donaldson,
>>>> VOTable editor.
>>>>
>>>> Requirement
>>>> -----------
>>>>
>>>> People sometimes want to add arbitrary key=value metadata to VOTable FIELD
>>>> or PARAM columns, the sort of thing that doesn't fit into the existing
>>>> attributes (unit, UCD, xtype, utype).  Some examples:
>>>>
>>>>       - Labelling DataLink PARAMs as mandatory or optional
>>>>         (https://github.com/ivoa-std/DataLink/issues/51)
>>>>
>>>>       - Indicating HEALPix order for a column containing a HEALPix index
>>>>         (http://mail.ivoa.net/pipermail/apps/2016-August/001131.html)
>>>>
>>>>       - Domain-specific standard metadata items from outside of astronomy
>>>>         (CAIO ATTRIBUTE
>>>> athttps://www.cosmos.esa.int/web/csa-guide/tap-tables-and-views)
>>>>
>>>> At present there's really no way to do this, though in some cases it's
>>>> possible to achieve the required effect by ad hoc abuse of some underused
>>>> VOTable elements or attributes.
>>>>
>>>> I would like to see a way to associate arbitrary key/value pairs with a
>>>> FIELD/PARAM to address issues like the above, and others we haven't
>>>> foreseen.
>>>> The idea would not be to associate any semantics to such per-column metadata
>>>> within the VOTable standard, though other client standards or applications
>>>> could do that using their own key vocabularies if they wanted to.
>>>> I don't think the values need to be typed (i.e. key and value can just
>>>> be strings as far as VOTable is concerned).
>>>>
>>>> Solutions
>>>> ---------
>>>>
>>>> Since multiple instances per FIELD/PARAM might in principle be required,
>>>> the obvious thing is to use child elements each with a key and value
>>>> attribute.
>>>> Some possibilities:
>>>>
>>>>       (1) Allow FIELD/PARAM to contain INFO children:
>>>>
>>>>            <FIELD name="healpix_id" datatype="int">
>>>>              <INFO name="healpix_order" value="8"/>
>>>>            </FIELD>
>>>>
>>>>       (2) Invent a new element for this purpose, say META:
>>>>
>>>>            <FIELD name="healpix_id" datatype="int">
>>>>              <META key="healpix_order" value="8"/>
>>>>            </FIELD>
>>>>
>>>>       (3) Use the existing LINK element using RDF to indicate semantics:
>>>>
>>>>            <FIELD name="healpix_id" datatype="int">
>>>>              <LINK action="rdf" content-role="#healpix_order" value="8"/>
>>>>            </FIELD>
>>>>
>>>> (1) and (2) would require modifications to the VOTable schema.
>>>> (1) is arguably less disruptive since it doesn't introduce a new element;
>>>> however it may be more prone to confusing existing clients, which may assume
>>>> that an INFO anywhere within a TABLE represents table-level, rather than
>>>> column-level, metadata.
>>>>
>>>> (3) requires no change to the VOTable schema, the only change required is
>>>> an explanation somewhere in the document text about what this means,
>>>> and that this pattern is the recommended way to do this sort of thing.
>>>>
>>>> Markus and I have had discussions on the relative merits of these options
>>>> athttps://github.com/ivoa-std/VOTable/issues/29.
>>>> Markus likes (3) because it fits into RDF semantic technology;
>>>> I find (3) obscure (not obvious when reading what it means, not obvious
>>>> when writing that this is how to communicate key=value intent)
>>>> and therefore tend to favour (1) or (2) (probably (2)).
>>>> But the fact that (3) requires no schema changes is clearly a significant
>>>> bonus.
>>>>
>>>> I think either of us could live with either solution.
>>>> Markus feel free to correct or clarify any of the above.
>>>>
>>>> Discussion
>>>> ----------
>>>>
>>>> So, do others have opinions on:
>>>>
>>>>      (a) whether this is a requirement worth expending effort to satisfy
>>>>      (b) which of options (1), (2), (3) or (other) is preferred
>>>>
>>>> I guess initial followups should go to this list, but presumably the
>>>> discussion
>>>> will make its way back tohttps://github.com/ivoa-std/VOTable/issues/29
>>>> eventually; feel free to consult that Issue for more detail on the summary
>>>> above.
>>>>
>>>> Mark
>>>>
>>>> --
>>>> Mark Taylor  Astronomical Programmer  Physics, Bristol University, UK
>>>> m.b.taylor at bristol.ac.uk            https://www.star.bristol.ac.uk/mbt/
>> --
>> Mark Taylor  Astronomical Programmer  Physics, Bristol University, UK
>> m.b.taylor at bristol.ac.uk           https://www.star.bristol.ac.uk/mbt/
>
> ------------------------------
>
> Message: 2
> Date: Thu, 25 May 2023 14:37:11 +0900
> From: Baptiste Cecconi <baptiste.cecconi at obspm.fr>
> To: Markus Demleitner <msdemlei at ari.uni-heidelberg.de>
> Cc: apps at ivoa.net
> Subject: Re: Generic FIELD/PARAM metadata items in VOTable
> Message-ID: <796388A7-E508-4A36-A5D5-B0A452DCCB97 at obspm.fr>
> Content-Type: text/plain;	charset=utf-8
>
> HI all,
>
> (I reply to this message, but I have read the more recent messages).
>
> My first thought when Pierre mentioned the issue to me (I just registered on the Apps list), was the solution proposed by FX, use GROUP/PARAM to define things and refer to them in FIELD.
>
> However as the Semantics chair, I must admit that I like a lot the RDFa-like solution in a <META ...> element. Even if it is not yet a fully RDFa annotation, it drives us towards the right direction for wider interoperability (like, e.g., reuse of non-IVOA ontologies/vocabularies).
>
> Anyway, thanks Markus for your last talk as a chair, and for staying around.
>
> Cheers
> Baptiste
>
>
>> Le 19 mai 2023 ? 15:27, Markus Demleitner <msdemlei at ari.uni-heidelberg.de> a ?crit :
>>
>> Hi Apps,
>>
>> On Wed, May 17, 2023 at 05:07:59PM +0100, Mark Taylor wrote:
>>>    (1) Allow FIELD/PARAM to contain INFO children:
>>>
>>>         <FIELD name="healpix_id" datatype="int">
>>>           <INFO name="healpix_order" value="8"/>
>>>         </FIELD>
>>>
>>>    (2) Invent a new element for this purpose, say META:
>>>
>>>         <FIELD name="healpix_id" datatype="int">
>>>           <META key="healpix_order" value="8"/>
>>>         </FIELD>
>>>
>>>    (3) Use the existing LINK element using RDF to indicate semantics:
>>>
>>>         <FIELD name="healpix_id" datatype="int">
>>>           <LINK action="rdf" content-role="#healpix_order" value="8"/>
>>>         </FIELD>
>>>
>> [...]
>>> I think either of us could live with either solution.
>>> Markus feel free to correct or clarify any of the above.
>> I think Mark has nicely summarised the state of affairs, and yes, if
>> someone else does the work I won't stand in the way of either of
>> these proposals.
>>
>> Except... I hesitate to complicate the situation, but having somehow
>> repressed the memories of that discussion, at the Bologna Interop I
>> actually discussed "proper" RDF (that is, RDFa) in VOTable:
>>
>> https://wiki.ivoa.net/internal/IVOA/InterOpMay2023Semantics/rdfa-notes.pdf
>>
>> Given that, I'd suggest that *if* we create a new element as per
>> META, let's at least make it RDFa-ready *in case* we'd ever want to
>> go in that direction.  That is: literal values should go into the
>> element content (if we *really* don't want that because it's not
>> quite in line with the style of the rest of VOTable, then call the
>> attribute @content rather than @value), and references would go into
>> an @href attribute.  And we ought to use @property instead of @key.
>>
>> So:
>>
>>   <META property="healpix_order">8</META>
>>
>> or:
>>
>>   <META property="healpix_order" content="8"/>
>>
>> And then perhaps, as an illustration of using an RDF object:
>>
>>   <META property="https://www.ivoa.net/rdf/voresource/relationship_type#IsDerivedFrom"
>>     href="ivo://cds.vizier/j/a+a/658/a167"
>>> Gobrecht et al: (Al2O3)n, n=1-10, clusters data</META>
>> Note that that's not enough to produce the right RDF triples (because
>> the subject will be wrong without doing something on the FIELD, and
>> that's not pretty to do because of ID vs. id), but at least it won't
>> build new barriers to RDFa in VOTable.
>>
>> Given my conclusion in the talk ("it's not low-hanging fruit, and the
>> fruit's not terribly sweet either"), however, I think I'd still go
>> for the (totally non-RDFa but otherwise nicely RDF-spirited) option (3).
>>
>>             -- Markus
>>
>> (who clearly is still struggling to let go of Semantics:-)
>
>
> ------------------------------
>
> Subject: Digest Footer
>
> _______________________________________________
> apps mailing list
> apps at ivoa.net
> http://mail.ivoa.net/mailman/listinfo/apps
>
>
> ------------------------------
>
> End of apps Digest, Vol 135, Issue 8
> ************************************


More information about the apps mailing list