[[VEP-005]<-- Re: Datalink vocabulary extension: sibling/co-generated

Patrick Dowler pdowler.cadc at gmail.com
Thu May 7 01:40:45 CEST 2020


If I think of these terms just as normal english words, I have two kids
that are siblings (same parents) and with computers we'd use those terms
with trees and graphs without confusing anyone. When I think of
co-generated, that is a much tighter relationship: if it was kids, they
would be twins, not just siblings :-)... with data it seems to mean
"generated by the same instance of processing from the same input" (same
instance also implying/saying "at the same time". So I'm pretty sure
co-generated is a narrow term just because it seems quite
specific/restrictive. Is it a narrower than sibling? I think sibling is
just the "from the same input" because I think we are talking about
siblings-by-provenance.

I don't really see that either sibling or co-generated capture the
relationship from Alberto/ESO use case: those other files that would be
calibrated by the same set of calibrators are related, but not in this
way... that seems more like "you can do the same thing to these ..." eg I
don't know what word to use :-)

My gut feeling is that vocabularies are easier to grow if one defines
general broad terms and adds narrower terms later., when differentiation is
needed.. the other way around (define a term and later on realise it is a
narrower term for a new or existing broad term) is a kind of refactoring.
That could be harmless in practice or it could imply a subtle change in
meaning. Still, refactoring is also a normal kind of evolution so it isn't
wrong to define a narrow term and add a broad parent later.

my 2c,

--
Patrick Dowler
Canadian Astronomy Data Centre
Victoria, BC, Canada


On Wed, 6 May 2020 at 08:52, François Bonnarel <
francois.bonnarel at astro.unistra.fr> wrote:

> Hi,
>
> Le 06/05/2020 à 17:37, alberto micol a écrit :
> > Thanks François. The ESO need was for raw files taken as part of the
> same observation template; all those (I was calling them “siblings”) are
> subject to exactly the same set of calibrations.
> > Or to say it differently and maybe more clearly: the datalink service
> activated on a particular raw file returns not only the link to the
> calibration files that can be used to process the given raw file, but it
> also links to the other raw (sibling) files that can be calibrated using
> the same calibration files.
> >
> > Do I understand correctly that “sibling” is now called “co-generated” ?
> Yes, It's my proposal because a couple of people didn't like "sibling"
> at all. I thought to something inspired by provenance like
> "co-generated" will fit better with the definition. I didn't know you
> were using "sibling" actually.
> >
> > I think “co-generated” could work for this case of mine, as indeed those
> raw files have been generated by the same observation template (not from
> other data, but from the same observing process). Would that work?
> Yes.
> >
> > If the above is correct, then I would not use “counterpart”.
>
> Humm. My mistake. I had the wrong remembering that your use case was
> something like "counterpart" and not "co-generated".
>
> Cheers
>
> François
>
> >
> > Many thanks,
> > Alberto
> >
> >> On 6. May 2020, at 14:42, François Bonnarel <
> francois.bonnarel at astro.unistra.fr> wrote:
> >>
> >> Hi Mireille, Markus, all
> >>
> >> Le 28/04/2020 à 19:02, Mireille LOUYS a écrit :
> >>> Sounds ok to me.
> >>>> Now, vocabularies 2 currently says on VEP review:
> >>>>
> >>>>     During the process, all parts of the VEP may be changed except the
> >>>>     term(s) proposed.
> >>>>
> >>>> and I still think that's largely a good idea.
> >>>>
> >>>> Hence, before I retract VEP-003 and replace it with an essentially
> >>>> identical VEP-004 with co-generated: Would anyone here object to that
> >>>> or strongly prefer #sibling?
> >>> I agree with "co-generated". The meaning is close to "sibling" ,
> >>>
> >>> has less connotation towards graphs' theory and may be understandable
> for a larger audience.
> >> :-)
> >>>>> p
> >>>>>
> >>>>>        I propose something like "#other" or "#alternate" ( the
> latter was
> >>>>> already proposed by Markus ..... in 2015 !!! semantics session during
> >>>>> interop In Sydney)
> >>>> #alternate was really intended when #this has multiple
> >>>> representations (classic example: a spectrum that you get as
> >>>> FITS-array, FITS-table, SDM VOTable, or CSV).  I still think this is
> >>>> a good idea because we ought to make it a SHOULD that there's just
> >>>> one #this per ID.  But that's for another VEP.
> >>> I agree. I understand this term means "same content in a different
> representation" while we need a term to mention
> >>>
> >>> we link to another dataset interesting to hela /enrich the
> interpretation of the data stored in #this.
> >>>
> >>> "related_data", "other", "see-also" seem too vague for this , but I
> cannot make up a better term .:-(
> >>>
> >>>
> >>>
> >> I was still looking for a word which says it is
> >>
> >>        1 ) not metadata, but data, not calibration data, not auxiliary
> (but main information)  not documentation, and not a service
> >>
> >>        2 ) is associated to the #this thing but is neither derived,
> nore progenitor , nore co-generated.
> >>
> >>              We have plenty of use cases like that in VizieR, for XMM,
> probably for ESO if I remember well an older request from Alberto Micol.
> And this also can be valid for "TimeSeries" outside the Gaia use case well
> represented by co-generated (see VEP-004)
> >>
> >>        Eventually I found the "counterpart" term.
> >>
> >>       Online Oxford dictionary definition reads "A person or thing
> holding a position or performing a function that corresponds to that of
> another person or thing in another place"
> >>
> >>       Sounds good to me. Is more general than "contains" (corresponds
> to the source #this but in the "image" world") "followup" (corresponds to
> #this but in the future) and "cross-associated," "cross-correlated" ...
> but is still covering these terms as an head term if desired.
> >>
> >>       I think with "progenitor", "derived", "co-generated" and
> "counterpart" we cover a wide field of relationships between #this and a
> linked dataproduct.
> >>
> >>       As you know from the DataLink discussion the actual
> dataproduct_type will be given in the mime-type parameter of the
> "content-type" field of Datalink response
> >>
> >>     So This is the VEP-005
> >>
> >> Vocabulary:http://ivoa.net/rdf/datalink/core
> >> Author: François Bonnarel
> >> Date: 2020_05-06
> >> Supercedes: VEP-001
> >>
> >> New Term: counterpart
> >> Action: Addition
> >> Label: counterpart dataproduct
> >> Description: Data products holding a position that corresponds to #this
> in another data space.
> >> Used-in:
> >>       VizieR or ESO example to be worked on
> >>
> >> Rationale:
> >>      Astronomers often want to associate to astronomical objects,
> sources or datasets dataproducts of other provenance but sharing some
> common features with those.
> >>
> >>      examples of that are measurements in another band, images, cubes,
> spectra, timeseries of a source, same dataproduct type and location in
> physical space  but at another time, etc ....
> >>
> >>
> >> Cheers
> >>
> >> François
> >>
> >>
> >>
> >>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ivoa.net/pipermail/apps/attachments/20200506/ef7efc05/attachment.html>


More information about the apps mailing list