Datalink vocabulary extension: sibling/co-generated

Mon May 11 13:05:18 CEST 2020

Hi,

> On 2020-05 -11, at 10:37, Markus Demleitner <msdemlei at ari.uni-heidelberg.de> wrote:
> 
> And while you're right, we're categorising along provenance relations
> here, I think that absolutely makes sense for datalink (which is why
> we've had #progenitor and #derivation from day one) because it
> intersects with what you'd like to do with the data:
> 

I do not see a problem with going up the provenance tree - that is going to result in a fairly well bounded small number of results, which hopefully the data provider can ensure is complete - however,  in any other direction the true number of links is theoretically unbounded with two problems;

* the data provider is potentially only aware of the things they have done to the data which leads to the completeness problem (especially in the #derivation direction)

* as a client, I risk being overwhelmed by the number of responses, and I don’t really have the tools that I have at the access protocol level to filter to the things that I am interested in.

> Debugging? -> #progenitor [wait for my VEP pushing #calibration below
>  #progenitor]
> Analyses other people have made? -> #derivation
> Other things people have done with the observation? -> #sibling/#co-derived
> 

"other things people have done with the data" is almost the raison d’être for the VO and it seems most naturally implemented by a query.

> What's still missing is "Other things I might like to know
> about...<ugh>", as in, perhaps "HST spectra taken for items shown on
> this image".   I think that's, indeed, a tough one because you'd have
> to say what <ugh> is, and datalink doesn't really have a way to have
> <ugh> anything but "the thing you've just discovered", and that
> absolutely does not cover the use case I've made up above (where
> <ugh> is "objects in this image" rather than "this image").
> 
> Given that, for now I'd like to agree with you on:
> 
>> Because linking is a very powerful tool, there is a tendency to
>> want to put in lots of links,  but can you be sure that you are
>> presenting a “complete” set of links. The IVOA has a whole lot of
>> “discovery” protocols which are in competition with these “hard”
> 
> I hope as people get used to things like the Aladin 10's discovery
> tree it'll be more natural to them to look for the HST spectra (or
> any spectra at all) using a generic client, rather than get them
> pre-packaged from the service operator.

In the case of CDS being the service operator, then I can see that the non-progenitor directions are more appealing as they are collecting more of the information about data relationships than most, and their “fixed” datalinks will be more trustworthy/complete than most. However, it just seems to me to be against the “spirit” of the way that the VO operates, and a query against the provenance DM would be more approriate for getting this information. There are questions of course about whether the TAP style queries against this model would be expressive enough to make these queries ’simple’, but that is really a question of potentially improving ProvDM and queries against it.

Anyway, I realise that I am have been 'semi-detached' from the VO for a while now and it is pretty irritating when people come late to a design process. We will be putting effort into a proper VO compliant e-MERLIN archive later this year, and I will have a go at doing things the ProvDM as a prototype to that approach.

Cheers,	
	Paul.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ivoa.net/pipermail/apps/attachments/20200511/1c3a1258/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2893 bytes
Desc: not available
URL: <http://mail.ivoa.net/pipermail/apps/attachments/20200511/1c3a1258/attachment-0001.p7s>