Datalink description scope [Was: [VEP-003]: datalink/core#sibling]

Mon Dec 9 10:09:44 CET 2019

Hi DAL,

On Fri, Dec 06, 2019 at 02:45:30PM +0100, ada nebot wrote:
> As I see it, the things we are discussing concerning Datalink fall into 4 independent levels or categories: 
> Level 0 - Data-format (fits, VOTable, PDF, png, …)
> Level 1 - Data-type (tabular, image, spectrum, cube, text, …)
> Level 2 - Data-information (Documentation, Calibration, Log, Preview, …)
> Level 3 - Data-relation (Derived from, Progenitor of, Sibling of, ...)

I think this is a useful way of categorising metadata, and it would
be a useful exercise to follow this tought, in particular as regards
use cases -- where in data discovery and analysis does which piece of
information be interpreted by whom (machine or person)?

Personally, I'm not sure I'd present this as a hierarchy -- as you
say, the aspects are largely orthogonal.  But that's a detail.  If
anyone were to write a Note on this and research prior work on this
(I'm sure there has to be plenty), I think I'd like to take part in
this.

Meanwhile, and towards datalink in particular:

> So, in my opinion, if I had to redo Datalink I would keep these
> different levels separated instead of putting everything into the
> semantics field. 

Well, as Mark says it's not all in the semantics field; your
Data-format essentially is the media type, and it could certainly be
extended to cover your Data-type as well (we've already done this for
datalink documents themselves with the content=datalink media type
parameter).

But since we're changing datalink right now anyway, this is our
chance to add whatever is necessary.  And since your Data-type is
already represented in obscore and applications will want to handle
it there, I suspect doing it in datalink like we do in obscore would
be the most convenient solution from a consumer point of view.
Although:

> But applications might have a different point of view here —>
> Shouldn't we add Apps to this discussion? 

I'm not sure if apps has a much larger readership than DAL, but yes,
for *datalink* evolution it makes a lot of sense to have Apps input.

I'd just be grateful if that particular discussion could be in a
different thread, and this one could concentrate on the merits or
demerits of datalink/core#sibling.  But while I'm talking:

> What we need is to be able to say is: 
> - This list of links are timeseries of tabular type 
> - This list of links are timeseries of spectrum type

As Mark says: Is the nature of the dependent vocabulary actually
something a machine needs to understand?  In other words: What's the
use case for having this in some formal way rather than just having
in the description something like "Dynamic spectrum of this source in
2019"?  [But that's a question of that other thread].

> But if were to add terms such as sibling and so on, there is
> already an IVOA relationship vocabulary: 
> http://ivoa.net/rdf/voresource/relationship_type/2016-08-17/relationship_type.html <http://ivoa.net/rdf/voresource/relationship_type/2016-08-17/relationship_type.html>

That's a very good point that I'll respond to in a different mail
that, hopefully, will be the #sibling discussion thread.

Thanks,

         Markus