<div dir="ltr"><div>O<span class="gmail_default" style="font-size:small">n the ObsCore dataproduct_type and subtype, I also have the feeling there that the (optional) subtype isn't a highly useful construct when I contrast it with the alternative of making dataproduct_type a real extensible vocabulary. The catch, of course, would be to make it feasible for people to query (eg in TAP+ADQL) a vocabulary column. Output is not a problem but querying right now would be by exact match only... it would be really cool if you could do something like "where ivo_vocab_match(dataproduct_type, 'cube')" and that would match "cube" and child terms... or you could be more specific (down to dataproduct_type = 'specific type').<br></span></div><div><span class="gmail_default" style="font-size:small"><br></span></div><div><span class="gmail_default" style="font-size:small">I think I could handle this feasibly if the vocab function just dealt with the base terms, eg "where ivo_vocab_base(dataproduct_type) = 'cube' -- that would give the same query power as now but allow extending the vocabulary to more specific types. I think I like that better than a type/subtype pair...</span></div><div><span class="gmail_default" style="font-size:small"><br></span></div><div><span class="gmail_default" style="font-size:small">Thoughts?<br></span></div><div><span class="gmail_default" style="font-size:small"><br></span></div><div><span class="gmail_default" style="font-size:small"></span></div><div><div dir="ltr" class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div><div>--<br></div><div>Patrick Dowler<br></div>Canadian Astronomy Data Centre<br></div>Victoria, BC, Canada<br></div></div></div></div></div><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, 8 Nov 2019 at 09:57, François Bonnarel <<a href="mailto:francois.bonnarel@astro.unistra.fr">francois.bonnarel@astro.unistra.fr</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div bgcolor="#FFFFFF">
<p>HI Pat, all,<br>
</p>
<br>
<div>Le 06/11/2019 à 18:14, Patrick Dowler a
écrit :<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div class="gmail_default" style="font-size:small">I agree with
Markus' analysis, re-iterating I think the main points:</div>
<div class="gmail_default" style="font-size:small"><br>
</div>
<div class="gmail_default" style="font-size:small">1.
associated-data: although the term itself if quite redundant
(all links are "associated" in datalink by definition) the
concept of "sibling" data is sound: other data (of the same
target?). To be clear, I think Markus is thinking that
something is one of progenitor, derivation, or sibling. I'd
like to find the best word for this but I like it.</div>
</div>
</blockquote>
The term "associated-data" is experimented in VizieR since a couple
of years. Outside DataLink usage. It means some dataproduct
associated to a catalog or a row (source or whatever) in a
catalogue.<br>
I think GAVO is also using something like that.<br>
<br>
Beside this is "sibling" appropriate to associate a row in a catalog
to a dataproduct such as an image or a timeseries (underlying use
cases) ??<br>
<br>
Anyway we need a "top-branch" term widely admitted for this kind of
use-cases. Should we open a page for proposals ?<br>
<br>
<blockquote type="cite">
<div dir="ltr">
<div class="gmail_default" style="font-size:small"><br>
</div>
<div class="gmail_default" style="font-size:small">To check
interpretation, I like to see if the tuple {link} {semantics}
{ID} can sensibly be spoken as a sentence (with some filler
articles):</div>
<div class="gmail_default" style="font-size:small"><br>
</div>
<div class="gmail_default" style="font-size:small"><a href="http://example.net/foo" target="_blank">http://example.net/foo</a>
is-a-spectrum-of blah:123</div>
<div class="gmail_default" style="font-size:small"><br>
</div>
<div class="gmail_default" style="font-size:small">In that
sense, it seems one can use dataproduct_type(s) to describe a
relationship between a resource and an identified thing. <br>
</div>
</div>
</blockquote>
Yes exactly what we had in mind for TimeDomain. All these are
sub-terms of "associated-data/sibling"<br>
But in addition timeseries require sub-types (lightcurgve,
radialvelocitycurve, etc...)<br>
<blockquote type="cite">
<div dir="ltr">
<div class="gmail_default" style="font-size:small"><br>
</div>
<div class="gmail_default" style="font-size:small"><br>
</div>
<div class="gmail_default" style="font-size:small">2. At the
same time, the more SAMP-like use case of driving actions is
depending on knowing what the resource at the end of the
access_url *is*, not what the relationship is. That sounds
more like a job for content-type or a new column and not for
semantics. It's also potentially orthogonal to semantics
(which I think gives rise to the explosion in number of terms
Markus' mentioned). Given that the current range of content
types we work with (application/fits, text/x-votable+xml,
application/x-hdf5, eg) don't say much of anything about the
content to expect, parameterising like we do with
content=datalink is a pretty straightforward solution. I think
this works and conveys more information to clients independent
of other enhancements e might make to the vocabulary or
datalink spec.<br>
</div>
<div class="gmail_default" style="font-size:small">It could
generally be a good thing to do wherever content-type is
conveyed (ObsCore access_format, DataLink content_type, http
Content-Type headers, etc). <br>
</div>
</div>
</blockquote>
Just to understand : semantics will be "associated-data/sibling"
and in that case you look at dataproduct_type string after the
semicolumn in content-type ?<br>
But the TimeDomain use cases (see Ada's talk at last interop)
requires a sub-typing (in Obscore and DataLink).<br>
Cand we use further content-type for that ?<br>
<blockquote type="cite">
<div dir="ltr">
<div class="gmail_default" style="font-size:small"><br>
</div>
<div class="gmail_default" style="font-size:small">As an aside,
I have been thinking about how to enable semantics to contain
multiple tags. I have a few use cases where it would be nice
to do that -- not sure how great an idea it is though. One
thing it does is it more or less removes the need/desire to
produce very similiar looking trees of terms with different
root terms. I intend to create a VOTable issue explore how
exactly to convey a "bag of terms" in a single table cell and
a DataLink issue to explore multiple semantics tags. I wanted
to mention it here in case it tweaks someone's imagination and
because it seems peripherally related.<br>
</div>
</div>
</blockquote>
Indeed, this could allow to use the
dataproduct_type/dataproduct_subtype branches in semantics in
combination with "sibling/associated-data", "progenitor etc ....<br>
<br>
But you are right this probably requires a change in VOTable which
has only a char (with dimension) datatype for strings. <br>
<br>
<br>
<br>
More discussion on all this needed.<br>
<br>
Cheers<br>
François<br>
<blockquote type="cite">
<div dir="ltr">
<div class="gmail_default" style="font-size:small"><br clear="all">
</div>
<div>
<div dir="ltr">
<div dir="ltr">
<div>
<div dir="ltr">
<div>
<div>--<br>
</div>
<div>Patrick Dowler<br>
</div>
Canadian Astronomy Data Centre<br>
</div>
Victoria, BC, Canada<br>
</div>
</div>
</div>
</div>
</div>
<br>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Mon, 4 Nov 2019 at 05:57,
Markus Demleitner <<a href="mailto:msdemlei@ari.uni-heidelberg.de" target="_blank">msdemlei@ari.uni-heidelberg.de</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi
DAL,<br>
<br>
On Tue, Oct 22, 2019 at 06:23:32PM +0200, François Bonnarel
wrote:<br>
> Le 22/10/2019 à 10:53, Markus Demleitner a écrit :<br>
> > On Mon, Oct 21, 2019 at 05:38:32PM +0200, Petr Skoda
wrote:<br>
> > As far as I can see, there are two use cases in
general for datalink<br>
> > semantics:<br>
> > <br>
> > (a) link filtering: The client, based on the
semantics, selects a<br>
[...]<br>
> > <br>
> > (b) figure out what do do with a link: When Aladin
implemented<br>
> > datalink, they found that based on what's in a
datalink row, they<br>
> > didn't know how to deal with a link: they'd like to
send spectra to<br>
> > clients listening to spectrum.load.ssa-generic,
images to those<br>
> > listening to image.load.fits and so forth. The
datalink content_type<br>
> > column isn't quite sufficient for this, because<br>
> > application/x-votable+xml can be a spectrum or an
object catalog,<br>
> > whereas image/fits might be some kind of cube or a
plain image (or an<br>
> > IRAF spectrum, or still something else). That's the
"SAMP sending<br>
> > use case" that, I think, was largely missed when we
wrote datalink.<br>
><br>
> Well, that's strange because from the beginning some of
us (authors) had<br>
> something like that in mind. Well not exactly "samp" but
more generally.<br>
> What will the client do with this link. Try to manage it
herself and do<br>
<br>
Be that as it may, the actual spec has failed to cover that
use case<br>
-- which is why we are here.<br>
<br>
> > Having established this much, after a mail from Ada
I had another of<br>
> > my dangerous epiphanies. That is, if we really want
to deal with use<br>
> > case (b) in semantics, we'll end up reproducing the
distinction that<br>
> > VEP-0001 proposes on in every branch: not only will
we have<br>
> > <br>
> > #associated-cube #associated-image
#associated-radialvelocitycurve ...<br>
> > <br>
> > but also<br>
> > <br>
> > #derivation-cube #derivation-image
#derivation-radialvelocitycurve ...<br>
> > <br>
> > and (we've already seen use cases for that)<br>
> > <br>
> > #progenitor-cube #progenitor-image
#progenitor-radialvelocitycurve ...<br>
><br>
> OK. This means that we are facing the three branches were
the links targets<br>
> to datasets or datasets exerpts.<br>
<br>
I doubt it would be limited to these three; look at error-map,
for<br>
instance -- it stands to reason that error maps would, in
general,<br>
follow their "main" dataset's type, and hence you'd have<br>
<br>
#error-cube #error-image #error-radialvelocitycurve...<br>
<br>
I could make that point for noise and weight, again, and I
suspect<br>
for quite a few of the terms we may see in the future.<br>
<br>
> > (3) Adding a dataproduct_type column in datalink.
If we started from<br>
> > scratch, this is probably what I'd do. As things
are now... don't<br>
> > know. As for (2), this can start immediately
(because datalink lets<br>
> > you add extra columns), and at it would even have
the advantage that<br>
> > clients that don't parse media types would still
understand<br>
> > content_type.<br>
> Well, some other people (Alberto for example) have asked
for this. I'm<br>
> reluctant because for most of the links this column will
be unused (most of<br>
> the links usecase are not "dataproducts" at all). In
general I think we<br>
<br>
That a column is empty for many links is not unusual in
datalink (see<br>
service_def and error_message in 1.0). But also I suspect in
most<br>
datalink documents, the majority of links are actually
"sendable" in<br>
this sense: The progenitors and derivations of images and
spectra, in<br>
all likelihood, will be images and spectra again, as will
#error,<br>
#flat, #noise, #weight, and, of course, #this.<br>
<br>
> should try to avoid adding columns in DataLink response
and should try to<br>
> keep it simple. And sepcialy when these columns come from
another spec<br>
<br>
About the simplicity, as someone wanting to put this stuff
into pyVO,<br>
my personal choice between<br>
<br>
Is semantics one of [#progenitor-image, #associated-image,<br>
#derviation-image, #noise-image, #bias-image, #dark-image,
...]?<br>
<br>
and<br>
<br>
check the dataproduct_type column and, if there's a value,
use that<br>
to determine the default SAMP destinations<br>
<br>
is fairly clear (in particular because I'll need the second
logic<br>
for Obscore anyway).<br>
<br>
The one big downside that I can see with the dataproduct_type
column<br>
is that datalink 1.0 services won't have it for a long time
(though<br>
of course you can always just add the column to a 1.0 service,
too).<br>
<br>
But then even with a semantics-based solution for the
SAMP-sending<br>
case, the clients would depend on operators adopting the new
terms,<br>
which I wouldn't expect to be instantaneous.<br>
<br>
Again, I'd like to hear from Datalink producers and consumers
what<br>
they think. Of for that, I'd still not count out the solution
via<br>
media type content paramenters; this would be mighty useful
far<br>
beyond Datalink...<br>
<br>
-- Markus<br>
</blockquote>
</div>
</blockquote>
<br>
</div>
</blockquote></div>