<div dir="ltr"><div>O<span class="gmail_default" style="font-size:small">n the ObsCore dataproduct_type and subtype, I also have the feeling there that the (optional) subtype isn&#39;t a highly useful construct when I contrast it with the alternative of making dataproduct_type a real extensible vocabulary. The catch, of course, would be to make it feasible for people to query (eg in TAP+ADQL) a vocabulary column. Output is not a problem but querying right now would be by exact match only... it would be really cool if you could do something like &quot;where  ivo_vocab_match(dataproduct_type, &#39;cube&#39;)&quot; and that would match &quot;cube&quot; and child terms... or you could be more specific (down to dataproduct_type = &#39;specific type&#39;).<br></span></div><div><span class="gmail_default" style="font-size:small"><br></span></div><div><span class="gmail_default" style="font-size:small">I think I could handle this feasibly if the vocab function just dealt with the base terms, eg &quot;where ivo_vocab_base(dataproduct_type) = &#39;cube&#39; -- that would give the same query power as now but allow extending the vocabulary to more specific types. I think I like that better than a type/subtype pair...</span></div><div><span class="gmail_default" style="font-size:small"><br></span></div><div><span class="gmail_default" style="font-size:small">Thoughts?<br></span></div><div><span class="gmail_default" style="font-size:small"><br></span></div><div><span class="gmail_default" style="font-size:small"></span></div><div><div dir="ltr" class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div><div>--<br></div><div>Patrick Dowler<br></div>Canadian Astronomy Data Centre<br></div>Victoria, BC, Canada<br></div></div></div></div></div><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, 8 Nov 2019 at 09:57, François Bonnarel &lt;<a href="mailto:francois.bonnarel@astro.unistra.fr">francois.bonnarel@astro.unistra.fr</a>&gt; wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

  <div bgcolor="#FFFFFF">

    <p>HI Pat, all,<br>

    </p>

    <br>

    <div>Le 06/11/2019 à 18:14, Patrick Dowler a

      écrit :<br>

    </div>

    <blockquote type="cite">

      <div dir="ltr">

        <div class="gmail_default" style="font-size:small">I agree with

          Markus&#39; analysis, re-iterating I think the main points:</div>

        <div class="gmail_default" style="font-size:small"><br>

        </div>

        <div class="gmail_default" style="font-size:small">1.

          associated-data: although the term itself if quite redundant

          (all links are &quot;associated&quot; in datalink by definition) the

          concept of &quot;sibling&quot; data is sound: other data (of the same

          target?). To be clear, I think Markus is thinking that

          something is one of progenitor, derivation, or sibling. I&#39;d

          like to find the best word for this but I like it.</div>

      </div>

    </blockquote>

    The term &quot;associated-data&quot; is experimented in VizieR since a couple

    of years. Outside DataLink usage. It means some dataproduct

    associated to a catalog or a row (source or whatever) in a

    catalogue.<br>

    I think GAVO is also using something like that.<br>

    <br>

    Beside this is &quot;sibling&quot; appropriate to associate a row in a catalog

    to a dataproduct such as an image or a timeseries (underlying use

    cases) ??<br>

    <br>

    Anyway we need a &quot;top-branch&quot; term widely admitted for this kind of

    use-cases. Should we open a page for proposals ?<br>

     <br>

    <blockquote type="cite">

      <div dir="ltr">

        <div class="gmail_default" style="font-size:small"><br>

        </div>

        <div class="gmail_default" style="font-size:small">To check

          interpretation, I like to see if the tuple {link} {semantics}

          {ID} can sensibly be spoken as a sentence (with some filler

          articles):</div>

        <div class="gmail_default" style="font-size:small"><br>

        </div>

        <div class="gmail_default" style="font-size:small"><a href="http://example.net/foo" target="_blank">http://example.net/foo</a>

          is-a-spectrum-of blah:123</div>

        <div class="gmail_default" style="font-size:small"><br>

        </div>

        <div class="gmail_default" style="font-size:small">In that

          sense, it seems one can use dataproduct_type(s) to describe a

          relationship between a resource and an identified thing. <br>

        </div>

      </div>

    </blockquote>

    Yes exactly what we had  in mind for TimeDomain. All these are

    sub-terms of &quot;associated-data/sibling&quot;<br>

    But in addition timeseries require sub-types (lightcurgve,

    radialvelocitycurve, etc...)<br>

    <blockquote type="cite">

      <div dir="ltr">

        <div class="gmail_default" style="font-size:small"><br>

        </div>

        <div class="gmail_default" style="font-size:small"><br>

        </div>

        <div class="gmail_default" style="font-size:small">2. At the

          same time, the more SAMP-like use case of driving actions is

          depending on knowing what the resource at the end of the

          access_url *is*, not what the relationship is. That sounds

          more like a job for content-type or a new column and not for

          semantics. It&#39;s also potentially orthogonal to semantics

          (which I think gives rise to the explosion in number of terms

          Markus&#39; mentioned). Given that the current range of content

          types we work with (application/fits, text/x-votable+xml,

          application/x-hdf5, eg) don&#39;t say much of anything about the

          content to expect, parameterising like we do with

          content=datalink is a pretty straightforward solution. I think

          this works and conveys more information to clients independent

          of other enhancements e might make to the vocabulary or

          datalink spec.<br>

        </div>

        <div class="gmail_default" style="font-size:small">It could

          generally be a good thing to do wherever content-type is

          conveyed (ObsCore access_format, DataLink content_type, http

          Content-Type headers, etc). <br>

        </div>

      </div>

    </blockquote>

    Just to understand : semantics will be &quot;associated-data/sibling&quot;  

    and in that case you look at dataproduct_type string after the

    semicolumn in content-type ?<br>

    But the TimeDomain use cases (see Ada&#39;s talk at last interop)

    requires a sub-typing (in Obscore  and DataLink).<br>

    Cand we use further content-type for that ?<br>

    <blockquote type="cite">

      <div dir="ltr">

        <div class="gmail_default" style="font-size:small"><br>

        </div>

        <div class="gmail_default" style="font-size:small">As an aside,

          I have been thinking about how to enable semantics to contain

          multiple tags. I have a few use cases where it would be nice

          to do that -- not sure how great an idea it is though. One

          thing it does is it more or less removes the need/desire to

          produce very similiar looking trees of terms with different

          root terms. I intend to create a VOTable issue explore how

          exactly to convey a &quot;bag of terms&quot; in a single table cell and

          a DataLink issue to explore multiple semantics tags. I wanted

          to mention it here in case it tweaks someone&#39;s imagination and

          because it seems peripherally related.<br>

        </div>

      </div>

    </blockquote>

    Indeed, this could allow to use the

    dataproduct_type/dataproduct_subtype branches in semantics in

    combination with &quot;sibling/associated-data&quot;, &quot;progenitor etc ....<br>

    <br>

    But you are right this probably requires a change in VOTable which

    has only a char (with dimension) datatype for strings.   <br>

    <br>

    <br>

    <br>

    More discussion on all this needed.<br>

    <br>

    Cheers<br>

    François<br>

    <blockquote type="cite">

      <div dir="ltr">

        <div class="gmail_default" style="font-size:small"><br clear="all">

        </div>

        <div>

          <div dir="ltr">

            <div dir="ltr">

              <div>

                <div dir="ltr">

                  <div>

                    <div>--<br>

                    </div>

                    <div>Patrick Dowler<br>

                    </div>

                    Canadian Astronomy Data Centre<br>

                  </div>

                  Victoria, BC, Canada<br>

                </div>

              </div>

            </div>

          </div>

        </div>

        <br>

      </div>

      <br>

      <div class="gmail_quote">

        <div dir="ltr" class="gmail_attr">On Mon, 4 Nov 2019 at 05:57,

          Markus Demleitner &lt;<a href="mailto:msdemlei@ari.uni-heidelberg.de" target="_blank">msdemlei@ari.uni-heidelberg.de</a>&gt;

          wrote:<br>

        </div>

        <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi

          DAL,<br>

          <br>

          On Tue, Oct 22, 2019 at 06:23:32PM +0200, François Bonnarel

          wrote:<br>

          &gt; Le 22/10/2019 à 10:53, Markus Demleitner a écrit :<br>

          &gt; &gt; On Mon, Oct 21, 2019 at 05:38:32PM +0200, Petr Skoda

          wrote:<br>

          &gt; &gt; As far as I can see, there are two use cases in

          general for datalink<br>

          &gt; &gt; semantics:<br>

          &gt; &gt; <br>

          &gt; &gt; (a) link filtering: The client, based on the

          semantics, selects a<br>

          [...]<br>

          &gt; &gt; <br>

          &gt; &gt; (b) figure out what do do with a link: When Aladin

          implemented<br>

          &gt; &gt; datalink, they found that based on what&#39;s in a

          datalink row, they<br>

          &gt; &gt; didn&#39;t know how to deal with a link: they&#39;d like to

          send spectra to<br>

          &gt; &gt; clients listening to spectrum.load.ssa-generic,

          images to those<br>

          &gt; &gt; listening to image.load.fits and so forth.  The

          datalink content_type<br>

          &gt; &gt; column isn&#39;t quite sufficient for this, because<br>

          &gt; &gt; application/x-votable+xml can be a spectrum or an

          object catalog,<br>

          &gt; &gt; whereas image/fits might be some kind of cube or a

          plain image (or an<br>

          &gt; &gt; IRAF spectrum, or still something else).  That&#39;s the

          &quot;SAMP sending<br>

          &gt; &gt; use case&quot; that, I think, was largely missed when we

          wrote datalink.<br>

          &gt;<br>

          &gt; Well, that&#39;s strange because from the beginning some of

          us (authors) had<br>

          &gt; something like that in mind. Well not exactly &quot;samp&quot; but

          more generally.<br>

          &gt; What will the client do with this link. Try to manage it

          herself and do<br>

          <br>

          Be that as it may, the actual spec has failed to cover that

          use case<br>

          -- which is why we are here.<br>

          <br>

          &gt; &gt; Having established this much, after a mail from Ada

          I had another of<br>

          &gt; &gt; my dangerous epiphanies.  That is, if we really want

          to deal with use<br>

          &gt; &gt; case (b) in semantics, we&#39;ll end up reproducing the

          distinction that<br>

          &gt; &gt; VEP-0001 proposes on in every branch: not only will

          we have<br>

          &gt; &gt; <br>

          &gt; &gt; #associated-cube #associated-image

          #associated-radialvelocitycurve ...<br>

          &gt; &gt; <br>

          &gt; &gt; but also<br>

          &gt; &gt; <br>

          &gt; &gt; #derivation-cube #derivation-image

          #derivation-radialvelocitycurve ...<br>

          &gt; &gt; <br>

          &gt; &gt; and (we&#39;ve already seen use cases for that)<br>

          &gt; &gt; <br>

          &gt; &gt; #progenitor-cube #progenitor-image

          #progenitor-radialvelocitycurve ...<br>

          &gt;<br>

          &gt; OK. This means that we are facing the three branches were

          the links targets<br>

          &gt; to datasets or datasets exerpts.<br>

          <br>

          I doubt it would be limited to these three; look at error-map,

          for<br>

          instance -- it stands to reason that error maps would, in

          general,<br>

          follow their &quot;main&quot; dataset&#39;s type, and hence you&#39;d have<br>

          <br>

          #error-cube #error-image #error-radialvelocitycurve...<br>

          <br>

          I could make that point for noise and weight, again, and I

          suspect<br>

          for quite a few of the terms we may see in the future.<br>

          <br>

          &gt; &gt; (3) Adding a dataproduct_type column in datalink. 

          If we started from<br>

          &gt; &gt; scratch, this is probably what I&#39;d do.  As things

          are now... don&#39;t<br>

          &gt; &gt; know.  As for (2), this can start immediately

          (because datalink lets<br>

          &gt; &gt; you add extra columns), and at it would even have

          the advantage that<br>

          &gt; &gt; clients that don&#39;t parse media types would still

          understand<br>

          &gt; &gt; content_type.<br>

          &gt; Well, some other people (Alberto for example) have asked

          for this. I&#39;m<br>

          &gt; reluctant because for most of the links this column will

          be unused (most of<br>

          &gt; the links usecase are not &quot;dataproducts&quot; at all). In

          general I think we<br>

          <br>

          That a column is empty for many links is not unusual in

          datalink (see<br>

          service_def and error_message in 1.0).  But also I suspect in

          most<br>

          datalink documents, the majority of links are actually

          &quot;sendable&quot; in<br>

          this sense: The progenitors and derivations of images and

          spectra, in<br>

          all likelihood, will be images and spectra again, as will

          #error,<br>

          #flat, #noise, #weight, and, of course, #this.<br>

          <br>

          &gt; should try to avoid adding columns in DataLink response

          and should try to<br>

          &gt; keep it simple. And sepcialy when these columns come from

          another spec<br>

          <br>

          About the simplicity, as someone wanting to put this stuff

          into pyVO,<br>

          my personal choice between<br>

          <br>

            Is semantics one of [#progenitor-image, #associated-image,<br>

              #derviation-image, #noise-image, #bias-image, #dark-image,

          ...]?<br>

          <br>

          and<br>

          <br>

            check the dataproduct_type column and, if there&#39;s a value,

          use that<br>

            to determine the default SAMP destinations<br>

          <br>

          is fairly clear (in particular because I&#39;ll need the second

          logic<br>

          for Obscore anyway).<br>

          <br>

          The one big downside that I can see with the dataproduct_type

          column<br>

          is that datalink 1.0 services won&#39;t have it for a long time

          (though<br>

          of course you can always just add the column to a 1.0 service,

          too).<br>

          <br>

          But then even with a semantics-based solution for the

          SAMP-sending<br>

          case, the clients would depend on operators adopting the new

          terms,<br>

          which I wouldn&#39;t expect to be instantaneous.<br>

          <br>

          Again, I&#39;d like to hear from Datalink producers and consumers

          what<br>

          they think.  Of for that, I&#39;d still not count out the solution

          via<br>

          media type content paramenters; this would be mighty useful

          far<br>

          beyond Datalink...<br>

          <br>

                  -- Markus<br>

        </blockquote>

      </div>

    </blockquote>

    <br>

  </div>

</blockquote></div>