<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

    <p>Dear all,<br>

    </p>

    <ul>

      <li>When I proposed VEP0001 immediately after Groningen Interop I

        could not imagine that such a controversy discussion would

        occur. <br>

      </li>

      <ul>

        <li>Before considering the use case we have I would like to go

          back to the current usages of DataLink I know.</li>

        <li>Then go back to the "new" use case</li>

        <li>And then check some of the proposed solutions on this list</li>

        <li>And then argue for my preference</li>

      </ul>

      <li>According to DataLink 1.0 <br>

      </li>

      <ul>

        <li>the semantics field contains a "Term from a controlled

          vocabulary describing the link" as stated in Table 1 and </li>

        <li>section 3.2.6 reads :</li>

        <li>"The semantics column contains a single term from an

          external RDF vocabulary that describes the meaning of this

          linked resource relative to the identified dataset. The

          semantics column is intended to be machine-readable and assist

          automating data retrieval and processing."</li>

        <li>Let's call the initial thing we are starting from and to

          which we want to link resources "Main" and the various linked

          resources "Target".</li>

        <ul>

          <li>Two remarks  :</li>

          <ul>

            <li>The text in section 3.2.6, consistently with the use

              cases described in the introduction considers that the

              "Main" is a dataset</li>

            <li>The  semantics field describes globally what the target

              is "with respect to the main"</li>

          </ul>

          <li>More classical is the group of columns access_URL ,

            content_type, content_length which references and describes

            the "Target" itself (independently from the "Main")</li>

          <li>Now I tried to look a little bit at the current usage of

            DataLink using Aladin DeskTop as a client and the three

            following SIAP2 servers  </li>

          <ul>

            <li>CADC : <br>

            </li>

            <ul>

              <li>In the example I found The DataLink service had "this"

                in semantics for the full retrieval of the dataset,</li>

              <li> "cutout" for a SODA service <br>

              </li>

              <li>and a couple of "auxiliary" Rows for additional data

                such as PSF images, etc...</li>

              <li> cutout is related to the fact that it is a service,

                described as "service descriptor". Aladin opens a

                specific menu in that case while it downloads the

                datasets in the other cases according to the fact its

                "content_type" is application/fits</li>

            </ul>

            <li>GAVO :  <br>

            </li>

            <ul>

              <li>In the example I found The DataLink service had "this"

                in semantics,  and also "preview", "proc" and "science".</li>

              <li> "this" and "preview" are self-explanatory. <br>

              </li>

              <li>"proc" is actually related to a SODA service (should

                be "cutout" maybe ?) <br>

              </li>

              <li>and science is a new term proposed by Markus to take

                into account that it is related science data  </li>

            </ul>

            <li>CASDA : <br>

            </li>

            <ul>

              <li> In the example I found,  "Main" was a cube. It had in

                semantics several "this", a "cutout and a "proc".</li>

              <li>  Each "this" row allowed the retrieval of the full

                dataset from different servers sometimes in synchronous

                mode and sometimes in asynchronous mode.</li>

              <li> The "cutout" row is related to a SODA service. <br>

              </li>

              <li>The "proc" row links to a SODA-like service extracting

                a single integrated spectrum from the data cube.</li>

            </ul>

          </ul>

          <li>This shows that semantics is not only there in DataLink

            for selection among rows in the {links} response table but

            also helps the client to figure out what to do with the

            target in combination with content-type, content_length and

            service descriptor (if any is defined).   </li>

          <li>This also shows that semantics terms work like a flat

            vocabulary despite their tree presentation in the rdf

            document. </li>

          <ul>

            <li>Auxiliary is a head term for bias, dark, flat but can

              also be used on its own for non registered cases.</li>

            <li>Same for proc and cutout. </li>

            <li>The tree structure of the vocabulary is actually only

              descriptive. It's not functional at the time of writing. </li>

          </ul>

        </ul>

      </ul>

      <li>New Uses cases:</li>

      <ul>

        <li>Short after DataLink became an official IVOA recommendation,

          some data providers were interested  in using the DataLink

          functionalities for use cases where the "Main" was a source in

          a catalogue.</li>

        <li> This can work, of course, and proposal are currently

          discussed to integrate these use cases within the scope of

          DataLink-1.1, but no adapted semantics terms describing this

          kind of relationship between the "Main" and the "Target" were

          available in the previous vocabulary.</li>

        <li>Often  the "Target" related to the source "Main" is the

          result of an observation of the source, actually a dataset

          (image, spectrum, lightcurve, etc..)</li>

        <ul>

          <li> In vizieR we had a similar situation for what we call

            "associated data" to catalogue "rows". </li>

          <li>these "associated data" can indeed be images, TimeSeries,

            cubes, spectra...</li>

        </ul>

        <li> Hence the VEP0001 proposal as it was presented in October

          the 15th<br>

        </li>

        <ul>

          <li>An associated_image is actually "an image of main" which

            is a source.</li>

          <li> An associated_lightcurve is similarly " a light curve of

            Main"   which is a source.</li>

        </ul>

        <li> It is to be en-lighted that this term informs the client

          that it is an image or a light curve and that it is an

          Observation result of the source.  </li>

        <li>The proposal to define an item in the associated branch for

          each value of dataproduct_type and even more for each subtype

          of TimeSeries introduced the idea to combine associated_data

          with the ObsCore vocabulary.</li>

        <ul>

          <li> It was pointed out (By Markus) that other head terms such

            has "progenitor" or "derived" could need this too and this

            could lead to a combinatory explosion. </li>

        </ul>

        <li>By the way the term "associated_data" itself has been

          criticized to describe the concept of observation result of a

          source.</li>

      </ul>

      <li>The 4 concepts proposal</li>

      <ul>

        <li>Ada proposed to separate the description of the links in 4

          different concepts</li>

        <ul>

          <li>"4 independent levels or categories: </li>

          <li>Level 0 - Data-format (fits, VOTable, PDF, png, …)</li>

          <li>Level 1 - Data-type (tabular, image, spectrum, cube, text,

            …)</li>

          <li>Level 2 - Data-information (Documentation, Calibration,

            Log, Preview, …)</li>

          <li>Level 3 - Data-relation (Derived from, Progenitor of,

            Sibling of, ...)"</li>

        </ul>

        <li>I think this introduces an effort for a  real data modelling

          of DataLink. It would be obviously a major improvement in the

          way we link resources. But it may take sometimes to achieve.</li>

        <li>At the moment I don't see a clear distinction between level

          2 and level 3 because the "information" we have in the

          "Target"  is always "relative" to a "Main" so not  that far

          from level 3. At least it may be sometimes difficult to know 

          in which "level" falls  a given category value </li>

        <li>On the other side for links to dynamical services I am not

          sure to which category their characterization belongs. Is

          that  a fifth level to add ? Data-type in the context of

          DataLink should have a much wider scope than ObsCore

          "dataproduct_type" because there are targets which are not

          data products. Various metadata, auxiliary data, texts, plots,

          etc... If data_product_type is standardized, what about the

          other stuff ? <br>

        </li>

        <li>To me It looks like the levels proposed by ada (an maybe a

          few others) are more like matrix description tant a flat one.

          <br>

        </li>

        <li>Account taken of all the above, I think the levelling of the

          categories can be a project for DataLink 2 which will be

          really interesting. if we want to have a quick solution I

          think we have to consider more modest solutions.</li>

      </ul>

      <li>Among different Proposals :</li>

      <ul>

        <li>I see two possible simple solutions to tackle the use case</li>

        <ul>

          <li>go back to a simplified version of VEP001.  </li>

          <ul>

            <li>Instead to reproduce the full ObsCore "dataproduct_type"

              variability we only define the terms we currently need 

              and we will see in the future if we need more.</li>

            <li>At the same time I get rid both of "associated_data" and

              "sibling" head term and choose to use

              "Observation_Result_of_source"</li>

            <li>ESO and SVO use cases :   "image_of_source"",

              "Spectrum_of_source"</li>

            <li>TimeDomain/Gaia use cases :  "LightCurve_Of_Source",

              "RadialVelocityCurve_Of_Source", "Movie_Of_Source",

              "SpectroChronogram_Of_Source"</li>

            <ul>

              <li>"TimeSeries_Of_Source" may be used as a head term for

                the four above, or when we don't know exactly what is

                varying in time.</li>

            </ul>

          </ul>

          <li>adopt proposal made by Pat Dowler. Use the media type in

            content_type to give the type or product type using the

            parameter "content="</li>

          <ul>

            <li>application/fits;content=image</li>

            <li>application/fits;content=spectrum</li>

            <li> application/fits;content=lightcurve or

              application/fits;content=timeseries;subtype=lightcurve</li>

            <li>application/fits;content=movie or

              applicaton/fits;content=timeseries;subtype=movie</li>

            <li>etc ...</li>

          </ul>

          <ul>

            <li>the standard structure of media types allows to reuse

              the current "dataproduct_type" vocabularu  as a vlaue of

              the content parameter and then to use an additional

              "subtype" parameter, or alternatively  to directly use the

              timseries subtype in "content=".</li>

            <li>a variant would be to create a new dataproduct_type

              parameter in the media type when appropriate<br>

            </li>

            <li> If we adopt that, semantics will only be

              "Observation_Result_of_source" in parallel for all these

              possibilities<br>

            </li>

          </ul>

          <li> In the first solution we directly introduce some kind of

            datatype in the "meaning of target relative to the main"

            semantics field which I think it's fine except that it

            doesn't explicitely reuse ObsCore dataproducttype.</li>

          <li>In the second solution clients will have to parse the

            media type to discover not only the format of the target but

            also its content. We still have to decide how to do subtype.

            <br>

          </li>

          <ul>

            <li>This has probably to be explicitly explained in the next

              DataLink-1.1 version</li>

          </ul>

        </ul>

        <li>What do implementers / service providers prefer ?</li>

      </ul>

    </ul>

    <p><br>

    </p>

    <p>I wish you all happy holidays for the coming days</p>

    <p>Cheers</p>

    <p>François<br>

    </p>

    <p><br>

    </p>

    <p><br>

      <br>

                      <br>

      <br>

      <br>

      <br>

      <br>

      <br>

           <br>

           <br>

      <br>

      <br>

      <br>

    </p>

    <br>

    <div class="moz-cite-prefix">Le 10/12/2019 à 02:10, Patrick Dowler a

      écrit :<br>

    </div>

    <blockquote type="cite"

cite="mid:CAFK8nrosTHdKS3-WMuXLOFt-sKw9eafGyhFLnwd+5FeySuXEvg@mail.gmail.com">

      <div dir="ltr">

        <div class="gmail_default" style="font-size:small">One of the

          ideas we discussed at  the interop was for the contentType to

          cover both level 0 and 1 Of Ada's very useful list above. We

          already do this with Datalink itself via a param in the type:</div>

        <div class="gmail_default" style="font-size:small"><br>

        </div>

        <div class="gmail_default" style="font-size:small">application/x-votable+xml;content=datalink</div>

        <div class="gmail_default" style="font-size:small"><br>

        </div>

        <div class="gmail_default" style="font-size:small">The idea was

          to prototype using the ObsCore dataproduct_type values with

          the content param, which works quite nicely for several of the

          types and tells you alot more than the bare mime type, eg:</div>

        <div class="gmail_default" style="font-size:small"><br>

        </div>

        <div class="gmail_default" style="font-size:small">application/fits</div>

        <div class="gmail_default" style="font-size:small">application/fits;content=image<br>

        </div>

        <div class="gmail_default" style="font-size:small">application/fits;content=spectrum</div>

        <div class="gmail_default" style="font-size:small">image/png;content=spectrum

          along with semantics=#preview??<br>

        </div>

        <div class="gmail_default" style="font-size:small">application/x-votable+xml;content=spectrum</div>

        <div class="gmail_default" style="font-size:small">application/x-votable+xml;content=datalink</div>

        <div class="gmail_default" style="font-size:small"><br>

        </div>

        <div class="gmail_default" style="font-size:small">We could

          eventually sanction and give guidance for this sort of usage

          and I think it is something simple that could be used by the

          larger community. The thing is that services can do this now:

          in a DataLink links resource, in HTTP Content-Type headers, in

          VOSpace node metadata, etc... all allowed now and all adding

          more useful information for clients to act on... the

          usefulness of this idea beyond the links response is appealing

          and makes me not want a DataLink-specific solution (new

          field).<br>

        </div>

        <div class="gmail_default" style="font-size:small"><br>

        </div>

        <div class="gmail_default" style="font-size:small">thoughts?</div>

        <div class="gmail_default" style="font-size:small"><br>

        </div>

        <div>

          <div dir="ltr" class="gmail_signature"

            data-smartmail="gmail_signature">

            <div dir="ltr">

              <div>

                <div dir="ltr">

                  <div>

                    <div>--<br>

                    </div>

                    <div>Patrick Dowler<br>

                    </div>

                    Canadian Astronomy Data Centre<br>

                  </div>

                  Victoria, BC, Canada<br>

                </div>

              </div>

            </div>

          </div>

        </div>

        <br>

      </div>

      <br>

      <div class="gmail_quote">

        <div dir="ltr" class="gmail_attr">On Mon, 9 Dec 2019 at 01:31,

          Markus Demleitner &lt;<a

            href="mailto:msdemlei@ari.uni-heidelberg.de"

            moz-do-not-send="true">msdemlei@ari.uni-heidelberg.de</a>&gt;

          wrote:<br>

        </div>

        <blockquote class="gmail_quote" style="margin:0px 0px 0px

          0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi

          DAL, again,<br>

          <br>

          On Fri, Dec 06, 2019 at 02:45:30PM +0100, ada nebot wrote:<br>

          &gt; But if were to add terms such as sibling and so on, there

          is already an IVOA relationship vocabulary: <br>

          &gt; <a

href="http://ivoa.net/rdf/voresource/relationship_type/2016-08-17/relationship_type.html"

            rel="noreferrer" target="_blank" moz-do-not-send="true">http://ivoa.net/rdf/voresource/relationship_type/2016-08-17/relationship_type.html</a>

          &lt;<a

href="http://ivoa.net/rdf/voresource/relationship_type/2016-08-17/relationship_type.html"

            rel="noreferrer" target="_blank" moz-do-not-send="true">http://ivoa.net/rdf/voresource/relationship_type/2016-08-17/relationship_type.html</a>&gt;<br>

          &gt; <br>

          &gt; Comments? <br>

          <br>

          This is an excellent point.  relationship_type currently

          reflects the<br>

          parts of DataCite's relationships relevant to VOResource.  But

          these<br>

          relationships by DataCite's goals are also (indeed, mainly)

          intended<br>

          for what we in the VO would call datasets.  And that happens

          to be<br>

          rather close to what Datalink is talking about.<br>

          <br>

          I also agree it looks a bit odd that we have #IsDerivedFrom

          and <br>

          #IsSourceOf in relationship_type and #progenitor and

          #derivation in<br>

          datalink/core -- it's one of these cases where things that

          appear to<br>

          be largely unrelated (Registry and Datalink) suddenly turn out

          to<br>

          have rather close relations after all.<br>

          <br>

          On the other hand, of course, I'm always for pragmatism when

          in<br>

          doubt.  The main use case for Datalink semantics has been (or

          so I<br>

          think) to let clients filter out or group links depending on

          what<br>

          they perceive the current user interest (which I think so far

          none<br>

          do): Hide #progenitor in science analysis, hide #derivation

          during<br>

          debugging.<br>

          <br>

          For that, #progenitor and #derivation would, I think, work

          rather<br>

          well (though I'm suddenly not sure any more why #calibration

          isn't a<br>

          child of #progenitor -- if someone puts in a VEP for that, I

          think<br>

          you'd have my vote).  And anyway, before we embark on a

          re-design of<br>

          that part of datalink with a view to unifying it with

          VOResource, I'd<br>

          frankly like to have opinions from datalink consumers if

          they'd like<br>

          a move towards relationship_type.<br>

          <br>

          So, I guess what I'm saying is: as long as we have #derivation

          and<br>

          #progenitor in datalink/core, there should be #sibling.  This

          at<br>

          least appears attractive to me from a producer side (which,

          yes,<br>

          isn't more than half the picture).<br>

          <br>

          The alternative would be to deprecate #derivation and

          #progenitor and<br>

          devise some way to pull relationship_type into Datalink.  I

          give you<br>

          that'd certainly be cleaner.  But, as said above, my<br>

          pragmatism-o-meter currently has an underflow when considering

          that.<br>

          But, again, it's been known to be off before.<br>

          <br>

                  -- Markus<br>

          <br>

          <br>

          PS: Incidentally, vocabularies should be cited with the

          namespaces<br>

          given on their HTML renditions, in this case<br>

          <a href="http://www.ivoa.net/rdf/voresource/relationship_type"

            rel="noreferrer" target="_blank" moz-do-not-send="true">http://www.ivoa.net/rdf/voresource/relationship_type</a>

          and<br>

          <a href="http://www.ivoa.net/rdf/datalink/core"

            rel="noreferrer" target="_blank" moz-do-not-send="true">http://www.ivoa.net/rdf/datalink/core</a>. 

          I wonder if there's a good<br>

          way to encourage people to not just yank the URL out of their<br>

          browser...<br>

        </blockquote>

      </div>

    </blockquote>

    <br>

  </body>

</html>