DataLink issues

Douglas Tody dtody at nrao.edu
Thu Sep 6 20:08:44 PDT 2012


On Thu, 6 Sep 2012, François Bonnarel wrote:

> Dear all,
>       Last interop confirmed there was interest in the DataLink concept , 
> defined as a "way to find related resources to datasets via a web service"
>      This is needed either as a complement to Data services (ObsTap, Tap, 
> S*A service) or to SimDal...
> A general agreement was made on input/output (PublisherDID/ VOtable with 
> links) and on a general concept for the structure of the links "records"
>
> There are  still a couple of issues, however:
>
>         -- Is "DataLink" a real full DAL service with recording in the IVOA 
> registries ? Or is just a web service refered in the main services query 
> responses ?
> The  first case requires  dataset ID to be a real PublisherDID. For the 
> latter case an internal  DatasetID could be sufficient....

The basic concept of DataLink is that we start with a Dataset (PubDID)
or maybe an Observation (ala TAP obs_id) and go get a list of data links
pointing to associated data files, services, or other resources, e.g.  a
batch job for custom reprocessing of the dataset.

It would seem that, given a DAL query response (TAP, SIA, etc.) it
should be possible to directly query for the data links for a given
dataset or possibly obs_id, without having to try to infer the existence
of an associated datalink service via a registry query.  So the DAL
query response should point to whatever web service or service operation
is used to get datalinks.  If for the given service, a PubDID or obs_id
is global and persistent then there is no reason this datalink service
could not also be registered as a service in its own right, separate
from any shortcut links from related DAL query responses, but this need
not be required to get datalinks for an existing DAL service query
response.

>        -- One of our use cases is "related access to another DAL service". 
> Another one  "Internal access  to complex datasets (archives, MEFS ....)....
> The first use case can be some "AccessData" method of the DAL service 
> performing some dynamic transformation on the dataset. The actual 
> transformation is
> driven by the parameters values of the method. Probably "AccessData" URL of 
> the service, without any parameter, could answer with a VOTABLE describing 
> available parameters for the dataset...
>            For the internal access to complex datasets, the May 2012 
> DataLink draft proposed a little model for internal structure "mappable" on 
> response VOTABLE FIELDs
> This has been criticized as an "ad hoc" solution for our MEF or tar archives 
> examples...
>            Another solution could be "extended URI" (Norman Gray proposal). 
> But the rightmost part of the URI is not interpretable anyway...
>            A new proposal (Laurent Michel) could be that the link to the 
> complex dataset  provides a list of parameters allowing to extract the 
> various subparts of the dataset.
> Something rather similar to the "AccessData" link behavior, eventually

I think DataLink is a quite different mechanism from AccessData.

DataLink is a sematic Web type of capability, providing semantically
complex links (more complex than just URLs) which can be followed given
a starting object.  Although data links have more complex semantics I
think the basic mechanism should be kept fairly simple - given the right
software one might be able to just "click" on a link and something
happens.  Or at least one gets a list of associated data objects or
other resources, with more semantic detail than we get from just a Web
resource.

AccessData is very different.  This provides precise, client-driven,
quantitative access to a given dataset.  So for example the client can
say give me this exact cutout, slice, or other subset of the dataset,
expressed in pixel (in the case of an image) or WCS coordinates.  This
is very different than a datalink as datalinks are predefined by the
data provider (here is a list of the standard datalinks we define)
whereas accessData provides the client with direct, sample/pixel level
access to a given dataset, to allow quantitative analysis without having
to download the full dataset.  The one place where they can come
together is where a datalink URL can invoke a specific accessData
operation on a dataset, e.g.  to provide some standard view of the
dataset.

 	- Doug


> Your comments are welcomed. We have to modify the May draft to go forward at 
> netx interop
>
> Best regards
> François
>


More information about the dal mailing list