[obs-tap]:updates on the Proposed recommendation
Francois Bonnarel
francois.bonnarel at astro.unistra.fr
Fri Jul 29 17:11:13 PDT 2011
Hi Arnold, all dm people,
Let me go back to this, because apparently, this discussion is going on
underground
First come back to the very beginning of the ObsTap effort...
It was a strong commitment from the comitee to build something fast
reusing tAP protocol and observation/charac data model for
data discovery covering most of the needs...
From the very beginning also, it was obvious that Data links
and virtual access data could not and will not be covered by Obstap
The DataLink method or service concept has been around in various DAL notes
since years now. As far as I am concerned I made presentations in the
last three
Interop meetings (Victoria, Nara and Napoli, see eg the latter:
http://www.ivoa.net/internal/IVOA/DAL-InteropMay2011/DataLink.pdf )
This concept is there, because you cannot imagine providing both Data
Discovery
and complex linkage features (or linkage for complex data structure) in
one step
and a SINGLE table, (single table required by the TAP-ADQL protocol as
all may remember)
So ObsTap is there for DataDiscovery... the only thing you can imagine
to provide access to the
various Data sets in an observation is to duplicate the observation raws
until you reach full
discovery of all observation-related products as was allready
explained... This is verbose
and works . So now how can DataLink work in the future ? see below on
your use case ...
Data Link is now in the roodmap of the DAL working group and an IVOA
note is in preparation as a
very first drafting effort of this new "protocol".... The note will be
available within 3 weeks or so..
Arnold Rots a e'crit :
> This is becoming unwieldy.
> Trying to make X-ray data (and I suspect the same is true for aperture
> synthesis data) fit into something that is designed with optical
> images in mind is reminiscent of round pegs and square holes.
>
> Service providers are free to define subtypes and titles, but you are
> saying that if they don't follow rules that are not spelled out,
> things won't work as envisaged.
> Also, if I understand the argument correctly, if data discovery
> software is to be helpful at all, it needs to be able to extract some
> information from the title field - but that is intended for human
> consumption.
>
> If I see this, it looks like I need to generate at least eight records
> for a single observation, some containing a mix of levels, and all
> duplicating pretty much the same metadata.
>
> This is not going to make it attractive to provide ObsTAP services.
>
>
> Maybe I should do what you did and provide an example of how I thought
> it should have worked.
>
> Here is how I would envisage data discovery of Chandra data to work:
> A single record per Obsid that provides the observational metadata and:
> ObsId
> 12345
> Dataset Identifier
> ivo://ADS/Sa.CXO#obs/12345
> Data Types available
> Package
> Event list
> Image
> Calibration level
> 2
> Title
> Chandra/ACIS ObsId 12345
>
>
DataLink is a method or a service allowing to retrieve a table
describing links between observations
identified by their obsid and any kind of data retrieval ... Obsid known
from an ObsTap discovery
phase can be directly used for interrogating such a service of course..
(and by the way in the case the Obstap service is a TAP-PQL service the
DataLink table could be attached with the main obstap table in the same
query response because the single table requirement is no more there in
that case)
But it is a qualified link which means that the semantic or type of the
link is given in one field
of the table, while the nature of the access is given in another field :
this can tell us if it is a simple
retrieval , an SIA Query service ans SSA AccesData method, etc ...
So in your use case we will get three different links for the same
Observation (obsid) .. the types
(or semantic) will be Package, event list and image and the Access
nature could be respectivly : retrieval
retrieval and SIA query (for example)
In addition the "Access" package (group of access fields in the table)
is proposed to be extended
beyond the traditional "reference" and "format" to describe which part
of a complex "file" is to be retrieved
( path in a directory/tar file, extension in MEF file, table name in a
VOTABLE, etc ...) .. A proposal
for such an extended access package is described in the
chaaracterisation 2 draft at the moment...
Best regards
Franc,ois
> Then a data access protocol that allows querying the archive using any
> of the above in a where clause, with either ObsId or DID required, and
> returning:
> ObsId DataType Contents Level Format URL
> -----------------------------------------------------------
> 12345 Pkg_1 evt,img 2 tar http://...
> 12345 Pkg_2 evt,img 1 tar http://...
> 12345 Pkg_12 evt,img 2,1 tar http://...
> 12345 evt evt 2 fits-bin http://...
> 12345 evt evt 1 fits-bin http://...
> 12345 img img 2 fits http://...
> 12345 img img 2 jpg http://...
> 12345 img img 2 fits http://...
> 12345 img img 2 jpg http://...
> This is an example where the client specified ObsId or DID, but no
> data type or format.
>
> Never mind the terms and abbreviations I used - you get the picture.
>
> Cheers,
>
> - Arnold
>
>
> Douglas Tody wrote:
>
>> More precisely what you might have is something like (display in a wide view):
>>
>> ObsId Type Subtype Level Format Title
>> ----------------------------------------------------------------------------------------------------------
>> 123 event chandra.hrc.pkg 1 application/x-tar-gzip Chandra ACS-XYZ observation package (event,refimage)
>> 123 image chandra.hrc.refimage 2 image/fits Chandra ACS-XYZ reference image
>> 123 image chandra.hrc.preview 2 image/jpeg Chandra ACS-XYZ preview image
>> 345 event rosat.foo.pkg 1 application/x-tar-gzip ROSAT whatever observation package (xxx)
>>
>> and so forth. The subtype could in principle be more generic but will
>> likely be instrument-specific for a level 1 observation.
>>
>> The Title should concisely describe the data product, e.g., origin,
>> instrument, ID, what it is (observation package, calibration, standard
>> view, etc.). The title string is what one normally wants to output on a
>> displayed image or plot to identify to a human the data being shown.
>> You can put whatever you want in there to describe the data product so
>> long as it is concise (one line of text).
>>
>> - Doug
>>
>>
>>
>>
>> On Mon, 11 Jul 2011, Douglas Tody wrote:
>>
>>
>>> On Thu, 7 Jul 2011, Arnold Rots wrote:
>>>
>>>
>>>> Aside from what I reported in a previous message, quoted below, there
>>>> are more discrepancies between Table 5 and Tables 6 and 7:
>>>>
>>>> obs_creator_did is missing from Table 7
>>>> o_units in Table 5 should be o_unit
>>>> pol_states is missing from Table 6
>>>> facility_name and instrument_name are spelled differently;
>>>> even though required, they show up in Table 7, rather than 6
>>>> em_unit is missing from Table 5
>>>> o_stat_error is missing from Table 7
>>>>
>>>> Also, note the comment I made on MJD in use case 1.6
>>>> and on the uselessness of bib_reference because of its murky
>>>> definition
>>>>
>>>> I still lament the fact that the data access functionality is
>>>> compromising the self-consistency and usefulness of the data discovery
>>>> function, but decided for our tarred packages to use:
>>>> dataproduct_type = NULL
>>>> dataproduct_subtype = package:event,image
>>>> access_format = application/x-tar
>>>> As far as I can tell, this is within the specifications.
>>>>
>>> Well we don't specify what the subtypes you provide for your archive
>>> should be so I suppose you could get away with this, but this example is
>>> not at all what we had in mind. The subtype should be the science type
>>> of the specific data product, *not* details about the content of the
>>> data product. I would expect the type to be "event" (meaning "event
>>> data" not "event list") and the subtype to be something more like
>>> "chandra.hrc.package", "chandra.hrc.refimage (or "rosat.XX" etc.).
>>>
>>> Note subtypes are supposed to be fixed strings so that one can search
>>> the local archive for a particular type of data product; if you try to
>>> describe what is included in a particular data product then such
>>> selection won't be possible. So for example a client will do a generic
>>> query to see what subtypes Chandra defines, and then they can pose a
>>> more specific query to get a certain type of Chandra-specific data
>>> product. Likewise for ALMA etc.
>>>
>>> Note you also have obs.title where you can provide a short description
>>> of the data product and for this you can provide whatever you want.
>>>
>>> - Doug
>>>
>>>
> --------------------------------------------------------------------------
> Arnold H. Rots Chandra X-ray Science Center
> Smithsonian Astrophysical Observatory tel: +1 617 496 7701
> 60 Garden Street, MS 67 fax: +1 617 495 7356
> Cambridge, MA 02138 arots at head.cfa.harvard.edu
> USA http://hea-www.harvard.edu/~arots/
> --------------------------------------------------------------------------
>
>
--
=====================================================================
Franc,ois Bonnarel Observatoire Astronomique de Strasbourg
CDS (Centre de donne'es 11, rue de l'Universite'
astronomiques de Strasbourg) F--67000 Strasbourg (France)
Tel: +33-(0)3 68 85 24 11 WWW: http://cdsweb.u-strasbg.fr/people/fb.html
Fax: +33-(0)3 68 85 24 25 E-mail: francois.bonnarel at astro.unistra.fr
---------------------------------------------------------------------
More information about the dm
mailing list