[obs-tap]:updates on the Proposed recommendation

Fri Jul 29 17:11:13 PDT 2011

Hi Arnold, all dm people,

Let me go back to this, because apparently, this discussion is going on 
underground

First come back to the very beginning of the ObsTap effort...
It was a strong commitment from the comitee to build something fast
reusing tAP protocol and observation/charac data model for
data discovery covering most of the needs...
 From the very beginning also, it was obvious that Data links
and virtual access data could not and will not be covered by Obstap
The DataLink method or service concept has been around in various DAL notes
since years now. As far as I am concerned I made presentations in the 
last three
Interop meetings (Victoria, Nara and Napoli, see eg the latter: 
http://www.ivoa.net/internal/IVOA/DAL-InteropMay2011/DataLink.pdf )

This concept is there, because you cannot imagine providing both Data 
Discovery
and complex linkage features (or linkage for complex data structure) in 
one step
and a SINGLE table, (single table required by the TAP-ADQL protocol as 
all may remember)
So ObsTap is there for DataDiscovery... the only thing you can imagine 
to provide access to the
various Data sets in an observation is to duplicate the observation raws 
until you reach full
discovery of all observation-related products as was allready 
explained... This is verbose
and works . So now how can DataLink work in the future ? see below on 
your use case ...
Data Link is now in the roodmap of the DAL working group and an IVOA 
note is in preparation as a
very first drafting effort of this new "protocol".... The note will be 
available within 3 weeks or so..

Arnold Rots a e'crit :
> This is becoming unwieldy.
> Trying to make X-ray data (and I suspect the same is true for aperture
> synthesis data) fit into something that is designed with optical
> images in mind is reminiscent of round pegs and square holes.
>
> Service providers are free to define subtypes and titles, but you are
> saying that if they don't follow rules that are not spelled out,
> things won't work as envisaged.
> Also, if I understand the argument correctly, if data discovery
> software is to be helpful at all, it needs to be able to extract some
> information from the title field - but that is intended for human
> consumption.
>
> If I see this, it looks like I need to generate at least eight records
> for a single observation, some containing a mix of levels, and all
> duplicating pretty much the same metadata.
>
> This is not going to make it attractive to provide ObsTAP services.
>
>
> Maybe I should do what you did and provide an example of how I thought
> it should have worked.
>
> Here is how I would envisage data discovery of Chandra data to work:
>   A single record per Obsid that provides the observational metadata and:
>     ObsId
>       12345
>     Dataset Identifier
>       ivo://ADS/Sa.CXO#obs/12345
>     Data Types available
>       Package
>       Event list
>       Image
>     Calibration level
>       2
>     Title
>       Chandra/ACIS ObsId 12345
>
>   
DataLink is a method or a service allowing to retrieve a table 
describing links between observations
identified by their obsid and any kind of data retrieval ... Obsid known 
from an ObsTap discovery
phase can be directly used for interrogating such a service of course.. 
(and by the way in the case the Obstap service is a TAP-PQL service the 
DataLink table could be attached with the main obstap table in the same 
query response because the single table requirement is no more there in 
that case)
But it is a qualified link which means that the semantic or type of the 
link is given in one field
of the table, while the nature of the access is given in another field : 
this can tell us if it is a simple
retrieval , an SIA Query service ans SSA AccesData method, etc ...
So in your use case we will get three different links for the same 
Observation (obsid) .. the types
(or semantic) will be Package, event list and image and the Access 
nature could be respectivly : retrieval
retrieval and SIA query (for example)
In addition the "Access" package (group of access fields in the table) 
is proposed to be extended
beyond the traditional "reference" and "format" to describe which part 
of a complex "file" is to be retrieved
( path in a directory/tar file, extension in MEF file, table name in a 
VOTABLE, etc ...) .. A proposal
for such an extended access package is described in the 
chaaracterisation 2 draft at the moment...

Best regards
Franc,ois
> Then a data access protocol that allows querying the archive using any
> of the above in a where clause, with either ObsId or DID required, and
> returning:
>   ObsId  DataType   Contents   Level   Format      URL
>   -----------------------------------------------------------
>   12345  Pkg_1      evt,img    2       tar         http://...
>   12345  Pkg_2      evt,img    1       tar         http://...
>   12345  Pkg_12     evt,img    2,1     tar         http://...
>   12345  evt        evt        2       fits-bin    http://...
>   12345  evt        evt        1       fits-bin    http://...
>   12345  img        img        2       fits        http://...
>   12345  img        img        2       jpg         http://...
>   12345  img        img        2       fits        http://...
>   12345  img        img        2       jpg         http://...
> This is an example where the client specified ObsId or DID, but no
> data type or format.
>
> Never mind the terms and abbreviations I used - you get the picture.
>
> Cheers,
>
>   - Arnold
>
>
> Douglas Tody wrote:
>   
>> More precisely what you might have is something like (display in a wide view):
>>
>>      ObsId     Type     Subtype               Level     Format                         Title
>>      ----------------------------------------------------------------------------------------------------------
>>      123      event    chandra.hrc.pkg         1      application/x-tar-gzip   Chandra ACS-XYZ observation package (event,refimage)
>>      123      image    chandra.hrc.refimage    2      image/fits               Chandra ACS-XYZ reference image
>>      123      image    chandra.hrc.preview     2      image/jpeg               Chandra ACS-XYZ preview image
>>      345      event    rosat.foo.pkg           1      application/x-tar-gzip   ROSAT whatever observation package (xxx)
>>
>> and so forth.  The subtype could in principle be more generic but will
>> likely be instrument-specific for a level 1 observation.
>>
>> The Title should concisely describe the data product, e.g., origin,
>> instrument, ID, what it is (observation package, calibration, standard
>> view, etc.).  The title string is what one normally wants to output on a
>> displayed image or plot to identify to a human the data being shown.
>> You can put whatever you want in there to describe the data product so
>> long as it is concise (one line of text).
>>
>>          - Doug
>>
>>
>>
>>
>> On Mon, 11 Jul 2011, Douglas Tody wrote:
>>
>>     
>>> On Thu, 7 Jul 2011, Arnold Rots wrote:
>>>
>>>       
>>>> Aside from what I reported in a previous message, quoted below, there
>>>> are more discrepancies between Table 5 and Tables 6 and 7:
>>>>
>>>> obs_creator_did is missing from Table 7
>>>> o_units in Table 5 should be o_unit
>>>> pol_states is missing from Table 6
>>>> facility_name and instrument_name are spelled differently;
>>>>  even though required, they show up in Table 7, rather than 6
>>>> em_unit is missing from Table 5
>>>> o_stat_error is missing from Table 7
>>>>
>>>> Also, note the comment I made on MJD in use case 1.6
>>>> and on the uselessness of bib_reference because of its murky
>>>> definition
>>>>
>>>> I still lament the fact that the data access functionality is
>>>> compromising the self-consistency and usefulness of the data discovery
>>>> function, but decided for our tarred packages to use:
>>>>  dataproduct_type = NULL
>>>>  dataproduct_subtype = package:event,image
>>>>  access_format = application/x-tar
>>>> As far as I can tell, this is within the specifications.
>>>>         
>>> Well we don't specify what the subtypes you provide for your archive
>>> should be so I suppose you could get away with this, but this example is
>>> not at all what we had in mind.  The subtype should be the science type
>>> of the specific data product, *not* details about the content of the
>>> data product.  I would expect the type to be "event" (meaning "event
>>> data" not "event list") and the subtype to be something more like
>>> "chandra.hrc.package", "chandra.hrc.refimage (or "rosat.XX" etc.).
>>>
>>> Note subtypes are supposed to be fixed strings so that one can search
>>> the local archive for a particular type of data product; if you try to
>>> describe what is included in a particular data product then such
>>> selection won't be possible.  So for example a client will do a generic
>>> query to see what subtypes Chandra defines, and then they can pose a
>>> more specific query to get a certain type of Chandra-specific data
>>> product.  Likewise for ALMA etc.
>>>
>>> Note you also have obs.title where you can provide a short description
>>> of the data product and for this you can provide whatever you want.
>>>
>>>       - Doug
>>>
>>>       
> --------------------------------------------------------------------------
> Arnold H. Rots                                Chandra X-ray Science Center
> Smithsonian Astrophysical Observatory                tel:  +1 617 496 7701
> 60 Garden Street, MS 67                              fax:  +1 617 495 7356
> Cambridge, MA 02138                             arots at head.cfa.harvard.edu
> USA                                     http://hea-www.harvard.edu/~arots/
> --------------------------------------------------------------------------
>
>   

-- 
=====================================================================
Franc,ois   Bonnarel           Observatoire Astronomique de Strasbourg
CDS (Centre de donne'es        11, rue de l'Universite'
astronomiques de Strasbourg)  F--67000 Strasbourg (France)

Tel: +33-(0)3 68 85 24 11     WWW: http://cdsweb.u-strasbg.fr/people/fb.html
Fax: +33-(0)3 68 85 24 25     E-mail: francois.bonnarel at astro.unistra.fr
---------------------------------------------------------------------