[obs-tap]:updates on the Proposed recommendation
Douglas Tody
dtody at nrao.edu
Mon Aug 1 15:12:55 PDT 2011
Arnold -
ObsTAP alone is sufficient for the simpler use cases; it is similar to
what a classical archive provides in providing direct discovery and
download of static archive data products.
Data linking (also association for modeling complex data) will be a
powerful advanced capability. However it is not required for the more
basic use cases, is optional, and will still be being prototyped after
we get the basic ObsTAP indexing in place.
If you really want to expose only your instrumental observations and
rely upon data linking for all access you can (it is possible within the
interface design) however it would be preferable, and more consistent
with usage at other sites, if you would expose both the observations and
major data products, using data linking only to support the more
advanced queries. This is what we plan to do here for example.
Yes there is some duplication of metadata in the query response for
related products, but that considerably simplifies the interface and as
noted earlier this metadata can easily be autogenerated from the actual
fully normalized and probably non-standardized base tables. If a client
just wants to work off the observation index plus data links that usage
mode will still work fine. They can simply restrict the query to a
single subtype and follow the links.
This could be a good approach for dealing with complex instrumental data
such as for Chandra or a radio instrument, however if a user is just
looking for smallish images with calib_level>=2 and a minimum spatial
resolution of 2 arcsec (or whatever) it can be done in one step with the
basic ObsTAP interface - but globally for the whole VO. A primary
characteristic of a good design is that simple things can be done
simply, but the design can also handle more complex use cases without
conflicting with the basic model.
- Doug
On Mon, 1 Aug 2011, Arnold Rots wrote:
> See also my response to Francois.
>
> I would argue that it's better to give up that ability, since it
> yields a cleaner data discovery protocol, that will therefore be more
> likely to survive future developments.
>
> Besides, if I understand your usage correctly, it will require
> separate records for the "reference" products and for the more
> involved ones.
>
> Cheers,
>
> - Arnold
>
> Douglas Tody wrote:
>> Arnold -
>>
>> As we see ObsTAP with association and data linking (which has long been
>> the plan) is capable of doing what you want, i.e., describe only the
>> observations and point to related data products or access services via
>> data linking.
>>
>> What you would give up with this approach however is the ability to
>> directly expose associated high level data products such as reference
>> images or spectra via ObsTAP so that they can be accessed directly
>> without having to follow data links or invoke additional services.
>>
>> As noted in earlier email a hybrid approach is possible, describing the
>> observation and overall packaged dataset with data links pointing to the
>> full list of individual data products or access services, as well as
>> selected high level data products such as precomputed reference images.
>>
>> - Doug
>>
>>
>> On Mon, 1 Aug 2011, Arnold Rots wrote:
>>
>>> Francois,
>>>
>>> Nothing is going on underground.
>>> I have shared our experiences in implementing ObsTAP with some local
>>> members of the TCG. I made it clear that the PR can be implemented,
>>> but that there are problems.
>>>
>>> But if I understand your argument below, we are in full agreement.
>>> You want to separate Data Link from Data Discovery and that is
>>> precisely what I was arguing; my complaint was that there are Data
>>> Link elements in the ObsTAP Data Discovery protocol that are causing a
>>> problem.
>>> Specifically: the access_* elements belong in Data Link, not in Data
>>> Discovery, and with them removed the data types available can be
>>> enumerated in a single record.
>>> So, the example I gave (the responses to a Data Discovery query and a
>>> Data Linking query) are in full agreement with what you are
>>> advocating, as far as I can tell.
>>>
>>> Is there still an issue, then?
>>>
>>> Cheers,
>>>
>>> - Arnold
>>>
>>>
>>> Francois Bonnarel wrote:
>>>> Hi Arnold, all dm people,
>>>>
>>>> Let me go back to this, because apparently, this discussion is going on
>>>> underground
>>>>
>>>> First come back to the very beginning of the ObsTap effort...
>>>> It was a strong commitment from the comitee to build something fast
>>>> reusing tAP protocol and observation/charac data model for
>>>> data discovery covering most of the needs...
>>>> From the very beginning also, it was obvious that Data links
>>>> and virtual access data could not and will not be covered by Obstap
>>>> The DataLink method or service concept has been around in various DAL notes
>>>> since years now. As far as I am concerned I made presentations in the
>>>> last three
>>>> Interop meetings (Victoria, Nara and Napoli, see eg the latter:
>>>> http://www.ivoa.net/internal/IVOA/DAL-InteropMay2011/DataLink.pdf )
>>>>
>>>> This concept is there, because you cannot imagine providing both Data
>>>> Discovery
>>>> and complex linkage features (or linkage for complex data structure) in
>>>> one step
>>>> and a SINGLE table, (single table required by the TAP-ADQL protocol as
>>>> all may remember)
>>>> So ObsTap is there for DataDiscovery... the only thing you can imagine
>>>> to provide access to the
>>>> various Data sets in an observation is to duplicate the observation raws
>>>> until you reach full
>>>> discovery of all observation-related products as was allready
>>>> explained... This is verbose
>>>> and works . So now how can DataLink work in the future ? see below on
>>>> your use case ...
>>>> Data Link is now in the roodmap of the DAL working group and an IVOA
>>>> note is in preparation as a
>>>> very first drafting effort of this new "protocol".... The note will be
>>>> available within 3 weeks or so..
>>>>
>>>> Arnold Rots a e'crit :
>>>>> This is becoming unwieldy.
>>>>> Trying to make X-ray data (and I suspect the same is true for aperture
>>>>> synthesis data) fit into something that is designed with optical
>>>>> images in mind is reminiscent of round pegs and square holes.
>>>>>
>>>>> Service providers are free to define subtypes and titles, but you are
>>>>> saying that if they don't follow rules that are not spelled out,
>>>>> things won't work as envisaged.
>>>>> Also, if I understand the argument correctly, if data discovery
>>>>> software is to be helpful at all, it needs to be able to extract some
>>>>> information from the title field - but that is intended for human
>>>>> consumption.
>>>>>
>>>>> If I see this, it looks like I need to generate at least eight records
>>>>> for a single observation, some containing a mix of levels, and all
>>>>> duplicating pretty much the same metadata.
>>>>>
>>>>> This is not going to make it attractive to provide ObsTAP services.
>>>>>
>>>>>
>>>>> Maybe I should do what you did and provide an example of how I thought
>>>>> it should have worked.
>>>>>
>>>>> Here is how I would envisage data discovery of Chandra data to work:
>>>>> A single record per Obsid that provides the observational metadata and:
>>>>> ObsId
>>>>> 12345
>>>>> Dataset Identifier
>>>>> ivo://ADS/Sa.CXO#obs/12345
>>>>> Data Types available
>>>>> Package
>>>>> Event list
>>>>> Image
>>>>> Calibration level
>>>>> 2
>>>>> Title
>>>>> Chandra/ACIS ObsId 12345
>>>>>
>>>>>
>>>> DataLink is a method or a service allowing to retrieve a table
>>>> describing links between observations
>>>> identified by their obsid and any kind of data retrieval ... Obsid known
>>>> from an ObsTap discovery
>>>> phase can be directly used for interrogating such a service of course..
>>>> (and by the way in the case the Obstap service is a TAP-PQL service the
>>>> DataLink table could be attached with the main obstap table in the same
>>>> query response because the single table requirement is no more there in
>>>> that case)
>>>> But it is a qualified link which means that the semantic or type of the
>>>> link is given in one field
>>>> of the table, while the nature of the access is given in another field :
>>>> this can tell us if it is a simple
>>>> retrieval , an SIA Query service ans SSA AccesData method, etc ...
>>>> So in your use case we will get three different links for the same
>>>> Observation (obsid) .. the types
>>>> (or semantic) will be Package, event list and image and the Access
>>>> nature could be respectivly : retrieval
>>>> retrieval and SIA query (for example)
>>>> In addition the "Access" package (group of access fields in the table)
>>>> is proposed to be extended
>>>> beyond the traditional "reference" and "format" to describe which part
>>>> of a complex "file" is to be retrieved
>>>> ( path in a directory/tar file, extension in MEF file, table name in a
>>>> VOTABLE, etc ...) .. A proposal
>>>> for such an extended access package is described in the
>>>> chaaracterisation 2 draft at the moment...
>>>>
>>>> Best regards
>>>> Franc,ois
>>>>> Then a data access protocol that allows querying the archive using any
>>>>> of the above in a where clause, with either ObsId or DID required, and
>>>>> returning:
>>>>> ObsId DataType Contents Level Format URL
>>>>> -----------------------------------------------------------
>>>>> 12345 Pkg_1 evt,img 2 tar http://...
>>>>> 12345 Pkg_2 evt,img 1 tar http://...
>>>>> 12345 Pkg_12 evt,img 2,1 tar http://...
>>>>> 12345 evt evt 2 fits-bin http://...
>>>>> 12345 evt evt 1 fits-bin http://...
>>>>> 12345 img img 2 fits http://...
>>>>> 12345 img img 2 jpg http://...
>>>>> 12345 img img 2 fits http://...
>>>>> 12345 img img 2 jpg http://...
>>>>> This is an example where the client specified ObsId or DID, but no
>>>>> data type or format.
>>>>>
>>>>> Never mind the terms and abbreviations I used - you get the picture.
>>>>>
>>>>> Cheers,
>>>>>
>>>>> - Arnold
>>>>>
>>>>>
>>>>> Douglas Tody wrote:
>>>>>
>>>>>> More precisely what you might have is something like (display in a wide view):
>>>>>>
>>>>>> ObsId Type Subtype Level Format Title
>>>>>> ----------------------------------------------------------------------------------------------------------
>>>>>> 123 event chandra.hrc.pkg 1 application/x-tar-gzip Chandra ACS-XYZ observation package (event,refimage)
>>>>>> 123 image chandra.hrc.refimage 2 image/fits Chandra ACS-XYZ reference image
>>>>>> 123 image chandra.hrc.preview 2 image/jpeg Chandra ACS-XYZ preview image
>>>>>> 345 event rosat.foo.pkg 1 application/x-tar-gzip ROSAT whatever observation package (xxx)
>>>>>>
>>>>>> and so forth. The subtype could in principle be more generic but will
>>>>>> likely be instrument-specific for a level 1 observation.
>>>>>>
>>>>>> The Title should concisely describe the data product, e.g., origin,
>>>>>> instrument, ID, what it is (observation package, calibration, standard
>>>>>> view, etc.). The title string is what one normally wants to output on a
>>>>>> displayed image or plot to identify to a human the data being shown.
>>>>>> You can put whatever you want in there to describe the data product so
>>>>>> long as it is concise (one line of text).
>>>>>>
>>>>>> - Doug
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, 11 Jul 2011, Douglas Tody wrote:
>>>>>>
>>>>>>
>>>>>>> On Thu, 7 Jul 2011, Arnold Rots wrote:
>>>>>>>
>>>>>>>
>>>>>>>> Aside from what I reported in a previous message, quoted below, there
>>>>>>>> are more discrepancies between Table 5 and Tables 6 and 7:
>>>>>>>>
>>>>>>>> obs_creator_did is missing from Table 7
>>>>>>>> o_units in Table 5 should be o_unit
>>>>>>>> pol_states is missing from Table 6
>>>>>>>> facility_name and instrument_name are spelled differently;
>>>>>>>> even though required, they show up in Table 7, rather than 6
>>>>>>>> em_unit is missing from Table 5
>>>>>>>> o_stat_error is missing from Table 7
>>>>>>>>
>>>>>>>> Also, note the comment I made on MJD in use case 1.6
>>>>>>>> and on the uselessness of bib_reference because of its murky
>>>>>>>> definition
>>>>>>>>
>>>>>>>> I still lament the fact that the data access functionality is
>>>>>>>> compromising the self-consistency and usefulness of the data discovery
>>>>>>>> function, but decided for our tarred packages to use:
>>>>>>>> dataproduct_type = NULL
>>>>>>>> dataproduct_subtype = package:event,image
>>>>>>>> access_format = application/x-tar
>>>>>>>> As far as I can tell, this is within the specifications.
>>>>>>>>
>>>>>>> Well we don't specify what the subtypes you provide for your archive
>>>>>>> should be so I suppose you could get away with this, but this example is
>>>>>>> not at all what we had in mind. The subtype should be the science type
>>>>>>> of the specific data product, *not* details about the content of the
>>>>>>> data product. I would expect the type to be "event" (meaning "event
>>>>>>> data" not "event list") and the subtype to be something more like
>>>>>>> "chandra.hrc.package", "chandra.hrc.refimage (or "rosat.XX" etc.).
>>>>>>>
>>>>>>> Note subtypes are supposed to be fixed strings so that one can search
>>>>>>> the local archive for a particular type of data product; if you try to
>>>>>>> describe what is included in a particular data product then such
>>>>>>> selection won't be possible. So for example a client will do a generic
>>>>>>> query to see what subtypes Chandra defines, and then they can pose a
>>>>>>> more specific query to get a certain type of Chandra-specific data
>>>>>>> product. Likewise for ALMA etc.
>>>>>>>
>>>>>>> Note you also have obs.title where you can provide a short description
>>>>>>> of the data product and for this you can provide whatever you want.
>>>>>>>
>>>>>>> - Doug
>>>>>>>
>>>>>>>
>>>>> --------------------------------------------------------------------------
>>>>> Arnold H. Rots Chandra X-ray Science Center
>>>>> Smithsonian Astrophysical Observatory tel: +1 617 496 7701
>>>>> 60 Garden Street, MS 67 fax: +1 617 495 7356
>>>>> Cambridge, MA 02138 arots at head.cfa.harvard.edu
>>>>> USA http://hea-www.harvard.edu/~arots/
>>>>> --------------------------------------------------------------------------
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> =====================================================================
>>>> Franc,ois Bonnarel Observatoire Astronomique de Strasbourg
>>>> CDS (Centre de donne'es 11, rue de l'Universite'
>>>> astronomiques de Strasbourg) F--67000 Strasbourg (France)
>>>>
>>>> Tel: +33-(0)3 68 85 24 11 WWW: http://cdsweb.u-strasbg.fr/people/fb.html
>>>> Fax: +33-(0)3 68 85 24 25 E-mail: francois.bonnarel at astro.unistra.fr
>>>> ---------------------------------------------------------------------
>>>>
>>> --------------------------------------------------------------------------
>>> Arnold H. Rots Chandra X-ray Science Center
>>> Smithsonian Astrophysical Observatory tel: +1 617 496 7701
>>> 60 Garden Street, MS 67 fax: +1 617 495 7356
>>> Cambridge, MA 02138 arots at head.cfa.harvard.edu
>>> USA http://hea-www.harvard.edu/~arots/
>>> --------------------------------------------------------------------------
>>>
>>
> --------------------------------------------------------------------------
> Arnold H. Rots Chandra X-ray Science Center
> Smithsonian Astrophysical Observatory tel: +1 617 496 7701
> 60 Garden Street, MS 67 fax: +1 617 495 7356
> Cambridge, MA 02138 arots at head.cfa.harvard.edu
> USA http://hea-www.harvard.edu/~arots/
> --------------------------------------------------------------------------
>
More information about the dm
mailing list