[ObsCoreRFC]Minutes of the telco Monday June 6
Douglas Tody
dtody at nrao.edu
Tue Jul 5 13:56:36 PDT 2011
On Tue, 5 Jul 2011, Arnold Rots wrote:
>> First, the subtype may be used to define what the data object is in
>> collection or archive specific terms. For example if the data object is
>> a tar file containing all the files comprising a ROSAT observation the
>> data provider can define a subtype for this type of data. It is up to
>> the client to understand what the content of the proprietary data
>> product is, but if they are able to deal with such instrument-specific
>> data they probably do know what it is.
>
> This is precisely the case I was trying to solve: a tarfile containing
> a mix of data types: images, spectra, event lists.
> The way I would like to solve it is to allow "package" (or something
> similar) for the data type and enumerate the data files contained in
> the tarfile in the data subtype.
>
> It still leaves a similar issue for the access format: that would be
> tar, but it would be nice to be able to enumerate the formats of the
> files in the tarfile in a similar format subtype - that also would
> allow one to indicate whether or not the content of the the tarfile is
> gzipped (as opposed to gzipping the tarfile itself).
>
> I realize that this constitutes a use of subtypes that is different
> from the original intent (at least, I think so), but it does seem a
> useful mechanism.
Arnold - I agree that in principle it would be useful to have this extra
information. However we had to argue for quite a while to get support
for instrumental data at this level included at all. One *can* expose
this data with ObsTAP 1.0 as outlined in my earlier email; in particular
exposing the individual data products separately allows them to be
described if the data provider wants to do so. Even exposing only the
tar/zip/MEF etc. file works so long as the client recognizes the
subtype.
To attempt to the describe the contents of arbitrary complex
instrumental datasets is out of scope for ObsTAP, at least 1.0. Perhaps
we can address this issue in the next phase of development where we
prototype related mechanisms such as data linking.
> However, there is also the reverse problem: what do we do with data
> products based on multiple observations? Do we allow ObsId to be a
> list of ObsIds?
This was addressed in the document as I recall. In the case of complex
data products which are derived from multiple inputs (e.g. multiple
observations) which essentially have a new "software observation", and a
new obs_id should be assigned. To say more about the derivation of a
particular data product is complex and gets into the general issue of
provenance which is being addressed separately. Furthermore obs_id is a
database key used to uniquely identify specific "observations" (usable
as a foreign key in other tables for example) hence we cannot turn it
into a list of obs_ids.
- Doug
More information about the dm
mailing list