ObsCoreDM and DAL for proprietary data

Petr Skoda skoda at sunstel.asu.cas.cz
Wed May 19 18:16:53 PDT 2010


Hi all,

although I had not been following the CharacDM nor ObsDM for about a year 
or more (sorry)  I am glad that some of the ideas we started with during 
the special "hotel session" during the Theory/GRID workshop in Garching 
and later in Baltimore seem to get to the realisation phase. Thanks to all 
your effort on it!

As I have been following the Interop almost in real time on wiki 
(regretting I am not able to be with you - due to financial reasons), I 
have just briefly read the proposal of the WD 1.0-20100517 and all the 
presentation available on wiki.

Concerning this I have comments to both the presentation of Mireille (on 
wiki it is named ObstapConvergence.pdf) and Francois (DataLink.pdf).
It concerns the access_url and access_format in case of proprietary data.

I think the very important science use case not mentioned in the Dave 
Schade's list is the action taken in case of proprietary data - very 
common in stellar astronomy with unknown and often unlimited (lifetime ;-) 
proprietary period.

The question is - should the astronomer be allowed to discover only 
public data ? The most important information - e.g. for proposal or 
collaboration purposes is:

"had someone observed my target ? In given time (e.g. has someone spectra 
taken just after nova outburst ?) in given spectral range (The behaviour 
of line XY is crucial for physics) . Or what was a time, band ... coverage 
of observation with given instrument .....

In this case I just want to know the EXISTENCE and COVERAGE (or whole 
characterization) of the observation - NOT THE DATA ITSELF !
Very important would be the curation part - best the direct contact to 
people who posses the data sets.

With respect to this it is necessary to make clear what to do with
mandatory access_url in case of data_rights=Proprietary (which is 
optional)

Should the service respond by some "dummy URL" causing some error like 
"not found" or better with some "null" contents (that will not show 
anything, zero length...)

Or should it return some verbose marker of the coverage (e.g. in case of 
proprietary imeage would show the empty rectangle or whole footprint - in 
Aladin - with possible text - "proprietary, Sorry" ;-) In case of spectra 
in SSA client just horizontal line at the 0.0 extending the wavelength 
ranges covered ??????

The another issue to consider is - what to do in case the privileged 
people (e.g. PI) would like to use VO Tools to work with public data AS 
WELL AS with their PROPRIETARY (which would allow many VO-reluctant 
sceptics to publish their data and learn the VO tools).

I think that the simple and feasible way could be based on similar idea as 
the GPS maps unlocking. (e.g. for Garmin) - You get the full product but 
without special key you see only contours of the maps, shorelines, lakes 
and the very coarse datail - e.g. large higways - not tourist tracks, 
small roads etc ...

So the server could recognize the proprietary data and encrypt the real 
contents, still sending all metadata. The client would transparently 
identify the existence of encoding from metadata (and in the list of 
datasets discovered would place a "small key" icon next to them).
Clicking on the key the description of details would occur - e.g. the 
expiration of proprietary period, contact, information what key is 
required etc ...

One option is then:
Inside this "more info label" could be the link or window where to put the 
key to unlock the data contents (only now would go the request to stream 
the real data ).

Another option:
The best way how to do this seems to me very similar to 
PGP/GPG.

First importing some "restricted keys (known to the PI's or whole team 
-e.g. department) to the local "keyring".
The client (e.g. Aladin or Splat) could check the possession of the keys 
trasparently while loading the ObsCore Curation  metadata (before creating 
the key icon) and use the key directly to unlock the data stream (which 
would be requested on-the-fly  for matching keyID). The server would send 
the proprietary data already encrypted.

I feel that this mechanism could be easily added to clients as a part of 
understanding the ObsCore structure (best in form of some library) and the 
servers would just add some additional function for GPG encryption which 
would be used for the delivery of data marked in internal database as 
private/proprietary.

I think this problem is loosely connected with Data link issues and the 
Alberto's comments to NULL or NOT NULL question ;-)

I want to emphasize that this is not competing with Single-Sign-On stuff 
and the whole GRIDs security issues and  community servers in Astrogrid 
CEAs etc ...

I am not sure how the ObsCoreDm should be linked to whole TAP stuff 
(including UWS async ....) - what I do not understand so far is the role 
of current SIA and SSA servers - should everyone support TAP access in 
ADDITION to the SSA/SIA to be able to return the basic 
provenance/characterization fulfilling some of the David's simple use 
cases ?

But I see the importance of some controlled access to 
proprietary/restricted access data as a crucial step for disseminating the 
VO awareness and forcing the "hands on" experience of the conservative 
part of community (which suffers of the data-jealousy) as well as 
different closed consortia.

----------------------------------
Another comments concerns the mime_type or obs:Access.Format
As was already mentioned in Mireille's presentation, there seems to be a 
rich list of values (including FITS.GZ, TEXT, TAR ....)
And Doug had mentioned in his examples (obsdp.pdf, page 5 and 9)  the 
fits.image.gz and even fits.uvfits ....

So I am glad that the strict restriction imposing the FITS in VO means the 
BINTABLE FITS (which causes a lot of confusion in accessing some spectra - 
e.g. ESO FITS spectra  in VOSPEC) will be broken.

Sorry to bother with this again ;-) but I have to ephasize again:

1) the 1D Image FITS spectra in the linear or log rebinned form (CRVAL1, 
CDELT1) as produced by most IRAF/MIDAS users/pipelines are the substantial 
part of optical spectra produced by ground based telescopes.
2) the highly demanded service among stellar astronomers (including 
students) of VO tools is the possibility to use the VO to discover 
spectra finally donwloaded and processed in IRAF splot  - the final stage 
is then the wspectxt conversion to ASCII table for advanced processing....

I just would like to remember the necessity of distinguishing the 1D FITS 
spectra from Bintable FITS in Access.Format when the ObscoreDM is 
finalized.

Best regards

Petr

*************************************************************************
*  Petr Skoda                         Phone : +420-323-649201, ext. 361 *
*  Stellar Department                         +420-323-620361           *
*  Astronomical Institute AS CR       Fax   : +420-323-620250           *
*  251 65 Ondrejov                    e-mail: skoda at sunstel.asu.cas.cz  *
*  Czech Republic                                                       *
*************************************************************************



More information about the dal mailing list