WD-AccessData-1.0-20140312
Petr Skoda
skoda at sunstel.asu.cas.cz
Thu Mar 20 18:08:59 PDT 2014
On Fri, 21 Mar 2014, Mark Allen wrote:
> An important part of the effort to make multi-dimensional data
> accessible and useable within the VO is the engagement with projects
> that produce (or will produce) data cubes. To name some of these
> projects that participated in the focus sessions and following
> discussions:
>
> ALMA
> LOFAR
> ASKAP
> (SKA)
> JVLA
> CGPS (survey)
> CALIFA
> MUSE
> (ESO IFUs)
> (JWST IFUs)
>
> Data from some of these have been shown in demonstrations and are being
> used to guide developments within VO projects. These are real data
> cubes. Further examples welcome.
OK - I was little bit concerned whether the real format of the above
projects was seen to judge the suitability of proposed trio (SIAP2
accessdata datalink) of standards (with the current parameter
restrictions).
As I said in the special session it was shown more of the nice images and
wishes than real data. And the only real data were presented on ALMA by
JVO visage - but (I have seen it at Hawaii as well) at least still there
the "energy" axis was represented by the individual channel ID - not real
"energy" span in barycentric wavelength in m.
And to be honest the IFUs so far in VO (I mean the Euro3D and he tools of
Igor and Ivan) were rather presented as individual spectra referenced by
fiber id (or slit ID)
So it seems to be rather sparse cube in some axes in comparison to e.g.
more densely covered "energy" axis.
In fact the CALIFA as well is more dense in "energy" axis then in spatial
axes. But I can say (after downloading) that the reduced product is realy
the datacube with NAXIS3
So I apologize for a little simplified view.
> The current efforts on the standards are to satisfy minimal requirements
> on discovery and access. Of course, doing science with data cubes
> requires much more, and we will need to consider the roles of services,
> user interfaces and tools. I think that some of your comments are mixing
> these things together, in particular for BAND.
My objection comes not because of datacubes (of e.g. IFU) but as it is
(SIAP2) declared as image access protocol or in some sense image
extraction (accessdata) protocol.
The BANDNAME is crucial for science discovery in multicolour photometry
surveys as well as the fiberID is crucial for multiobject spectra (look in
SDSS - the plate-fiber pair is crucial primary metadata - not the position
which is derived.
If you want the insist on simple numerical energy band as primary
information do not name it "Image protocol" but SDAP - simple datacube
access protocol - and then everyone would know he should not expect the
server of images (in different filters) here.
The standard needs to be
> uniform and simple, but tools or services could offer any number of ways
> for an astronomer to specify a wavelength/frequency/energy range. For
> example, I think that a user interface could use a look-up of the SVO
> filter service to do for BAND what the NED/CDS sesame name resolver does
> for coordinates, presenting a way for the astronomer to deal with filter
> names, but using the standard that speaks only in metres (or Hz).
As I said - it is not uniquely mappable and will require much more effort
to investigate the data before publishing. Moreover it will not allow you
to select exactly what you want and know.
The examaple with Sesame is perfect example how the generalized view
without deeper investigation makes you troubles to achieve scientific
goals.
With some little simplification suppose the case you want to search for
spectrum of double star. You have (at least) two spectra on one chip. Both
may be extracted and put is separate 1D spectra. But the RA DEC in FITS
header belongs to only one position, the DATE-OBS is the same.
Obiously you know during the reduction which spectrum belongs to what
star and you name the OBJECT accordingly.
For SSA you are obliged to use POS query - but it returns spectra of both
stars obviously even with very small SIZE circle.
When you have 100+ spectra of both stars how do you isolate only the one
of the (e.g.) secondary component which has Balmer line emission and is
the only interesting ?
This is our case with HR1847A and B
Fortunately in SSA there is the TARGETNAME parameter (optional) so I can
query by TARGETNAME and ignore the POS. And although there may be some
ambiguity in names (different spaces etc ...) I say in SSA
TARGETNAME=HR1847B and I will fulfill my science goal.
Of course if I do not know the TARGETNAME immediately - so first I perform
discovery query in large circles probably using SESAME names etc ....
But once I know it I can precisely get what I want.
I agree the protocol should be simple but IMHO if it is orthogonally
designed it should allow me to isolate by proper combination of parameters
the individual entities (e.g. spectra, images). In case of SSA I have the
combination of POS, SIZE (still yeilding the ambiguity) + TARGETNAME.
In case of SIAP2 I need the BAND (with wider range) + BANDNAME (or better
BANDID) to get exactly THAT image.
Does anybody already see how it is important?
In terms of "ontology" or semantics:
The object of my investigation is some entity described by extremely large
number of different atributes characterizing one or another property.
I will assign to it a label. So when I want to select it I will use the
label.
If it is possible I can map one label to another in a unique -
bijective way. So I can use this new label to point
to it as well. If this label is Halpha or 6562.8e-10m (in air) its fine
(but I must use some tollerance).
But if I want to map symbol to interval - it is not bijective anymore. As
in wider interval may be more narrower subintervals (narrow-band
photometry images). Stating the wider interval (490e-9/700e-9m) will
return me ALL the subband images and I cannot imagine how to select only
Johnson V filter image.
If there are thousands of images I am deemed to die after manual
selection or get very angry on whole stupid VO garbage ;-)
More information about the dal
mailing list