roadmap 2010-2011

Francois Bonnarel francois.bonnarel at astro.unistra.fr
Fri Sep 17 06:21:09 PDT 2010


Hi Petr,
just one comment below....
Regards
françois
Petr Skoda a écrit :
>
> Dear all,
>
> I am glad that the question of next DAL development was opened.
> I am strongly biased to the optical stellar spectroscopy where I see 
> the lack of interest of the VO community but unfortunately the lack of 
> interests of spectroscopists in VO but I would expect the similar 
> problems are in other fields as well - just do not know enough about 
> style of work of other scientists.
>
> I see this all the time (even the ESO astronomers ;-) ignore the VO as 
> it does not give them the tools they need.
>
> I was giving lecture about VO at the Crimean Astrophysical Observatory 
> yesterday - from where a most advanced ideas about the structure of 
> stars, stellar evolution, complicated numerical 3D simulations of mass 
> exchange in binaries etc ... come - they were excited by simple 
> demonstration of
> vodesktop and splat - but no one had seen anything like SPLAT and they 
> had the brief notion that VO is something about complicated access to 
> catalogues;-)
>
> And what is worse, some people working in stellar reserch tried the VO 
> hoping to be able to make their work more efficient but after short 
> time they have to conclude that they do not see any difference to 
> download the spectra directly from archive or obtain through VO search 
> in SPLAT.
> No add-on value is provided ! they said.
>
>
> In fact I have to repeat all the time the tools we have in VO for 
> spectroscopy are not in the VO spirit. The only practical work you can 
> do with current tools is just rudimentary plotting of several spectra 
> that can be handled by the limited memory of the tool and in fact the
> only work I have seen sucessfully done with VO tools (by SVO people) 
> is the combination of several spectra in different regions and 
> comparison with theoretical spectra - ONE by ONE.
>
> The problem is clearly seen in case of (among stellar stronomers) most 
> wanted spectra from ESO (not through the VO tools) - the UVES echelle.
> If you want to download 100 spectra from UVES even in the local ESO 
> network it takes minutes. From the world it is practically imposible 
> to work with series of spectra from both both UVES, ELODIE, HARPS etc ...
> And the client will often crash of slow down because a large memory 
> requirements ...
>
> So this approach is not in VO spirit - as it has been presented often 
> (avalanche of data, seamless way of working with large sets ....)
>
> So the solution is the on-the-server processing (cutout of spectral 
> lines, rebinning, normalization ......) and we have the SSA standard 
> (probably the only one consistent with data model) but in practise the 
> only one really large spectral set (SDSS) is not accessible through VO 
> - it stands apart - all the effort in VO development is obscured by 
> their own CASJOB interface (even very few people know about 
> spectraservices web) for most astronomers I have seen working with 
> SDSS spectra.
>
> The rest of SSA services is quite confusing as the most queries (e.g. 
> from vodesktop) are returning usually errors or nothing as they do not 
> understand the more advanced parameters (like BAND) time .... Some are 
> pretending to have more spectra but these are just same representation 
> of the same and it is difficult to distinguish particular type of data 
> (e.g. in ESO HARPS vs UVES). And some SSA are returning time series 
> (e.g. COROT) instead but there is no way in clients to recognize this 
> and be able to behave accordingly (e.g. to make period analysis)
>
> The problem in current SSA - there is no way how to describe the type 
> of processing in generating the "virtual data". E.g. for my spectra 
> cutout I have to use 2 services - what if I will add e.g. rebining, 
> convolution to given resolution etc ... how many service we would need 
> then ..
>
SIA2 Internal draft distinguishes clearly the AccessData method (the 
only one able to drive various kind of reprocessing) from
the QueryMethod for images and cubes. the distinction is not so obvious 
in SIA1. So probably SSA2 which is in the roadmap has to expand this...
> So why I am so sceptic about the plans of the advanced GDS DAL interface:
> We do not yet have the practical testbed for whole SSA documentation 
> processing (something understanding everything written in SSAP standard)
>
> we need the obligatory keywords for description of post-processing 
> operations even for simple spectra (cutout, rebining, convolution, 
> wavelength shift.....) and especially theoretical spectra (convolution 
> with given rotational velocity ....)
>
> And in addition to that there is no practical description how to 
> implement SSA service if you have a bunch of FITS spectra (perhaps the 
> SAADA has something but it is just partial - not according to SSA full 
> specification) .
>
> I have already pointed this in about November ..
>
>
> I think that Doug had precisely expressed all the spirit of VO ideas.
> We should still think about VO as a tool for astronomers who are 
> expecting to do their work with VO more efficiently - having the 
> similar capabilities - but the current development of VO is all about 
> background infrastructure - but who will do the tools that can use this ?
>
>
>> In general virtual data generation can involve some combination of
>> subsetting, filtering, or transformation.
>
>> This vital to what we are trying to do with VO, to be able to scale
>> up to the very large datasets which are coming.
>
> EXACTLY !!!!
>
>
>> spectra, e.g. cutting out a small 2D image region (or hundreds of
>> them), reprojecting 2D image data, or cutting out a region around
>> a single spectral line in a high resolution spectrum.
>
> not even high resolution - even low resolution spectrographs (e.g. 
> LAMOST) have now 4000+ pixels and for analysis of time evolution (for 
> which a whole series of spectra is needed) you have to zoom on 
> particular range only. In practice the downloading of say 500 spectra 
> takes time (minutes ..) the zooming takes time (e.g. large memory - 
> swapping, plotting interpolating pixels which are no t used afterwards 
> before zooming) etc..
> instead the cutting of short wavelength regions on server and 
> downloading this and display is much faster even if I need to download 
> another set for different spectral range ...
>
> I have a practical experience with this using my cutout ssa server and 
> SPLAT-VO on a 3GB notebook all over the world (different speeds and 
> network latency)
>
>> We need both simple whole-file and virtual data access capabilities.
>> Virtual data access capabilities are essential to enabling distributed
>> data analysis (analysis performed directly on remote data without
>> first downloading the data), and to scaling up.
>
> YES YES YES !!!!!!!
>
>
>> Discovering and downloading whole archive files for local processing
>> is of course a major use case - this is probably still the dominant
>> form of data access.
>
> I am afraid that downloading can be done easily by archive tools (e.g. 
> the tar.gz creation on FTP servers etc ... retransmission in case of 
> failure by rsync or wget .....
>
> The really big spectra may be already a problem. Concerning discovery -
> when I need series - it is usualy from the same instrument - so I know 
> where it is. And many people in "random" discovery - e.g. who had 
> observed my object and when - are interested only in simple feature 
> (like are there seen lines of HeI in emission? or did they observed 
> good quality profile of Halfa line as well ? - so they would not like 
> to download gigabytes of spectra, open all and zoom on given range.
>
> For the first question they need the postprocessing (cutout of line) 
> for second they need ObsTAP giving range and SNR ...
>
>
> But the typed interfaces like SIA/SSA already
>> support this simple mode of access; this is the "simple" mode these
>> interfaces support.
>
> simple means curently (in practice) whole data.
>
>> Soon we will have ObsTAP with both ADQL and
>> PQL query interfaces, which will provide a simpler alternative for
>> whole file discovery and access, adding the capability to access and
>> associate any type of data, at the cost of some lost object-specific
>> metadata.
>
> Thats the most wanted feature - to know IS IT SOMEWHERE?, WHERE IS 
> IT?, HOW and WHEN WAS IT OBSERVED?
> and only than comes HOW DOES IT LOOK LIKE ?
>
>> to add knowledge of and advanced access capabilities for a specific
>> type of astronomical data.
> YES - the astronomers want to work with data not just look at them.
>
>>
>> The typed interfaces extend the generic query interface in important
>> ways for each type of data, and add virtual data generation
>> capabilities (the queryData response can describe virtual data).
>
> That is nice - self-describing response - but how the cilent will work 
> with it ? (and who will write such ?)
>
>
>> While the query may look
>> similar in each interface these added semantics are extremely important
>> as they represent the difference between (for example) a catalog or
>> an image or a spectrum or a theoretical model.
>
> yes you cannot measure RV on image and compare synthetic spectrum with 
> one line cut from 2D image of galaxy.
>
>> Hence one might use ObsTAP with PQL to discover all the
>> data for a region on the sky, and then use SIA or SSA etc. for data
>> access more advanced than merely downloading entire archive files.
> BUT what if the data discovered by ObsTAP will be of TB volumes ?
>
> We are approaching the 4-th paradigma in astroinformatics ;-) and 
> people will want to dig inside the PB volumes to find something new 
> about the Universe.
>
>> If whole file access is sufficient then ObsTAP alone might be enough.
> I can imagine some will be happy just with small amoutn of full files 
> but the real power of VO can be acquired only with add-on value services.
>
>> More complex data analysis use cases require the capabilities of the
>> typed DAL interfaces with their customized parameter interfaces.
>
> YES
>
> Thanks Doug for concise and clear summary
>
> Petr Skoda
>
> *************************************************************************
> * Petr Skoda Phone : +420-323-649201, ext. 361 *
> * Stellar Department +420-323-620361 *
> * Astronomical Institute AS CR Fax : +420-323-620250 *
> * 251 65 Ondrejov e-mail: skoda at sunstel.asu.cas.cz *
> * Czech Republic *
> *************************************************************************
>


-- 
=====================================================================
François   Bonnarel           Observatoire Astronomique de Strasbourg
CDS (Centre de données        11, rue de l'Université
astronomiques de Strasbourg)  F--67000 Strasbourg (France)

Tel: +33-(0)3 68 85 24 11     WWW: http://cdsweb.u-strasbg.fr/people/fb.html
Fax: +33-(0)3 68 85 24 25     E-mail: francois.bonnarel at astro.unistra.fr
---------------------------------------------------------------------



More information about the dal mailing list