Error column in VOResource -> metadata standardization

Tony Linde ael at star.le.ac.uk
Wed May 19 06:01:50 PDT 2004


Hi Pierre,

I agree in principle to the levelled approach (after ten seconds thought :)
).

What we must have is some XMLSchema based metadata representation that is
common across all VObs components. This ensures the use of standard tools
and mechanisms including the ability to state a block standard and then
refer to it via a namespace. 

And namespaces is probably the way to get to your levelled approach: the top
level are the standard IVOA namespaced metadata blocks; the next level will
use parts of these standards and perhaps add certain extensions; the third
level will be whole new blocks perhaps.

How all this might work I don't know - the discussion and trials need to
happen now - but we've got to dump the present way of representing metadata
in VOTable because it is a dead-end (with a 100m cliff beyond!).

Cheers,
Tony. 

> -----Original Message-----
> From: Pierre Didelon [mailto:pdidelon at cea.fr] 
> Sent: 19 May 2004 12:59
> To: Tony Linde
> Cc: 'Doug Tody'; registry at ivoa.net
> Subject: Re: Error column in VOResource -> metadata standardization
> 
> 
> 
> Tony Linde wrote:
> 
> > I completely agree, Doug. We should standardize on what we 
> can agree 
> > as a common standard - via the DM effort.
> Which will be materialised in a precise data structure in a 
> specify format (XML specific or whatever else) with precise 
> specificaly define ("semantic"?) content.
> > But any extensions should follow some
> > standard extension mechanism
> Why not, in this case, using a generic data structure (FITS 
> header, VOTable header, or whatever else at group 
> convenance), to allow generic look up tool, visualisation and 
> search, and all possible generic behaviors/handling allowed 
> by this structure.
> 
> >so that, as you say, they can at least be seen  by users or included 
> >and passed on by applications.
> generic structure would certainly allow this.
> 
> So you can handle information/metadata like a three level cake;
> - the first one having precisely defined structure AND 
> content (semantic?) which covers the 'most' common and 
> important part, fully searcheable and directly operationnal 
> for processing and analysis
> - the second one having a defined structure but a free 
> content, less searchable or harvestable, but nevertheless 
> searchable...
> You can always imagining and code a search looking for a 
> certain value of a Fits keyword list.
> - and even a third one with a free stucture and content; i.e. 
> proprietary format in the sens of data producer specific 
> format, this one non-searcheable and perhaps even not 
> vizualisable, but which can be propagate to user, and if he 
> knows how to handle it, it could be usefull for some of them.
> 
> You can even imagine a standard process to promote some 
> information from level 2 to level 1; create a new piece of 
> info (structure and content) at level 1 and process (in a 
> generic way?) level 2 structure to feed newly created level 1 
> piece of info.
> 
> Is this silly, already in the reflexion domain or completly 
> out of subject?
> I must admit that I din't follow the discussions on this list 
> very accuratly, only keeping an eye on it, so forgive me any 
> non-appropriate intervening.
> Regards,
> Pierre
> >  
> >>-----Original Message-----
> >>From: owner-registry at eso.org 
> [mailto:owner-registry at eso.org] On Behalf 
> >>Of Doug Tody
> >>Sent: 18 May 2004 18:05
> >>To: Tony Linde
> >>Cc: registry at ivoa.net
> >>Subject: RE: Error column in VOResource
> >>
> >>We will never be able to standardize everything.  We will 
> never even 
> >>be able to know about all the telescopes, survey projects, 
> etc., being 
> >>developed or underway around the world.
> >> Even if we do know about a project it will be constantly 
> changing.  
> >>All we can really hope to do is standardize the core, and define a 
> >>standard framework for things like resource description, dataset 
> >>characterization, data formatting, etc.
> >>
> >>People will use these standard mechanisms, try to adhere to the 
> >>standard core, but will need to add nonstandard extensions 
> to do new 
> >>things, or to specialize the services, data model, or data 
> packaging 
> >>to fully describe their data.
> >>Sure, all applications will not be able to understand and deal with 
> >>the extensions, but this is how new standards develop, and 
> some subset 
> >>of applications will really need those extensions to 
> process certain 
> >>classes of data, and will be written to do so.  So long as 
> the service 
> >>or dataset is compliant to some core model then all 
> applications which 
> >>support the core will work down to that level, ignoring the 
> >>extensions.
> >>Even nonstandard extensions can be useful if packaged in a standard 
> >>way, e.g., a human can browse them to better understand the data, 
> >>generic searches can be performed, generic tools can be used in an 
> >>ad-hoc fashion, and so forth.
> >>
> >>Basically I am arguing that the standard VO framework 
> should only try 
> >>to go so far, but should be designed to be extensible.  If 
> it tries to 
> >>be all-inclusive it will be too complicated to be used, and 
> will never 
> >>work anyway.
> >>
> ...
> >>
> >>This is all true for static archive data products, e.g., 
> precomputed 
> >>survey images in an archive.  But what if we have, e.g., an image 
> >>access service which generates images on the fly, e.g., 
> image cutouts 
> >>or mosaics?
> >>Or perhaps the service generates images on the fly from X-ray event 
> >>data, applying a time filter in the process and generating 
> the image 
> >>with the desired celestial projection?
> >>SIA for example already supports all this.
> >>Basically what happens is the client application tells the service 
> >>what it would ideally like to get back, the service decides what it 
> >>can provide, and returns metadata for one or more virtual datasets 
> >>which it can generate to satisfy the query.  The image is 
> not actually 
> >>generated until the access reference URL is invoked.
> >>
> >>What we need the registry for is to tell us what services are out 
> >>there, what they are capable of, and the characteristics of 
> the data 
> >>they can serve (specific data collections, bandpass, sky coverage, 
> >>etc.).  We also need to register all data collections and 
> be able to 
> >>find services which can serve them up.  It could also be useful to 
> >>register individual static datasets within a data collection, 
> >>including caching dataset metadata of some type (at least 
> that which 
> >>uniformly characterizes the data at a high level).
> >>This would start to provide a replica management capability for 
> >>managing large data collections.  One has to ask though, 
> whether this 
> >>is something which should be provided by the registry or by 
> a separate 
> >>replica management service.  If it gets complicated enough, 
> it may be 
> >>better to split it off as a separate service in order to avoid 
> >>over-complicating the registry.
> >>
> >>Anyway, enough!  I have to get back to DAL stuff or I won't 
> be ready 
> >>for next week.
> >>
> >>	- Doug
> >>
> --
> Pierre
> --------------------------------------------------------------
> ------------
> DIDELON :@: pdidelon_at_cea.fr        Phone : 33 (0)1 69 08 58 89
> CEA SACLAY - Service d'Astrophysique  91191 Gif-Sur-Yvette Cedex
> --------------------------------------------------------------
> ------------
> 



More information about the registry mailing list