SED Data Model: Questions and Comments

Doug Tody dtody at nrao.edu
Wed Feb 16 10:59:26 PST 2005


I don't think we can expect older data external to the VO to be rewritten
to conform to a new standard - much of this is archival data which has
long since been frozen, published, replicated, software has been written
to deal with its peculiarities, etc.  Instead what we can do is mediate
the data to a common data model at access time in VO (in a DAL service).
You already have to do this for the metadata to write a valid SIA/SSA query
response.  In principle we should be doing this for the data itself too.
This is the case for SSA, but eventually if we are generating e.g., image
cutouts with SIA, the image cutout datasets should conform to a standard
image data model and metadata as well.

In the case of units and SSA either approach (WCS I or dimensional equation)
will work since standard metadata has to be returned in at least the query
response, even if foreign data is passed-through.    - Doug



On Wed, 16 Feb 2005, David Berry wrote:

> Pedro,
>        I have a certain sympathy with your attitude towards handling
> legacy FITS files with all their diverse ad-hoc approaches to meta-data
> representation! The situation is somewhat similar to the handling of WCS.
> Many different schemes have been used in the past for representing WCS in
> FITS files. Now there is a published standard, we are faced with the
> problem of what to do with all the non-conforming FITS files. The problem
> is similar to that of the use of non-standard units strings. In the case
> of WCS it looks like the solution is either to change all VO data to use
> the standard, or to use FITS interprets that known how to interpret the
> common WCS variants (which is what happens at the moment with things like
> AST and WCSTOOLS). Of course you then have to define what you mean by
> "common"...
> 
> In the case of units, I'm just not sure that adding a dimensional analysis
> to every data set is any less work than correcting the units string of
> every data set. The process would presumably be:
> 
>    for every legacy VO data file
>       interpret the existing units string
>       create a corresponding dimensional analysis and add it to the file
>    next file
> 
> as opposed to:
> 
>    for every legacy VO data file
>       interpret the existing units string
>       create a corresponding standardised unit string and replace the
>          original units string in the file.
>    next file
> 
> The first doesn't seem any easier than the second. Or am I missing
> something?
> 
> David
> 
> 
> 
> > [...]FITS WCS paper one suggests that unit strings should be
> > standardised[...]
> >
> > yes, and again the problem is that some data providers do already have
> > their units written in other formats. Some of them are inside very old
> > "standard" names inside FITS files that will never be changed, just to
> > give an example.
> >
> > [...]So, given that some standardisation effort
> > > is necessary, and that data will presumably always include a human
> > > readable units string, why not standardise that string rather than
> > > introducing an additional dimensional analysis standard?[...]
> >
> > because the dimensional analysis standard consists of only one line,
> > whereas the units standard consists of many names. And I'm not asking
> > for removal of the string names, I'm asking for inclusion of dimensional
> > parameters.
> >
> > For you interest, I was asked by the FITS community to send information
> > about this dimensional analysis thing, and I attach the answer back from
> > Greisen himself (one of the writers of the FITS WCS III paper). He
> > understands that the idea is nice and puts his reasons to not include it
> > in paper III (as he understood that was the proposal, which was
> > certainly not). Among them is the absence of rigorous formulation, and
> > that's the reason why we are writing something on it. Please see the
> > attached mail.
> >
> > [...]But is also introduces extra redundant meta-data, increasing data
> > size and complexity[...]
> >
> > as I say, the dimensional parameters are just two, normally the same for
> > many of the providers' data. Not much overhead.
> >
> > [...]and requires more effort on the part of data providers (in
> > that they have to work out what the dimensional analysis and scale
> > factor are)[...]
> >
> > we can help people on this. On the other hand, data providers will not
> > have to change their units inside their files, but just give the correct
> > dimeq-scaleq in the metadata. This would allow their data (though old as
> > they might be) to be able to play in the VO without having to modify
> > them. Still, I believe it's worth the effort.
> >
> > Cheers,
> > P.
> >
> >
> > On Wed, 2005-02-16 at 12:51, David Berry wrote:
> > > Pedro,
> > >
> > > > parsing of strings is the traditional way to handle units, and we
> > > > believe there are examples more than enough of cases where units are
> > > > named wrongly, despite any effort to homogeneise unit names (which vary,
> > > > by the way, sometimes from FITS WCS paper I to A&A recommended units
> > > > conventions (a la Vizier, I believe), to CODATA ones, etc.).
> > >
> > > Sure, people need to abide by some standard language if communication is
> > > to be possible. FITS WCS paper one suggests that unit strings should be
> > > standardised, and you suggest that dimensional analysis description should
> > > be standardised. Either way, data provides have to check that their data
> > > conforms with *something*. So, given that some standardisation effort
> > > is necessary, and that data will presumably always include a human
> > > readable units string, why not standardise that string rather than
> > > introducing an additional dimensional analysis standard?
> > >
> > > > However, we insist that for superimposition of different
> > > > spectra in different units, the dimensional approach gives -even
> > > > algorithmically- a lot of benefits.
> > >
> > > But is also introduces extra redundant meta-data, increasing data size and
> > > complexity, gives rise to the possibility of inconsistency within the
> > > meta-data, and requires more effort on the part of data providers (in
> > > that they have to work out what the dimensional analysis and scale
> > > factor are).
> > >
> > > David
> > --
> > Pedro Osuna Alcalaya
> >
> >
> > Software Engineer
> > Science Archive Team
> > European Space Astronomy Centre
> > (ESAC/ESA)
> > e-mail: Pedro.Osuna at esa.int
> > Tel + 34 91 8131314
> > ---------------------------------
> > European Space Astronomy Centre
> > European Space Agency
> > P.O. Box 50727
> > E-28080 Villafranca del Castillo
> > MADRID - SPAIN
> >
> 
> 



More information about the dm mailing list