SED Data Model: Questions and Comments

David Berry dsb at ast.man.ac.uk
Wed Feb 16 05:54:01 PST 2005


Pedro,
       I have a certain sympathy with your attitude towards handling
legacy FITS files with all their diverse ad-hoc approaches to meta-data
representation! The situation is somewhat similar to the handling of WCS.
Many different schemes have been used in the past for representing WCS in
FITS files. Now there is a published standard, we are faced with the
problem of what to do with all the non-conforming FITS files. The problem
is similar to that of the use of non-standard units strings. In the case
of WCS it looks like the solution is either to change all VO data to use
the standard, or to use FITS interprets that known how to interpret the
common WCS variants (which is what happens at the moment with things like
AST and WCSTOOLS). Of course you then have to define what you mean by
"common"...

In the case of units, I'm just not sure that adding a dimensional analysis
to every data set is any less work than correcting the units string of
every data set. The process would presumably be:

   for every legacy VO data file
      interpret the existing units string
      create a corresponding dimensional analysis and add it to the file
   next file

as opposed to:

   for every legacy VO data file
      interpret the existing units string
      create a corresponding standardised unit string and replace the
         original units string in the file.
   next file

The first doesn't seem any easier than the second. Or am I missing
something?

David



> [...]FITS WCS paper one suggests that unit strings should be
> standardised[...]
>
> yes, and again the problem is that some data providers do already have
> their units written in other formats. Some of them are inside very old
> "standard" names inside FITS files that will never be changed, just to
> give an example.
>
> [...]So, given that some standardisation effort
> > is necessary, and that data will presumably always include a human
> > readable units string, why not standardise that string rather than
> > introducing an additional dimensional analysis standard?[...]
>
> because the dimensional analysis standard consists of only one line,
> whereas the units standard consists of many names. And I'm not asking
> for removal of the string names, I'm asking for inclusion of dimensional
> parameters.
>
> For you interest, I was asked by the FITS community to send information
> about this dimensional analysis thing, and I attach the answer back from
> Greisen himself (one of the writers of the FITS WCS III paper). He
> understands that the idea is nice and puts his reasons to not include it
> in paper III (as he understood that was the proposal, which was
> certainly not). Among them is the absence of rigorous formulation, and
> that's the reason why we are writing something on it. Please see the
> attached mail.
>
> [...]But is also introduces extra redundant meta-data, increasing data
> size and complexity[...]
>
> as I say, the dimensional parameters are just two, normally the same for
> many of the providers' data. Not much overhead.
>
> [...]and requires more effort on the part of data providers (in
> that they have to work out what the dimensional analysis and scale
> factor are)[...]
>
> we can help people on this. On the other hand, data providers will not
> have to change their units inside their files, but just give the correct
> dimeq-scaleq in the metadata. This would allow their data (though old as
> they might be) to be able to play in the VO without having to modify
> them. Still, I believe it's worth the effort.
>
> Cheers,
> P.
>
>
> On Wed, 2005-02-16 at 12:51, David Berry wrote:
> > Pedro,
> >
> > > parsing of strings is the traditional way to handle units, and we
> > > believe there are examples more than enough of cases where units are
> > > named wrongly, despite any effort to homogeneise unit names (which vary,
> > > by the way, sometimes from FITS WCS paper I to A&A recommended units
> > > conventions (a la Vizier, I believe), to CODATA ones, etc.).
> >
> > Sure, people need to abide by some standard language if communication is
> > to be possible. FITS WCS paper one suggests that unit strings should be
> > standardised, and you suggest that dimensional analysis description should
> > be standardised. Either way, data provides have to check that their data
> > conforms with *something*. So, given that some standardisation effort
> > is necessary, and that data will presumably always include a human
> > readable units string, why not standardise that string rather than
> > introducing an additional dimensional analysis standard?
> >
> > > However, we insist that for superimposition of different
> > > spectra in different units, the dimensional approach gives -even
> > > algorithmically- a lot of benefits.
> >
> > But is also introduces extra redundant meta-data, increasing data size and
> > complexity, gives rise to the possibility of inconsistency within the
> > meta-data, and requires more effort on the part of data providers (in
> > that they have to work out what the dimensional analysis and scale
> > factor are).
> >
> > David
> --
> Pedro Osuna Alcalaya
>
>
> Software Engineer
> Science Archive Team
> European Space Astronomy Centre
> (ESAC/ESA)
> e-mail: Pedro.Osuna at esa.int
> Tel + 34 91 8131314
> ---------------------------------
> European Space Astronomy Centre
> European Space Agency
> P.O. Box 50727
> E-28080 Villafranca del Castillo
> MADRID - SPAIN
>



More information about the dm mailing list