VOunits draft

Francois Ochsenbein francois at vizier.u-strasbg.fr
Sun May 24 16:11:56 PDT 2009


>
>Hi Anita,
>
>> There are two issues here.  Firstly, we need to recognise unit
>> strings attatched to published data.  My impression is that rather a
>> lot of SI prefixes are in common use, from milli to Tera for Hz for
>> example.
>
>One imagines all the SI prefixes are used between one set of measures
>or another.  But a matrix with base units (eg, meters) on one axis and
>prefixes on the other would be sparsely filled in real world usage.
>Even if the matrix were jam packed, however, there aren't so many
>prefixes that it wouldn't be perfectly practical to compile a flat
>list of a few thousand entries.
>
>This is especially true since the abbreviations generating the
>corresponding unit labels are quite idiosyncratic.  It is that usage
>we're trying to recognize and replicate, not to popularize a new
>scheme of nomenclature.
>
>Users can always provide or applications require the full names of
>units - millimeter instead of mm.  For that matter, what do you intend
>to do with instances like cc as an alias for a milliliter?  If you're
>not careful, the VO will take over the entire bailiwick of the
>Chemical Rubber Company :-)

I think I completely disagree with you on this point, Rob --
either we accept SI or we refuse it. Taking a part of it and 
replacing a set of about a dozen of symbols plus a dozen of
prefixes, all well-defined and internationally accepted, by 
an enumerated list of your own is a non-sense. I don't see
the usefulness of adding 'cc' as a synonym of 'cm3', at least
associated to the data we expect to exchange between the data
providers and the VO applications. Of course if one application
prefers to write 'cc' instead of 'cm3', there must be an absolute 
freedom to do so -- but what we are talking about is the set
of units which are have to be understood by _any_ VO application. 

>> Secondly, one of the things which came up from the initial attempt
>> to get use cases was that users often need data in relatively short
>> floating point numbers with SI prefixes rather than with huge (or
>> tiny) exponents, since labelling axes on a plot or tabulating
>> results as 9.87 to 345.6 nJy is often much more convenient, and
>> intuitive for the human reader to visualise, than 9.78e-9 to
>> 3.456e-7 or 0.00000000987 to 0.0000003456 Jy.
>
>But what is the use case here?  Are we talking about generating plots
>labeled in nJy from some table that contains a column with units of
>nJy?  Or will VO compliance require that some user who desires nJy has
>to load a table in units of Jy with extremely small values just to
>create a plot rescaled to nJy?

Again, the final user must have the right to choose whatever
unit he would like to plot along the axes -- horsepower per
acre and Megacycle if he likes this. Such choices do not imply 
that the VO standard has to understand such units.

>> Handling data internally using SI prefixes also helps to avoid
>> possible loss of precision - see my previous rant - although really
>> that should be fixed by making all tools format numbers sensibly,
>> but you often don't find out until you try and pass nJy through a
>> package written when 100 mJy was the depths of sensitivity...
>
>Indeed, but this seems a numerical computing issue, not a
>representation issue.  There is no reason that such scaling has to be
>quantized to powers of ten.
>
>> We need to make sure that we regonginse SI prefixes to avoid Mcm, etc.
>
>And the most reliable way to recognize correct usage is to have a
>vetted vocabulary rather than generating it on the fly.
>
>> 'Decibels' does illustrate the point made by Paddy Leahy I think,
>> that we should be able to parse the whole prefic (deci) as well as
>> the abbreviation.  And if for a few units we have to have special
>> rules like 'don't convert to centibels', that is no big deal.
>
>What it suggests to me is that the rules are too complex (and possibly
>too expensive) to implement at runtime.  Rather, the goal should be to
>parse an expression against a static list of all viable combinations
>of prefix and base unit.  That will be hard enough to get correct.

Again I basically disagree here. These rules are simple, not difficult
to implement at run-time. Computers are excellent for such operations,
they are able to deal with all viable combinations faster, and with
a 100% reliability.

-- Francois
=======================================================================
Francois Ochsenbein    ------   Observatoire Astronomique de Strasbourg
   11, rue de l'Universite 67000 STRASBOURG  Phone: +33-(0)390 24 24 29
Email: francois at astro.u-strasbg.fr (France)    Fax: +33-(0)390 24 24 17
=======================================================================



More information about the dm mailing list