VOUnits RFC

Frederic V. Hessman Hessman at Astro.physik.Uni-Goettingen.DE
Mon Jul 29 01:31:36 PDT 2013


On 29 Jul 2013, at 09:41, Markus Demleitner <msdemlei at ari.uni-heidelberg.de> wrote:

> On Fri, Jul 26, 2013 at 12:12:34PM +0200, Frederic V. Hessman wrote:
>> On 26 Jul 2013, at 11:25, Norman Gray <norman at astro.gla.ac.uk> wrote:
>>> On 2013 Jul 26, at 07:38, Markus Demleitner <msdemlei at ari.uni-heidelberg.de> wrote:
>>> 
>>>> On Thu, Jul 25, 2013 at 05:54:40PM +0100, Norman Gray wrote:
>>>>> opposed to _quantity_): we restrict such scaling factors to round
>>>>> powers of ten, and in any case expect these to be rather rare.
>>>> 
>>>> For the record: I am still convinced that that is a restriction we
>>>> are going to regret -- allowing arbitrary factors is a small price to
>>>> pay for not having to touch data that has, say, "Jupiter mass" as
>>>> units and while having IVOA-valid unit strings.  It's *much* nicer to
>>>> have to commit to a choice for Jupiter mass, say, only in the
>>>> metadata rather than to bake that choice into the data itself.
>>> 
>>> That's an argument I hadn't thought of.  It's certainly a
>>> hygienically consistent position (so I stand beside you on that),
>>> but from a practical point of view, I suspect that data providers
>>> really do want to head columns 'jupMass', and I'm not sure it's
>>> up to us to say they oughtn't.
> 
> "head columns" in the sense of "naming" them?  They'd be welcome to
> do that.  Have jupMass in the unit string?  I'm pretty sure we should
> talk them out of that.  I see basically two purposes of defining a
> grammar for machine readable unit strings:
> 
> (1) let clients bring values in data retrieved from multiple sources
> to common units without user intervention
> 
> (2) let clients present query forms (or similar artifacts) to the
> user in units convenient to them and convert to whatever the service
> expects the values in on the fly.
> 
> For both, we just *have* to restrict what can be in the unit
> attribute.  I feel a bit silly to state this -- am I missing
> something?

This is a question of philosophy:  is VOUnits intended just for machine-machine interaction or human-machine/machine-human interaction as well?

Of course the protocol has to be restricted, but only to make things as simple and as useful as possible - we don't need semantic models which are useless because unmanageable, even if they are nominally elegant.   

Your user has a table with data having the units "mas" and "M_Jupiter" : do you help her to get this data in or out of the VO universe so that it can be processed further or not?  I'd say you do - this is the whole point of VO - unless such a task is so complicated that it's unmanageable.  This doesn't sound like a too complicated task if it is given a manageable and standardized framework:

	- always keep track of standard units if you do anything with the data - here are the standard units…..;
	- always be able to parse complex units in terms of standard units - here are the rules…..;
	- always be prepared to get a scale factor with your units - here are the rules…..;
	- if you don't understand a unit, ask someone who does - here is someone who knows…...;
	- if possible, keep track of unit metadata - you may not need it, but someone else down the road may.

> Anyway, allowing arbitrary scale factors goes a long way to allow the
> description of existing tables for very little implementation effort.
> So -- why not just pick that kind of low-hanging fruit?

Because arbitrary scale factors have lost their semantic meanings.   Managed scale factors keep such meanings.

>> Indeed, so a set of simple rules (e.g. multiple "/" are OK, since
>> they are easily parsed and are actually more robust) and a
>> vocabulary of common non-standard units is all we need.
>> 
>> <skos:Concept rdf:about="vou:units#jupiterMass">
>> 	<skos:prefLabel>M_jupiter</skos:prefLabel>
>> 	<skos:definition>1.89813e27 kg</skos:definition>
>> 	<skos:altLabel>jupMass</skos:altLabel>
>> 	<skos:altLabel>Mjup</skos:altLabel>
>> 	<skos:altLabel>M_jup</skos:altLabel>
>> 	<skos:altLabel xml:lang="en">jupiter mass</skos:altLabel>
>> 	<skos:altLabel xml:lang="en">jupiter masses</skos:altLabel>
>> 	<skos:related rdf:resource="iau93:#jupiter"/>
>> 	<skos:scopeNote xml:lang="en">Case is not important.</skos:scopeNote>
>> </skos:Concept>
>> 
>> That way, your favourite unit parser could always simply ask
>> vou:units for help.  Maybe I'm slightly misusing skos:definition,
>> but it works just fine.
> 
> While I'd like such a resource, building basic VOUnits mechanism on
> it opens a whole new can of worms -- who's going to maintain it?

Content? The IVOA semantics workgroup could, of course, unless someone else comes up with a better one.  The number of actively used non-standard astronomical units is modest.  We have lots of other resources, why not a units-lookup resource?  We come up with a suggested units conversion table and find someone to host a simple service.

> How?  Do we really want to require unit parsers to have to understand
> skos, and to have network access for parsing?  To, in all likelihood,
> be able to query the Registry for how to retrieve the resource?  To,
> in this way, depend in their behaviour on such a resource that can
> and will change over time?

No: your local units parser does the best it can using the standard (SI) units or other local favorites and if it encounters something bizarre, it asks.  

In principle not very different from what happens when you type "Jupiter Mass" into google - you get "mass of Jupiter = 1.89813 x 10^27 kilograms".   Our service simple says "1.898913e27 kg" and gives a link to further semantic info if one is interested.  Your tool doesn't have to be interested but is very happy to have been given the SI equivalent using a conversion factor which is constant among all VO-compatible tools.  If your tool kept track of the link to further semantic info, it could compare this alone with other units using the link alone to see that the unit is the same.

> To me, this seems a high price to pay to solve a problem 80% of which
> is solved by allowing arbitrary scale factors.  The remaining 20%
> (telling the user that 1.89813e27 kg really is meant to mean "mass of
> Jupiter assumed here") are interesting, true, but IMHO it's fine if this
> kind of -- human-oriented -- information is in the human-oriented
> pieces of metadata, i.e., the column name and its description.

No, we shouldn't give up the units metadata without a fight, because that information, once gone, is gone forever (well, until your software asks a human in a pop-up window).

Rick



More information about the semantics mailing list