SODA, section 4.3

Wed Nov 16 19:55:27 CET 2016

For the interval case specifically, while the value is technically an
array of length 2 but the semantics of interval is a set of values
[a,b]... so Markus has convinced me that the normal usage of MIN/MAX
makes sense for 1-d intervals (instead of just a MAX interval).

For circle and polygon the normal meaning of MIN/MAX doesn't really
work because there are 2 dimensions. So here we have to (ab)use the
meaning of MAX and the rather hazy constraints on what can go into
VALUES in order to convey param metadata.

Pat

On 16 November 2016 at 09:11, François Bonnarel
<francois.bonnarel at astro.unistra.fr> wrote:
> Hi all,
>
> Hum, hum...
>
>      I think the meaning (or absence of meaning) of MIN/MAX for an array of
> data is dependant of the xtype (if not of the utype) anyway.
>      a ) If we have arraysize=1 MIN/MAX is unambiguous
>      b ) I f we have arraysize >= 2 then MIN/MAX may be the limit values for
> this array of numbers but only if that makes sense. If we have a FIELD/PARAM
> containing RA and DEC we probably have an xtype or utype to give an hint of
> the semantics of this FIELD and we understand MIN MAX have no meaning.
>      c ) If arraysize > 2 and the numbers are of the same "nature" MIN/MAX
> makes sense but don't really code an interval, just extrema
>      d ) if arraysize = 2 and the 2 numbers are of the same nature then
> MIN/MAX and interval limits are probably synonymous.
>    So I think whatever solution we choose we have to figure out that this
> solution is xtype dependant anyway.
>
>     Is this only an implementation conflict ?
> Cheers
> François
>
> Le 11/11/2016 à 09:34, Markus Demleitner a écrit :
>>
>> Hi Pat,
>>
>> On Thu, Nov 10, 2016 at 08:56:22AM -0800, Patrick Dowler wrote:
>>>
>>> Comments inline...
>>>
>>> On 10 November 2016 at 01:56, Markus Demleitner
>>> <msdemlei at ari.uni-heidelberg.de> wrote:
>>>>
>>>> Hi Pat,
>>>>
>>>> On Wed, Nov 09, 2016 at 12:17:18PM -0800, Patrick Dowler wrote:
>>>>>
>>>>> The reason intervals are treated that way is consistency.  My reading
>>>>
>>>> Well, we're going to be inconsistent anyway -- there's no way to
>>>> reconcile our ugly hacks on CIRCLE and POLYGON with the semantics
>>>> intended by VOTable in (not only, I claim) my reading.
>>>
>>> Well, the VOTable xsd says that value in MIN/MAX is any string, so
>>> interpretation is completely up to the parser or application. I
>>> don't see any reason that  any of the attributes of the enclosing
>>> PARAM should be ignored when interpreting it.  I think it is clear
>>> they are intended to be used, and not just datatype. We are just
>>> taking this to the logical conclusion when  an xtype is specified.
>>
>> The reason to ignore them is very simple: Implementation sanity.  I
>> don't need to know the unit, or the utype, or indeed anything except
>> datatype to interpret MIN and MAX.  I retrieve the value parser
>> associated with the datatype, parse @value, and can immediately see
>> whether any value for a field or param is or is not ok.  Of course,
>> the same goes for VALUES/@null -- it would suck if the rules for
>> MAX/@value were different from the ones for VALUES/@null.
>>
>> If a VOTable processor needed to hedge against special rules for
>> whatever other attributes FIELD or PARAM might have, the
>> implementation becomes a nightmare.  No, if you want to have
>> something that *necessitates* a custom value parser, then you must
>> define a datatype, not an xtype.
>>
>> Put a bit more mathematically, the question is: what is the signature
>> of a get_value_parser function in VOTable?  Do we want it to be
>>
>> get_value_parser(datatype) -> callable
>>
>> or should it be
>>
>> get_value_parser(datatype, arraysize, xtype, ...) -> callable.
>>
>> I argue that the first is not only much more desirable but actually
>> completely sufficient. Of course, things break down for the
>> array-abusing CIRCLE and POLYGON; but that's because the modelling
>> for them is wrong in the first place; it should be cleaned up when we
>> have a good model for spherical regions.
>>
>> Conversely, saying
>>
>>      <PARAM name="BAND" unit="m" ucd="em.wl"
>>        datatype="double" arraysize="2"
>>        xtype="interval" value="">
>>        <DESCRIPTION>The wavelength intervals to be extracted</DESCRIPTION>
>>        <VALUES>
>>          <MIN value="3e-7"/>
>>          <MAX value="8e-7"/>
>>        </VALUE>
>>      </PARAM>
>>
>> is *at least* as expressive as when you have your
>> <MAX value="3e-7 8e-7"/>, I'd argue a lot more intuitive, and in
>> particular it's consistent with the other conceivable uses, both
>> within SODA metadata declaration and in normal VOTables.  I'm
>> metioning
>>
>>      <PARAM name="POL" ucd="meta.code;phys.polarization"
>>        datatype="char" arraysize="*" value="">
>>        <DESCRIPTION>Polarization states to be extracted.</DESCRIPTION>
>>        <VALUES>
>>          <OPTION>I</OPTION>
>>          <OPTION>V</OPTION>
>>        </VALUE>
>>      </PARAM>
>>
>> -- you certainly wouldn't want <OPTION>I V</OPTION> here, even though
>> there's arraysize="*", right?
>>
>> And of course it's consistent with, say
>>
>>      <PARAM name="ATTENUATION"
>>        datatype="double" value="">
>>        <DESCRIPTION>A factor to dampen everything with</DESCRIPTION>
>>        <VALUES>
>>          <MIN value="1"/>
>>          <MAX value="1e-10"/>
>>        </VALUE>
>>      </PARAM>
>>
>> or other scalar parameters or table rows.
>>
>> Finally, I'd argue that <MAX value="3e-7 8e-7"/> is positively
>> confusing; even if one buys that you'll have one value per array
>> element. The (IMHO plausible) guess that array element 0 is bounded
>> by 3e-7 and array element 1 is bounded by 8e-7 is, of course, wrong.
>>
>> *Both* are bounded by 3e-7 downwards and by 8e-7 upwards.  That's why
>> an array is an acceptable representation (it's homogenoeus), and
>> confusing that fact is something we'll regret later.
>>
>>>> Hm... no.  Admittedly, VOTable is a bit hazy here, which is why we
>>>> *might* just get away with what we do to VALUES for CIRCLE and
>>>> POLYGON. But even talking about minimum and maximum really precludes
>>>> using array literals (as they are not orderable preserving
>>>> arithmetic).  Language like "The domain may therefore be defined as a
>>>> single interval" (VOTable 1.3, p. 16) reinforces this notion.
>>>
>>> That was undoubtedly written before xtype was introduced in VOTable-1.2
>>> so I'd suggest that the full implications of xtype were not apparent.
>>
>> Well, perhaps, but as argued above at least *I* don't think xtypes
>> should have any implication on MIN/MAX, and that there actually are
>> no implications of xtype for them.  And hence I'm severely unhappy
>> to, by gentleman agreement, simply re-interpret the standard
>> language when I really see no good reason to.
>>
>>> Still, if we do this with circle and polygon then we can do it with
>>> interval and I that means xtype usage dictates interpreting values
>>> in MAX. VOTable-2.0?
>>
>> My opinion is, again, that we (really) shouldn't be doing it for
>> circle and polygon either.  Until the rest of the VO can tell us how
>> to sanely do geometries, we dare do an emergency hack here and plead
>> forgiveness from the VOTable implementors.
>>
>> Interval modelling with arrays and xtypes, on the other hand, is
>> sane, and we can confidently say: Dear VOTable crowd, thanks for
>> providing us with the facilities to properly model what we need.
>>
>> My bottom line, I guess, is: We should not complicate standards in
>> order to accomodate emergeny hacks.
>>
>>            -- Markus

-- 
Patrick Dowler
Canadian Astronomy Data Centre
Victoria, BC, Canada