SODA, section 4.3

Wed Nov 16 18:11:25 CET 2016

Hi all,

Hum, hum...

      I think the meaning (or absence of meaning) of MIN/MAX for an 
array of data is dependant of the xtype (if not of the utype) anyway.
      a ) If we have arraysize=1 MIN/MAX is unambiguous
      b ) I f we have arraysize >= 2 then MIN/MAX may be the limit 
values for this array of numbers but only if that makes sense. If we 
have a FIELD/PARAM containing RA and DEC we probably have an xtype or 
utype to give an hint of the semantics of this FIELD and we understand 
MIN MAX have no meaning.
      c ) If arraysize > 2 and the numbers are of the same "nature" 
MIN/MAX makes sense but don't really code an interval, just extrema
      d ) if arraysize = 2 and the 2 numbers are of the same nature then 
MIN/MAX and interval limits are probably synonymous.
    So I think whatever solution we choose we have to figure out that 
this solution is xtype dependant anyway.

     Is this only an implementation conflict ?
Cheers
François
Le 11/11/2016 à 09:34, Markus Demleitner a écrit :
> Hi Pat,
>
> On Thu, Nov 10, 2016 at 08:56:22AM -0800, Patrick Dowler wrote:
>> Comments inline...
>>
>> On 10 November 2016 at 01:56, Markus Demleitner
>> <msdemlei at ari.uni-heidelberg.de> wrote:
>>> Hi Pat,
>>>
>>> On Wed, Nov 09, 2016 at 12:17:18PM -0800, Patrick Dowler wrote:
>>>> The reason intervals are treated that way is consistency.  My reading
>>> Well, we're going to be inconsistent anyway -- there's no way to
>>> reconcile our ugly hacks on CIRCLE and POLYGON with the semantics
>>> intended by VOTable in (not only, I claim) my reading.
>> Well, the VOTable xsd says that value in MIN/MAX is any string, so
>> interpretation is completely up to the parser or application. I
>> don't see any reason that  any of the attributes of the enclosing
>> PARAM should be ignored when interpreting it.  I think it is clear
>> they are intended to be used, and not just datatype. We are just
>> taking this to the logical conclusion when  an xtype is specified.
> The reason to ignore them is very simple: Implementation sanity.  I
> don't need to know the unit, or the utype, or indeed anything except
> datatype to interpret MIN and MAX.  I retrieve the value parser
> associated with the datatype, parse @value, and can immediately see
> whether any value for a field or param is or is not ok.  Of course,
> the same goes for VALUES/@null -- it would suck if the rules for
> MAX/@value were different from the ones for VALUES/@null.
>
> If a VOTable processor needed to hedge against special rules for
> whatever other attributes FIELD or PARAM might have, the
> implementation becomes a nightmare.  No, if you want to have
> something that *necessitates* a custom value parser, then you must
> define a datatype, not an xtype.
>
> Put a bit more mathematically, the question is: what is the signature
> of a get_value_parser function in VOTable?  Do we want it to be
>
> get_value_parser(datatype) -> callable
>
> or should it be
>
> get_value_parser(datatype, arraysize, xtype, ...) -> callable.
>
> I argue that the first is not only much more desirable but actually
> completely sufficient. Of course, things break down for the
> array-abusing CIRCLE and POLYGON; but that's because the modelling
> for them is wrong in the first place; it should be cleaned up when we
> have a good model for spherical regions.
>
> Conversely, saying
>
>      <PARAM name="BAND" unit="m" ucd="em.wl"
>        datatype="double" arraysize="2"
>        xtype="interval" value="">
>        <DESCRIPTION>The wavelength intervals to be extracted</DESCRIPTION>
>        <VALUES>
>          <MIN value="3e-7"/>
>          <MAX value="8e-7"/>
>        </VALUE>
>      </PARAM>
>
> is *at least* as expressive as when you have your
> <MAX value="3e-7 8e-7"/>, I'd argue a lot more intuitive, and in
> particular it's consistent with the other conceivable uses, both
> within SODA metadata declaration and in normal VOTables.  I'm
> metioning
>
>      <PARAM name="POL" ucd="meta.code;phys.polarization"
>        datatype="char" arraysize="*" value="">
>        <DESCRIPTION>Polarization states to be extracted.</DESCRIPTION>
>        <VALUES>
>          <OPTION>I</OPTION>
>          <OPTION>V</OPTION>
>        </VALUE>
>      </PARAM>
>
> -- you certainly wouldn't want <OPTION>I V</OPTION> here, even though
> there's arraysize="*", right?
>
> And of course it's consistent with, say
>
>      <PARAM name="ATTENUATION"
>        datatype="double" value="">
>        <DESCRIPTION>A factor to dampen everything with</DESCRIPTION>
>        <VALUES>
>          <MIN value="1"/>
>          <MAX value="1e-10"/>
>        </VALUE>
>      </PARAM>
>
> or other scalar parameters or table rows.
>
> Finally, I'd argue that <MAX value="3e-7 8e-7"/> is positively
> confusing; even if one buys that you'll have one value per array
> element. The (IMHO plausible) guess that array element 0 is bounded
> by 3e-7 and array element 1 is bounded by 8e-7 is, of course, wrong.
>
> *Both* are bounded by 3e-7 downwards and by 8e-7 upwards.  That's why
> an array is an acceptable representation (it's homogenoeus), and
> confusing that fact is something we'll regret later.
>
>>> Hm... no.  Admittedly, VOTable is a bit hazy here, which is why we
>>> *might* just get away with what we do to VALUES for CIRCLE and
>>> POLYGON. But even talking about minimum and maximum really precludes
>>> using array literals (as they are not orderable preserving
>>> arithmetic).  Language like "The domain may therefore be defined as a
>>> single interval" (VOTable 1.3, p. 16) reinforces this notion.
>> That was undoubtedly written before xtype was introduced in VOTable-1.2
>> so I'd suggest that the full implications of xtype were not apparent.
> Well, perhaps, but as argued above at least *I* don't think xtypes
> should have any implication on MIN/MAX, and that there actually are
> no implications of xtype for them.  And hence I'm severely unhappy
> to, by gentleman agreement, simply re-interpret the standard
> language when I really see no good reason to.
>
>> Still, if we do this with circle and polygon then we can do it with
>> interval and I that means xtype usage dictates interpreting values
>> in MAX. VOTable-2.0?
> My opinion is, again, that we (really) shouldn't be doing it for
> circle and polygon either.  Until the rest of the VO can tell us how
> to sanely do geometries, we dare do an emergency hack here and plead
> forgiveness from the VOTable implementors.
>
> Interval modelling with arrays and xtypes, on the other hand, is
> sane, and we can confidently say: Dear VOTable crowd, thanks for
> providing us with the facilities to properly model what we need.
>
> My bottom line, I guess, is: We should not complicate standards in
> order to accomodate emergeny hacks.
>
>            -- Markus