SED data model v0.92

Tue Nov 23 06:29:55 PST 2004

Gilles DUVERT wrote:

>
>
> Ed Shaya wrote:
>
>>
>> You want the lower error bar to reach from the upper limit to the 
>> x-axis.
>>
> I think, in view of what you object, that I would prefer upper limits 
> to have a different status than measure. Besides, I'm ill at ease 
> thinking measure in geometrical terms (or, rather, in "plot+axis" 
> terms), this looks too linked with our day to day external 
> representation of data, (a mathematician would perhaps read "measure" 
> as a "norm" for topology )
>
Although it makes the model more complicated, it is probably worth 
having an  UpperLimitPoint and a LowerLimitPoint which contains only the 
limit to be used ONLY when the limit value is not known.  It could have 
an optional  NSigma which tells  how many sigma limit was used and the 
default should be 3.  This way one does not confuse the upper limit 
magnitude with the true error value or the true value.

>>> The risk would be, if any *measured* value is set, that is is taken 
>>> at face value, when only the noise level has sense. Could you use 
>>> some kind of blanking value for the measured value in this? (or is 
>>> there a general concept of upper limit that would go in the 
>>> Quantity::Accuracy data model?)
>>>
>> All I need say here is that if 4 independent experiments come up  
>> with  0.9 sigma  detections of  some measurement that it would be 
>> awful  if they each  published  only the upper limits of the value.
>
>
> Here you suppose that the value "detected at 0.9 sigma" exists, or, 
> rather, that I can pinpoint it in the graph, and put nice big 
> errorbars on it for each of the 4 measurements. What I say is that 
> this value does not exist until it is measured, and it is measured 
> only when it is not an upper limit.

Are we having some subtle argument about when a measurement is a 
measurement and when a detection is a detection?
If I measure 9 photons in a pixel but the sky noise is 10 photons  we 
keep track of those 9 photons by placing it in the point/value just like 
any other measured value.
With just this measurement it is not a detection of anything except for 
those 9 photons, but  if  I  go back through the archives and find that 
this location has been studied 3 times before, perhaps with different 
filters, and each time this location has roughly 1 sigma positive 
measurements, then I would say that an object is detected at roughly 4 
sigma and my observation was just a 0.9 sigma detection.  Perhaps you 
would phrase this differently.
But, I hope that you are not supporting the idea that the USUAL 
treatment for a  measurement that is  less than  3 sigma should be to 
drop the value and just quote the upper limit.
You would not support blanking out all of the pixels in an image or 
spectrum where the signal is below 3 times the sky noise and replace 
them with upper limit values?  Or, would you?  We would need to 
reprocess all existing data archives to treat upper limits in this way?
If 4 major physics laboratories come up with 1 sigma detections of the 
mass of the neutrino at about 10eV should each publish just upper limits?

>
> Of the 4 groups coming up with this "0.9 sigma detection", the 3 last 
> in time are morons: they were not able to devise an experiment with 
> less uncertainty as the 1st pioneering group. Shall we continue to 
> support them financially? ;^). 

That is a political statement.

> Besides, since the experiments are not the same, and you want to get a 
> measurement at the end by averaging values+errors, you have to prove 
> that errors and "measure" in those different experimental setups can 
> be averaged. Unless the 4 experiments are just 4 realizations of the 
> 1st measurement, and, bingo, upper limits are still upper limits in 
> this case....
>
We certainly need to have a system so that if the same measurement shows 
up in several context that we and our applications can easily recognize 
it.  That is the point of putting IDs on observations.
Beyond that, it is the job of a distributed data system  to  be able to  
pull together similar  data entities for  processing.  Isn't the idea of 
the VO to allow us to do analyses across multiple data sets and data 
archives that is beyond our present capabilities.

> Fortunately, for people bold enough to claim (and they are numerous!) 
> that their 0.9 sigma measurement (was really an upper limit) _is_ a 
> measurement, they can use the normal value+error scheme.

This should be the normal mode and our applications should be upgraded 
to recognize measurements below 3 sigma and plot them as yellow and 
properly draw the upper limit symbol.

>
>> The possibility that some moron may not look at the quoted noise 
>> levels before coming to some silly conclusion on a measurment does 
>> not compensate for missing out on  potentially important real 
>> discoveries that properly archived data makes possible.
>
>
> a) The possibility of "some moron..." is huge.

Oh yes!  This is a tradeoff between a bunch of people wasting some time 
against losing precious perhaps irretrievable information.

> b) I would not place too much faith in discoveries (real, important) 
> based on a sum of  invalid measurements...
>
This is the crux of it.  To me, a less than 3 sigma measurement is a 
valid measurement.  It tells me  the probability of something being 
there so that I can properly assess my risks in attempting to go deeper 
in that direction.  If it is an invalid measurement then I don't want it 
published at all anywhere.

>
> Best,
> Gilles
>
Cheers,
Ed