<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<p>Dear Mireille, Sebastien, IVOA Semantics group and colleagues,<br>
<br>
Thank you very much for giving my question so much thought. Your
proposed new words are clearly useful for my use case. My comments
are:<br>
<br>
The proposed new word stat.percentile is clearly a good idea.<br>
<br>
The proposed new words stat.percentile.1sigma (and 2sigma and 3
sigma) are also useful (and something I had not thought about
myself), as they provide more information about what percentile is
meant. Your scheme of adding either stat.min or stat.max, as in<br>
stat.percentile.1sigma;stat.min<br>
stat.percentile.1sigma;stat.max<br>
works, but I am not sure it's the most satisfying solution. As far
as I can see, one would never use stat.percentile.1sigma without
adding either stat.min or stat.max, so I would therefore create
separate words for the percentiles below and above the median,
e.g.<br>
stat.percentile.lower1sigma<br>
stat.percentile.upper1sigma<br>
And similarly for 2sigma and 3sigma. I am not sure what the best
wording would be. If you want to use more characters, one could
insert the word "median", as in "1sigmabelowmedian". (And instead
of lower/upper one could user below/above.) One could also have
another level (stat.percentile.1sigma.lower), which could be more
readable.<br>
<br>
I want to note that e.g. the 16% percentile is only guaranteed to
be located 1 standard deviation ("sigma") below the median (and
mean) for a normal distribution, whereas for asymmetric
distributions that would not be the case. (Disclaimer: I am not a
statistics expert.) It should be therefore be understood that
these new UCD words can be applied to the percentiles that in a
normal distribution would correspond to 1,2,3 sigma below/above
the median, but which in the concrete case may not have that
property.<br>
<br>
I think that the 1sigma/2sigma/3sigma naming is fine. If you
instead wanted to have the actual numbers, a problem is the dot in
e.g. 2.5%. Instead of per cent one could use per mille. I have
looked up what the percentiles (in per mille!) are for a normal
distribution for -3,-2,-1,+1,+2,+3 sigma:<br>
1.3499000000000194<br>
22.750130000000013<br>
158.65525499999995<br>
841.3447450000001<br>
977.24987<br>
998.6501<br>
So one could create the words<br>
stat.percentile.1permille<br>
stat.percentile.23permille<br>
stat.percentile.159permille<br>
stat.percentile.841permille<br>
stat.percentile.977permille<br>
stat.percentile.999permille<br>
But I am not sure it is more elegant. (And I note that my
catalogue (not created by my) has e.g. the 2.5% percentile and not
2.3% which would be the logical choice.)<br>
<br>
I would like to use the new proposed UCD words (either directly
what you wrote, or a modified version based on what I suggest now)
in my catalogues for publication in ESO's Phase 3. How long would
it take before the new words would be approved? I suppose they
need to be approved before ESO can accept them. I can say that we
found a small problem with one column in the catalogue, so the
final version will probably not be ready before 1-2 weeks, as the
main author is finishing his PhD thesis these days.<br>
<br>
Kind regards, Bo<br>
</p>
<div class="moz-cite-prefix">On 3/17/22 12:39 PM, Mireille LOUYS
wrote:<br>
</div>
<blockquote type="cite"
cite="mid:2503619f-1909-036c-310a-ca68ecafbe1a@unistra.fr">
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<br>
Hi Bo , Hi semantics, <br>
<br>
We have re-examined your use case together with S. Derriere and A.
Preite Martinez and checked also how Vizier handles percentiles.<br>
<div class="moz-cite-prefix"><br>
</div>
<div class="moz-cite-prefix">There is indeed currently no proper
way to describe with UCDs that a measurement is associated to
some percentile<br>
of a statistical model/distribution.</div>
<div class="moz-cite-prefix">Creating a new word could help
describe these values : <br>
<pre>Q stat.percentile Percentile in a statistical distribution</pre>
</div>
<div class="moz-cite-prefix">We could also have a few more precise
words to address exactly what you are trying to describe :</div>
<div class="moz-cite-prefix">
<pre>Q stat.percentile.1sigma Percentile corresponding to one standard deviation from the median</pre>
</div>
<div class="moz-cite-prefix">
<div class="moz-cite-prefix">
<pre>Q stat.percentile.2sigma Percentile corresponding to two standard deviations from the median</pre>
</div>
<div class="moz-cite-prefix"><br>
</div>
<div class="moz-cite-prefix">With these words, we could use :<br>
<pre style="margin-left: 2em; font-family: monospace; caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: -webkit-left; text-indent: 0px; text-transform: none; widows: auto; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; text-decoration: none;">ucd="src.redshift.phot;stat.percentile.2sigma;stat.min" for EAZY 2.5% percentile of photo-z
</pre>
<pre style="margin-left: 2em; font-family: monospace; caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: -webkit-left; text-indent: 0px; text-transform: none; widows: auto; word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; text-decoration: none;">ucd="src.redshift.phot;stat.percentile.1sigma;stat.min" for EAZY 16% percentile of photo-z AND LePhare photo-z lower limit, 68% conf. level
ucd="src.redshift.phot;stat.median" for EAZY 50% percentile of photo-z
ucd="src.redshift.phot;stat.percentile.1sigma;stat.max" for EAZY 84% percentile of photo-z AND LePhare photo-z upper limit, 68% conf. level
ucd="src.redshift.phot;stat.percentile.2sigma;stat.max" for EAZY 16% percentile of photo-z
</pre>
</div>
<div class="moz-cite-prefix">In the UCD vocabulary, maybe an
extra word would cover all possible cases :</div>
<div class="moz-cite-prefix">
<pre>Q stat.percentile.3sigma Percentile corresponding to three standard deviations from the median
</pre>
I hope this helps . <br>
I have created a VEP-UCD for this term , and will circulate it
in the UCD Board to discuss it for adoption .<br>
<br>
Tell us wheter you can use this , and your feedback in case .
<br>
Thanks in advance .<br>
<br>
Mireille & Sebastien <br>
CDS, Strasbourg <br>
----------------<br>
</div>
<div class="moz-cite-prefix">
<hr width="100%" size="2"></div>
</div>
</blockquote>
</body>
</html>