New UCD1+ for SIA protocol

Arnold Rots arots at head.cfa.harvard.edu
Thu Apr 1 07:20:52 PST 2004


I was asked to write something about the SCORE parameter for SIAP:

SCORE
This parameter is intended to aid the user in selecting an image in
cases where a SIAP query would return multiple images.  This is
especially intended for non-expert users and for users who have no
need, for their present purposes, to check through the metadata
provided in other parameters.
The criteria for determining the SCORE are entirely up to the service
provider, since different considerations come into play for different
archives.  Some hints are provided below.  We implicitly trust the
providers to choose the best algorithm for the majority of the users,
based on the fact that they know their data properties better than
anybody else.

Output
SCORE is a floating point value that ranks returned images according
to their relevance for the query as perceived by the server.  This is
meant to aid the client (especially the non-specialist client) in
choosing from a list of images in case more than one image satisfies
the query criteria.  The scale is always relative and only meaningful
in the context of the result in which it is provided.  The highest
number represents the "best" image available that satisfies the
query. There is no specified range.  It may measure things like
exposure time, image quality, proximity to the specified position,
resolution, etc.

Input
SCORE is a string and may assume two values:
SCORE=TOP
If multiple images satisfy the query criteria, only the one with the
highest value of SCORE (in output) is returned.
SCORE=ALL
(default) All images satisfying the query criteria are returned.


This issue is important for archives of pointed observations where
dozens of images may be available satisfying a single query.  Expert
users may want to see the metadata on all of them and choose, but
less sophisticated users and more general services (the issue came up
in some of our prototypes/demos as a real problem!) are more likely
get annoyed and say: just get me what you think is the best one.  And
that's what this is about.

Note that SCORE is not necessarily a measure of data quality: a 1000 s
exposure of high-quality data may still score lower than a 100000 s
exposure of somewhat poorer quality.  Nor does it say anything about
how well each image satisfies the query: all returned images are
expected to match the query's criteria.

The question really is: among all these matches, how well do we think
the user will like them, or how well will they fit the user's purposes?
I would like to emphasize two things:
1. The scoring scale should be relative and only have meaning within a
particular response list.  I.e., you cannot necessarily compare the
scores that come from two different queries and deduce any meaningful
conclusion.  The scoring may not even be linear.
2. The user is free to like or to dislike, to trust or to distrust the
scores that are returned.  If the user does not like the scores, (s)he
should look at the actual metadata returned with each image in the
response.


Designing a scoring algorithm takes some thought and will be very
mission/observatory specific.  To illustrate this let me list the
considerations that went into the Chandra archive's algorithm:
 - Exposure time; clearly, longer exposures produce higher-quality
images
 - Instrument: ACIS-I/S, HRC-I/S; this is based on the different
sensitivities and spectral responses
 - Co-allignment with requested position; Chandra's PSF quickly gets
worse off-axis
 - Image resolution; we have two canned images at different
resolutions and with different FOV.

One might consider adding other factors, such as seeing/aspect
quality.  It is likely that we will continue to fine-tune the
algorithm.

I don't think it is particularly useful to publish the algorithm or
its description.  Frankly, I suspect that those who rely on SCORE
don't really care and that those who care wnat to inspect the metadata
to make their own decision.


There is no UCD that fits SCORE, it's a context-dependent thing and,
besides, it has a different datatype on input and output.


  - Arnold

--------------------------------------------------------------------------
Arnold H. Rots                                Chandra X-ray Science Center
Smithsonian Astrophysical Observatory                tel:  +1 617 496 7701
60 Garden Street, MS 67                              fax:  +1 617 495 7356
Cambridge, MA 02138                             arots at head.cfa.harvard.edu
USA                                     http://hea-www.harvard.edu/~arots/
--------------------------------------------------------------------------



More information about the dal mailing list