SIA Evolution Proposal

Pedro Osuna posuna at iso.vilspa.esa.es
Fri Apr 16 03:06:03 PDT 2004


Dear Alberto, dear all

thanks for your thorough comments on our note. Please find bellow our
explanation to your concern. We hope it helps a bit.


The main goal of our proposal is to be able to add structure to the
results of a SIAP query.

In our view, the CDS proposal using the current DTD allows for the
inclusion of several "parallel" <RESOURCE> elements with metadata
descriptions, but in the end does not solve the "structurization"
problem, as the results keep on being somehow flat. Parallel (i.e., same
level) <RESOURCE>s can not allow for structure inside. The CDS proposal
gives a <RESOURCE> of type "Results" and several others at the same
level. This is due to the fact that to allow for the inclusion of real
data inside the resource (i.e., to allow for structure), the resource's
TABLE _should_ allow to include inside it another RESOURCE (or another
TABLE, as we propose). 
This, however, can not currently be done, as -according to the current
DTD-  neither the TABLE element nor the FIELD one allow for either
TABLES or RESOURCES.


We would therefore have two options to solve the problem:
- modify the DTD to allow RESOURCEs inside TABLEs
- modify the DTD to allow TABLEs inside TABLEs

The first one, we dislike as we believe it would introduce redundancy: a
resource contains a lot of header information and descriptions; the same
resource header will appear in every row hence producing redundancy.

The second option, that we proposed, is to enable the inclussion
of tables inside tables. This option can be summarized
as ONE "results" <RESOURCE> whose <TABLE> might contain nested tables
inside. The nested tables contain the minimal information needed for the
representation, and it would be backwards compatible with the current
Sia protocol if the client would only take the first level (i.e. a flat
table. (On the other hand, additional metadata information could be
added in other resources, following the CDS approach).

You said:
[...]
The example by Pedro and Jesus shows a single record; but suppose that
multiple records (say 4) are returned, and suppose that 2 observations
were taken in energy band A and 2 observations were taken in energy band
B. The same Energy_Band_Table will be seen in two different records for
both the A filter and the B cases.
[...]

Even when one filter appears in more than one "observation", the
energyband table is not the same one. The energyband table contained in
every observation row is the table which contains the different images
for different filters for this observation. As the image for filter A
for obs 0011020 is different than the image for filter A for obs 0011021
there is no redundancy. As we said, the description of filter A could
appear in a separate filters resource, but not in the results resource
itself.

Following your example, and using a tree representation and a "pseudo
xml/votable format" representation:
<results type resource>
 <observation table>
  <tr>
    <td>obs1 Image</td>
    <td>
       <filter table for obs1>
         <tr> obs1 Filter A image </tr>
         <tr> obs1 Filter B image </tr>
        </filter table for obs1>
      </td>
  </tr>

  <tr>
     <td>obs2 Image</td>
     <td>
       <filter table for obs2>
         <tr> obs2 Filter A image </tr>
         <tr> obs2 Filter B image </tr>
       </filter table for obs2>
     </td>
   </tr>
 </observation table>
</results type resource>

(i,e., something like:)
results _______ obs 1_______ Filter A             
         |                  |_____ Filter B      
         |_______ obs 2_______ Filter A                 
                            |_____ Filter B

As you can see, there is no inherent redundancy in this approach. The
redundancy might appear in the client display, for which the following
question might appear:

How does the client "group" (and this is the key word we believe) the
information coming from different projects, if we do not know _what_
each of the resource tables inside it is describing?

The answer to this is in the UCDs. Every table field information should
be described by a proper UCD. In the case of our energy bands, we ought
to find a proper UCD that would describe the _type_ of information
contained in that table (i.e., in the "sub-structure" of our SIAP
result. The clients should then, in our view, group things by the _type_
of quantity they describe in their UCD, rather than by their name (which
would imply _what_ they are instead of what _type_ they are).

By doing this, clients could group results in a different hierarchy than
the original one in the Data Provider data model, and this would not
have any problem, as -in the end- the image is what is important for the
SIAP, and whether it is at the first level or third in the original
hierarchy is of no importance to the user that just wants to be able to
compare things of the same type.

We hope this clarifies our view a little bit more, and wait for your
hopefully not too discouraging comments.

Best regards,
Pedro & Jesus



More information about the dal mailing list