A note on a possible extension to SIAP

Tue Apr 20 07:18:29 PDT 2004

Dear Pedro and Jesus,

>Dear Francois,
>
>this new approach, where you include different tables in the same
>RESOURCE  results, is closer to what we propose.
>
>Although we like it more than your previous proposal, we have been
>looking at the pros and cons of your approach vs ours, and these are the
>conclusions we have drawn. We think there is no-single solution for this
>issue, but also believe it is good to post these ideas to the whole
>community for discussion. Here they go:
>
>
>In your example, a table "points" to a structure which is outside itself
>(another table). This is to avoid modification of the DTD to allow
>tables inside tables. All the tables appear, therefore, at the same
>level.
>
>This means that the external tables do have to have an identifier to be
>able to do the joins among tables. 
>
>This would be an exact mapping of how an RDBMS works: independent tables
>joined through identifiers. 
>
>The problem with this approach is that it leaves the whole of the
>interpretation of the structure to the client. 
>  
>
I would rephrase it with:

The BEAUTY with this approach is that it leaves the whole of the
interpretation of the structure to the client.

>As the service is hiding the structure (it does not have any structure
>by itself, only pointers and identifiers to be able to do joins), the
>client will have to re-interpret it. In the case of a couple of
>observations, this could be done fast. In the case of hundreds of
>observations, in your approach the client would have to go through the
>whole results for each and every observation to figure out which
>exposures belong to it (and later, for each and every exposure to find
>the sources, etc.).
>
Do you really think that the typical SIAP user will want to receive back
thousands of images from his/her/its (the latter for a robotic client) 
query? (*)

Maybe I'm looking at things too much from a human user point of view here.
Probably because I cannot imagine too common use cases where it's a computer
using SIAP to retrieve huge lists of datasets, but I might be wrong. Maybe
the super-duper Registry/Portal will need that.

But if this is the case we  need first of all to clarify to ourselves 
what is the
main context for SIA, SSA, etc:

     Human Interaction or Computer Interoperability?

>
>As the results given in your way do not contain any specific ordering,
>the display client will have to make this ordering by itself.
>
Not by itself, but following the user's needs.

> This
>reordering would be done _without_ indexes (in the database terminology
>sense) and therefore would be a very un-efficient task. In our case,
>however, indexes are implicit in the structure itself, as
>"substructures" (as exposures, sources, etc.) are already inserted below
>a specific "superstructure" (like observation) and therefore the client
>only travels the structures once. 
>
Indeces are worth for tables with at least, let's say, 1000 records;
for smaller datasets a table scan is good enough; but go back to (*) above

>
>In our proposal, therefore, the join issues are handled at the server
>level, and the structure is already unveiled to the client, which would
>only have to display it in one go.
>
>  
>
It might reveal quite difficult to translate a complicated "data 
provider" structure
into what-and-how the user wants to see displayed (see Doug'a 
answer/experience).

>Another "artifact" of your proposal is the use of those "dummy"
>placeholders:
>
><TD>Exposure_Table</TD>
>
>These ones appear due to the fact that VOTable is imposing to have an
>entry per FIELD, and thus the <FIELD ID=Exposure_Table>, which in our
>case goes _inside_ the TABLE, has to have a dummy <TD> in your case as
>it points to an outside table. 
>
>This might look like cosmetics thing, but in our view it reveals an
>inconsistency between the VOTable real requirements and the way we would
>use the VOTables in your example. Of course, this problem would be
>solved by modifying somehow the DTD.
>
>As a last example we were trying to use to clarify our own ideas, we
>were identifying the structure we wanted to include in the SIAP to a
>file system directory structure. In our approach, we send the directory
>structure directly to the client, allowing it to display it the way it
>wants. In your approach, a very simple directory structure would be
>completely flat, with each subdirectory containing a pointer to the
>directory where it hangs from, allowing to "join" directories and
>subdirectories. Instead of giving the structure directly to the client,
>you give it the tools to create the structure, but then it is the client
>which has to do the whole process of "structurization". Although a valid
>approach, we believe that the time spent by the client in doing this
>should be spent by the server, which in the end is doing an asynchronous
>work with the rest of the servers being requested.
>
>
>Anyway, we believe this approach is much close to our idea of how things
>should work, and we think we are working towards a positive evolution
>for SIAP results. 
>  
>
Good!

>Best regards,
>P&J
>
Ciao,

Alberto