DAL2 interface consistency (was [VOSI] Re: TAP 1.0: sync vs async)

Douglas Tody dtody at nrao.edu
Mon Jul 20 17:36:34 PDT 2009


Hi -

Thinking about this rehash of TAP/DAL2 standards a bit more I want
to expand on the importance of interface consistency, particularly
in the DAL interfaces.  These recent discussions have been focused
very much on detailed technical issues such as how we provide async
and how hard it is to program a redirect.  But perhaps we are losing
track of what we need to provide to the user.

For the most part the VO middleware is system software which the
user never sees.  The DAL interfaces are different from most VO
middleware as they are the main interface used to access science data,
in particular by client applications which are often written by users
(e.g. the folks who come to our summer schools).

So lets stand back and look at this from the point of view of a user
trying to write or adapt science applications to talk to the VO.
Such a user will primarily see things like:

     - The types of services provided and the implicit classification
       of data (catalog/table, image, spectrum, etc.).  Such a
       classification is fundamentally object oriented, i.e., a class
       structure.  It should resemble common practice within astronomy.

     - The operations one can perform on each type of data, defining the
       functionality provided by the data access services.

     - The data models and metadata used to describe data (most of
       the effort for the user in fact involves this, at least for
       data access/analysis).

     - The data objects which are returned.

A science user will rarely see things like the details of how
asynchronous operations are performed, or authorization etc.; usually
some higher level interface will need to be provided.

So with that in mind lets go back and look at some of the issues which
have been discussed in the recent mail.

     - Object model.  Primary in the user interface.  Since data
       is inherently OO with a class structure (e.g., an image or
       spectrum is an object) this is quite important.  Data access
       involves a data object with defined properties (metadata) and
       operations which can be performed upon the object.  In VO we are
       dealing with virtual data not just static files or resources,
       hence these operations can be nontrivial.  REST has limited
       capabilities for virtual data in that files can be dynamically
       generated, but only if everything can be modeled as a resource
       (a file-like hierarchy essentially).  REST-like interfaces with
       parameters can however work since this basically provides a
       class with methods capability.  It is important to observe REST
       semantics at the HTTP level for this to work well with the Web.

     - Interface consistency.  Since what we have is a class hierarchy
       with a high degree of inheritance of both functionality and
       metadata, 90% of the service interface is common to each member
       of the family of services.  Since users often program directly
       at the HTTP level they see these interfaces and write tools to
       use these interfaces, and it is important to provide consistency
       at this level.  The details of what the interface looks like
       are largely arbitrary but need to support the object model and
       need to be standardizes otherwise we fail to provide consistency
       (hence for DAL we tried to do this 3-4 years ago prior to the
       roll out of all the DAL2 service interfaces).

     - Sync/async.  This is an important capability which users will
       have to deal with at some level, however it has nothing to do
       with science data.  Few users will need to understand the details
       of how we define this interface, e.g., in terms of /sync and
       /async HTTP endpoints, or how UWS is modeled.  The resource
       model works well for things like kernel/process/job state,
       so is reasonable to use at this level.  For DAL it is important
       for the services to be compliant with the GWS standards so that
       we can share code, but the interface can look different than
       what DAL defines for OO data access.

     - Redirect of a static URL for load balancing.  It takes more
       time for us to discuss this here than it would take to write
       the code to provide this at the service rather than applications
       server level.  In any case we will never have a load problem with
       something like getCapabilities or getAvailability.  Where we
       will have load issues is with long running operations, for
       which we already have UWS which already does not rely upon an
       applications server to automate load balancing.

In summary the object model, service functionality provided, and
interface consistency are primary for user-developers whereas the
more technical aspects of the VO middleware, while critical, will
rarely be seen by scientific users of the VO.

Getting back to the interface consistency issue, I am reminded of a
conversation we had with science users at a recent NVO summer school.
They were complaining that cone search uses RA,DEC,SR whereas SSA and
DAL2 uses POS,SIZE, which is inconsistent.  Our response was mainly
that this was due to the evolution of VO and that with DAL2 this
would all be resolved, at least for one generation of interfaces,
and everything would be standardized across all the second generation
interfaces so far as possible.

It would be very hard to explain to either such users or to the funding
agencies that all the interfaces look different because the mix of
people involved varied in each instance (or whatever the cause) and
thus the IVOA failed to successfully address such a basic concern.
Can we have a successful multi-year effort which addresses such
concerns or is it really a random walk depending upon whatever group
of people are actively involved in the discussions at a given time?

Cheers,

 	- Doug



More information about the dal mailing list