Comments on SSAP V0.95

Doug Tody dtody at nrao.edu
Thu Jun 22 14:09:20 PDT 2006


Hi Randy -

Thanks again for your detailed comments on the draft SSA specification.
Many of the more detailed comments appear to be due to the fact that
the version of the draft you reviewed was incomplete, as was noted at
the time it was released.  I don't think it is worthwhile to discuss
document issues at this time, but we will take your comments into
account in generating the next version.  In what follows we review
the broader issues you raised instead.

 	- Doug


> 0) Versioning:
>  Versioning is useful and important, but we see
>  the following problems with the current proposal:
>  - the main problem is that there are several standards
>    involved in completing an SSAP request, including:
>    1) the VOTable format
>    2) UCD syntax
>    3) Data Models including Spectrum/SED/Characterization
>       and external(?)
>    4) serialization changes (FITS, XML, etc)
>    5) registry
>    Assuming these standards can evolve independently, requiring
>    versioning solely for the SSAP interface would not be sufficient.
>    The fact that 2 types of UCDs are now defined is already
>    causing confusion.

The version referred to here is the protocol version, which is well
defined by the protocol specification.  It is important to verify
the version at the level of the runtime protocol to ensure that the
client and server can talk to each other: for example, a parameter
unit change between two versions could result in incorrect operation
which might not be detected without explicit version checking.
In a GET interface, the only way explicit version checking can be
implemented is with something such as the proposed VERSION parameter
(which we adopted from OpenGIS/WMS by the way).

The intention is that the protocol be defined in a largely
self-contained fashion.  Hence, where UCDs or UTYPEs are defined by
the protocol, the protocol will specify exactly what is required or
permitted without reference to external documents.  If version 1.0 of
the SSA protocol specifies a V1.1 UCD, that is what is required by the
V1.0 SSA protocol.  Likewise no knowledge of the registry should be
required to implement SSA services or clients.  The getCapabilities
method, used among other things to populate service metadata in the
registry, will be fully defined by the SSA protocol without reference
to the registry.

Major service elements such as VOTable and the various data
serializations will however be separately versioned documents.
That is, when a VOTable is returned, the VOTable document itself
will specify the version implemented by the document.  The same is
also true for a data serialization, although in general we expect
to keep the data model and protocol versions in sync for something
like SSA since they are interdependent.  Versioning is important in
all these cases, however this does not make it any less important to
have version control for the protocol itself.


>  - there [are] a large number of "optionally supported" SSAP
>    parameters so version negotiation will never guarantee compatibility
>    between requester and service.

The design of these protocols is such that many parameters are optional
for the service to implement without affecting correct operation of the
protocol.  In any case, the service capabilities describe what the
service supports.  This has is not a protocol version issue.


>  - downward compatible changes would change the SSAP version
>    number and cause a mismatch even though the request would
>    otherwise be successful. Likewise, some changes may not be
>    relevant to the users specific request.

This is an important issue.  To have a good versioning system we need
to be able to specify when versions are incompatible or backwards
compatible.  The current proposal (discussed since the V0.95 document
was released) is that any changes in the version number smaller than
the second digit are minor, backwards compatible changes which would
not normally trigger a version mismatch.  That is, V1.1.0 and V1.1.3
(V1.10 and V1.13 in the IVOA syntax) would not trigger a runtime
version mismatch, whereas V1.1 and V1.2 are deemed to be incompatible.
The exact semantics of version verification, however, are up to the
client application - the client application, or user, may know enough
about specific protocol versions and how they are used by the client
to override this behavior.


>  - I suspect most data providers will not support old
>    versions particularly when multiple standards are involved.

They are not required to, but then no one may be able to use their
new services when an upgrade occurs, and many existing external
applications may break.  In general we probably have no choice but to
continue to support recent versions for some time when a new version is
deployed.  It can also be useful to support experimental new versions
with new capabilities simultaneously with the production services.


>  - Versioning negotiation can require considerably more maintenance by
>    the data providers
>  - Having to do version negotiation for every single request
>    can be a burden

In most cases actual runtime version negotiation will not be needed
since we will already have matched versions via the registry.  The main
purpose of explicit versioning at the protocol level is to verify that
things are functioning correctly, both at runtime and post-facto via
system logs.


> We also question whether it is necessary to
> make all input/query/request parameter values case-sensitive 
> (section 5.7.1). Do we really need to distinguish "getData" 
> from "getdata" or "GETDATA"? Or 1E06 from 1e06? Obviously 
> some may be required to be case-sensitive (e.g., references 
> to file or filter names), but why not allow parameters to be 
> case-insensitive unless specified otherwise.

At the level of the protocol, parameter value strings have to be
defined to be case-sensitive or you will likely lose case information
somewhere in the implementation.  On a per-parameter basis,
case-insensitivity can be specified, for example operation names
(getData) are case-insensitive even when passed as parameter values.


> We would suggest having all the input parameters described
> in one section? Specify:
>    a) if they are required or optional for request (consistent with
>       theoretical data?),
>    b) if they "must", "should" or "may" be supported,
>    c) all default values (and allowed values if appropriate),
>    d) data types, units, and utype values,
>    e) whether they allow single values, ranges, and or lists
>    f) examples

The key point is that a service may have multiple operations:
queryData, getCapabilities, etc.  These are independent operations
hence they need to be described separately.  The service-specific
operations all share the same basic protocol, hence this common
protocol needs to be described separately as well.  I agree however,
that for a given operation it is desirable to describe all the
operation parameters together in one section.

What is probably confusing here is that in the past (SIA) the service
only defined a single operation.  One of the important changes being
introduced with SSA is support for services with multiple operations.


> 2) Some sections of the SSAP paper are very general. Section 
> 4 on requirements for compliance and the HTTP rules in 
> section 5 could apply to all VO protocols. Perhaps these 
> sections could be described in a separate document and 
> later referenced by all standards.

As you say, this is general material which applies to all operations
and probably to multiple services as well.  If it were complex enough
it might be moved out into a separate document, but at present it is
probably simpler to have the document be self-contained and include
this material in all service specifications.  We can consider moving
it to an appendix.


> 3) Type of Data (section 6.1)- There are many references to various 
> types of supported data but with the recent decision to support 
> only single-vector spectra, it is not clear what is actually 
> to be supported in Version 1.0. Does it still support "spectra,
> time series, and SEDs"? The paper should be specific
> as to what is to be supported now (V1.0) and what is planned 
> for the future.

This is a change from earlier verions - this version of SSA only
supports Spectrum.  The issue of how SED and TimeSeries will be handled
in future interfaces is being deferred until after SSA V1.0 is out.
The core data model and basic interface are expected to be the same
for all three types of data, but in general a given service instance
is expected to support only a single type of data.


> 7) List/Range Syntax - There is some general confusion regarding
> which parameters allow the list/range syntax, and how unique
> values should be interpreted (e.g., what does REDSHIFT=5.0 mean?) 
> Is it reasonable to say ALL numerical and time parameters should 
> allow this syntax? If so, then perhaps SPECRES, SPATRES, SNR, should 
> not be defined as minimum values and SINCE renamed.

Parameters only support the range-list syntax where it explicitly
says so; the generality of the syntax permitted also varies with the
individual parameter, and is or will be specified for each individual
parameter.  In general semantics are always defined at the level of
the individual parameter.

In theory we could allow all or many parameters to support the full
range-list syntax, but implementation concerns require that this feature
only be supported for parameters that really need it.

(As an aside, virtual data generation - required to reduce data volumes
when remotely accessing large datasets - requires both data subsetting
and filtering.  An example of data subsetting is a cutout.  Range-lists
are needed to support data filtering.)


> 9) Specific Parameters - 
>
>     BAND - The paper should state the default rest frame 
> (presumably observer).

This is already addressed in the document.


>     REDSHIFT - WAVELENGTHSHIFT would be a more accurate (and
> less astrophysically provincial) parameter name. Redshift has
> developed a connotation of a cosmological yardstick. Also a blueshifted
> redshift does not make sense.

Perhaps.  This has been discussed several times in the working
group with the consensus that REDSHIFT, while imperfect, was the
simplest approach.  A negative redshift (blueshift) is not commonly
used terminology, but is physically valid, and in terms of interface
simplicity may be preferable to alternatives such as adding another
parameter.  Many of our spectral data collections are redshift
surveys and redshift is a common attribute for existing spectral
data collections.


>     APERTURE - We would like to see this description reworded. 
> It currently says "only circular apertures are currently supported", 
> but then allows a slit width for non-circular apertures???  Why not 
> state that APERTURE can be either a diameter for circular apertures or a 
> width for non-circular apertures?

I agree that APERTURE needs further clarification.


>     TARGETNAME - The example in 7.4.2 is "Mars"? Should TARGETNAME
> be restricted to resolvable target names? Also, how does input 
> parameter TARGETNAME (section 7.4.2) differ from the target metadata 
> entry "Target.Name" (section 7.5.3)? Why do they have different 
> utypes (SSA.TargetName vs. Target.Name)?

The reason for including TARGETNAME was to allow reference to data to
moving objects, for which POS is undefined or of little use.  The text
should be revised to clarify this.

In general there is no exact correspondence between query parameters
(which may have complex semantics) and data model attributes.  They
are different interface elements, although they may describe the same
thing in some cases.


>     TOP - A more straight forward approach, in our opinion, would
> be to define a "SORT" parameter to allow the requester to optionally
> determine the sort order. This would also allow sorts on parameters 
> other than those used in the query, and optionally multilevel and
> ascending/descending sorts could also be specified. It is unclear to us 
> how one would "rank" results based on multi-parameter queries and 
> there is no guarantee that services would rank results in a consistent 
> manner.

In general sorting is best done on the client side as there are many
alternative ways to view the same table.  The purpose of TOP is not to
sort the query response, but to select the top-ranked items matching
the query.  Algorithms for scoring queries have been extensively
discussed elsewhere (a google query illustrates the utility of this
feature).


>     SPECRES - (delta Lambda)/Lambda would be a more general definition
> and provides the option to support multi-wavelength and future
> multi-order queries?

Agreed.


> 10) Section 7.4.3 Service-defined Parameters - These names
> should be case-insensitive like the reserved parameter names. Making 
> them lower case does not necessarily distinguish them from reserved
> parameters.

Agreed.  All parameter names are case-insensitive (5.7.1).



More information about the dal mailing list