The IVOA in 2007: Assessment and Future Roadmap
Roy Williams
roy at cacr.caltech.edu
Mon Jun 18 09:45:47 PDT 2007
On behalf of the IVOA Technical Coordination Committee, we are pleased
to release a report
"The IVOA in 2007: Assessment and Future Roadmap"
which is available at:
http://www.ivoa.net/internal/IVOA/TechnicalMilestones/IVOARoadMap-2007-final.pdf
This document summarizes many of the issues that were discussed before
and during the recent Interop meeting in Beijing, as reproduced below.
Each issue comes with one or more Recommendations intended to cement the
strong IVOA collaboration, and it is hoped that the Working Groups and
national-VO projects can help to maintain this unity.
The report also lists the roadmap for each Working Group, in terms of
dates at which standards documents will reach the stages of Working
Draft, Proposed Recommendation, Recommendation.
Your comments are welcome, either just to us, or to the 22 members of
the TCG as listed on the report, which has the address tcg at ivoa.net.
Roy Williams
TCG Chair
Christophe Arviset
TCG vice-Chair
---------------------------------
6. Leading Issues
(1) Registry Graininess
The issue of registry graininess has been an on-going issue throughout
the development of the registry framework. The so-called “fine-grained
registry” approach encourages capturing detailed, possibly dynamic
metadata into the registry. The motivation for this is not just to
enable sophisticated resource discovery and also as an aid to automated
planning for and execution of service-driven applications. In contrast,
the “coarse-grained registry” approach prefers a registry that restricts
its self to more general metadata with the expectation that more
detailed information would be accessed directly from the service. The
major concern about fine-grained information is one of registry metadata
curation: detailed metadata is more likely to be either incorrect or not
provided at all. This concern applies especially to metadata that is
available directly from the resource (e.g. table metadata); without a
tight coupling between the registry and services, it’s possible for the
metadata in the registry to become out of sync or out of date with
respect to the resource. This adds to the already large curation costs
registries are faced with to ensure that the quality of the metadata in
the registry is sufficiently good so that registries are practically
useful. Today, we are beginning to see applications being built around
fine-grained metadata in a registry, though we have not yet effectively
addressed the curation issues. The fact that all metadata are shared
across all registries via the harvesting stream, every registry must
deal at some level with the associated curation costs regardless of
whether it wishes to support fine-grained applications.
Recommendation: Since the current registry upgrade is a necessary step
prior to putting into place more effective curation practices, we
recommend that the upgrade be completed as soon as possible and at the
highest priorities. Further changes in the relevant standards that could
delay the completion of the upgrade should be avoided.
Recommendation: Curation practices aimed at improving metadata quality
are needed to catch up with desire to develop applications based on
fine-grained registries. After the upgrade is complete, we recommend
shifting greater focus putting such practices into place, including
effective use of automated validation of resource metadata and the
standard services they describe.
Recommendation: Extension schemas that are expected to be widely
supported across all VO registries must be put through the IVOA
standardization process. Projects that wish to introduce extensions that
are intended only for local support should consult with the Registry
Working Group (RWG) regarding possible impact on all registries.
Documentation in the form of a IVOA Note or, at least, RWG wiki page is
recommended.
Recommendation: After the completion of the upgrade, the Registry WG and
Grid/web service WG should develop mechanisms for harvesting more of the
fine-grained metadata directly from services (through the VO Standard
Interface (VOSI) specification), and for reducing the metadata that gets
shared on the harvesting stream. A registry will then have greater
control over how much information it manages within the context of its
store.
(2) GetCapabilities method for Services
Another driver for making more detailed information available from the
service directly has been pursued by the DAL WG: they wish to make the
next generation of services more self-describing, independent of the
registry. In particular, if the service can reveal its capabilities and
behaviors directly, then service clients can directly negotiate with the
service. It is expected that such information might often be generated
either transparently or dynamically by the service implementation, and
(therefore) it will be more up-to-date than the registry. The proposed
way of getting this information to clients is via a getCapabilities
method. There is still considerable discussion going on regarding the
details of exactly what information is returned and in what form which
has been holding up the advancement of critical service specifications
(SSA, TAP). Further complicating the discussion is issue of registry
graininess and how registries should get this information -- see (1) above.
Recommendation: In an effort to allow critical specifications to go
forward, first-generation techniques for accessing service behaviors
from the services should be adopted for current protocols, and the
getCapabilities method should be spun off and incorporated into the VO
Standard Interfaces (VOSI) specification. This will allow client
development based on getCapabilities to go forward without holding back
first implementations of SSA and TAP.
(3) Dependencies in IVOA Recommendations
There are some examples appearing of IVOA standards proceeding to
Recommendation, that depend on IVOA documents that are not
Recommendations. One of the first was VOEvent, with dependency on STC
(not a Rec as of writing, May 2007). Other places where dependency could
occur are UCD versioning, the VODataService (not a Rec) and VO Support
Interfaces.
Recommendation: The rules for IVOA standards should have a rule that a
Rec cannot in principle be dependent on non-Recs.
(4) Footprints in the Registry
It would be very useful for some registry records to contain a footprint
specification, so that machines can decide if a given point or region
intersects the coverage of a dataset or service. Currently the registry
record can contain either (de facto) free text, or a full STC (Space
Time Coordinates) record.
Recommendation:The registry WG should allow and encourage multiple ways
to specify footprint, including: free text; STC, a restricted subset of
STC (eg BOX, CIRCLE), pointers to footprint services, and ways by which
footprints can be created by probing a service directly.
(5) Registry Harvesting and concatenated XML
A problem has emerged in the last year concerning the XML documents that
registries exchange in the process of harvesting each other, and this is
blocking the progress to Recommendation of the VOResource standard. A
set of these documents (instances of VOResource) is handled by the
registry with the (false) assumption that a concatenation of valid XML
documents is also valid. The problem is with the ID construct in XML,
which states that such ID values must be unique. In particular, the STC
schema uses these IDs to identify coordinate systems for spatial
coverage, although we should say this is a general XML problem, not
specific to STC. A user might write
ID="UTC-FK5-GEO" href=”ivo://STClib/CoordSys#UTC-FK5-GEO” meaning the ID
value can be used as an abbreviation of the referent (href value).
However, if the same abbreviation is declared elsewhere in the document,
the XML rules make it invalid, hence the problem with concatenating
documents that all use the same coordinate system. A solution is
emerging based on the following agreements (a) the ID value can and will
be changed arbitrarily in an XML document without changing the essential
information, and (b) this is easier to do if all ID values are easy to
find in the XML; therefore (c) parsing software for the XML document
must make decisions based on the referent value, not the ID value, and
(d) the referent of the ID must be well-defined and stable, so that
parsing software can recognize it.
Recommendation: IVOA standards should try to avoid use of the ID/IDREF
mechanism, unless they have good reason to believe that conforming
document instances are unlikely ever to be concatenated.
Recommendation: The IVOA registry group should develop a general
approach for recognizing this pattern and handling such documents in the
registry.
(6) SOAP and REST
In the IVOA, the term "web service" generally implies either SOAP or
GET/POST/REST type service protocol. The latter are simpler to
understand and implement and the software is much less complex and
bug-infested, and therefore preferable for simple services; however, in
some cases the extra sophistication of SOAP makes it optimal. A
significant advantage for SOAP services is that it is easy to create a
formal interface document (WSDL), whereas this is more difficult for
GET/POST/REST services (done by hand).
Recommendation: The Grid/Web Services WG should create a study to
understand where SOAP is sufficiently advantageous and where the easier
GET/POST/REST can do the job just as well. The Grid/Web Services group
should re-examine the utility of the “VO WS Basic Profile” document in
the light of the results of the study.
(7) Asynchronous services
As the VO concept matures, asynchronous services are emerging, where the
response to a request is not the answer, but rather a way to check on
the running service, which will eventually produce the answer. There is
already deployment of asynchronous services (UK-VO, US-VO, France-VO,
Euro-VO), and standards are converging. The GWS-WG proposal (called UWS)
has the paradigm Initialize job / Upload input / Receive quote / Run job
/ Poll status / Fetch results; and the DAL proposal integrates
asynchrony with astronomical services through the stageData / getData /
AccessReference attributes of the S*AP protocols. The Table Access
Protocol (TAP) protocol (see (12)) is being developed with an
asynchronous capability.
Recommendation: Implementors of asynchronous services should utilize the
UWS pattern. The DAL stageData protocol should be implemented using the
UWS pattern. The TAP should base its asynchronous operations on UWS.
(8) Data Models and utypes
The concept of "utype" was defined in the IVOA as a response to the
fuzzy nature of the UCD descriptor: if a quantity has a utype, then it
must be part of a specific data model. Proper utypes would allow queries
to be built independent of the underlying database structure ("where
STC.coords.FK5.RA between 300 and 302"), and would provide a strong
framework for parameter-based queries ("http://.....? STC.coords.FK5.RA
= 300 &..."). However, many of the data models in use in the IVOA have
XML representation only, and do not have representation as a hierarchy
of utype values. We note that the syntax of utypes is not well defined
in the IVOA, and also that in simple cases the utype can be cleanly
derived from the Xpath representation of an XML element, so this should
be a straightforward matter.
Recommendation: A subcommittee of the IVOA, consisting of the relevant
persons across the various WGs (at least DM, UCD, VOQL) should review
the situation of utypes within IVOA. The syntax of utype and its
namespaces should be well-defined. Just as with UCDs, there should be
services to find relevant data models and their utypes from search
words, and there should be services to trace a given utype back to its
precise meaning.
(9) Space-Time Coordinates
This large and comprehensive working draft has become a de facto
standard in the IVOA through multiple implementations, and yet it is not
yet a Recommendation. The IVOA should take firm action on this matter to
resolve the status of STC. While there are several software packages
that use STC, none of them exercises *every* part of the proposed
standard. Further, there is often complaint from implementers about the
complexity of STC -- countered by the contention that astronomical
coordinate systems are complex by nature. What astronomers want in this
area is both assurance that full rigor and precise coordinates are
available in the IVOA; and the release from complexity when that full
rigor is not deemed necessary by the astronomer.
Recommendation: In addition to STC, there must be a simpler system for
everyday use, with mappings to full STC well-defined. It is a matter of
defaults. For example if the information in the simple system is just RA
and Dec numbers, this can map to the FK5 system with reference point at
the barycenter of the solar system and the epoch 2000.0. Regions that
are disks and RA/Dec intervals should be expressible in just a few
characters. Alternate syntaxes should not only provide a straightforward
way for a client to recognize its use, but also recognize its mapping
into full STC.
Recommendation: Applications and standards should clearly describe the
subset of STC that they are using, for example the Registry uses CIRCLE
and BOX; VOEvent uses longitude, latitude, and error radius. This will
allow consumers to build applications against these common subsets.
However STC beyond this should be recognized and either be fully used or
fail gracefully.
(10) Table Access Protocol
The TAP is under development by an IVOA subcommittee. The TCG expects
taht it will specify how an ADQL (or optionally an SQL) query can be
submitted to a service for processing. The response will be in VOTable
format.
Recommendation: The TAP should build on existing IVOA standards. Initial
versions of the protocol should state clearly what it will eventually
define, but mandate the minimum necessary to ensure a public release of
the protocol is achieved without delay..
(11) Multiple Data Access
A principle justification of the VO itself is to encourage statistical
studies of populations of astronomical objects, as well as the more
traditional single object study. The IVOA should encourage this through
multi-point protocols, bulk data access, and scalability of services to
the grid.
Recommendation: Data access protocols should be re-considered in terms
of their ability to handle multiple requests and bulk data. The Cone and
SIA services, in particular, do not handle multiple requests.
(12) VO interoperability with popular software
Most astronomers do most of their work with software packages like IDL,
IRAF, DS9, MIDAS, Sextractor, etc. It is highly desirable that these be
interoperable with the VO framework through use of VO services and
desktop messaging.
Recommendation: The VO national projects and Applications WG should
assess VO interoperability with these popular astronomy software
packages and environments.
(13) Bundling of VO software
Bundling of astronomy software such as the Scisoft and ex-Starlink
collections provides a convenient way of distributing many packages at
once to ease the burden of installation. Bundled distributions of VO
software would assist in up-take of VO tools, and we note that Scisoft
VII will contain a selection of VO software.
Recommendation: The list of VO Applications maintained on the (publicly
editable) Apps WG wiki pages serve as a place for Applications to be
visible for parties compiling collections of VO tools.
(14) Interoperable Security: Security and authentication is being
implemented in several new efforts. The Astrogrid (UK-VO) project has
built a sophisticated workflow system for asynchronous computations and
is adding authentication; a complementary project from the US NVO
project is exploring the idea of “graduated security” for giving
community access to high-performance computing. While the IVOA has a
mature Single-signon standard for security, using X.509 certificates,
there has been little discussion of which VO projects are issuing
certificates and the levels of authentication taking place, and which VO
projects will accept certificates from which other projects.
Recommendation: The Grid-Web Services WG should create a listing of
certificate authorities in the national projects, how to get a
certificate from each, what can be done with the certificate, and
compliance to accreditaton guidelines (eg PMA ).
Recommendation: VO organizations should encourage their users to obtain
certificates from PMA-accredited Certificate Authorities where possible,
or, failing this, from properly accredited certificate authorities
inside the VObs movement.
Recommendation: Service providers in the VO should be encouraged to
accept by default only certificates from PMA-accredited certificate
authorities and certificate authorities accredited within the VO
movement. They may choose also to accept "weak certificates" for cases
where the providers deem this to be sufficiently safe.
Recommendation: The IVOA should choose a set of guidelines for
VO-accredited certificate authorities, basing the guidelines on those
for the PMA-accredited authorities.
(15) Units: Most scientific quantities carry units, and data returned
from IVOA services should also carry explicit unit information when not
clear implicitly. Units should follow the IAU recommendation , and/or
the VOTable recommendation . When a user makes a query based on a
quantity, units can either be user-defined or fixed. In the former case,
the user has the freedom to express the quantity in arbitrary units (eg.
calories per square furlong per hour!), or an enumerated choice (eg.
Angstroms OR nanometers). In the case of fixed units, the data model of
the query is bound to specific units (eg all angles must be in decimal
degrees).
Recommendation: A study by the Data Model Working Group of how units are
used in IVOA views and services, where it would be appropriate to simply
fix the units, and where it is necessary to allow freedom of choice,
distinguishing between unit choice in the user interface and in the
back-end services. In the latter case, the report should also recommend
on how unit conversion is implemented: who is responsible and the nature
of the software.
(16) IVOA Newsletter
Recommendation: The global VO community would be well-served by an IVOA
newsletter, including announcements from national projects and working
groups, events, press coverage of VO issues, etc.
More information about the interop
mailing list