The IVOA in 2007: Assessment and Future Roadmap

Roy Williams roy at cacr.caltech.edu
Mon Jun 18 09:45:47 PDT 2007


On behalf of the IVOA Technical Coordination Committee, we are pleased 
to release a report

"The IVOA in 2007: Assessment and Future Roadmap"

which is available at:
http://www.ivoa.net/internal/IVOA/TechnicalMilestones/IVOARoadMap-2007-final.pdf

This document summarizes many of the issues that were discussed before 
and during the recent Interop meeting in Beijing, as reproduced below. 
Each issue comes with one or more Recommendations intended to cement the 
strong IVOA collaboration, and it is hoped that the Working Groups and 
national-VO projects can help to maintain this unity.

The report also lists the roadmap for each Working Group, in terms of 
dates at which standards documents will reach the stages of Working 
Draft, Proposed Recommendation, Recommendation.

Your comments are welcome, either just to us, or to the 22 members of 
the TCG as listed on the report, which has the address tcg at ivoa.net.

Roy Williams
TCG Chair

Christophe Arviset
TCG vice-Chair

---------------------------------



6. Leading Issues
(1) Registry Graininess
The issue of registry graininess has been an on-going issue throughout 
the development of the registry framework. The so-called “fine-grained 
registry” approach encourages capturing detailed, possibly dynamic 
metadata into the registry. The motivation for this is not just to 
enable sophisticated resource discovery and also as an aid to automated 
planning for and execution of service-driven applications. In contrast, 
the “coarse-grained registry” approach prefers a registry that restricts 
its self to more general metadata with the expectation that more 
detailed information would be accessed directly from the service. The 
major concern about fine-grained information is one of registry metadata 
curation: detailed metadata is more likely to be either incorrect or not 
provided at all. This concern applies especially to metadata that is 
available directly from the resource (e.g. table metadata); without a 
tight coupling between the registry and services, it’s possible for the 
metadata in the registry to become out of sync or out of date with 
respect to the resource. This adds to the already large curation costs 
registries are faced with to ensure that the quality of the metadata in 
the registry is sufficiently good so that registries are practically 
useful. Today, we are beginning to see applications being built around 
fine-grained metadata in a registry, though we have not yet effectively 
addressed the curation issues. The fact that all metadata are shared 
across all registries via the harvesting stream, every registry must 
deal at some level with the associated curation costs regardless of 
whether it wishes to support fine-grained applications.
Recommendation: Since the current registry upgrade is a necessary step 
prior to putting into place more effective curation practices, we 
recommend that the upgrade be completed as soon as possible and at the 
highest priorities. Further changes in the relevant standards that could 
delay the completion of the upgrade should be avoided.
Recommendation: Curation practices aimed at improving metadata quality 
are needed to catch up with desire to develop applications based on 
fine-grained registries. After the upgrade is complete, we recommend 
shifting greater focus putting such practices into place, including 
effective use of automated validation of resource metadata and the 
standard services they describe.
Recommendation: Extension schemas that are expected to be widely 
supported across all VO registries must be put through the IVOA 
standardization process. Projects that wish to introduce extensions that 
are intended only for local support should consult with the Registry 
Working Group (RWG) regarding possible impact on all registries. 
Documentation in the form of a IVOA Note or, at least, RWG wiki page is 
recommended.
Recommendation: After the completion of the upgrade, the Registry WG and 
Grid/web service WG should develop mechanisms for harvesting more of the 
fine-grained metadata directly from services (through the VO Standard 
Interface (VOSI) specification), and for reducing the metadata that gets 
shared on the harvesting stream. A registry will then have greater 
control over how much information it manages within the context of its 
store.

(2) GetCapabilities method for Services
Another driver for making more detailed information available from the 
service directly has been pursued by the DAL WG: they wish to make the 
next generation of services more self-describing, independent of the 
registry. In particular, if the service can reveal its capabilities and 
behaviors directly, then service clients can directly negotiate with the 
service. It is expected that such information might often be generated 
either transparently or dynamically by the service implementation, and 
(therefore) it will be more up-to-date than the registry. The proposed 
way of getting this information to clients is via a getCapabilities 
method. There is still considerable discussion going on regarding the 
details of exactly what information is returned and in what form which 
has been holding up the advancement of critical service specifications 
(SSA, TAP). Further complicating the discussion is issue of registry 
graininess and how registries should get this information -- see (1) above.
Recommendation: In an effort to allow critical specifications to go 
forward, first-generation techniques for accessing service behaviors 
from the services should be adopted for current protocols, and the 
getCapabilities method should be spun off and incorporated into the VO 
Standard Interfaces (VOSI) specification. This will allow client 
development based on getCapabilities to go forward without holding back 
first implementations of SSA and TAP.

(3) Dependencies in IVOA Recommendations
There are some examples appearing of IVOA standards proceeding to 
Recommendation, that depend on IVOA documents that are not 
Recommendations. One of the first was VOEvent, with dependency on STC 
(not a Rec as of writing, May 2007). Other places where dependency could 
occur are UCD versioning, the VODataService (not a Rec) and VO Support 
Interfaces.
Recommendation: The rules for IVOA standards should have a rule that a 
Rec cannot in principle be dependent on non-Recs.

(4) Footprints in the Registry
It would be very useful for some registry records to contain a footprint 
specification, so that machines can decide if a given point or region 
intersects the coverage of a dataset or service. Currently the registry 
record can contain either (de facto) free text, or a full STC (Space 
Time Coordinates) record.
Recommendation:The registry WG should allow and encourage multiple ways 
to specify footprint, including: free text; STC, a restricted subset of 
STC (eg BOX, CIRCLE), pointers to footprint services, and ways by which 
footprints can be created by probing a service directly.

(5) Registry Harvesting and concatenated XML
A problem has emerged in the last year concerning the XML documents that 
registries exchange in the process of harvesting each other, and this is 
blocking the progress to Recommendation of the VOResource standard. A 
set of these documents (instances of VOResource) is handled by the 
registry with the (false) assumption that a concatenation of valid XML 
documents is also valid. The problem is with the ID construct in XML, 
which states that such ID values must be unique. In particular, the STC 
schema uses these IDs to identify coordinate systems for spatial 
coverage, although we should say this is a general XML problem, not 
specific to STC. A user might write
ID="UTC-FK5-GEO" href=”ivo://STClib/CoordSys#UTC-FK5-GEO” meaning the ID 
value can be used as an abbreviation of the referent (href value). 
However, if the same abbreviation is declared elsewhere in the document, 
the XML rules make it invalid, hence the problem with concatenating 
documents that all use the same coordinate system. A solution is 
emerging based on the following agreements (a) the ID value can and will 
be changed arbitrarily in an XML document without changing the essential 
information, and (b) this is easier to do if all ID values are easy to 
find in the XML; therefore (c) parsing software for the XML document 
must make decisions based on the referent value, not the ID value, and 
(d) the referent of the ID must be well-defined and stable, so that 
parsing software can recognize it.
Recommendation: IVOA standards should try to avoid use of the ID/IDREF 
mechanism, unless they have good reason to believe that conforming 
document instances are unlikely ever to be concatenated.
Recommendation: The IVOA registry group should develop a general 
approach for recognizing this pattern and handling such documents in the 
registry.

(6) SOAP and REST
In the IVOA, the term "web service" generally implies either SOAP or 
GET/POST/REST type service protocol. The latter are simpler to 
understand and implement and the software is much less complex and 
bug-infested, and therefore preferable for simple services; however, in 
some cases the extra sophistication of SOAP makes it optimal. A 
significant advantage for SOAP services is that it is easy to create a 
formal interface document (WSDL), whereas this is more difficult for 
GET/POST/REST services (done by hand).
Recommendation: The Grid/Web Services WG should create a study to 
understand where SOAP is sufficiently advantageous and where the easier 
GET/POST/REST can do the job just as well. The Grid/Web Services group 
should re-examine the utility of the “VO WS Basic Profile” document in 
the light of the results of the study.

(7) Asynchronous services
As the VO concept matures, asynchronous services are emerging, where the 
response to a request is not the answer, but rather a way to check on 
the running service, which will eventually produce the answer. There is 
already deployment of asynchronous services (UK-VO, US-VO, France-VO, 
Euro-VO), and standards are converging. The GWS-WG proposal (called UWS) 
has the paradigm Initialize job / Upload input / Receive quote / Run job 
/ Poll status / Fetch results; and the DAL proposal integrates 
asynchrony with astronomical services through the stageData / getData / 
AccessReference attributes of the S*AP protocols. The Table Access 
Protocol (TAP) protocol (see (12)) is being developed with an 
asynchronous capability.
Recommendation: Implementors of asynchronous services should utilize the 
UWS pattern. The DAL stageData protocol should be implemented using the 
UWS pattern. The TAP should base its asynchronous operations on UWS.

(8) Data Models and utypes
The concept of "utype" was defined in the IVOA as a response to the 
fuzzy nature of the UCD descriptor: if a quantity has a utype, then it 
must be part of a specific data model. Proper utypes would allow queries 
to be built independent of the underlying database structure ("where 
STC.coords.FK5.RA between 300 and 302"), and would provide a strong 
framework for parameter-based queries ("http://.....? STC.coords.FK5.RA 
= 300 &..."). However, many of the data models in use in the IVOA have 
XML representation only, and do not have representation as a hierarchy 
of utype values. We note that the syntax of utypes is not well defined 
in the IVOA, and also that in simple cases the utype can be cleanly 
derived from the Xpath representation of an XML element, so this should 
be a straightforward matter.
Recommendation: A subcommittee of the IVOA, consisting of the relevant 
persons across the various WGs (at least DM, UCD, VOQL) should review 
the situation of utypes within IVOA. The syntax of utype and its 
namespaces should be well-defined. Just as with UCDs, there should be 
services to find relevant data models and their utypes from search 
words, and there should be services to trace a given utype back to its 
precise meaning.

(9) Space-Time Coordinates
This large and comprehensive working draft has become a de facto 
standard in the IVOA through multiple implementations, and yet it is not 
yet a Recommendation. The IVOA should take firm action on this matter to 
resolve the status of STC. While there are several software packages 
that use STC, none of them exercises *every* part of the proposed 
standard. Further, there is often complaint from implementers about the 
complexity of STC -- countered by the contention that astronomical 
coordinate systems are complex by nature. What astronomers want in this 
area is both assurance that full rigor and precise coordinates are 
available in the IVOA; and the release from complexity when that full 
rigor is not deemed necessary by the astronomer.
Recommendation: In addition to STC, there must be a simpler system for 
everyday use, with mappings to full STC well-defined. It is a matter of 
defaults. For example if the information in the simple system is just RA 
and Dec numbers, this can map to the FK5 system with reference point at 
the barycenter of the solar system and the epoch 2000.0. Regions that 
are disks and RA/Dec intervals should be expressible in just a few 
characters. Alternate syntaxes should not only provide a straightforward 
way for a client to recognize its use, but also recognize its mapping 
into full STC.
Recommendation: Applications and standards should clearly describe the 
subset of STC that they are using, for example the Registry uses CIRCLE 
and BOX; VOEvent uses longitude, latitude, and error radius. This will 
allow consumers to build applications against these common subsets. 
However STC beyond this should be recognized and either be fully used or 
fail gracefully.

(10) Table Access Protocol
The TAP is under development by an IVOA subcommittee. The TCG expects 
taht it will specify how an ADQL (or optionally an SQL) query can be 
submitted to a service for processing. The response will be in VOTable 
format.
Recommendation: The TAP should build on existing IVOA standards. Initial 
versions of the protocol should state clearly what it will eventually 
define, but mandate the minimum necessary to ensure a public release of 
the protocol is achieved without delay..

(11) Multiple Data Access
A principle justification of the VO itself is to encourage statistical 
studies of populations of astronomical objects, as well as the more 
traditional single object study. The IVOA should encourage this through 
multi-point protocols, bulk data access, and scalability of services to 
the grid.
Recommendation: Data access protocols should be re-considered in terms 
of their ability to handle multiple requests and bulk data. The Cone and 
SIA services, in particular, do not handle multiple requests.

(12) VO interoperability with popular software
Most astronomers do most of their work with software packages like IDL, 
IRAF, DS9, MIDAS, Sextractor, etc. It is highly desirable that these be 
interoperable with the VO framework through use of VO services and 
desktop messaging.
Recommendation: The VO national projects and Applications WG should 
assess VO interoperability with these popular astronomy software 
packages and environments.

(13) Bundling of VO software
Bundling of astronomy software such as the Scisoft and ex-Starlink 
collections provides a convenient way of distributing many packages at 
once to ease the burden of installation. Bundled distributions of VO 
software would assist in up-take of VO tools, and we note that Scisoft 
VII will contain a selection of VO software.
Recommendation: The list of VO Applications maintained on the (publicly 
editable) Apps WG wiki pages serve as a place for Applications to be 
visible for parties compiling collections of VO tools.

(14) Interoperable Security: Security and authentication is being 
implemented in several new efforts. The Astrogrid (UK-VO) project has 
built a sophisticated workflow system for asynchronous computations and 
is adding authentication; a complementary project from the US NVO 
project is exploring the idea of “graduated security” for giving 
community access to high-performance computing. While the IVOA has a 
mature Single-signon standard for security, using X.509 certificates, 
there has been little discussion of which VO projects are issuing 
certificates and the levels of authentication taking place, and which VO 
projects will accept certificates from which other projects.
Recommendation: The Grid-Web Services WG should create a listing of 
certificate authorities in the national projects, how to get a 
certificate from each, what can be done with the certificate, and 
compliance to accreditaton guidelines (eg PMA ).
Recommendation: VO organizations should encourage their users to obtain 
certificates from PMA-accredited Certificate Authorities where possible, 
or, failing this, from properly accredited certificate authorities 
inside the VObs movement.
Recommendation: Service providers in the VO should be encouraged to 
accept by default only certificates from PMA-accredited certificate 
authorities and certificate authorities accredited within the VO 
movement. They may choose also to accept "weak certificates" for cases 
where the providers deem this to be sufficiently safe.
Recommendation: The IVOA should choose a set of guidelines for 
VO-accredited certificate authorities, basing the guidelines on those 
for the PMA-accredited authorities.

(15) Units: Most scientific quantities carry units, and data returned 
from IVOA services should also carry explicit unit information when not 
clear implicitly. Units should follow the IAU recommendation , and/or 
the VOTable recommendation . When a user makes a query based on a 
quantity, units can either be user-defined or fixed. In the former case, 
the user has the freedom to express the quantity in arbitrary units (eg. 
calories per square furlong per hour!), or an enumerated choice (eg. 
Angstroms OR nanometers). In the case of fixed units, the data model of 
the query is bound to specific units (eg all angles must be in decimal 
degrees).
Recommendation: A study by the Data Model Working Group of how units are 
used in IVOA views and services, where it would be appropriate to simply 
fix the units, and where it is necessary to allow freedom of choice, 
distinguishing between unit choice in the user interface and in the 
back-end services. In the latter case, the report should also recommend 
on how unit conversion is implemented: who is responsible and the nature 
of the software.

(16) IVOA Newsletter
Recommendation: The global VO community would be well-served by an IVOA 
newsletter, including announcements from national projects and working 
groups, events, press coverage of VO issues, etc.





More information about the interop mailing list