Extensions to VOResource - was Error column in VOResource

Roy Williams roy at caltech.edu
Tue May 18 16:42:24 PDT 2004


I believe that each service can be described in several metadata formats:

1 -- The SOAP clients want WSDL
2 -- Some astronomers want VOResource
3 -- Some astronomers want the VOTable header of the output.
4 -- The Librarians want Dublin Core and METS

The service itself can emit 1 if it is a SOAP rather than GET service.
O'Mullane and Rixon are suggesting that each compliant service must emit 2.
I believe that 3 would be easy and useful. The OAI interface emits 4.

We can think of each service as its own micro-registry that has only one
entry (itself). The "real" registries are just caches accumulated from them.
Rather than "publishing" a service, we simply point a registry at it and
everything necessary (1-4) is harvested.

We could demand that each service expose an OAI interface with some of these
metadata formats.

Roy

--------
California Institute of Technology
roy at caltech.edu
626 395 3670

----- Original Message ----- 
From: "Doug Tody" <dtody at aoc.nrao.edu>
To: "Martin Hill" <mchill at dial.pipex.com>
Cc: <registry at ivoa.net>
Sent: Tuesday, May 18, 2004 4:03 PM
Subject: Re: Extensions to VOResource - was Error column in VOResource


> I wonder if we are confusing what we mean by the term metadata?
> By metadata in the earlier discussion I referred primarily to descriptive
> metadata used to describe a dataset - sky coverage, bandpass, WCS,
> and so forth.  WSDL however refers to the service interface.  Indeed,
> the service type and interface, the capability matrix for a particular
> service instance, etc., are all fairly static, well standardized things
> which should be available via the registry.  I certainly agree that all
> VO services should be registered and described in the registry.  Doug
>
>
> On Tue, 18 May 2004, Martin Hill wrote:
>
> > I'll second that.
> >
> > I am concerned though that avoiding the standardisation process *too*
> > much will leave us with n x m connections between n applications and m
> > services.  Using the registry as a central connection point for metadata
> > gives us long term resiliance - in fact 'loosely couples' the
> > applications and services (and indeed the service <-> service
> > connections).
> >
> > Services that dynamically generate data still have consistent metadata -
> > SIAP for example *could* be defined using WSDL and it strikes me is such
> > a common VO service that we should describe it in VOResource so that
> > discovery tools can find them and 'understand' what is available at each
> > one.  Such things as sky coverage, units used to select regions, other
> > query parameters (incl UCDs), etc are all things that apply to any
> > queryable dataset, whether the data itself is static or dynamically
> > generated.
> >
> > One-off or rare services are candidates for not trying to describe using
> > VO metadata.  And certainly to begin with we should concentrate on the
> > the very common services.
> >
> > Cheers (looking forward to more discussions on this over a few
> > microbrewery pints)
> >
> > Martin
> >
> > Tony Linde wrote:
> >
> > > I completely agree, Doug. We should standardize on what we can agree
as a
> > > common standard - via the DM effort. But any extensions should follow
some
> > > standard extension mechanism so that, as you say, they can at least be
seen
> > > by users or included and passed on by applications.
> > >
> > > Cheers,
> > > Tony.
> > >
> > >
> > >>-----Original Message-----
> > >>From: owner-registry at eso.org [mailto:owner-registry at eso.org]
> > >>On Behalf Of Doug Tody
> > >>Sent: 18 May 2004 18:05
> > >>To: Tony Linde
> > >>Cc: registry at ivoa.net
> > >>Subject: RE: Error column in VOResource
> > >>
> > >>On Tue, 18 May 2004, Tony Linde wrote:
> > >>
> > >>
> > >>>Right back at ya :) ... How does an application know how to handle
> > >>>metadata that conforms to no known standard? Whatever the
> > >>
> > >>problems for
> > >>
> > >>>the registry, they are a thousand times worse for the apps since
> > >>>there'll be thousands of applications wanting to use the resources.
> > >>
> > >>We will never be able to standardize everything.  We will
> > >>never even be able to know about all the telescopes, survey
> > >>projects, etc., being developed or underway around the world.
> > >> Even if we do know about a project it will be constantly
> > >>changing.  All we can really hope to do is standardize the
> > >>core, and define a standard framework for things like
> > >>resource description, dataset characterization, data formatting, etc.
> > >>
> > >>People will use these standard mechanisms, try to adhere to
> > >>the standard core, but will need to add nonstandard
> > >>extensions to do new things, or to specialize the services,
> > >>data model, or data packaging to fully describe their data.
> > >>Sure, all applications will not be able to understand and
> > >>deal with the extensions, but this is how new standards
> > >>develop, and some subset of applications will really need
> > >>those extensions to process certain classes of data, and will
> > >>be written to do so.  So long as the service or dataset is
> > >>compliant to some core model then all applications which
> > >>support the core will work down to that level, ignoring the
> > >>extensions.
> > >>Even nonstandard extensions can be useful if packaged in a
> > >>standard way, e.g., a human can browse them to better
> > >>understand the data, generic searches can be performed,
> > >>generic tools can be used in an ad-hoc fashion, and so forth.
> > >>
> > >>Basically I am arguing that the standard VO framework should
> > >>only try to go so far, but should be designed to be
> > >>extensible.  If it tries to be all-inclusive it will be too
> > >>complicated to be used, and will never work anyway.
> > >>
> > >>
> > >>
> > >>>I don't understand how the metadata can be dynamic (other than by a
> > >>>data centre accumulating more data). Surely the coverage of
> > >>
> > >>a dataset,
> > >>
> > >>>say, is based on the data in it? Even virtual data has to
> > >>
> > >>be generated
> > >>
> > >>>from some real data and it is on that data that the
> > >>
> > >>metadata is based.
> > >>
> > >>>Maybe some examples would help, Doug.
> > >>
> > >>This is all true for static archive data products, e.g.,
> > >>precomputed survey images in an archive.  But what if we
> > >>have, e.g., an image access service which generates images on
> > >>the fly, e.g., image cutouts or mosaics?
> > >>Or perhaps the service generates images on the fly from X-ray
> > >>event data, applying a time filter in the process and
> > >>generating the image with the desired celestial projection?
> > >>SIA for example already supports all this.
> > >>Basically what happens is the client application tells the
> > >>service what it would ideally like to get back, the service
> > >>decides what it can provide, and returns metadata for one or
> > >>more virtual datasets which it can generate to satisfy the
> > >>query.  The image is not actually generated until the access
> > >>reference URL is invoked.
> > >>
> > >>What we need the registry for is to tell us what services are
> > >>out there, what they are capable of, and the characteristics
> > >>of the data they can serve (specific data collections,
> > >>bandpass, sky coverage, etc.).  We also need to register all
> > >>data collections and be able to find services which can serve
> > >>them up.  It could also be useful to register individual
> > >>static datasets within a data collection, including caching
> > >>dataset metadata of some type (at least that which uniformly
> > >>characterizes the data at a high level).
> > >>This would start to provide a replica management capability
> > >>for managing large data collections.  One has to ask though,
> > >>whether this is something which should be provided by the
> > >>registry or by a separate replica management service.  If it
> > >>gets complicated enough, it may be better to split it off as
> > >>a separate service in order to avoid over-complicating the registry.
> > >>
> > >>Anyway, enough!  I have to get back to DAL stuff or I won't
> > >>be ready for next week.
> > >>
> > >> - Doug
> > >>
> > >>
> > >>
> > >>
> > >>>Tony.
> > >>>
> > >>>
> > >>>>-----Original Message-----
> > >>>>From: owner-registry at eso.org [mailto:owner-registry at eso.org] On
> > >>>>Behalf Of Doug Tody
> > >>>>Sent: 18 May 2004 17:18
> > >>>>To: Tony Linde
> > >>>>Cc: registry at ivoa.net
> > >>>>Subject: RE: Error column in VOResource
> > >>>>
> > >>>>Tony, how does your approach handle services which return virtual
> > >>>>data, or datasets which contain metadata which has not been
> > >>>>standardized?
> > >>>>
> > >>>>In the case of virtual data, the metadata for the virtual
> > >>
> > >>dataset is
> > >>
> > >>>>not static hence cannot be cached in the registry.
> > >>>> One has to ask the actual service what it can generate
> > >>
> > >>to service a
> > >>
> > >>>>specific query, and the metadata for the virtual dataset is
> > >>>>generated on the fly.  Probably this sort of thing will
> > >>
> > >>be the case
> > >>
> > >>>>for most sophisticated VO services.  Hence, the registry
> > >>
> > >>is limited
> > >>
> > >>>>primarily to service discovery based on fairly high
> > >>>>level, static resource descriptors.    Doug
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>On Tue, 18 May 2004, Tony Linde wrote:
> > >>>>
> > >>>>
> > >>>>>As I keep saying, the coupling is an issue of
> > >>
> > >>implementation, not
> > >>
> > >>>>>design. We design the registry interface so that it is a
> > >>>>
> > >>>>one-stop shop
> > >>>>
> > >>>>>for all metadata but if one implementation gets the
> > >>>>
> > >>>>metadata from the
> > >>>>
> > >>>>>resource every time it is asked and another caches that
> > >>>>
> > >>>>metadata, it
> > >>>>
> > >>>>>is transparent to the calling application.
> > >>>>>
> > >>>>>
> > >>>>>>Making the registry a one-stop shop for metadata
> > >>
> > >>demands a tight
> > >>
> > >>>>>>coupling with the services they describe.  Any change
> > >>>>>
> > >>>>>I don't see why - the registry only needs to know how to
> > >>>>
> > >>>>ask for the
> > >>>>
> > >>>>>metadata and how to return it to the calling app. We
> > >>
> > >>will have a
> > >>
> > >>>>>standard way of getting the metadata (from Wil's proposal for a
> > >>>>>standard baseline for all services), a standard representation
> > >>>>>(VOResource which includes the DM-based schema) and a
> > >>>>
> > >>>>standard way for apps to get that metadata (RI spec).
> > >>>>
> > >>>>>This is about as loose a coupling as I can think of.
> > >>>>>
> > >>>>>And if it *did* require tight coupling, all the more reason
> > >>>>
> > >>>>to put all
> > >>>>
> > >>>>>this processing into the registry, otherwise you end up
> > >>
> > >>with every
> > >>
> > >>>>>single application having to be tightly coupled to
> > >>
> > >>every resource
> > >>
> > >>>>>- but this is not the case.
> > >>>>>
> > >>>>>Making the registry the source of all metadata means that all
> > >>>>>applications only have to manage one interface right down
> > >>>>
> > >>>>until they
> > >>>>
> > >>>>>select the service they want to invoke - they don't have to
> > >>>>
> > >>>>each and
> > >>>>
> > >>>>>every one be coded to fish around lots of services looking
> > >>>>
> > >>>>for the metadata they want.
> > >>>>
> > >>>>>T.
> > >>>>>
> > >>>>>
> > >>>>>>-----Original Message-----
> > >>>>>>From: Ray Plante [mailto:rplante at ncsa.uiuc.edu]
> > >>>>>>Sent: 18 May 2004 16:51
> > >>>>>>To: Tony Linde
> > >>>>>>Cc: registry at ivoa.net
> > >>>>>>Subject: RE: Error column in VOResource
> > >>>>>>
> > >>>>>>On Tue, 18 May 2004, Tony Linde wrote:
> > >>>>>>
> > >>>>>>>the registry is a one stop shop for all metadata.
> > >>>>>>
> > >>>>>>I disagree with this statement in general.  Besides
> > >>>>
> > >>>>various pratical
> > >>>>
> > >>>>>>reasons of scaling and scope, there is an issue of volitility.
> > >>>>>>
> > >>>>>>Making the registry a one-stop shop for metadata
> > >>
> > >>demands a tight
> > >>
> > >>>>>>coupling with the services they describe.  Any change in
> > >>>>
> > >>>>the service
> > >>>>
> > >>>>>>must be reflected back into the registry.  If the
> > >>>>
> > >>>>registry is simply
> > >>>>
> > >>>>>>about discovering services, the coupling is looser, and
> > >>>>
> > >>>>the service
> > >>>>
> > >>>>>>is more flexible to changes in implementation.
> > >>>>>>
> > >>>>>>It can be argued that the tighter the coupling, the more
> > >>>>
> > >>>>costly the
> > >>>>
> > >>>>>>system in terms of software development and coordination
> > >>>>
> > >>>>of people.
> > >>>>
> > >>>>>>A tightly coupled design may be appropriate for a
> > >>
> > >>particular VO
> > >>
> > >>>>>>project that can manage that coordination; however, it's less
> > >>>>>>appropriate for the IVOA as a whole.
> > >>>>>>
> > >>>>>>It's an interesting issue that I expect we'll learn more
> > >>>>
> > >>>>about with
> > >>>>
> > >>>>>>experience.
> > >>>>>>
> > >>>>>>cheers,
> > >>>>>>Ray
> > >>>>>>
> > >>>>>>
> > >>>>>
> > >>>>>
> > >>>
> > >
> > >
> >
> >
> >
>



More information about the registry mailing list