RWP04: Registry Replication
Tony Linde
ael at star.le.ac.uk
Wed Apr 30 09:17:37 PDT 2003
<<Type of services: >>
<<Type of objects: >>
I don't think Keith is trying to represent himself as an expert on
astronomy: his example is generic. For AstroGrid, we're defining a narrow
set of parameters that can go into a registry query.
<<Who's coming up with things like a standard list of object types ?>>
For IVOA, we need to define those parameters: that is the job of Rwp03:
http://www.ivoa.net/twiki/bin/view/IVOA/IVOARegWp03
Cheers,
Tony.
-----Original Message-----
From: amicol at eso.org [mailto:amicol at eso.org] On Behalf Of Alberto Micol
Sent: 30 April 2003 12:18
To: Keith Noddle
Cc: IVOA Registry mailing list; dm at ivoa.net
Subject: Re: RWP04: Registry Replication
SELECT * FROM REGISTRY WHERE
(
TYPE="white dwarf star" AND
(WAVELENGTH="optical" OR WAVELENGTH="uv") AND
(KEYWORD="BPM 16274" OR
KEYWORD="GD 50" OR
KEYWORD="HST photometric standards"
)
)
then one way of presenting this in XML is shown ... etc
Dear Keith,
Type of services:
The first problem I have is that the query does not esplicitly
defines what type of services you want to identify in the registry.
I think a SERVICE_TYPE is required. It could take some values
like "catalogue browser", "data archive", "documentation", etc.
If someone is interested in ANY service type, then the query should
explicit that with a constraint like:
SERVICE_TYPE="ANY"
Type of objects:
You use the constraint TYPE="white dwarf star"
This is not a service_type, but specifies what class of objects the user
is interested in. It should be probably called OBJECT_TYPE.
The main problem here is in the value: "white dwarf star"
How is that going to be used ?
Will only resources matching exactly "white dwarf star" be returned ?
What if the resource I maintain lists OBJECT_TYPE="white dwarf" ?
It will not match ...
We have to come up with a standard list, a thesaurus,
to homogenise those types!
Probably the best thing is to start with the IAU thesaurus
http://msowww.anu.edu.au/library/thesaurus/english/
(but I remember other similar efforts like the IUE object class ...).
Wavelength:
Wavelength is another item value you defined.
Here the Data Model should intervene with a proper definition both for
the
name of the item (I remember a joke by Jonathan McDowell that introduced
the FREWAVERGY!), and for the values (optical and uv are ok, but we need
to define many more).
Keyword:
As you presented this, it looks like KEYWORD is a generic container,
which could take very different values spanning from object names (GD 50)
to more or less free text (HST photometric standards).
Maybe this is too generic ... ?
One of the problem I have with the registry is that I do not know
in advance whether a service will list a certain characteristic
in its metadata, or within the data itself.
Example:
The "GD 50" white dwarf might be listed in a set of resource keywords,
in some other cases it will be a record in a resource (eg an entry
in a catalogue of white dwarfs);
in this second case your query will fail, even though the object is to
be found in the resource.
I described the same problem in the Rwp02 astrovirtel use case for the
Distance attribute, where a catalogue might offer the distance as a field
in its records, or there could be a metadata keyword saying that the
objects
in a catalog are all to be found within so many mega parsec.
Summary:
My main point here is that the registry must describe things using
not only a well defined set of keywords,
but also a well defined set of keyword values!
Phometric aspects (the Wavelength values): it will be
the Data Model WG to come up with a proper description and set of values;
service_types and other service (or resource) specific metadata
(I mean non-Bob's document for level 0, but other keywords for deeper
levels)
will be defined by the Registry WG.
What I do not know is:
Who's coming up with things like a standard list of object types ?
Do we need one ? (I think so ... the IAU thesaurus might show the way)
Alberto
--
Alberto.Micol at eso.org Tel: +49 89 32006365
HST Science Archive ST-ECF Fax: +49 89 32006480
ESA/RSSD/SN c/o ESO Karl Schwarzschild Str.2,
http://archive.eso.org/ No ads, thanks. Garching bei Muenchen,
http://www.stecf.org/ HTML emails D-85748 Germany
Keith Noddle wrote:
Thanks for the feedback Ray - much appreciated!
I think we're all converging on the model proposed by Tony which, to
paraphrase Ray, is the full-(specialist/limited)-(source/private) model.
I think this pretty much satisfies the points raise by Wil and Ray as
well as meeting most of the useful requirements I originally proposed.
I'll start work on the design and get something posted on the IVOA Wiki
for comment - I'm conscious of the need to get a presentation together
in short order for Cambridge!
The other major aspect of the RWP04 work is the development of a
registry query schema. Tony, Elizabeth and I have been working on
something similar(!) for AstroGrid and whilst we don't yet have schema,
we have a simple example XML query (below) upon which I would welcome
everyone's comments. We are looking into XQuery/XPath, but that might
not be appropriate for the current iteration of AstroGrid. Again, your
comments would be welcome.
Keith.
--
Keith Noddle Phone: +44 (0)116 223 1894
AstroGrid Technical Lead Fax: +44 (0)116 252 3311
Dept of Physics & Astronomy Mobile: +44 (0)7721 926 461
University of Leicester Email: ktn at star.le.ac.uk
Leicester, UK LE1 7RH Web: http://www.astrogrid.org
--------------------------------------------
If the query we are trying to satisfy can be expressed in pseudo-SQL as:
SELECT * FROM <registry> WHERE
(
TYPE="white dwarf star" AND
(WAVELENGTH="optical" OR WAVELENGTH="uv") AND
(KEYWORD="BPM 16274" OR
KEYWORD="GD 50" OR
KEYWORD="HST photometric standards"
)
)
then one way of presenting this in XML is shown below
<query>
<selectionSequence>
<selection>
<item>type</item>
<value>white dwarf star</value>
</selection>
<operator>AND</operator>
<selectionSequence>
<selection>
<item>wavelength</item>
<value>optical</value>
</selection>
<operator>OR</operator>
<selection>
<item>wavelength</item>
<value>uv</value>
</selection>
</selectionSequence>
<operator>AND</operator>
<selectionSequence>
<selection>
<item>keyword</item>
<value>BPM 16274</value>
</selection>
<operator>OR</operforator>
<selection>
<item>keyword</item>
<value>GD 50</value>
</selection>
<operator>OR</operator>
<selection>
<item>keyword</item>
<value>HST photometric standards</value>
</selection>
</selectionSequence>
</selectionSequence>
</query>
This only requires 6 tags and is sufficiently flexible to cope with most
queries I can think of - but I'm not an astronomer...!
More information about the dm
mailing list