RWP04: Registry Replication

Alberto Micol Alberto.Micol at eso.org
Wed Apr 30 04:17:33 PDT 2003


> SELECT * FROM REGISTRY WHERE
>    (
>       TYPE="white dwarf star" AND
>       (WAVELENGTH="optical" OR WAVELENGTH="uv") AND
>       (KEYWORD="BPM 16274" OR
>          KEYWORD="GD 50" OR
>          KEYWORD="HST photometric standards"
>       )
>    )
>
> then one way of presenting this in XML is shown ... etc

Dear Keith,


Type of services:

   The first problem I have is that the query does not esplicitly
   defines what type of services you want to identify in the registry.

   I think a SERVICE_TYPE is required. It could take some values
   like "catalogue browser", "data archive", "documentation", etc.
   If someone is interested in ANY service type, then the query should
   explicit that with a constraint like:

       SERVICE_TYPE="ANY"

Type of objects:

   You use the constraint TYPE="white dwarf star"
   This is not a service_type, but specifies what class of objects the user
   is interested in. It should be probably called OBJECT_TYPE.

   The main problem here is in the value: "white dwarf star"

   How is that going to be used ?
   Will only resources matching exactly "white dwarf star" be returned ?
   What if the resource I maintain lists OBJECT_TYPE="white dwarf" ?
   It will not match ...
   We have to come up with a standard list, a thesaurus,
   to homogenise those types!

   Probably the best thing is to start with the IAU thesaurus
   http://msowww.anu.edu.au/library/thesaurus/english/
   (but I remember other similar efforts like the IUE object class ...).

Wavelength:

   Wavelength is another item value you defined.
   Here the Data Model should intervene with a proper definition both for the
   name of the item (I remember a joke by Jonathan McDowell that introduced
   the FREWAVERGY!), and for the values (optical and uv are ok, but we need
   to define many more).

Keyword:

   As you presented this, it looks like KEYWORD is a generic container,
   which could take very different values spanning from object names (GD 50)
   to more or less free text (HST photometric standards).

   Maybe this is too generic ... ?
   One of the problem I have with the registry is that I do not know
   in advance whether a service will list a certain characteristic
   in its metadata, or within the data itself.
   Example:
   The "GD 50" white dwarf might be listed in a set of resource keywords,
   in some other cases it will be a record in a resource (eg an entry
   in a  catalogue of white dwarfs);
   in this second case your query will fail, even though the object is to
   be found in the resource.

   I described the same problem in the Rwp02 astrovirtel use case for the
   Distance attribute, where a catalogue might offer the distance as a field
   in its records, or there could be a metadata keyword saying that the objects
   in a catalog are all to be found within so many mega parsec.

Summary:

My main point here is that the registry must describe things using
not only a well defined set of keywords,
but also a well defined set of keyword values!

Phometric aspects (the Wavelength values): it will be
the Data Model WG to come up with a proper description and set of values;

service_types and other service (or resource) specific metadata
(I mean non-Bob's document for level 0, but other keywords for deeper levels)
will be defined by the Registry WG.

What I do not know is:
Who's coming up with things like a standard list of object types ?
Do we need one ? (I think so ... the IAU thesaurus might show the way)

Alberto

--
Alberto.Micol at eso.org                         Tel: +49 89 32006365
HST Science Archive       ST-ECF              Fax: +49 89 32006480
ESA/RSSD/SN               c/o ESO             Karl Schwarzschild Str.2,
http://archive.eso.org/   No ads, thanks.     Garching bei Muenchen,
http://www.stecf.org/     HTML emails         D-85748 Germany

Keith Noddle wrote:

> Thanks for the feedback Ray - much appreciated!
>
> I think we're all converging on the model proposed by Tony which, to
> paraphrase Ray, is the full-(specialist/limited)-(source/private) model.
> I think this pretty much satisfies the points raise by Wil and Ray as
> well as meeting most of the useful requirements I originally proposed.
> I'll start work on the design and get something posted on the IVOA Wiki
> for comment - I'm conscious of the need to get a presentation together
> in short order for Cambridge!
>
> The other major aspect of the RWP04 work is the development of a
> registry query schema. Tony, Elizabeth and I have been working on
> something similar(!) for AstroGrid and whilst we don't yet have schema,
> we have a simple example XML query (below) upon which I would welcome
> everyone's comments. We are looking into XQuery/XPath, but that might
> not be appropriate for the current iteration of AstroGrid. Again, your
> comments would be welcome.
>
> Keith.
>
> --
>
> Keith Noddle                    Phone:  +44 (0)116 223 1894
> AstroGrid Technical Lead        Fax:    +44 (0)116 252 3311
> Dept of Physics & Astronomy     Mobile: +44 (0)7721 926 461
> University of Leicester         Email:  ktn at star.le.ac.uk
> Leicester, UK   LE1 7RH         Web:    http://www.astrogrid.org
>
> --------------------------------------------
>
> If the query we are trying to satisfy can be expressed in pseudo-SQL as:
>
> SELECT * FROM <registry> WHERE
>    (
>       TYPE="white dwarf star" AND
>       (WAVELENGTH="optical" OR WAVELENGTH="uv") AND
>       (KEYWORD="BPM 16274" OR
>          KEYWORD="GD 50" OR
>          KEYWORD="HST photometric standards"
>       )
>    )
>
> then one way of presenting this in XML is shown below
>
> <query>
>    <selectionSequence>
>       <selection>
>          <item>type</item>
>          <value>white dwarf star</value>
>       </selection>
>       <operator>AND</operator>
>       <selectionSequence>
>          <selection>
>             <item>wavelength</item>
>             <value>optical</value>
>          </selection>
>          <operator>OR</operator>
>          <selection>
>             <item>wavelength</item>
>             <value>uv</value>
>          </selection>
>       </selectionSequence>
>       <operator>AND</operator>
>       <selectionSequence>
>          <selection>
>             <item>keyword</item>
>             <value>BPM 16274</value>
>          </selection>
>          <operator>OR</operforator>
>          <selection>
>             <item>keyword</item>
>             <value>GD 50</value>
>          </selection>
>          <operator>OR</operator>
>          <selection>
>             <item>keyword</item>
>             <value>HST photometric standards</value>
>          </selection>
>       </selectionSequence>
>    </selectionSequence>
> </query>
>
> This only requires 6 tags and is sufficiently flexible to cope with most
> queries I can think of - but I'm not an astronomer...!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ivoa.net/pipermail/dm/attachments/20030430/f7840dc4/attachment-0001.html>


More information about the dm mailing list