VO and ADEC identifiers

Thu Sep 18 10:48:30 PDT 2003

On Thursday 18 September 2003 08:21, Tony Linde wrote:
> If you want to then send that service the dataset pointer of
> "t421/c110.ori", will you need an extra standard method built into the
> SkyServices which can resolve that pointer? At the moment, we're likely
> to
> have some sort of doQuery method (in addition to getMetadata, getStatus
> etc). Do we need to add a getDataset(string DatasetID) method?

Absolutely. Otherwise the Identifier would be useless.

> Next question is about how you expect to use these datasets. Since they
> are
> not registered in the VO as resources, a search of a registry will not
> find
> them. I'm not sure you could query a dataset either without yet another
> special method (eg queryDataset(string DatasetID, document AdqlQuery) in
> the
> data access services. It is getting a bit silly if we have to add new
> methods to all the services in order to handle an entity that the VO
> does
> not recognise. 

Generally, if you design a service interface that is simple yet complete, it 
will have those get?() methods anyway. It doesn't really effect service 
design to say what the registry does or doesn't do since the service is 
usable without the registry. The registry is just the way to find the 
service(s).

>How do you see this being handled?

In the CVO proptotype we have two types of services: Archive and
Catalog. An Archive delivers (stores or creates and provides URls to) 
data products (1+ files). A Catalog is a collection of Entry(s) and is
queryable. 

We have a class called ArchiveLink which is a globally unique identifier for 
a thing in an archive (a data product, dataset, whatever you like to call it). 
It has two parts: archiveName and archiveID. The archiveName is public
and is the "name" of the Archive service in our registry. The archiveID is
an archive-specific identifier, completely under the control of the archive.
An Archive has a get???() methods that take an ArchiveLink object as the 
argument and return 1+ URLs  (technically they only need the archiveID). 

Essentially: the archiveName says which service and the archiveID lets you
get a dataset that service has. The service type (Archive) explicitly means 
that the service has the appropriate get methods (in our case getData(...)
and getPreview(...), with optional cutout capabilities).

> I guess the simple solution is that if you want to do anything with a
> dataset within the VO then it first needs to be registered as a
> resource.
> No?

The service needs to be registered. We can't register every dataset. Our model
is that an archive would publish a dataset into a Catalog (so it can be  found 
via query) and part of what it publishes is the ArchiveLink global 
identifier. By publishing, the archive must ensure the ArchiveLink is a valid 
one, so it must run an Archive service of the appropriate name and honour 
that archiveID.

So, when you query a Catalog, you may find some ArchiveLinks. These you
can resolve by (i) looking up the Archive by name in the service registry, 
(ii) using the service you find to retrieve the data.

-- 
Patrick Dowler
Tel/Tél: (250) 363-6914                  | fax/télécopieur: (250) 363-0045
Canadian Astronomy Data Centre   | Centre canadien de donnees astronomiques
National Research Council Canada | Conseil national de recherches Canada
Government of Canada                  | Gouvernement du Canada
5071 West Saanich Road               | 5071, chemin West Saanich
Victoria, BC                                  | Victoria (C.-B.)