How should the Registry handle mirrors?

Clive Page cgp at star.le.ac.uk
Thu Jun 24 06:58:14 PDT 2004


On Thu, 24 Jun 2004, Tony Linde wrote:

> In general, the problem of mirrors has been sidelined as 'too difficult for
> now' but is one of the issues for post-2005-demo.

Sure, solving all these problems is clearly too difficult for now.  What I
wanted to point out is that we need to think ahead a little to work out
how we may eventually want to handle mirroring in order not to take wrong
development routes now.

> There is a facility to cope with mirrors using the Relationship element with
> a relationshipType of 'mirror-of'.

Yes, that should at least identify mirrors, which is a useful first step,
and the 'derived-from' case can, I suppose, flag near copies, which
might be useful to the (human) user.

> The issue is what constitutes a mirror. If a dataset is a bit-for-bit copy,
> then probably yes. But what if one field in one record is changed - is it
> still a mirror? And what if only records of a certain type are mirrored? Is
> it still a mirror?

As far as I know, *most* mirrors are exact clones; differences may exist
briefly when the original is updated, before the updates propagate (update
frequency is often once per day, e.g. overnight).  Other cases nearly
always have substantial changes, not just the odd record.  Most of these
cases should be covered by adequate versioning, e.g.  DSS-1 vs DSS-2,
USNO-A1/-A2/-B1 etc.  There are also cases in which one underlying
resource has several query interfaces, e.g. the SDSS clone at ROE points
out three different ways of querying it:
http://www-wfau.roe.ac.uk/sdss/catalog.html
How these are handled again depends on the query context.

> The reason we've not worried too much about it up to now is that we don't
> have the infrastructure to determine the best mirror to use.

Even without that ability (which is clearly some way off) I think we need
to be able to distinguish between the cases of (a) queries which should
always return a list of mirrors so the user can choose, and (b) ones which
never return more than one (which may or may not be the "best") when the
usage context requires that duplicate results would be wrong.


-- 
Clive Page
Dept of Physics & Astronomy,
University of Leicester,
Leicester, LE1 7RH,  U.K.




More information about the registry mailing list