On ID "sameness"

Arnold Rots arots at head-cfa.cfa.harvard.edu
Wed Feb 5 14:16:02 PST 2003


Ray Plante wrote:
> Hey Arnold,
> 
> ...
> > As far into those remainders as they are identical there
> > is sameness, but you would probably have to look at the metadata to
> > determine the degree of sameness.  For instance, CXO:2000 in
> > NASA:HEASARC:CXO:2000 and SAO:CDA:CXO:2000 indicate that we are
> > pointing to the same observation
> 
> I think this is a repeat of what you just said, but I would that one would 
> have to look at the metadata for NASA:HEASARC:CXO:2000 to know that it 
> refers to the same observation as SAO:CDA:CXO:2000 as there is no a priori 
> way of knowing that those common parts are intentional or accidental.  

It is a repeat, but the mission (observatory) identifier is a special
one.  The general ID is datacenterID:missionID:datasetID, where
datasetID 9or maybe all of them) may be compound.
What I was saying here is that there need to be two lists of
prescribed (and unique) IDs:
- Data center identifiers (such as NASA:HEASARC or HEASARC and SAO:CDA
  or SAO or CDA)
- Mission or telescope or observatory identifiers (such as CXO and VLA)

There are three rules:
- The registry needs to know which mission (telescope) archives are
  served by each data center.
- The original issuing data center (CDA for CXO) defines the mission
  sub-IDs (i.e., the dataset IDs).
- Other data centers that carry data from those missions (mirrored or
  otherwise) ARE REQUIRED to use the sub-ID hierarchy defined by the
  issuing data center.

The justification is two-fold:
There is a similar discussion enfolding within ADEC, related to tieing
datasets to bibcodes which involves the ADS, the data centers, and
the journals.  This seems to gravitate toward an identifier that
consists of missionID:datasetID.  Clearly, in order to provide an ID
that will work in perpetuity, the mission ID and dataset ID are
crucial, but a data center ID may do more harm than good.  In order to
turn a journal article's link into a "real" link, it would need to be
prefixed by a data center ID, which is where the registry comes in.
However, this will only work if the sub-ID space for each mission or
telescope is immutable.  [I am assuming here that we would want both
ID systems to be compatible]  If you like, you can think of
missionID:datasetID as a logical ID, and
datacenterID:missionID:datasetID as a physical ID.
Second, it may be that a secondary archive does not store a complete
mission archive.  It would be nice if it were able to detect a link to a
missing data object from the ID and then pass the whole thing on to a
primary archive that does have everything.

  - Arnold


> 
> ...
> cheers,
> Ray
> 
> 
--------------------------------------------------------------------------
Arnold H. Rots                                Chandra X-ray Science Center
Smithsonian Astrophysical Observatory                tel:  +1 617 496 7701
60 Garden Street, MS 67                              fax:  +1 617 495 7356
Cambridge, MA 02138                             arots at head-cfa.harvard.edu
USA                                     http://hea-www.harvard.edu/~arots/
--------------------------------------------------------------------------



More information about the registry mailing list