ADEC and VO data registration
Doug Tody
dtody at nrao.edu
Thu Sep 18 14:59:51 PDT 2003
Arnold -
This one may come down to a matter of preference so here is my view. Both
forms can work but since the authority ID and resource ID are fundamental to
what is being described, they are best represented explicitly in the syntax.
Some advantages of the first form:
o The distinction between the authority ID and the resource (data
collection in this case) is clear from the syntax. Otherwise
one has to look up the authority ID metadata and apply some
heuristics to determine what is being referred to, what is the
authority ID and what is the resource. This would be much more
prone to interpretation error, and would be more complex, as
runtime queries would be needed to resolve what otherwise would
be clear from the syntax alone.
o Separating out the resource in this way makes it easier to associate
multiple resources with the same authority, e.g., different data
collections, or a service which goes along with a data collection.
For example, if Sa.HST.STIS is an authority ID, then
Sa.HST.STIS/STIS-V1.1 might be a versioned data collection
controlled by the authority Sa.HST.STIS, and Sa.HST.STIS/sia might
be a SIA service for this data collection, again controlled by
the same authority. I included STIS in the authority ID here
to illustrate how hard it is to tell from the name alone what
is being referred to - Sa.HST.STIS might well be the naming
authority for STIS data, and not the/a STIS data collection.
o By separating out the resource ID there are fewer restrictions
on the form this takes, e.g., a longer name could be used to
describe different versions of a data collection in the naming
syntax alone (you could do this with the second form as well but
it would result in less consistent authority ID names).
Although current ADEC proposals emphasize naming individual datasets,
any scheme intended for publications should recognize data collections as
well as datasets. One project might analyze only a few datasets which
are explicitly referenced in a paper, while another project may perform
statistical analysis of many datasets and it will be more appropriate to
reference the entire data collection in a published paper.
I like the use of # to delimit the resource-specific namespace (e.g.
dataset ID), so long as this does not change when the ID is used in
different contexts.
> Sa.HST.STIS/O4LT010E0
> Sa.HST.WFPC2/U32L0104T
Ignoring for the moment the blending of authority and resource, it might be
better here to use names like
STIS.HST.Sa
WFPC2.HST.Sa
to be more consistent with existing DNS usage. I prefer left-to-right
myself from a logical point of view, but unless there are other existing
conventions pushing us in this direction we should be consistent with
common URL usage or it will just confuse everyone.
- Doug
On Thu, 18 Sep 2003, Arnold Rots wrote:
> In the interest of simplicity for authors, can anyone explain what the
> advantage is of this three-element Identifier definition:
>
> <AuthorityId>/<ResourceKey>#<DatasetId>
>
> which would result in things like:
>
> Sa.CXO/4000
> Sa.HST/STIS#O4LT010E0
> Sa.HST/WFPC2#U32L0104T
> Sa.IUE/LWP25899
>
> Over the two-element identifier:
>
> <AuthorityId>/<DatasetId>
>
> that would result in identifiers like:
>
> Sa.CXO/4000
> Sa.HST.STIS/O4LT010E0
> Sa.HST.WFPC2/U32L0104T
> Sa.IUE/LWP25899
>
> In both cases the same number of resources have to be registered,
> though in the first case they are all different authority Ids, while
> in the first case some of them are resource keys.
> Actually, come to think of it, the first case requires more registry
> records since the authority Ids as well as the resource keys need to
> be registered.
> In either case Sa.HST.STIS and Sa.HST/STIS need to be resolved to a
> physical location. What's the difference?
>
> I don't see any advantage and unless someone can convince us that it's
> a much better idea, I propose that we drop the #-sign and return to
> the two-element model - it's simpler and cleaner.
>
> - Arnold
>
> --------------------------------------------------------------------------
> Arnold H. Rots Chandra X-ray Science Center
> Smithsonian Astrophysical Observatory tel: +1 617 496 7701
> 60 Garden Street, MS 67 fax: +1 617 495 7356
> Cambridge, MA 02138 arots at head-cfa.harvard.edu
> USA http://hea-www.harvard.edu/~arots/
> --------------------------------------------------------------------------
More information about the registry
mailing list