Cube model - Dataset IDs

CresitelloDittmar, Mark mdittmar at cfa.harvard.edu
Thu Mar 19 16:25:58 CET 2015


Thanks for the background Doug.

So the use-case for the DataID.datasetID is:
  + data center has a registered publisher ID
  + assigns its own dataset ids, which may change over time
     (these are IVOA IDs, so globally unique and with proper syntax)
  + the dataset has ALSO been assigned a persistent dataset id from
     a 'global index service' such as ADS which the publisher wants to
     retain in the dataset.

  I'll use some MAST file info as an example (but it doesn't have
publisherDID)
  The resulting file would contain:
    Curation.publisher = "MAST"
    Curation.publisherID = "ivo://mast.stsci.edu"
    Curation.publisherDID = "ivo://mast.stsci.edu?obsid=1234"  <some 'mast'
specific ID, (using above for basis?)>
    DataID.datasetID = "ads/sh.hut#ngc4151_141"
    DataID.creatorDID = "ngc4151_141"


Questions..
1)  Is it possible for an archive/data center/data provider, to NOT have a
registered publisherID?
     In other words, NOT be able to assign identifiers.  Instead, it relies
on an external 'global index service'
     to provide it with identifiers for it's holdings.  In this case, there
would be just the one identifier,
     which could be either the publisherDID OR the datasetID.
     Maybe this is the 'more on this' case?

     Curation.publisher = "MAST"
     Curation.publisherID = <none>
     Curation.publisherDID = "ads/sh.hut#ngc4151_141"
     DataID.datasetID = "ads/sh.hut#ngc4151_141"
     DataID.creatorDID = "ngc4151_141"

    I'm not sure which location this 'global index id' should go.. so put
it at both.

 2) My ignorance surrounding identifiers may become apparent here, but...
     I'm not sure if a single dataset can be tagged with >1 identifier from
any given
     'global index service', but here are, presumably, multiple 'global
index services'.
     So, there is a question about multiplicity for that attribute.

     If the ADS IDs are publication based, then this would be a growing
list, as a
     dataset is used in various research.  Keeping this sort of metadata
accurate
     and current would require frequent updates to the dataset itself.

     While it seems useful for an archive/center to keep track of IDs which
reference
     a particular dataset, it doesn't seem right to store that information
IN the dataset.
     This sounds something like a 'Getty Image' storing metadata about every
     usage of that particular image IN the png file. (which I don't think
they do)


http://www.gettyimages.com/detail/news-photo/apollo-8-view-of-earthrise-over-the-moon-news-photo/50580029

Mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ivoa.net/pipermail/dm/attachments/20150319/f473b2d3/attachment.html>


More information about the dm mailing list