Single or collection resources

Markus Demleitner msdemlei at ari.uni-heidelberg.de
Wed Aug 5 11:22:34 PDT 2009


Doug, Robert, DAL folks,

Thanks for your responses, but let me follow up --


On Tue, Aug 04, 2009 at 09:02:57AM -0400, Robert Hanisch wrote:
> Hello, Markus.  My preference would be for option 2.
[collection services]

> The resource metadata was intended to describe a data collection, e.g., data
> from a particular instrument, of a particular class of objects, or
> pertaining to some phenonemon.  In cases where the metadata elements are not
> unique for the collection (your example of CREATOR) it is perfectly ok to
> use values like "Various".  The metadata for data collections is supposed to
> aid discovery, not be a full description of each and every image in the
> collection.  The FITS keywords for the individual files should contain
> additional metadata specific to each image.
Yes, well; I see that you can embed a fair amount of provenance in
FITS.  On the other hand, such "what *have* you found"-metadata
is not the only application of RMI metadata.  I'm much more worried
about people looking for "services exposing data from instrument X"
or "services that have 'carte du ciel' in their description" in the
registry.  With collection services, this would likely not yield the
desired results.

And of course, the people that supplied you with the data get happy
if you query *their* data using VOExplorer.  While that's kind of
silly, we are (or at least I am) still not in a situation where
people queue up to get their data into the VO, so things like that
help.

I guess that's, in a few more words, the case for single services I
made in the original mail.

> I agree that having the same data served through multiple services is likely
> to confuse users.  And we don't really want an explosion of SIA services,
> each with a separate registry entry.
Ye-es... Well, I wasn't sure about that, and this was in part why I
wrote the first mail: is that "keep the number of services down"
policy rough consensus within the VO?  If it is, that's of course a
strong case for collection services.


On Tue, 4 Aug 2009 13:56:35 -0600, Doug Tody wrote:

> Any of these approaches would work.  In general the same data can
> be available from multiple services (e.g. due to replication of a
> popular collection) hence this situation cannot really be avoided.
Yes -- but at least as long as we do not have reliable "artifact ids"
that would clients allow to weed out duplicates, I have a feeling Bob
is right and we should *try* to avoid it as best we can.  But that's
just a feeling not based on actual user feedback.

> Re option #2, while the RMI cannot fully describe the data in this case
> since there are multiple individual collections, the SIA query response
> can, since each image is separately described.  Hence something
> like CREATOR (DataID.Creator), COLLECTION, PublisherDID, etc. can
> be specified separately for each image.  This metadata is included
Hm, I suspected that SIAv2 would make such collection services a bit
more attractive, but still that metadata is only included in the
*response* but is not available while people are trying to locate
resources in the registry, so even SIAv2 won't help me too much.

I guess what I'd really be looking for would be some way of
registering the individual data sets and point them all to one SIAP
service.  But that's bad as well, since then data will come back that
doesn't match the registry metadata.

Still a bit at a loss,

          Markus



More information about the dal mailing list