registering collections

Markus Demleitner msdemlei at ari.uni-heidelberg.de
Mon May 26 01:15:14 PDT 2014


Dear colleagues,

On Fri, May 23, 2014 at 05:39:25AM -0500, Ray Plante wrote:
> Over the last few interops, there has been growing support for 
> endorsing a common practice for registering data collections.  I 
> presented this issue at ESAC; see my slides at 
> http://wiki.ivoa.net/internal/IVOA/InterOpMay2014Registry/Plante-RWGMay2014.pdf. 
> 
> I presented 3 possible ways to achieve this proposal.  I believe the 
> "consensus" at the meeting was to simply use the existing 
> CatalogService resource type for registering data collections and 
> their services; use of the DataCollection resource type would be 
> deprecated.
> 
> Any further comments?  Can we encourage our publishing registries to 
> adopt this?  

As a reminder: CatalogService's content model is 

 	validationLevel*, title, shortName?, identifier, curation, content,
 	rights*, capability*, facility*, instrument*, coverage?, tableset?

against DataCollections's

 	validationLevel*, title, shortName?, identifier, curation, content,
 	facility*, instrument*, rights*, format*, coverage?, tableset?,
 	accessURL?

-- as capability is 0..n in CatalogService, CatalogService can in
principle stand in as for DataCollection essentially everywhere.
What's missing in CatalogService vs. DataCollection is format and
accessURL; I'd argue they can be replaced by a generic vr:Capability
with, as appropriate, a vr:WebBrowser interface (for landing pages)
or vs:ParamHTTP (if the result type actually matters).

Hence, I agree we'd not be losing anything when essentially dropping
DataCollection.  Let's go for it.

However, I'd still like to see some mention of the pattern of a
central service with several data collections (or capability-less
catalog services, now).  This would be for the cases of S*AP services
exposing multiple data collections, all of which would declare
relationships to the master service. See this guy here:

http://dc.zah.uni-heidelberg.de/wirr/q/ui/fixed?field0=ivorn&operator0=%3D&operand0=ivo%3A%2F%2Forg.gavo.dc%2Flensunion%2Fq%2Fim&MAXREC=20&OFFSET=0

(click on the relationship icon to see what's going on) -- if the
contributing (four) data collection all had the SIA capability,
current clients would hit the service four times and hence get the
same results four times. The alternative way around that undesirable
effect would to to require clients to unique the access URLs they
discover from the registry, but I wouldn't like that.

Why have relations here and embedded capabilities in other cases?
Well, I'd say: If the main access pattern is global discovery ("give
me *all* SIA services in the infrared"), have relationships.  If the
main access pattern is individual discovery ("give me *a* service
exposing both microwave fluxes and an indication of morphology"),
embed the capabilities.

Cheers,

           Markus



More information about the registry mailing list