new take on resource registration best practice

Marco Molinaro molinaro at oats.inaf.it
Thu Oct 24 02:43:03 PDT 2013


Dear Reg-WG

2013/10/23 Markus Demleitner <msdemlei at ari.uni-heidelberg.de>

> Dear Reg-WG,
>
> On Wed, Oct 23, 2013 at 01:46:26PM +0200, Pierre Le Sidaner wrote:
> > Thank you for raising this problem
> > I just want us to separate two point :
> > The way we want the services and collection to be registered by the
> > users. (example for registering one tap with multiple collections)
> > The way we want to ingest them in the registry. (or the way we would
> > like to retrieve informations)
>
> This is true to some extent -- but of course the design of the
> registry data model (and its actual usage) has to keep both the needs
> of the publishers and the needs of the searching users in mind.  If
> these needs are so different that, in effect, two different data
> models (i.e., we shove around pieces of information on resource
> ingestion) are necessary, so be it.
>
> But if we can avoid this, we should.
>

I'm not sure I'm getting completely the meaning of this distinction.
Considering the registry data model as a unique, the requirements
form the searchers and those from the publisher should live in the
same space. I consider the distinction as an interfaces-related
topic. The caveat, seems to me, is that the model has to allow for
feasible interfaces to plug into the model, that is where the
collection-service proposal stuck upon the RegTAP, if I understood
it right.
Am I missing the point?
If not I'm in favor of avoiding two different models.


> If we figure all that out, we could specify that some sort of copying
> of capabilities has to happen on ingestion; that specification needs
> to be done regardless if you're querying through ADQL or Solr's query
> language.  In any case, you're doing this capabilities-copying, you're
> creating a VOResource-user data model distinct from
> VOResource-publisher.  How bad is that?  Given the limited extent,
> it's probably not catastrophic.
>
> But still: Before we do this, let's think again if we can't keep the
> two together.
>

I agree it could have a minimal impact, but I think it could mess
up things at registry maintenance.
I mean, you add cross-resource manipulation at ingestion step:
isn't this adding a point of failure, specially if you consider mirrored
resources?
Don't know, maybe I'm only too much worried on this point.


> One possibility still is that we do nothing in VOResource.  Under the
> assumption that there's not going to be thousands of "federated"
> services, maybe clients could cope with resolving relationsships by
> just memoizing the most common federated services?  Maybe queries
> against the original VOResource DM can be made natural enough that
> this can work?  I believe the three-worlds approach I've described in
> my Interop talk --
> http://wiki.ivoa.net/internal/IVOA/InterOpSep2013Registry/regtap.pdf
> -- is at least workable, for example, even if it is not too
> beautiful. Similar approaches are, I guess, possible using Lucene.
>
> Frankly, I personally would probably still rather go with Ray and add
> capabilities in what are now data collection records.  It's simple,
> it'll not derail the old RI1 registries, and I believe it can be
> pulled off with fairly minimal changes -- if at all -- to VOResource.
>

I'm not happy in leaving that part of resource management to clients.
They could do that in any case if they want, but I'll prefer having
registries
robust by themselves, so if the changes to VOResource are a better
solution I'll go for it. (BTW: it could be a good point maybe to put all
desirable changes to VOResource in one step, I'm thinking of the
ContentLevel fine grained reduction e.g.).



> There's a catch, however: Let's say someone wants to enumerate all TAP
> services -- if all the little data collections all say they have this
> TAP capability, they'll have a lot of records.  Things are even worse
> for the typed services as for them, all-VO-queries do make sense.  If
> all contributing data collections say they have the SSAP capability
> for a federated service, a naive all-VO SSA search would hit the
> service containing that capabilitiy fairly often.
>
> Therefore, I'd say these "served-by" capabilities should have special
> standardIds (maybe just the normal standard ids with "?service-for"
> appended?).
>

I'd prefer something that does not require parsing (am I monotonous?),
but the idea of clearly stating the "service-for" I think would be useful
for clients.

I have a couple of other points.

The first one is about authorities/organizations and so on.
I'm ok in best practicing their usage, but could this lead also in some
alongside best practice with IVORNs? New publishers entering the
VO may find it useful to have some guidelines for it.
Is this only a dream of mine?

The second is only a question about the "case 2: data repository".
Shouldn't each collection in it have a "part-of" relationship
to the repository DataCollection2, like it happens with Data Center
individual mission resources?
If not, can you explain me why? (probably my fault, but I cannot see it).

Cheers,
    Marco
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ivoa.net/pipermail/registry/attachments/20131024/cae93082/attachment-0001.html>


More information about the registry mailing list