Harvesting

Tony Linde ael at star.le.ac.uk
Wed Sep 10 08:54:25 PDT 2003


> I do not recall what kind of pointers the VOResource can 
> have. How can I

One suggestion I made in the discussions with Ray was to have (0-many)
Relationship complex types for each resource, each containing an (attribute
or subsidiary element) whose value was sourced from a namespace defined
enumerated list of relationship types (eg RefersToCollection, Mirrors,
IsBinaryCompatibleTo or whatever) and the ResourceID of the target resource.
I don't think we put this into the final schema but it might be worth adding
(if it is technically feasible).

> What are the metadata formats that will be exchanged through 
> OAI harvesting?

Our schemas will be so much more complex than DC that I think it better if
we have dual harvesting. Normal VO harvesting will spread the VO schemas.
Optionally, a registry might offer DC-style harvesting which mapped certain
VO Resource metadata to DC metadata.

> How are registries populated and clean?
>     -- How to get astronomers to fill in the forms (snowball effect of
> usefulness)

I think you're right. If people want their data to be used, they will
complete the forms. That said, if the metadata is already available (as with
CDS VizieR) or can be automatically generated (eg from FITS headers), we
should look to build such tools.

>     -- How to make sure their entires are meaningful
>     -- How to stop maniacs filling the registry with nonsense

Difficult. We want to make sure that any astronomer (even amateurs) can make
data available via the VO with the minimum of fuss but also want to avoid
the nutters.

I don't think IVOA should police this. Maybe we leave it up to individual
registries to police. AstroGrid-based registries could define anyone
registered within any AstroGrid community to be authorised; NVO can base
theirs on authorised organisations etc.

Any registry which has such lax rules that it ends up full of crap will be
switched out of the harvesting regime, thus effectively deregistering ALL
its resources, so I think responsible people will take care.

> What if registries are operatring different versions of the 
> VOResource schema?

A registry needs to support multiple schemas and multiple versions of the
interface (bacakward compatibility), so, yes, we need some way of
identifying this.

Cheers,
Tony. 

> -----Original Message-----
> From: owner-registry at eso.org [mailto:owner-registry at eso.org] 
> On Behalf Of Roy Williams
> Sent: 09 September 2003 16:29
> To: registry at ivoa.net
> Subject: Re: Harvesting
> 
> 
> Here are my questions about registries. I guess I see the 
> "mirror" question in the context of specifying relationships 
> in general -- "A is a mirror of B" or "A is derived from B" 
> or "A is a product of project B". Anyway, here are the 
> questions that come to mind about harvesting.
> 
> Roy
> -----------------------------------------------------
> 
> A given registry harvests some others on a regular basis.
>     -- Which others? How does the list of friends get changed?
>     -- ie How do I make my private registry into a public one?
>     -- Will there be a central clearinghouse of NVO-compliant 
> registries?
> 
> I do not recall what kind of pointers the VOResource can 
> have. How can I
> say:
>     -- this service returns the data from this DataCollection
>     -- this service has an identical mirror which is this 
> other service
>     -- this DataCollection results from the computations 
> defined by this Project
> 
> What are the metadata formats that will be exchanged through 
> OAI harvesting?
>     -- VOResource?
>     -- Dublin Core?
>     -- We could ask a friendly librarian to point his OAI 
> harvester at our Dublin Core!
> 
> How are registries populated and clean?
>     -- How to get astronomers to fill in the forms (snowball effect of
> usefulness)
>     -- How to make sure their entires are meaningful
>     -- How to stop maniacs filling the registry with nonsense
> 
> What if registries are operatring different versions of the 
> VOResource schema?
>     -- Do they just ignore each other
>     -- Or should we have a version negotiation protocol?
> 




More information about the registry mailing list