AstroGrid registry structure
Ray Plante
rplante at poplar.ncsa.uiuc.edu
Sun Jun 15 23:29:47 PDT 2003
Hi Tony,
On Fri, 13 Jun 2003, Tony Linde wrote:
> Put simply, my biggest problem with VOResource schema is that, if I expand,
> say, Project, I see that (unless I am misreading) I must include Curation
> information and can include Coverage and Content information. To my mind,
> these are not relevant to the description of a project.
Okay, I think we can work through this one. As I've mentioned, I
have felt that Coverage should be pulled out the generic Resource
metadata. Curation and Content, I think, are the main issues.
I think Roy's argument of conformance with DC is compelling--in fact, that
was an important consideration from the beginning of the RSM. We
shouldn't throw it out lightly; nevertheless, let's look at it.
Let's take your three classes--Service, Commmunity, and perSpace--and make
a list of attributes that *might* be used to describe them, a la...
Service Community perSpace
--------------- ------------------- -------------------
title title ...
publisher mission
... ...
I would concentrate on concepts rather than names. I would emphasize
concepts we think might apply to all resources in that set; however, I
would not worry about whether they actually do. Afterwords, we can go
and see which are common to all sets, where there are similar concepts,
and which seem limited in applicability.
We can also use this list to discuss some general issues, such as
precision, optional/required attributes, etc.
> Where do I store the
> information about the project funders or other project information? We've
> not yet even had a debate about what information is stored about project,
> person, group, etc. Organisation has the same structure - why?
The reason is that, as you say, we have not had that debate, yet. In the
first version of VOResource, we concentrated on Organizations and
Services. For registry prototyping, we want to be able to sufficiently
describe cone search and SIA services and the organizations that manage
them. We can and probably should add Organization-specific metadata;
however, at the time no more were needed beyond what was then catagorized
by a generic resource. Project and DataCollection were added because some
of those that participated in the modeling discussions thought they should
be called out as separate types of resources. However, work had not
proceeded to call out metadata specific for these classes either.
> I want to see metadata for a type of entity that is unambiguous and clearly
> describes the entity. If your model does this, please explain how and where.
> I think that if we can sort out this issue, we'll be able to move forward.
I think this touches on some of the general issues in our approach that we
should discuss: precision of metadata definitions, cost (to data/service
providers) of using/understanding them, and extensibility.
By precision, I mean how specific metadata are defined. It's been my
contention that if you want to get a group of diverse resources to
interoperate, you must squint your eyes a bit--that is, blur the details
that make them different. That is, metadata should not be to precisely
defined. The opposite extreme would have a different schema for every
resource. The question is, what level of precision is enough? Can we
define things loosely and allow registrants to apply them in the way that
makes the most sense?
The answer to that I think has a lot to do with cost. If our metadata is
very precise, we'll need a lot of terms which will become costly for the
provider that has to come up with values. (BIB-1, a flat Z39.50 schema
has ~100 terms in it, GEO-1 has ~300 terms.) On the other hand, if the
terms seem too ambiguous, registrants filling out forms may be stimied
by what something means in a registration form. Confusion is a cost,
too.
In the US (at least, those of us that worked in the ISAIA project), I
think we've tended at this high level toward looser definitions that are
interpreted as is appropriate for the resource. Further precision could
be added in extensions to the lower level as needed. Note, however, that
imprecision does not imply inaccuracy. It may make sense to alter names
and definitions so it is clearer how it might be applied in different
situations (e.g. see InterfaceURL definition in VOResource).
If we take curation as an example, I think the main theme here is, who is
responsible. This, I think, applies to organizations as well as services.
If this theme can be rendered into metadata that stretch across all
resources, will that make general searches easier? Will that allow us to
find all resources managed by HEASARC or has so-and-so as a contact?
cheers,
Ray
More information about the registry
mailing list