AstroGrid registry structure

Ray Plante rplante at poplar.ncsa.uiuc.edu
Sun Jun 15 23:29:47 PDT 2003


Hi Tony,

On Fri, 13 Jun 2003, Tony Linde wrote:
> Put simply, my biggest problem with VOResource schema is that, if I expand,
> say, Project, I see that (unless I am misreading) I must include Curation
> information and can include Coverage and Content information. To my mind,
> these are not relevant to the description of a project. 

Okay, I think we can work through this one.  As I've mentioned, I 
have felt that Coverage should be pulled out the generic Resource 
metadata.  Curation and Content, I think, are the main issues.  

I think Roy's argument of conformance with DC is compelling--in fact, that 
was an important consideration from the beginning of the RSM.  We 
shouldn't throw it out lightly; nevertheless, let's look at it.  

Let's take your three classes--Service, Commmunity, and perSpace--and make 
a list of attributes that *might* be used to describe them, a la...

  Service                  Community                 perSpace
  ---------------          -------------------       -------------------
  title                    title                     ...
  publisher                mission                   
  ...                      ...

I would concentrate on concepts rather than names.  I would emphasize 
concepts we think might apply to all resources in that set; however, I 
would not worry about whether they actually do.  Afterwords, we can go 
and see which are common to all sets, where there are similar concepts, 
and which seem limited in applicability.   

We can also use this list to discuss some general issues, such as 
precision, optional/required attributes, etc.  

> Where do I store the
> information about the project funders or other project information? We've
> not yet even had a debate about what information is stored about project,
> person, group, etc. Organisation has the same structure - why?

The reason is that, as you say, we have not had that debate, yet.  In the
first version of VOResource, we concentrated on Organizations and
Services.  For registry prototyping, we want to be able to sufficiently
describe cone search and SIA services and the organizations that manage
them.  We can and probably should add Organization-specific metadata;  
however, at the time no more were needed beyond what was then catagorized
by a generic resource.  Project and DataCollection were added because some
of those that participated in the modeling discussions thought they should
be called out as separate types of resources.  However, work had not
proceeded to call out metadata specific for these classes either.

> I want to see metadata for a type of entity that is unambiguous and clearly
> describes the entity. If your model does this, please explain how and where.
> I think that if we can sort out this issue, we'll be able to move forward.

I think this touches on some of the general issues in our approach that we 
should discuss:  precision of metadata definitions, cost (to data/service 
providers) of using/understanding them, and extensibility. 

By precision, I mean how specific metadata are defined.  It's been my 
contention that if you want to get a group of diverse resources to 
interoperate, you must squint your eyes a bit--that is, blur the details 
that make them different.  That is, metadata should not be to precisely 
defined.  The opposite extreme would have a different schema for every 
resource.  The question is, what level of precision is enough?  Can we 
define things loosely and allow registrants to apply them in the way that 
makes the most sense?

The answer to that I think has a lot to do with cost.  If our metadata is 
very precise, we'll need a lot of terms which will become costly for the 
provider that has to come up with values.  (BIB-1, a flat Z39.50 schema 
has ~100 terms in it, GEO-1 has ~300 terms.)  On the other hand, if the 
terms seem too ambiguous, registrants filling out forms may be stimied 
by what something means in a registration form.  Confusion is a cost, 
too.

In the US (at least, those of us that worked in the ISAIA project), I
think we've tended at this high level toward looser definitions that are
interpreted as is appropriate for the resource.  Further precision could
be added in extensions to the lower level as needed.  Note, however, that
imprecision does not imply inaccuracy.  It may make sense to alter names
and definitions so it is clearer how it might be applied in different
situations (e.g. see InterfaceURL definition in VOResource).

If we take curation as an example, I think the main theme here is, who is 
responsible.  This, I think, applies to organizations as well as services.  
If this theme can be rendered into metadata that stretch across all 
resources, will that make general searches easier?  Will that allow us to 
find all resources managed by HEASARC or has so-and-so as a contact?

cheers,
Ray




More information about the registry mailing list