OAI coordination, the root element question
Ray Plante
rplante at poplar.ncsa.uiuc.edu
Wed Nov 26 10:36:55 PST 2003
Hey Gerard,
On Wed, 26 Nov 2003, Gerard Lemson wrote:
> Are these proposals, in whatever form they'll end up being adopted,
> going to be formalized in the schema definitions as well ?
No, this is not my intention. Most of these the items I mention have more
to do with OAI and are not controllable by the schema. The only item that
is *possible* to control by the schema is the choice of the root element.
Note: the rest of this message has nothing to do with our January demos.
> The current schema allows for any of the many (global) elements
> (defined under <xs:schema>) to be used as the root element of a
> valid document.
As many of you know, I am of the mind that the root element should
be the choice of the application. That is, another application may wish
to pass contact information and therfore would use <Contact> as the root
element. By making all elements global, you maximize flexibility,
reusability, and extensibility. I feel this is approach is closer to how
we traditionally define metadata in which we attach unique names to
concepts.
Note that when we were using DTDs, there was no way to enforce one element
to be the root. This had little effect on our applications.
> As you say, this makes validating documents a much more
> difficult job, especially if you use some source generator like JAXB or
> Castor for which validation is almost a no-brainer.
The difficulty I refered to is specific to using <Resource> and its
substitution group. Checking for the root element outside of the
validation process is quite trivial. Consider how you would do this in
JAXB:
try {
Unmarshaller un = context.createUnmarshaller();
VOResource vod = (VOResource) un.unmarshal(inputStream);
}
catch (ClassCastException ex) {
System.err.println("Wrong Doc Type: root element is not VOResource!");
}
Here it is using DOM:
Document doc = parser.parse(input);
if (! doc.getDocumentRoot().getName().equals("VOResource"))
throw new Exception("Wrong Doc Type: root element is not VOResource!");
> If not, if the current schemas are defining valid VO documents as well,
> the current design methodology, which heavily relies on using element
> "ref"-s with their required global elements is problematic.
Specifically, what is the problem? Is it something more than
incorporating the check of the root element into the validation process?
> If you're interested I've created an alternative "port" of your schema's
> that uses element "type" definitions. It carries the same semantics.
Yes, this would be good to look at. (Thanks!)
> It validates XML documents that are almost identical to those validated
> by the current schemas, but has currently only two global elements,
> VODescription and VOResource.
Given that you have more than one possible root element, don't you still
have to do something similar to the above? For example, if we agreed on
VOResource for the OAI interface, if someone sent you a VODescription, it
would still validate. But now you would have decide what to do if someone
erronously sent you multiple resource descriptions where there should only
be one.
> They do moreover have a lower "impedance
> mismatch" with obvious object-data modeling approaches.
I disagree with this last statement. Use of substitutionGroups, which
requires all members to be globally defined, is a *better* match to the
OO-Modeling approach than its alternative, <xs:choice>. (Consider: does
UML support the notion of an arbitrary choice relationship?) All members
of a substitution group must have the same type or be derived from the
type of the group's head element. This provides polymorphism for
elements: VOResource's child is defined to be Resource. Service
"inherits" from Resource and, therefore, can be substituted in for
Resource.
The weak part of my argument is that the software binding tools do not
fully support substitution groups.
> I guess an alternative approach is to say that the current schemas are
> not normative for valid documents sent to/from (registry) services, but
> that separate schemas using the current ones are being created ?
This was not my idea; however, if we used an approach where standard
schemas just defined global types (as Wil has suggested), then it would
make sense to do something like this. That is, there would be a special
Registry schema that defined the root element needed by the Registry
schema.
cheers,
Ray
More information about the registry
mailing list