handling metadata with multiple values

Thomas McGlynn tam at lheapop.gsfc.nasa.gov
Tue Aug 12 13:44:56 PDT 2003


Hmmmm.  I suspect I'm going to regret this, but I'm not
sure I'm happy about the idea that for essentially any
value where we might have a list as a response, that we
need to define two distinct keywords for that item.

I.e., we need
    Date and DateList, UCD and UCDList, Instrument and InstrumentType,
and so forth.  It seems like we're mixing structure and semantics here.

If I understand what has been proposed we have (using UCDs as the
example):

1. Scalar or vector:
    <UCD> x1 </UCD>
    <UCD> x2 </UCD>

2. Scalar value is <UCD> x1 </UCD>
    Vector is:
    <UCD>
         <item> x1 </item>
         <item> x2 </item>
    </UCD>

3. Scalar or Vector are:
    <UCDList>
         <UCD> x1 </UCD>
         <UCD> x2 </UCD>
    </UCDList>


Case for 1:

     + Scalars handled transparently.
     + Fewest keywords.
     + Easy access to underlying data.

     - ??

Case for 2:

     + Single keyword handles all kinds of lists
     - Need to handle scalars and list separately
     - Harder to find data in lists.

Case for 3:

     + Scalars handles transparently.
     + Consistent access to data but need to go
       one layer deeper.
     - Need two keywords defined for each data type.


The + that was suggested for 3 versus 1 is that if a user properly indents
an XML file, then it's easier to see the structure of the file.
I think that's debatable. Is

      <SUBJECT> x </SUBJECT>
      <SUBJECT> y </SUBJECT>

      <OBJECT> 1 </OBJECT>
      <OBJECT> 2 </OBJECT>

  really less readable than:

      <SUBJECTS>
        <SUBJECT> x </SUBJECT>
        <SUBJECT> y </SUBJECT>
      </SUBJECTS>

      <OBJECTS>
         <OBJECT> 1 </OBJECT>
         <OBJECT> 2 </OBJECT>
      </OBJECTS>

If writers pay attention to formatting, I think the terseness of the
first at least counterbalances the explicit punctuation of the second.
This might not hold if the values inside were themselves structures,
but for the moment I believe we are discussing cases where the
values are simple strings or numbers.

In practice I don't think real XML files are human readable anyway,
so that this is a red-herring.  Regardless doubling the number of
keywords in our dictionaries seems a large price to pay to handle
this relatively small issue.

	Cheers,
	Tom




More information about the registry mailing list