[CATALOGUE]Starting Data Model Subgroup - terms like 'column'

Tue Aug 3 10:14:10 PDT 2004

Martin Hill wrote:

> Ed Shaya wrote:
>
>> Here.  The mere mention of columns is, in my opinion, out of place.  
>> The concept of rows and columns should not appear in any component of 
>> our data model.  They belong in a relational database data model.  
>> Here I think we are working on a more abstract level in which objects 
>> may contain other objects.  This results in  tree-like structures.  
>> We should worry about transformation into a set of interelated 
>> relational tables only after the VO data model for this is complete.  
>> I believe that Roy correctly chimed in that  VOTable can  already do 
>> this only because Pedro incorrectly brought up the issue of  
>> describing rows and columns.
>
>
> Being a bit pedantic, but our data models won't necessarily be trees 
> either.  In fact our data models are 'relational' - they consist of 
> various bits of information in 'lumps' that make sense to us, related 
> to other 'lumps'.  Some of these relations will be tree-like, but some 
> won't.  We *could* write down our models where the 'lumps' are 'table 
> definitions' and the 'bits of information' are 'columns' in those 
> tables.   I believe however we are intending to model these lumps as 
> 'objects' and the bits of information as 'properties' of those 
> objects, and use UML relational diagrams to write it down.

My use of the word relational was a poor choice.  I was just saying that 
a catalog should not be restricted to only  2 dimensional datasets.   
You are right that tree-like is  also not  general  enough since there 
could be explicit relationships from any object to any other object.   
If we agree to extend the meaning of the word column to mean a set of 
similar classed objects, then I could accept its use as well.   But 
still a Catalog may not have any columns since  each object may  have 
differing sets of properties.  For instance a list of two clusters of 
galaxies.  For one we know its richness and X-ray properties, for the 
other we know its member names and its mass.

>
> Using UML to represent our data and its relationships is fine, but we 
> must also remember that our data may be stored and processed in non-OO 
> languages, such as FORTRAN.  If some find it easy to think in columns 
> and tables, and others in terms of objects and properties, we should 
> be able to cope with both.
>
I'm afraid it will be just too difficult to program in FORTRAN77 for the 
general Catalog.  But for certain common subclasses of Catalog it should 
be fine. 

> But we should avoid using particular implemenations of 
> representations; we shouldn't try and describe *models* in terms of 
> Java Objects/Interfaces or Sybase or VOTables or FORTRAN structs or 
> XML Schemas.  These are specific implementations of representations, 
> not suitable for our general models, but we may want to use them for 
> 'worked examples' of how our models might be used in practice.
>
Of course.  One needs either a modeling language or an ontology.  Along 
these lines, I believe that modeling languages like UML are best for 
processing and data flow architectures.  Ontology is best for 
information and knowledge statement architectures.  Most of what we are 
trying to do in DM  is the latter.

> Cheers,
>
> Martin
>