[QUANTITY] Choice of DM representation (Was: Re: [QUANTITY] An object- and domain-oriented proposal in UML and XML)

Wed May 28 07:33:29 PDT 2003

	Hi David,

On Tuesday 27 May 2003 06:42 pm, Giaretta, DL (David)  wrote:
> [snip]
> I'm a little concerned that your paper seems to be an XML specific data
> model - things like the Id and IdRef.

	Well, its not XML specific per-se. I describe the model in terms of UML
	diagrams rather than in XML schema for this very reason. That said,
	*some* representation of the data model *will* have to be chosen
	sooner or later. XML has a number of advantages which I think will
	make it the likely candidate. I merely wished to show that the model wasn't
	inconsistent with XML technology. Of course, IF some other technology
	is chosen to represent the model (and what would you choose over XML
	at this point??) the id/idref stuff can easily come out (I think it only appears
	in 1 or 2 places in the model at this point!!)

>
> It seems to me that we should be very clear that the data model should be
> independent of the format of the data. The same data can - and probably
> will - be represented as FITS, XML, Ascii, etc. We should clearly separate
> the representation from the actual information and one of the tests of the

	I agree and disagree mildly with various parts of this statement. 

	First, of the choices you offer, XML is a clear leader. FITS cannot store object, 
	or hierarchical data (unless you believe a single level of tables is a heirarchy..), 
	and ASCII is just a broader category that XML belongs to. Why choose ASCII 
	when XML already exists and is accepted by the WWW community as a standard 
	AND with supporting technologies like Webservices, validation, transformation,
	etc.

	That said, I DO agree that the Data model and its representation *must*
	be able to represent and/or wrap all existing data that the VO will have.
	This means that the data model must be able to encapsulate all FITS 
	meta-data, be able to serve as a web-ready wrapper about that FITS file
	and be translatable into a FITS file as needed.

	To be clear about what is needed here, I think the data model has the
	following requirements:

	- must be internet ready, easily exchangeable representation (format) 
	- must have some inheritance, domain properties to allow for local 
          expansion on meta-data as desired.
	- must be able to represent all the data we currently have, and plan to have
	- must be easily searchable
	- must be easily archiveable (e.g. supports knowledge capture, is easy to
          read and understand)
	- must be easily interchangeable and transformable
	- *perhaps* support future scientific publishing standards

        XML appears (to me) to best fit the bill for all of these needs in so far as
        choice of a representation goes. Furthermore, I think many will find this a
        surprising statement, but We DON'T need the data model to support loads
        of analysis and plotting software. Why? because we can translate the DM
        into FITS as needed. Why reinvent the wheel here if we don't need to? I
        don't foresee the death of FITS anytime soon for the average scientist.
        The data model we are talking about, if successful,  will generally be used
        sight unseen supporting the machinery of the VO and its data providers.

> data model must surely be - "does it work if this piece of data is a FITS
> file" and "does it work if the data is pure XML" - and we'd better get an
> answer "yes" for both.
>

	With this thought, I definitely agree. The DM should be able to represent information
        in any current data format used by VO participants (e.g. FITS is a big one to
        target).

        I certainly don't believe that XML is incapable of holding FITS information. We
        tested this out with our FITSML project (see http://xml.gsfc.nasa.gov) quite
        some time ago. ITs the other way around that is impossible, e.g. using FITS
        to represent everything that XML can (generally because FITS doesn't support
        hierarchical data structures).

> The Open Archival Information System Reference Model (OAIS)
> (http://wwwclassic.ccsds.org/documents/pdf/CCSDS-650.0-B-1.pdf) - now an
> ISO standard - carefully distingusihes between Content Information and
> Representation Information and I think this is an important distinction.
>

	Thanks for the link, I'll take a look.

							Regards,

							=b.t.

> ...David
>
>
>
> -----Original Message-----
> From: Brian Thomas [mailto:thomas at mail630.gsfc.nasa.gov]
> Sent: 27 May 2003 18:33
> To: dm at ivoa.net
> Subject: [QUANTITY] An object- and domain-oriented proposal in UML and
> XML
>
>
>
> 	Hi all,
>
> 	Ok, since I see others have put together proposals for the simple
> quantity, now
> 	Ed and I have also done so. What is different about this proposal?
> Concisely,
>  	it describes an object- and domain-oriented simple quanties data
> model. We
>  	use these technologies in order to enable a number of important
> requirements,
> 	including the ability for the quantity data model to describe data
> at every data
> 	repository using machine readable inheritable
> concepts/classes/standards.
>
> 	Whitepaper is available in a variety of formats from the following
> URLS:
>
> 	http://nvo.gsfc.nasa.gov/DataModel/VOSimpleQuantityDataModel.html
> 	http://nvo.gsfc.nasa.gov/DataModel/VOSimpleQuantityDataModel.doc
> 	http://nvo.gsfc.nasa.gov/DataModel/VOSimpleQuantityDataModel.sxw
>
> 	(where the extension tells you what you are dealing with...the last
> is the
>     	openoffice format).
>
>
> 					Regards,
>
> 					=b.t.

-- 

  * Dr. Brian Thomas 

  * Code 630.1 
  * Goddard Space Flight Center NASA

  *   fax: (301) 286-1775
  * phone: (301) 286-6128

QOTD:
	"I thought I saw a unicorn on the way over, but it was just a
	horse with one of the horns broken off."