[QUANTITY] and [OBSERVATION] data models: new drafts

Brian Thomas brian.thomas at gsfc.nasa.gov
Mon May 3 09:15:58 PDT 2004


Francois, all,

On Friday 23 April 2004 07:35 pm, Francois Bonnarel wrote:
> Dear DM  partners,
>     As I mailed allready I am a little concerned about the lack of
> reactions on the two drafts: Quantity and Observation. Not to speak about
> STC! Because even if there is no real opposition to these drafts, are we
> sure that the people  will use this work if they are not convinced?
> [snip]

	Well, please review these docs then. As far as practicality of application
	is concerned, I am  preparing a prototype reference package (in Java)  for 
	Quantity. This package should allow one to create, serializes to/from XML
	various core classes which support general data models. While the code is not
	ready for list-wide release at this time I would be very happy to share
	the code with anyone interested (even better, anyone who wants to help
	code the project is *very* welcome to join and add their name as an author).


> and
>
>
> Brian:
> Simply put, thats just not the reality of the astrophysical concepts. Many
> (if not most) concepts belong to more than one group, and when they belong
> to a group, their relationship may be qualified (ex. I belong to this group
> under the following conditions..)
>         As a result, I don't see how pushing the strings as id's/concepts
> within UCD is going to be effective in the long run in terms of meeting VO
> requirements. We need to take a more advanced approach, such as has been
> frequently suggested by UCD3. (Brian)
	
	Where/when did you "hear" me state this opinion?!?

	I think you are confused about my point of view. As per the paper that
	Ed and I presented at the ADASS 2003, we completely believe that the issue
	of overlapping concepts must be handled. Currently, the only technology
	which appears to be able to handle this are the RDF/OWL Ontology-based ones
	(in essence, "UCD3").

	I have always maintained that the string-based UCD approach is very 
	flawed, and it the root cause of many of the issues in trying to transition
	from UCD1 to 1+/2. I am very much a believer that we should just keep
	UCD1 as is, and immediately start work on UCD3. Skip the "intermediate"
	steps which aren't going to be very productive as the gap between what
	is really needed (a flexible, multi-node interconnected set of semantics)
	and what is currently implemented (somewhat inflexible 2-level hierarchy based
	on strings) is too wide. An incremental approach will likely be a failure here. 
	We need to admit the failures of the present approach and move on.


>      D )   Serialization ?
>
>     What kind of serialization are we looking for ? In a SIAP or "AVO demo"
> oriented perspective, we are looking for a serialization which is a
> standardized data description. But I think the full XML serialization in
> the Quantity/Brian/Ed perspective is serialization for the data themselves
> (replacing older data formats like FITS).

	Yes, the quantity should underpin the other data models. Its reason for being
	may be simply stated:

	"To provide the basic mechanism for storage, retrieval (search) and transport
	between VO services/repositories."
 
	Hence, the Quantity gives semantic meaning only to those components
	which relate to the above requirement. In other words, it only defines
	things like dataTypes, units, etc which relate to _all scientific data_. When
	you start talking about Astronomy Images, spectra, etc, then you are 
	talking about higher-level data models. When you start talking about concepts
	like "Flux", "phenomena" (as per some theorists) then you are talking
	about higher level data models again. The quantity only provides a minimal
	framework for holding these things, and allows interchange between them.

	Now, given this, the serialization for the Quantity should be fairly mutable. It
	should be able to support all of the higher level datamodels. Most astronomers
	will not be aware of it, but it will make many things within the VO much easier
	(such as search across the VO for concepts like "galaxies which have h-alpha
	emission of X and are observed in the V and I bands simultaneously").
 
	Once the data is discovered, and perhaps downloaded, it will probably be
	translated back into FITS. I say: Why re-invent the wheel? We have currently
	plenty of analysis software that astronomers can use with FITS, and plenty
	of other VO -related work to do to worry about trying to redesign how 
	astronomers do their analysis (at this time).


>
> Other question: is the serialization controlled by Developper/Astronomer
> like in the various VOTable serialization proposals or automatically
> generated?

	Why not both? (Seriously!!)

	By having namespaces where models are applicable, we can control,
	appropriately, what is a valid data model.

	I would guess that perhaps 3 levels of namespace could exist: (we 
	could have more or less), these being:

	"VO"	= 	You need a serious proposal/community agreement to have model here.
				IF some model is in the VO namespace, then most repositories will
				need to implement it.

	"Archive" 	= For archives that need to design internal models the public won't see
				 or are highly-specialized for a very small community of astronomers.
			`	Oversight here is by the archive. The model may change more frequently,
				and changes wont break the VO as a whole.

	"Personal"  = For individual astronomers who want to group information in a particular way.
				I can see this as usefull if the astronomer may use their model as a search tool.
				For example, they may submit this model to the VO repositories as a "bag" to 
				be filled when they initiate a search. IF this was the search paradigm, then there 
				probably would be some standard "search models" so that Joe Astronomer
				doesn't have to design his own model in order to search the VO. I'm thinking
				Data model design is more of a power user type of activity.

	Perhaps the last namespace level isn't merited, but I think the other 2 are. 


>
>      And for serialization we need attributes: Jonathan defined a few of
> them by defining utypes for SIA. Should we go on before the next version of
> the draft?
>
>
>      E ) Packaging
>
>      This is somewhat related to the previous point. If we are using the
> datamodel for data description, we need a description of the data which
> will come next? Is Quantity what we need for that ? What about the coding
> compression/format stuff? Don't we have to add a packaging class to the
> draft ?

	I guess it depends on what you mean by "packaging". I'd say packaging 
	depends on the model.

	Quantity controls the position/content of general scientific and IO (like compression)
	information at a low-level.
	
	So, to provide some concrete examples,  if you ask, "where do I find the errors for 
	this number?" the Quantity model tells you. If you are asking "where do I find the 
	name of the observer for this telescope?" then the UCD/phenomena model tells you 
	(which inherits from the Quantity, at least, thats how I see it in UCD3). If you are asking 
	"what are the fundamental information that I need to include to describe an astronomical 
	image?" then the SIAP/Image data model tells you what concepts from UCD and Quantity 
	are needed and where they go.

	Thus, to provide a short answer to your question, the Quantity should contain prescription
	for how to do the IO/compression. It is currently not complete in that regard although the
	Quantity DM group does have some early drafts of how to achieve this.

	Regards,

	=b.t.

>
>    Just to be discussed here.
>
> Cheers
> François
>
> SAUVONS LA RECHERCHE :  <http://recherche-en-danger.apinc.org/>
>
> =====================================================================
> Francois   Bonnarel               Observatoire Astronomique de Strasbourg
> CDS (Centre de donnees          11, rue de l'Universite
> astronomiques de Strasbourg)    F--67000 Strasbourg (France)
>
> Tel: +33-(0)3 90 24 24 11       WWW:
> http://cdsweb.u-strasbg.fr/people/fb.html Fax: +33-(0)3 90 24 24 25      
> E-mail: bonnarel at astro.u-strasbg.fr
> ---------------------------------------------------------------------

-- 

  * Dr. Brian Thomas 

  * Dept of Astronomy/University of Maryland-College Park 
  * Code 630.1/Goddard Space Flight Center-NASA

  *   fax: (301) 286-1775
  * phone: (301) 286-6128 [GSFC]
           (301) 405-2312 [UMD] 




More information about the dm mailing list