Putting the pieces together...

Doug Tody dtody at aoc.nrao.edu
Fri May 14 11:26:35 PDT 2004


Hi Ivo -

Having a uniform quality metric such as 0-1.0 or 0-100 is a good idea,
that would work too and seems general enough.  Again, it would be useful
to survey what is actually done currently in spectral archives then we
could be sure what we come up with is reasonable.

> In short, my point is: shouldn't the data model allow some room for
> non-standard extra bits of info such as data quality threshold values?

This sounds like what I meant by extension records.  At the level of a
dataset I think we should always allow "nonstandard" extensions which extend
the core model using the same mechanisms as used in the core.  In many
cases this just means adding some extra keywords, which is probably how
one would encode a threshold value.  Or this extra information could take
the form of additional structured entities in the dataset.  Of course it
would take extra work to make use of such information, but I think many
would like to have such information available even if it is not yet
standardized, so long as it does not obscure the core model.

	- Doug



On Fri, 14 May 2004, Ivo Busko wrote:

> 
> Doug Tody wrote:
> <snip>
> > Ivo - regarding your point about data quality vectors:  As you know,
> > the SSA data model has a data quality vector.  We don't really know what
> > to put in it though.  I don't think we should put anything instrumental
> > in nature in the general SSA data model (this can be done but it would
> > go into nonstandard extension records).  Simple models for the quality
> > vector would be binary (good or bad) or trinary (known good, known bad
> > or flagged, or questionable).  Perhaps once we get more experience with
> > real data from archives it will be possible to develop a more refined
> > quality model.  (Note this should not be confused with the error vectors
> > which we already have).
> > 
> >         - Doug
> 
> Thanks, Doug, that sounds good enough. I agree that nothing
> instrument-specific
> should be put in the data model. However, something must be done to
> accomodate
> cases that do not follow the norm.
> 
> I have in mind cases where a binary or trinary model wouldn't be enough
> to summarize the data quality information available in the original
> file. 
> A good example is FUSE data; it uses a continuously variable 2-byte
> integer
> value to store a kind of "weight" (between 0 an 100), instead of the
> more
> commonly found bit-encoded mask. To cast that data into a, say, binary
> good/bad
> model, one needs an additional piece of information, in the form of a
> threshold
> value. 
> 
> Ideally, a VO transaction involving such data should allow for the
> threshold
> value to be either specified by the requestor, or alternatively be set
> by the
> data provider in a instrument-dependent way. 
> 
> In short, my point is: shouldn't the data model allow some room for
> non-standard
> extra bits of info such as data quality threshold values?
> 
> Cheers,
> 
> -Ivo
> 
> 



More information about the dal mailing list