Putting the pieces together - Quality...

Martin Hill mch at roe.ac.uk
Fri May 14 06:14:42 PDT 2004


The data model would be the wrong place to put non-standard things :-)

Quality threshold values I would have thought would be set by the datacenters 
- after all, these are the people who have already set effective threshold 
values when working out binary or trinary quality values. Later we can model 
quality more carefully.

Which brings us to another point; if we can model the quality of an 
individual bit of data, how do we model quality on large groups such as 
datasets?  Presumably some datasets are better than others, and so the data 
'as a whole' has better value than another one - particularly as I understand 
it even undergraduates should be able to post their data on the VO in order 
to use VO tools, and to allow other people to use their data :-).

Cheers,

Martin

On Friday 14 May 2004 1:45 pm, Ivo Busko wrote:
> Doug Tody wrote:
> <snip>
>
> > Ivo - regarding your point about data quality vectors:  As you know,
> > the SSA data model has a data quality vector.  We don't really know what
> > to put in it though.  I don't think we should put anything instrumental
> > in nature in the general SSA data model (this can be done but it would
> > go into nonstandard extension records).  Simple models for the quality
> > vector would be binary (good or bad) or trinary (known good, known bad
> > or flagged, or questionable).  Perhaps once we get more experience with
> > real data from archives it will be possible to develop a more refined
> > quality model.  (Note this should not be confused with the error vectors
> > which we already have).
> >
> >         - Doug
>
> Thanks, Doug, that sounds good enough. I agree that nothing
> instrument-specific
> should be put in the data model. However, something must be done to
> accomodate
> cases that do not follow the norm.
>
> I have in mind cases where a binary or trinary model wouldn't be enough
> to summarize the data quality information available in the original
> file.
> A good example is FUSE data; it uses a continuously variable 2-byte
> integer
> value to store a kind of "weight" (between 0 an 100), instead of the
> more
> commonly found bit-encoded mask. To cast that data into a, say, binary
> good/bad
> model, one needs an additional piece of information, in the form of a
> threshold
> value.
>
> Ideally, a VO transaction involving such data should allow for the
> threshold
> value to be either specified by the requestor, or alternatively be set
> by the
> data provider in a instrument-dependent way.
>
> In short, my point is: shouldn't the data model allow some room for
> non-standard
> extra bits of info such as data quality threshold values?
>
> Cheers,
>
> -Ivo

-- 
Martin Hill
Astrogrid/AVO, ROE
Tel: 07901 55 24 66



More information about the dal mailing list