Characterization data model
Francois Bonnarel
bonnarel at alinda.u-strasbg.fr
Thu Sep 1 10:18:25 PDT 2005
Anita was commenting the Characterization predraft published before
the Kyoto meeting (http://alinda.u-strasbg.fr/Model/Characterisation/characterizationMay16.pdf)
Anita Richards wrote:
>=20
> Dear Francois, Mireille, Jonathan et al.,
>=20
> Notes on Characterisation
>=20
> I have been looking at the ALMA data model as well as other
> interferometry models and general data access issues arising from
> AstroGrid use cases.
>=20
> I am being very pedantic because I think that this model will be very
> useful and we need to make sure that there are no ambiguities which
> will confuse data providers or software.
>=20
Thank you for your careful reading of this draft. Your feed back is very =
helpful.
> v 0.1
> 1.1
>=20
> Axes
> What is 'cinematic' - is this the temporal+spatial analogy of a
> spectral+spatial data cube?
>=20
No, no ; for us kinematic was just the velocity axis.
> I tried to introduce velocity earlier, has it been deliberately
> dropped or just deferred? It is not the same as frequency, if you have
> data on a single object in multiple spectral transitions, each of
> which has a velocity structure which overlaps; you make line
> assignments based on position, relative intensity etc, and you might
> publish a velocity spectrum or data cubes whilst the frequency data
> were only available in visibility data form. No need to add it yet if
> it complicates things but I hope it hasn't been dismissed.
>=20
We are aware that "spectral transition" and "radial velocity"=20
information are mixed on the wavelength axis. Probably we had in mind=20
simple cases where the relationship between lambda and the velocity is=20
simple and unique (single emission line for example). The overlapping=20
case you are adressing is "strange" because one single data dimension=20
(the "spectrum") is described by two independant axis: wavelength and=20
velocity. But if you present this kind of data as a velocity spectrum,=20
how much do you need ? one per spectral structure ?
We also remember that Jonathan was in defavour of having two=20
differrent axes. Jonathan, could you argument that?
> Tables 1 and 2
>=20
> I am not sure if these are meant to contain generic terms or to be
> specific examples, can this be stated? I guess that they are just
> examples but in some cases this isn't very clear -=20
Yes. These 2 examples only illustrates what metadata would be filled in=20
the diff properties and levels.
> what is Quantum
> Efficiency in Sensitivity - Spatial? Is vignetting another (maybe
> more widely understood) example?
The property: Sensitivity (function) describes the variation of the=20
response (observable) with respect to one characterisation axis.
Sensitivity-Spatial combines the various effects like vigneting,=20
variability of quantum efficiency along the detector, etc...
> One of the most important things many users (and data providers) want
> to know is, what is the faintest detectable emission - commonly called
> sensitivity or noise. I think that many people will be confused that
> it is not the Sensitivity of the Observable. In many cases this will
> be the Resolution of the Observable, in which case can we explicitly
> call it e.g. statistical error or noise. I guess that it is actually
> meant to come in as Bounds of the Observable, in which case can we
> explicitly call that Max Flux, Min Flux please?
We agree, we have a mistake here for Bounds-Observable: for a specific
observation, we 'll store the min and max flux recorded.
The LimitingFlux depends on the statistical error or noise and is well
categorised under Resolution-Observable: If we consider Resolution to be
defined as "the smallest interpretable quantity along one axis", the
limitingFlux representing the smallest nb of photons or the minimum flux
value above the statistical error (or noise level) can be logicaly hooked here.
But is this still true if we consider a photo plate where the "absolute
detection limit" and the "flux separation power" are different things?
It was a reason to distinguish them in the table. The saturation is now
also missing. The idea to put detection limit and saturation as Observation
Observable Bounds was wrong ok.. But can we have them in the bounds of a
"detector characterization"?
> When describing a potential image/spectrum from a visibility data set,
> it is useful to be able to express the fact that the input data may
> have multiple spectral channels in each (sub)band. In this case
> Sampling - Spectral would be the smallest spectral scale available, so
> in Table 1 this is the channel width which is pixel-like - not a FWHM
> (determining the Field of View, inter alia!). The Resolution may be
> different and can be a channel FWHM if spectral smoothing has been
> applied without rebinning. Not important, but I mention this in case
> someone thinks that the Resolution-Spectral and Sampling-Spectral are
> always the same and uses one element.
Ok, we agree. BTW Are the channel width and channel spacing the same with
Visibility data?
> Similarly, in the Temporal domain, I suggest that Resolution and
> Sampling are allowed to be different since for interferometry
> the min. temporal resolution to make synthesis images is usually
> considerably longer than the sampling or integration time but it is
> useful to know the latter not only if you want a light curve but
> because it also determins the FoV. Whilst the user should not need to
> know that Bounds-Spatial is actually a function of sampling, it will
> be useful to have the information in the model since it is not
> inconceivable that interferometry data providers would provide
> functions to use this information.
OK
> Sensitivity - Spectral
> For interferometry data, we usually provide a bandpass correction
> function which is applied to data before imaging, extraction of
> spectra etc. Should an entry for Sensitivity - Spectral describe the
> state of the data after applying the correction, or should it describe
> the uncorrected data? I think that it should describe the state of the
> data as supplied, but that will need to be made clear.
Ok, we could consider that characterisation (e.g, the sensitivity=
function) describes the data as they are provided. If the bandpass correction
function has allready been applied to the data, the SensitivityFunction should
include it. If not could mention it as a "calibration metadata ", but it's
no more Charac.
>=20
> Can all the quantities be expressed as ranges (eventually functions
> but never mind that now)?
Yes. In fact, we expended the coarse-to-fine description of coverage to=20
Resolution and Sampling as well. this is not shown on the tables , but=20
clearer on the UML diagram (Fig. 1) and xml schema(http://alinda.u-strasbg.fr/
Model/Characterisation/char0505.xsd. Resolution, Sampling, etc.. then=20
may have Bounds and store min,max intervals. The text and tables obviously have
to be upgraded on this aspect.
> Most of the characterisation slots are
> non-unique for potential images. The trickiest one is how to
> characterise the maximum spatial scale present as well as the minimum
> (i.e. resolution).=20
>It could be done by giving a range of resolution
> and relying on people knowing that larger scales are actually going to
> be missing; I think that anyone who hasn't grasped that is not going
> to understand a description of the visibility data in terms of uv
> distances either.
> Filling factor and
> 2.1 - bottom para on p.6 - can we drop this? I am not sure what it is
> adding to the model. Systematic small gaps in coverage arise in other
> situations than the ones described and are dealt with in different
> ways, e.g. visibility data for an imaging field containing a pulsar
> where only on-pulse data have been recorded;
ok We agree, this concept is not fully worked out. It was meant to be=20
used for the description of data quality. There are probably other ways
to characterize data with a complex fine structure. Let's drop this paragraph
for now.
> 3.1
>
> We need to be careful with regard to Data Retrieval; we already have a
> Registry standard. There are many differences between describing a
> data archive - which may contain very heterogenous data - and
> describing a single image or a collection of images (or spectra
> etc. etc.) in enough detail for extraction and processing. As far as
> I can see some of the top layers of the Characterisation model could
> map to the Registry model but I would like to see this done explicitly
> with cut-offs.
We do not totally catch you here, what is "cut-off" for here?
>If some aspects of the Registry model (e.g. using the
> 7 or so band names - radio, mm, IR...) don;t fit comfortably because
> they are too coarse that isn't a problem as long as the
> Characterisation and Registry models don't pretend that they are
> describing the _same_ thing in different ways.
Well: that's a tricky point:
1) Registry RM is dealing with "Resources"
and Characterization with individual (but also collections of) DataSets.
This may intersect in some cases, but generally not: not all Observations
or even collections will gain a "Resource" status.
2) Characterization is layered while Registry RM is not.
Anyway that would have been nice if Characterization level 1 and RM
could have shared the same formalism. Historically it was not possible, but
a mapping should be done. The actual values will differ of course.
> I also think that we should see VOs in general and Characterisation in
> particular as being mainly aimed at providing data which will be
> passed between other VO tools. Thus 'Characterisation ... is necesary
> .. but not sufficient .. also requires details of observing...'
> worries me. In many cases I hope that the Characterisation model will
> be sufficient because we should be encouraging data providers to
> supply data with instrumetal signatures already removed, so that other
> VO tools can tackle the images etc. Observational details should be
> provided as part of giving the data a history, and we do already have
> that in the Observation DM, but they are only directly relevant to
> Characterisation if we have tools to use them (e.g. a special tool to
> convert Chandra counts or 2MASS magnitudes to physical units).
Hmmm! Does VO have to be a data "police" ? What about "old data". What
about reprocessing of raw data? Potential use of the data is different
according to the level of available description. Eg: Uncalibrated spectra
are usefull to study line ratio in the same domain, no ?
> Fig. 2
>=20
> I was amused to see Observatory Location under Coverage_SPATIAL=20
This class comes from the re-use of the STC classes for coordinates.
it appears in the diagram because we show the STC subclasses , and not=20
one big STC blackbox.
Is this misleading?
- is
> this so that you at least know what hemisphere the image is in? I
> think that level of coarseness is OK for the Registry but
> Characterisation should be describing specific data a bit better than
> that (unless it is an all-sky map...)
>=20
> best wishes
>=20
> Anita
>=20
>=20
>=20
>=20
>=20
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Dr.=20
> Anita M. S. Richards, AstroGrid Astronomer
> MERLIN/VLBI National Facility, University of Manchester, Jodrell Bank=20
> Observatory, Macclesfield, Cheshire SK11 9DL, U.K. tel +44 (0)1477=20
> 572683 (direct); 571321 (switchboard); 571618 (fax).
>=20
>=20
--=20
--------------------------------------------------------------
Mireille LOUYS mailto: Mireille.Louys at astro.u-strasbg.fr
L S I I T & CDS,
Ecole Nationale Superieure Observatoire de Strasbourg
de Physique de Strasbourg, 11, Rue de l'Universite
Boulevard S=E9bastien Brant, BP 10413 67000 STRASBOURG
67412 ILLKIRCH Cedex Tel: +33 3 90 24 24 34
---------------------------------------------------------------
lFrom louys at newb6.u-strasbg.fr Fri Aug 19 17:06:30 2005
Return-Path: <louys at newb6.u-strasbg.fr>
Received: from mailhost.u-strasbg.fr (mailhost.u-strasbg.fr [130.79.200.153])
by alinda.u-strasbg.fr (8.8.8+Sun/8.8.8) with ESMTP id RAA28353
for <bonnarel at alinda.u-strasbg.fr>; Fri, 19 Aug 2005 17:06:30 +0200 (MET DST)
Received: from aladin.u-strasbg.fr (aladin.u-strasbg.fr [130.79.128.1])
by mailhost.u-strasbg.fr (8.13.3/jtpda-5.5pre1) with ESMTP id j7JFIGD0057403
for <bonnarel at alinda.u-strasbg.fr>; Fri, 19 Aug 2005 17:18:16 +0200 (CEST)
Received: from mailhost.u-strasbg.fr (mailhost.u-strasbg.fr [130.79.200.152])
by aladin.u-strasbg.fr (8.11.6+Sun/8.11.6) with ESMTP id j7JGCm300247
for <bonnarel at aladin.u-strasbg.fr>; Fri, 19 Aug 2005 17:12:48 +0100 (WEST)
Received: from newb6.u-strasbg.fr (newb6.u-strasbg.fr [130.79.128.16])
by mailhost.u-strasbg.fr (8.13.3/jtpda-5.5pre1) with ESMTP id j7JFIC8U047819
for <bonnarel at aladin.u-strasbg.fr>; Fri, 19 Aug 2005 17:18:12 +0200 (CEST)
Received: from mailhost.u-strasbg.fr (mailhost.u-strasbg.fr [130.79.200.155])
by newb6.u-strasbg.fr (8.9.3/8.9.3) with ESMTP id RAA02488
for <bonnarel at newb6.u-strasbg.fr>; Fri, 19 Aug 2005 17:17:46 +0200 (MET DST)
Received: from alinda.u-strasbg.fr (alinda.u-strasbg.fr [130.79.128.41])
by mailhost.u-strasbg.fr (8.13.3/jtpda-5.5pre1) with ESMTP id j7JFHhLa060025
for <bonnarel at newb6.u-strasbg.fr>; Fri, 19 Aug 2005 17:17:43 +0200 (CEST)
Received: from astro.u-strasbg.fr (paladin.u-strasbg.fr [130.79.129.177])
by alinda.u-strasbg.fr (8.8.8+Sun/8.8.8) with ESMTP id RAA28343
for <bonnarel at astro.u-strasbg.fr>; Fri, 19 Aug 2005 17:05:53 +0200 (MET DST)
Message-ID: <4305F098.50804 at astro.u-strasbg.fr>
Date: Fri, 19 Aug 2005 16:45:44 +0200
From: Mireille Louys <louys at newb6.u-strasbg.fr>
Reply-To: louys at newb6.u-strasbg.fr
Organization: CDS Observatoire de Strasbourg
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.3) Gecko/20030313
X-Accept-Language: en-us, en, fr
MIME-Version: 1.0
To: Francois Bonnarel <bonnarel at newb6.u-strasbg.fr>
Subject: nouvelle version
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-1.6 (mailhost.u-strasbg.fr [130.79.200.153]); Fri, 19 Aug 2005 17:18:16 +0200 (CEST)
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-1.6 (mailhost.u-strasbg.fr [130.79.200.152]); Fri, 19 Aug 2005 17:18:12 +0200 (CEST)
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-1.6 (mailhost.u-strasbg.fr [130.79.200.155]); Fri, 19 Aug 2005 17:17:43 +0200 (CEST)
X-Virus-Scanned: ClamAV 0.86.2/1034/Thu Aug 18 22:07:58 2005 on mr3.u-strasbg.fr
X-Virus-Scanned: ClamAV 0.86.2/1034/Thu Aug 18 22:07:58 2005 on mr2.u-strasbg.fr
X-Virus-Scanned: ClamAV 0.86.2/1034/Thu Aug 18 22:07:58 2005 on mr5.u-strasbg.fr
X-Virus-Status: Clean
X-MailScanner-Astro: Message analyse non infecte par sophos sur astro.u-strasbg.fr
X-Spam-Status: No, hits=-5.9 required=5.0
tests=AWL,BAYES_00,EMAIL_ATTRIBUTION,QUOTED_EMAIL_TEXT,
USER_AGENT_MOZILLA_UA
version=2.55
X-Spam-Level:
X-Spam-Checker-Version: SpamAssassin 2.55 (1.174.2.19-2003-05-19-exp)
Status: RO
Hello Anita,
Hello Jonathan,
Anita Richards wrote:
>
> Dear Francois, Mireille, Jonathan et al.,
>
> Notes on Characterisation
>
> I have been looking at the ALMA data model as well as other
> interferometry models and general data access issues arising from
> AstroGrid use cases.
>
> I am being very pedantic because I think that this model will be very
> useful and we need to make sure that there are no ambiguities which
> will confuse data providers or software.
>
Thank you for your careful reading of this draft. Your feed back is very =
helpful.
> v 0.1
> 1.1
>
> Axes
> What is 'cinematic' - is this the temporal+spatial analogy of a
> spectral+spatial data cube?
>
No, no ; for us kinematic was just the velocity axis.
> I tried to introduce velocity earlier, has it been deliberately
> dropped or just deferred? It is not the same as frequency, if you have
> data on a single object in multiple spectral transitions, each of
> which has a velocity structure which overlaps; you make line
> assignments based on position, relative intensity etc, and you might
> publish a velocity spectrum or data cubes whilst the frequency data
> were only available in visibility data form. No need to add it yet if
> it complicates things but I hope it hasn't been dismissed.
>
We are aware that "spectral transition" and "radial velocity"
information are mixed on the wavelength axis. Probably we had in mind
simple cases where the relationship between lambda and the velocity is
simple and unique (single emission line for example). The overlapping
case you are adressing is "strange" because one single data dimension
(the "spectrum") is described by two independant axis: wavelength and
velocity. But if you present this kind of data as a velocity spectrum,
how much do you need ? one per spectral structure ?
We also remember that Jonathan was in defavour of having two
differrent axes. Jonathan, could you argument that?
> Tables 1 and 2
>
> I am not sure if these are meant to contain generic terms or to be
> specific examples, can this be stated? I guess that they are just
> examples but in some cases this isn't very clear -
Yes. These 2 examples only illustrates what metadata would be filled in
the diff properties and levels.
> what is Quantum
> Efficiency in Sensitivity - Spatial? Is vignetting another (maybe
> more widely understood) example?
The property: Sensitivity (function) describes the variation of the
response (observable) with respect to one characterisation axis.
Sensitivity-Spatial combines the various effects like vigneting,
variability of quantum efficiency along the detector, etc...
> One of the most important things many users (and data providers) want
> to know is, what is the faintest detectable emission - commonly called
> sensitivity or noise. I think that many people will be confused that
> it is not the Sensitivity of the Observable. In many cases this will
> be the Resolution of the Observable, in which case can we explicitly
> call it e.g. statistical error or noise. I guess that it is actually
> meant to come in as Bounds of the Observable, in which case can we
> explicitly call that Max Flux, Min Flux please?
We agree, we have a mistake here for Bounds-Observable: for a specific
observation, we 'll store the min and max flux recorded.
The LimitingFlux depends on the statistical error or noise and is well
categorised under Resolution-Observable: If we consider Resolution to be
defined as "the smallest interpretable quantity along one axis", the
limitingFlux representing the smallest nb of photons or the minimum flux
value above the statistical error (or noise level) can be logicaly
hooked here.
But is this still true if we consider a photo plate where the "absolute
detection limit" and the "flux separation power" are different things?
It was a reason to distinguish them in the table. The saturation is now
also missing. The idea to put detection limit and saturation as Observation
Observable Bounds was wrong ok.. But can we have them in the bounds of a
"detector characterization"?
> When describing a potential image/spectrum from a visibility data set,
> it is useful to be able to express the fact that the input data may
> have multiple spectral channels in each (sub)band. In this case
> Sampling - Spectral would be the smallest spectral scale available, so
> in Table 1 this is the channel width which is pixel-like - not a FWHM
> (determining the Field of View, inter alia!). The Resolution may be
> different and can be a channel FWHM if spectral smoothing has been
> applied without rebinning. Not important, but I mention this in case
> someone thinks that the Resolution-Spectral and Sampling-Spectral are
> always the same and uses one element.
Ok, we agree. BTW Are the channel width and channel spacing the same
with
Visibility data?
> Similarly, in the Temporal domain, I suggest that Resolution and
> Sampling are allowed to be different since for interferometry
> the min. temporal resolution to make synthesis images is usually
> considerably longer than the sampling or integration time but it is
> useful to know the latter not only if you want a light curve but
> because it also determins the FoV. Whilst the user should not need to
> know that Bounds-Spatial is actually a function of sampling, it will
> be useful to have the information in the model since it is not
> inconceivable that interferometry data providers would provide
> functions to use this information.
OK
> Sensitivity - Spectral
> For interferometry data, we usually provide a bandpass correction
> function which is applied to data before imaging, extraction of
> spectra etc. Should an entry for Sensitivity - Spectral describe the
> state of the data after applying the correction, or should it describe
> the uncorrected data? I think that it should describe the state of the
> data as supplied, but that will need to be made clear.
Ok, we could consider that characterisation (e.g, the sensitivity=
function) describes the data as they are provided. If the bandpass
correction function has allready been applied to the data, the
SensitivityFunction should include it. If not we could mention it as a
"calibration metadata ", but it's no more Charac.
>
> Can all the quantities be expressed as ranges (eventually functions
> but never mind that now)?
Yes. In fact, we expended the coarse-to-fine description of coverage to
Resolution and Sampling as well. this is not shown on the tables , but
clearer on the UML diagram (Fig. 1) and xml
schema(http://alinda.u-strasbg.fr/
Model/Characterisation/char0505.xsd. Resolution, Sampling, etc.. then
may have Bounds and store min,max intervals. The text and tables
obviously have
to be upgraded on this aspect.
> Most of the characterisation slots are
> non-unique for potential images. The trickiest one is how to
> characterise the maximum spatial scale present as well as the minimum
> (i.e. resolution).
>It could be done by giving a range of resolution
> and relying on people knowing that larger scales are actually going to
> be missing; I think that anyone who hasn't grasped that is not going
> to understand a description of the visibility data in terms of uv
> distances either.
> Filling factor and
> 2.1 - bottom para on p.6 - can we drop this? I am not sure what it is
> adding to the model. Systematic small gaps in coverage arise in other
> situations than the ones described and are dealt with in different
> ways, e.g. visibility data for an imaging field containing a pulsar
> where only on-pulse data have been recorded;
ok We agree, this concept is not fully worked out. It was meant to be
used for the description of data quality. There are probably other ways
to characterize data with a complex fine structure. Let's drop this
paragraph
for now.
> 3.1
>
> We need to be careful with regard to Data Retrieval; we already have a
> Registry standard. There are many differences between describing a
> data archive - which may contain very heterogenous data - and
> describing a single image or a collection of images (or spectra
> etc. etc.) in enough detail for extraction and processing. As far as
> I can see some of the top layers of the Characterisation model could
> map to the Registry model but I would like to see this done explicitly
> with cut-offs.
We do not totally catch you here, what is "cut-off" for here?
>If some aspects of the Registry model (e.g. using the
> 7 or so band names - radio, mm, IR...) don;t fit comfortably because
> they are too coarse that isn't a problem as long as the
> Characterisation and Registry models don't pretend that they are
> describing the _same_ thing in different ways.
Well: that's a tricky point:
1) Registry RM is dealing with "Resources"
and Characterization with individual (but also collections of) DataSets.
This may intersect in some cases, but generally not: not all
Observations
or even collections will gain a "Resource" status.
2) Characterization is layered while Registry RM is not.
Anyway that would have been nice if Characterization level 1 and RM
could have shared the same formalism. Historically it was not possible, but
a mapping should be done. The actual values will differ of course.
> I also think that we should see VOs in general and Characterisation in
> particular as being mainly aimed at providing data which will be
> passed between other VO tools. Thus 'Characterisation ... is necesary
> .. but not sufficient .. also requires details of observing...'
> worries me. In many cases I hope that the Characterisation model will
> be sufficient because we should be encouraging data providers to
> supply data with instrumetal signatures already removed, so that other
> VO tools can tackle the images etc. Observational details should be
> provided as part of giving the data a history, and we do already have
> that in the Observation DM, but they are only directly relevant to
> Characterisation if we have tools to use them (e.g. a special tool to
> convert Chandra counts or 2MASS magnitudes to physical units).
Hmmm! Does VO have to be a data "police" ? What about "old data". What
about reprocessing of raw data? Potential use of the data is different
according to the level of available description. Eg: Uncalibrated spectra
are usefull to study line ratio in the same domain, no ?
> Fig. 2
>
> I was amused to see Observatory Location under Coverage_SPATIAL
This class comes from the re-use of the STC classes for coordinates.
it appears in the diagram because we show the STC subclasses , and not
one big STC blackbox.
Is this misleading?
- is
> this so that you at least know what hemisphere the image is in? I
> think that level of coarseness is OK for the Registry but
> Characterisation should be describing specific data a bit better than
> that (unless it is an all-sky map...)
>
> best wishes
>
> Anita
>
>
>
>
>
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Dr.
> Anita M. S. Richards, AstroGrid Astronomer
> MERLIN/VLBI National Facility, University of Manchester, Jodrell Bank
> Observatory, Macclesfield, Cheshire SK11 9DL, U.K. tel +44 (0)1477
> 572683 (direct); 571321 (switchboard); 571618 (fax).
>
>
Mireille Louys and François Bonnarel
=====================================================================
Francois Bonnarel Observatoire Astronomique de Strasbourg
CDS (Centre de donnees 11, rue de l'Universite
astronomiques de Strasbourg) F--67000 Strasbourg (France)
Tel: +33-(0)3 90 24 24 11 WWW: http://cdsweb.u-strasbg.fr/people/fb.html
Fax: +33-(0)3 90 24 24 25 E-mail: bonnarel at astro.u-strasbg.fr
---------------------------------------------------------------------
More information about the dm
mailing list