Vocabularising dataproduct_type

Markus Demleitner msdemlei at ari.uni-heidelberg.de
Wed Mar 25 11:51:23 CET 2020


Dear Alberto,

On Wed, Mar 25, 2020 at 10:15:53AM +0100, alberto micol wrote:
> 1.- I’d like to reassure that “measurements” is used in ObsCore, at least at ESO, where currently:
> 
[...]
> 
> (maybe, Markus, your query was for “measurement” without the ending “s” ?)

Whoops.  Now that I check the query
(http://mail.ivoa.net/pipermail/dal/2020-March/008291.html), that is
quite obviously what happened.  Apologies all around for the fake
news.

> 2.- Our measurements products are FITS binary tables of 3 subtypes: 
> - catalog: scientific catalogue (typically all-sky) in single FITS
> binary table (26 such catalogs)

That, I think, is what the original catalog -> measurements renaming
strived to prevent.

I don't feel strongly about having them in obscore, but right now no
recommended discovery pattern for this kind of thing will find them
(they'll show up in GloTS and hence TOPCAT, ok, but that's
non-standard).

To fit well into the VO, it would be great if these catalogues got
proper registry records as well so TAP and (as applicable) SCS
clients will find them, and they're present in the VO with reasonably
complete metadata.  I'm happy to help if you're not sure how to go
about that.

> - catalogtile: one FITS binary table for each of the tiles an ESO Public Survey (or other observing programmes) is partitioned into 
>   (22,502 such catalog tiles)

Hm... what's the scenario here, i.e., why would people looking for
such a tile run an obscore query in the first place?  Is this a
pattern you expect other data centres to follow?

> - srctbl: source tables derived from individual images (~370,000 such srctbl)

I think that is what the Obscore authors had in mind when they put in
#measurements -- is that right?

> 3- I want to stress that we make a distinction between “sources” and physical “objects"
> 
> sources: are detections on single images (single bands). It is not
> given that a detection is for a real object, it could be just only
> a spurious detection. In this sense, sources are not yet objects,
> unless they get confirmed into “objects" by the analysis process
> (see physical objects)
> 
> physical objects, e.g.:
> - objects in catalog tiles: sources in different images (e.g. in
> multiple wavelength bands) recognised to be detections of the same
> object (cross-correlation implied)
> - objects in all-sky catalogs: whereby typically the measurements
> are derived from 1 or multiple spectra of the same object

I think this is a very interesting distinction to make, and that
could help us to improve the definition of #measurements.  What if we
said:

  [#measurements is] tabular data containing a list of sources, i.e.,
  simple detections in some observation, not necessarily
  corresponding to physical objects.  Catalogues of physical objects
  derived from further analysis of such measurements are not covered
  by this term.

Does that convey the original obscore intention?

This would also make #measurements a parent of #event, I think.

> From the above you can immediately understand that I fully second
> Laurent: A catalog can be derived from many source tables (e.g. via
> cross-correlation of source tables in different bands).

Ye...es -- the question is: do we *want* "measurements" to mean that?
So far, I think the obscore authors' answer would be no.  But of
course given that that is not entirely clear from the obscore spec,
and, more importantly, there is actual usage of the term in the wider
sense (i.e., it's both object and source lists), I'd be open to
change the meaning of #measurements to, perhaps, "any tabular data
containing coordinates" (though again I wonder if that's a concept
useful for discovery).

> 4- multi-typed catalogs: some all-sky catalogs (e.g. the PESSTO
> multi-epoch and multi-band photometry) are actually time series of
> SEDs, (I should probably change the subtype to SED to make it
> discoverable), while others are simpler (e.g. NGTS) light curves,
> ie time series of photometric points in one single band (that’s
> where the bulk of the 40 billion records (31E9) come from).

I'd say these, again, should be separate services (probably SSA, in
these cases).  I don't think you're doing anyone a service by dumping
huge and complex FITS tables into obscore.  But that, again, is an
orthogonal discussion that should take place in a separate thread and
exclusively on DAL.

Thanks,

          Markus


More information about the dal mailing list