[Cube/vo-dml] ivoa datatypes

Carlos Rodrigo crb at cab.inta-csic.es
Tue May 6 09:04:29 PDT 2014


Hi

I have always had a doubt that could have something to do with this discussion (if I'm not
understanding everything wrong)

I want to serialize an spectrum in a votable.
I have two fields: wavelength and flux.

<FIELD name="WAVELENGTH" utype="spec:Data.SpectralAxis.Value" ucd="em.wl" unit="angstrom"
datatype="double"/>
<FIELD name="FLUX" utype="spec:Data.FluxAxis.Value" ucd="phot.flux.density;em.wl" unit="erg/cm2/s/A"
datatype="double"/>

the information about ucd, unit and also name for the Spectral and Flux axis is given there.

But reading the Spectrum DM (at least version 2.0, but I think that it was similar in the previous
one and in other DataModels) I get the impression that I must duplicate this information in a
Characterization group:

<GROUP name="Characterization">
 <GROUP name="Char.FluxAxis" utype="spec:Char.FluxAxis">
  <PARAM name="FluxAxisName" utype="spec:Char.FluxAxis.name" value="FLUX"  .../>
  <PARAM name="FluxAxisUcd"  utype="spec:Char.FluxAxis.ucd"  value="phot.flux.density;em.wl" .../>
  <PARAM name="FluxAxisUnit" utype="spec:Char.FluxAxis.unit" value="erg/cm2/s/A" .../>
 </GROUP>
 <GROUP name="Char.SpectralAxis">
  <PARAM name="SpectralAxisName" utype="spec:Char.SpectralAxis.name" value="WAVELENGTH" .../>
  <PARAM name="SpectralAxisUcd"  utype="spec:Char.SpectralAxis.ucd"  value="em.wl" .../>
  <PARAM name="SpectralAxisUnit" utype="spec:Char.SpectralAxis.unit" value="angstrom" .../>
  </GROUP>
</GROUP>

where I say again the name, ucd and unit for the spectral and flux axis.

Is that really needed? what for? I've always found this odd.

Carlos

On 06/05/14 17:03, Laurino, Omar wrote:
> Hi Pierre,
> 
>  
> 
>     May I precise my position.
> 
> 
> Your feedback has been valuable in the Tiger Team and is always welcome.
> 
> =====
> TL;DR reply (more details follow):
> 
>     I said one year ago that the VO-DML VOTable serialization proposed by Gerard tended to move some
>     meta information such as *UCD*, *unit *or *datatype *outside the VOTable FIELD entity towards
>     the proposed GROUP VO-DML hierarchy extension. I noted that this point would be extremely
>     annoying for all VOTable clients such as TOPcat or Aladin for which this metadata information
>     must stay in the FIELD entities.
> 
> 
> I am not sure what you are exactly referring to. If it is what Gerard commented on, yes, this was
> fixed long ago after you made this comment.
> 
> If it is not, I am giving more information in the second part, but in summary we are trying to
> standardize the serialization of Data Models also for the reason you mention: allowing clients to
> know where to look for metadata, which is tricky, to say the least, with the current usages and
> standards (see the second part of the email for details and examples).
> 
>     For bypassing this issue, and if I correctly understand the current 2014-05-03 XML basic IVOA
>     model description
>     (https://volute.googlecode.com/svn/trunk/projects/dm/vo-dml/models/ivoa/IVOA.vo-dml.xml), the
>     "quantity" entry duplicates now the UCD role and unit role.
> 
>  
> We are not duplicating existing standards, we are defining a standardized way to describe and
> serialize data models in a machine-readable way. You might be confusing the two levels of the
> solution, which correspond to two different documents: VODML descriptions of data models, and the
> serialization of such data models in VOTable. In the second document we use the standardized units
> and ucd and the corresponding VOTable standard attributes.
> 
>  
> 
>       And I have to say that the current basic IVOA model appears for me too heteroclite to be used
>     without fear: "identity, rational, complex, duration, anyURI, boolean, real, nonnegativeInteger,
>     datetime, integer, string, quantity". For a no-DM person, it is quite difficult to understand
>     why such or such data type is considered as a basic datatype (duration ? datetime ? anyURI ?),
>     and why others are not (char ?, range ? frequency ? ...).
> 
> 
> Where to draw the line is a good question, and the current descriptions have been there to be
> commented for about a year, so we are happy we are finally discussing them!
> 
> =====
> 
> 
> More detailed responses below.
> 
>  
> 
>     I said one year ago that the VO-DML VOTable serialization proposed by Gerard tended to move some
>     meta information such as *UCD*, *unit *or *datatype *outside the VOTable FIELD entity towards
>     the proposed GROUP VO-DML hierarchy extension. I noted that this point would be extremely
>     annoying for all VOTable clients such as TOPcat or Aladin for which this metadata information
>     must stay in the FIELD entities.
> 
> 
> 
> I am not sure whether you refer to the fact that in an early proof of concept serialization there
> were standalone PARAMs for unit and ucd. If that's the case, as Gerard pointed out this was fixed
> long ago in response to your feedback and the result is in section 6.8 of the UTYPEs draft we
> presented in Heidelberg one year ago, as well as in the actual examples (Reference 1 below).
> 
> It may also sound like you are worried about FIELDref having the UCD metadata as opposed to FIELDs.
> If that's the case, there are several current standards and production implementations that use UCDs
> in FIELDrefs. I am not going to elaborate too much on this, since I am not sure whether this is
> really what you meant, but I will give a couple of references, just in case. The PhotDM, in section
> C.2 (Reference 2) provides an example of a Cone Search response, and use FIELDrefs (with UCDs).
> FIELDs are not even mentioned. This is, I believe, taken directly to the note by Sebastien et al
> (Reference 3) on how to serialize Photometry Measurements in VOTable. The only examples that makes
> use of FIELDs (section 4.1 and 4.2) have two sets of (different) UCDs, one for the FIELDs and one
> for the FIELDrefs. The other examples do not mention FIELDs.
> 
> 
> In any case, whether you meant the first or the second interpretation, more generally, the problem
> is that the current standards make it hard for clients to make sense of the metadata, and this is
> one of the reasons why we are trying to standardize the serialization of data models: to make
> clients' life easier. 
> 
> As far as I know this only applies to UCDs and UTYPEs, because FIELDrefs can only have these
> attributes (Reference 4, Sections 7.2).
> 
> Some models (e.g. Spectrum 1.1, Reference 5) define reify UCDs by creating UCD fields in the model
> (thus creating many *.ucd UTYPEs). For instance, see the VOTable example in section 8.2 (I'm
> including a snippet for convenience):
> 
>     <PARAM ID="DataFluxUcd" datatype="char" name="DataFluxUcd"
>     utype="spec:Spectrum.Data.FluxAxis.Ucd" value="phot.flux.density;em.wl" arraysize="*">
>     <DESCRIPTION>UCD for flux</DESCRIPTION>
>     </PARAM>
> 
> 
> Notice that, as opposed to Gerard's 2012 proof of concept, this is stated in a *standard* document.
> 
> The status quo is that a client parsing a *standard* Spectrum 1.1 VOTable (I am using the example
> above, but there may be other examples in other models) can find a UCD in many different places:
>   - a FIELDref with @utype spec:Spectrum.Data.FluxAxis
>   - a FIELD referenced by a FIELDref and without a @utype
>   - a FIELD with @utype spec:Spectrum.Data.FluxAxis
>   - a PARAM with @utype spec:Spectrum.Data.FluxAxis.Ucd
>   - a TD relative to a FIELD with @utype spec:Spectrum.Data.FluxAxis.Ucd
> 
> This is what we are trying to standardize, so that it is clear to clients how to look for metadata
> in an unambiguous way. Even better, with a standard like the one suggested by the Tiger Team,
> parsing a VOTable according to a data model becomes a mechanical effort, so that users and
> developers can use libraries, which is currently impossible (if not convinced by the above example
> see the Current Usages document, Reference 6).
> 
>  
> 
>     For bypassing this issue, and if I correctly understand the current 2014-05-03 XML basic IVOA
>     model description
>     (https://volute.googlecode.com/svn/trunk/projects/dm/vo-dml/models/ivoa/IVOA.vo-dml.xml), the
>     "quantity" entry duplicates now the UCD role and unit role.
> 
> 
> I believe you are confusing two levels, which are represented by two documents. One level is the
> data model description. Data Models can (in fact they do, see Sprectrum 1.1) define ucd and unit as
> fields of their models, reifying them. Even when they don't, there are some cases (I can provide
> examples from production services) where the data publisher needs to reify some of the metadata. For
> instance consider a column where the same quantity is expressed in different units: in this case the
> unit piece of metadata becomes data and you need a column to store them.
> 
> So, VODML supports all of these real world examples. This has nothing to do with VOTable or any
> other serialization. As a matter of fact, VODML is indeed an effort to make serializations of Data
> Models interoperable.
> 
> The other level is the one of serialization. since VOTable has a @ucd attribute, it's smart to use
> it, and that's what we do in the serialization document.
> 
>  
> 
>     Personally, I am not sure that this solution to duplicate this kind of information will be the
>     more appropriate approach: 1) we redo our VO efforts already done on UCDs and units... 
> 
> 
> Nope. When you serialize a data model instance in VOTable you use the standard UCDs and Units, and
> the standard VOTable attributes for them (again, section 6.8 in the UTYPEs WD).
>  
> 
>     2) we will have to manage correspondances between FIELD-UCD/FIELD-unit and VO-DML-quantity.
> 
> 
> You already need to do that now, but with VODML and the serialization document there is a standard
> to be implemented, applications developers do not need to "guess", or to assume conventions.
>  
> 
>       And I have to say that the current basic IVOA model appears for me too heteroclite to be used
>     without fear: "identity, rational, complex, duration, anyURI, boolean, real, nonnegativeInteger,
>     datetime, integer, string, quantity". For a no-DM person, it is quite difficult to understand
>     why such or such data type is considered as a basic datatype (duration ? datetime ? anyURI ?),
>     and why others are not (char ?, range ? frequency ? ...). 
> 
> 
> Where to draw the line is a good question, and the current descriptions have been there to be
> commented for about a year, so we are happy we are finally discussing them!
> 
> Notice, however, that a no-DM person shouldn't care: VODML descriptions are meant to be used by
> software developers who need to know how to map the IVOA types to their language, and only DM people
> need to create models, so...
> 
> Primitive types are special in that they need to be defined beforehand so that developers can map
> them to their own "primitive" classes or structures. All other types can be derived from them, and
> that can be done mechanically in any language (we have prototypes and reference implementations in
> Java and Python already, as I showed in Hawaii).
> 
> I believe primitive types should all be domain-independent: frequency is a physics concept, you
> won't find it as a primitive type in MySQL or Java, while datetime is general and can be found in
> both (I am using "primitive" in a broad sense, not in a language-specific sense... e.g. Java doesn't
> have a datetime "primitive", but datetime and duration have corresponding classes in the standard
> Java library).
> 
> Also, they should map at least to the VOTable concepts.
> 
> Of course, this is all to some extent arbitrary and fuzzy. For instance, you mention char and
> duration: the first one would be good to include because it maps directly to a VOTable datatype. The
> second one is really on the fuzzy edge. I think it makes sense to include it among the primitive
> types, but I wouldn't be against leaving it out of the list.
> 
> Thanks for the feedback!
> 
> Omar.
> 
> Reference 1. UTYPEs WD http://volute.googlecode.com/svn/trunk/projects/dm/vo-dml/doc/UTYPEs-WD-v1.0.pdf
> 
> Reference 2. PhotDM REC http://www.ivoa.net/documents/PHOTDM/20130928/PR-PhotDM-1.0-20130928.pdf
> 
> Reference 3. PhotDM in VOTAble
> NOTE http://wiki.ivoa.net/internal/IVOA/PhotometryDataModel/NOTE-PPDMDesc-0.1-20101202.pdf
> 
> Reference 4. VOTable 1.3
> REC http://www.ivoa.net/documents/VOTable/20130920/REC-VOTable-1.3-20130920.html
> 
> Reference 5. Spectrum 1.1
> REC http://www.ivoa.net/documents/SpectrumDM/20111120/REC-SpectrumDM-1.1-20111120.pdf
> 
> Reference 6. UTYPEs: Current Usages
> NOTE http://www.ivoa.net/documents/Notes/UTypesUsage/20130213/NOTE-utypes-usage-1.0-20130213.html
> 
> 
> -- 
> Omar Laurino
> Smithsonian Astrophysical Observatory
> Harvard-Smithsonian Center for Astrophysics
> 100 Acorn Park Dr. R-377 MS-81
> 02140 Cambridge, MA
> (617) 495-7227



More information about the dm mailing list