[Cube/vo-dml] ivoa datatypes

Laurino, Omar olaurino at cfa.harvard.edu
Tue May 6 10:48:02 PDT 2014


Carlos,

you implemented one of the prototypes I presented in Hawaii, so you saw
that in action, even though that happened a while ago, and we were using
prototype model descriptions. Also, I now realize we made some mistakes,
which is to expect for a prototype. When you want we can look at those
issues and fix them.

In any case in the prototype you correctly use the pattern that Pierre
suggested, with the UCD, Unit, and datatype in the FIELDs.

Your prototype is also a good example of how the proposal is backward
compatible, since your FIELDs have the "old-style" UTYPEs, while the
FIELDrefs have the "new-style" ones, so that an existing client can make
sense of the file even though they ignore VODML GROUPS.

Omar.




On Tue, May 6, 2014 at 12:04 PM, Carlos Rodrigo <crb at cab.inta-csic.es>wrote:

> Hi
>
> I have always had a doubt that could have something to do with this
> discussion (if I'm not
> understanding everything wrong)
>
> I want to serialize an spectrum in a votable.
> I have two fields: wavelength and flux.
>
> <FIELD name="WAVELENGTH" utype="spec:Data.SpectralAxis.Value" ucd="em.wl"
> unit="angstrom"
> datatype="double"/>
> <FIELD name="FLUX" utype="spec:Data.FluxAxis.Value"
> ucd="phot.flux.density;em.wl" unit="erg/cm2/s/A"
> datatype="double"/>
>
> the information about ucd, unit and also name for the Spectral and Flux
> axis is given there.
>
> But reading the Spectrum DM (at least version 2.0, but I think that it was
> similar in the previous
> one and in other DataModels) I get the impression that I must duplicate
> this information in a
> Characterization group:
>
> <GROUP name="Characterization">
>  <GROUP name="Char.FluxAxis" utype="spec:Char.FluxAxis">
>   <PARAM name="FluxAxisName" utype="spec:Char.FluxAxis.name" value="FLUX"
>  .../>
>   <PARAM name="FluxAxisUcd"  utype="spec:Char.FluxAxis.ucd"
>  value="phot.flux.density;em.wl" .../>
>   <PARAM name="FluxAxisUnit" utype="spec:Char.FluxAxis.unit"
> value="erg/cm2/s/A" .../>
>  </GROUP>
>  <GROUP name="Char.SpectralAxis">
>   <PARAM name="SpectralAxisName" utype="spec:Char.SpectralAxis.name"
> value="WAVELENGTH" .../>
>   <PARAM name="SpectralAxisUcd"  utype="spec:Char.SpectralAxis.ucd"
>  value="em.wl" .../>
>   <PARAM name="SpectralAxisUnit" utype="spec:Char.SpectralAxis.unit"
> value="angstrom" .../>
>   </GROUP>
> </GROUP>
>
> where I say again the name, ucd and unit for the spectral and flux axis.
>
> Is that really needed? what for? I've always found this odd.
>
> Carlos
>
> On 06/05/14 17:03, Laurino, Omar wrote:
> > Hi Pierre,
> >
> >
> >
> >     May I precise my position.
> >
> >
> > Your feedback has been valuable in the Tiger Team and is always welcome.
> >
> > =====
> > TL;DR reply (more details follow):
> >
> >     I said one year ago that the VO-DML VOTable serialization proposed
> by Gerard tended to move some
> >     meta information such as *UCD*, *unit *or *datatype *outside the
> VOTable FIELD entity towards
> >     the proposed GROUP VO-DML hierarchy extension. I noted that this
> point would be extremely
> >     annoying for all VOTable clients such as TOPcat or Aladin for which
> this metadata information
> >     must stay in the FIELD entities.
> >
> >
> > I am not sure what you are exactly referring to. If it is what Gerard
> commented on, yes, this was
> > fixed long ago after you made this comment.
> >
> > If it is not, I am giving more information in the second part, but in
> summary we are trying to
> > standardize the serialization of Data Models also for the reason you
> mention: allowing clients to
> > know where to look for metadata, which is tricky, to say the least, with
> the current usages and
> > standards (see the second part of the email for details and examples).
> >
> >     For bypassing this issue, and if I correctly understand the current
> 2014-05-03 XML basic IVOA
> >     model description
> >     (
> https://volute.googlecode.com/svn/trunk/projects/dm/vo-dml/models/ivoa/IVOA.vo-dml.xml),
> the
> >     "quantity" entry duplicates now the UCD role and unit role.
> >
> >
> > We are not duplicating existing standards, we are defining a
> standardized way to describe and
> > serialize data models in a machine-readable way. You might be confusing
> the two levels of the
> > solution, which correspond to two different documents: VODML
> descriptions of data models, and the
> > serialization of such data models in VOTable. In the second document we
> use the standardized units
> > and ucd and the corresponding VOTable standard attributes.
> >
> >
> >
> >       And I have to say that the current basic IVOA model appears for me
> too heteroclite to be used
> >     without fear: "identity, rational, complex, duration, anyURI,
> boolean, real, nonnegativeInteger,
> >     datetime, integer, string, quantity". For a no-DM person, it is
> quite difficult to understand
> >     why such or such data type is considered as a basic datatype
> (duration ? datetime ? anyURI ?),
> >     and why others are not (char ?, range ? frequency ? ...).
> >
> >
> > Where to draw the line is a good question, and the current descriptions
> have been there to be
> > commented for about a year, so we are happy we are finally discussing
> them!
> >
> > =====
> >
> >
> > More detailed responses below.
> >
> >
> >
> >     I said one year ago that the VO-DML VOTable serialization proposed
> by Gerard tended to move some
> >     meta information such as *UCD*, *unit *or *datatype *outside the
> VOTable FIELD entity towards
> >     the proposed GROUP VO-DML hierarchy extension. I noted that this
> point would be extremely
> >     annoying for all VOTable clients such as TOPcat or Aladin for which
> this metadata information
> >     must stay in the FIELD entities.
> >
> >
> >
> > I am not sure whether you refer to the fact that in an early proof of
> concept serialization there
> > were standalone PARAMs for unit and ucd. If that's the case, as Gerard
> pointed out this was fixed
> > long ago in response to your feedback and the result is in section 6.8
> of the UTYPEs draft we
> > presented in Heidelberg one year ago, as well as in the actual examples
> (Reference 1 below).
> >
> > It may also sound like you are worried about FIELDref having the UCD
> metadata as opposed to FIELDs.
> > If that's the case, there are several current standards and production
> implementations that use UCDs
> > in FIELDrefs. I am not going to elaborate too much on this, since I am
> not sure whether this is
> > really what you meant, but I will give a couple of references, just in
> case. The PhotDM, in section
> > C.2 (Reference 2) provides an example of a Cone Search response, and use
> FIELDrefs (with UCDs).
> > FIELDs are not even mentioned. This is, I believe, taken directly to the
> note by Sebastien et al
> > (Reference 3) on how to serialize Photometry Measurements in VOTable.
> The only examples that makes
> > use of FIELDs (section 4.1 and 4.2) have two sets of (different) UCDs,
> one for the FIELDs and one
> > for the FIELDrefs. The other examples do not mention FIELDs.
> >
> >
> > In any case, whether you meant the first or the second interpretation,
> more generally, the problem
> > is that the current standards make it hard for clients to make sense of
> the metadata, and this is
> > one of the reasons why we are trying to standardize the serialization of
> data models: to make
> > clients' life easier.
> >
> > As far as I know this only applies to UCDs and UTYPEs, because FIELDrefs
> can only have these
> > attributes (Reference 4, Sections 7.2).
> >
> > Some models (e.g. Spectrum 1.1, Reference 5) define reify UCDs by
> creating UCD fields in the model
> > (thus creating many *.ucd UTYPEs). For instance, see the VOTable example
> in section 8.2 (I'm
> > including a snippet for convenience):
> >
> >     <PARAM ID="DataFluxUcd" datatype="char" name="DataFluxUcd"
> >     utype="spec:Spectrum.Data.FluxAxis.Ucd"
> value="phot.flux.density;em.wl" arraysize="*">
> >     <DESCRIPTION>UCD for flux</DESCRIPTION>
> >     </PARAM>
> >
> >
> > Notice that, as opposed to Gerard's 2012 proof of concept, this is
> stated in a *standard* document.
> >
> > The status quo is that a client parsing a *standard* Spectrum 1.1
> VOTable (I am using the example
> > above, but there may be other examples in other models) can find a UCD
> in many different places:
> >   - a FIELDref with @utype spec:Spectrum.Data.FluxAxis
> >   - a FIELD referenced by a FIELDref and without a @utype
> >   - a FIELD with @utype spec:Spectrum.Data.FluxAxis
> >   - a PARAM with @utype spec:Spectrum.Data.FluxAxis.Ucd
> >   - a TD relative to a FIELD with @utype spec:Spectrum.Data.FluxAxis.Ucd
> >
> > This is what we are trying to standardize, so that it is clear to
> clients how to look for metadata
> > in an unambiguous way. Even better, with a standard like the one
> suggested by the Tiger Team,
> > parsing a VOTable according to a data model becomes a mechanical effort,
> so that users and
> > developers can use libraries, which is currently impossible (if not
> convinced by the above example
> > see the Current Usages document, Reference 6).
> >
> >
> >
> >     For bypassing this issue, and if I correctly understand the current
> 2014-05-03 XML basic IVOA
> >     model description
> >     (
> https://volute.googlecode.com/svn/trunk/projects/dm/vo-dml/models/ivoa/IVOA.vo-dml.xml),
> the
> >     "quantity" entry duplicates now the UCD role and unit role.
> >
> >
> > I believe you are confusing two levels, which are represented by two
> documents. One level is the
> > data model description. Data Models can (in fact they do, see Sprectrum
> 1.1) define ucd and unit as
> > fields of their models, reifying them. Even when they don't, there are
> some cases (I can provide
> > examples from production services) where the data publisher needs to
> reify some of the metadata. For
> > instance consider a column where the same quantity is expressed in
> different units: in this case the
> > unit piece of metadata becomes data and you need a column to store them.
> >
> > So, VODML supports all of these real world examples. This has nothing to
> do with VOTable or any
> > other serialization. As a matter of fact, VODML is indeed an effort to
> make serializations of Data
> > Models interoperable.
> >
> > The other level is the one of serialization. since VOTable has a @ucd
> attribute, it's smart to use
> > it, and that's what we do in the serialization document.
> >
> >
> >
> >     Personally, I am not sure that this solution to duplicate this kind
> of information will be the
> >     more appropriate approach: 1) we redo our VO efforts already done on
> UCDs and units...
> >
> >
> > Nope. When you serialize a data model instance in VOTable you use the
> standard UCDs and Units, and
> > the standard VOTable attributes for them (again, section 6.8 in the
> UTYPEs WD).
> >
> >
> >     2) we will have to manage correspondances between
> FIELD-UCD/FIELD-unit and VO-DML-quantity.
> >
> >
> > You already need to do that now, but with VODML and the serialization
> document there is a standard
> > to be implemented, applications developers do not need to "guess", or to
> assume conventions.
> >
> >
> >       And I have to say that the current basic IVOA model appears for me
> too heteroclite to be used
> >     without fear: "identity, rational, complex, duration, anyURI,
> boolean, real, nonnegativeInteger,
> >     datetime, integer, string, quantity". For a no-DM person, it is
> quite difficult to understand
> >     why such or such data type is considered as a basic datatype
> (duration ? datetime ? anyURI ?),
> >     and why others are not (char ?, range ? frequency ? ...).
> >
> >
> > Where to draw the line is a good question, and the current descriptions
> have been there to be
> > commented for about a year, so we are happy we are finally discussing
> them!
> >
> > Notice, however, that a no-DM person shouldn't care: VODML descriptions
> are meant to be used by
> > software developers who need to know how to map the IVOA types to their
> language, and only DM people
> > need to create models, so...
> >
> > Primitive types are special in that they need to be defined beforehand
> so that developers can map
> > them to their own "primitive" classes or structures. All other types can
> be derived from them, and
> > that can be done mechanically in any language (we have prototypes and
> reference implementations in
> > Java and Python already, as I showed in Hawaii).
> >
> > I believe primitive types should all be domain-independent: frequency is
> a physics concept, you
> > won't find it as a primitive type in MySQL or Java, while datetime is
> general and can be found in
> > both (I am using "primitive" in a broad sense, not in a
> language-specific sense... e.g. Java doesn't
> > have a datetime "primitive", but datetime and duration have
> corresponding classes in the standard
> > Java library).
> >
> > Also, they should map at least to the VOTable concepts.
> >
> > Of course, this is all to some extent arbitrary and fuzzy. For instance,
> you mention char and
> > duration: the first one would be good to include because it maps
> directly to a VOTable datatype. The
> > second one is really on the fuzzy edge. I think it makes sense to
> include it among the primitive
> > types, but I wouldn't be against leaving it out of the list.
> >
> > Thanks for the feedback!
> >
> > Omar.
> >
> > Reference 1. UTYPEs WD
> http://volute.googlecode.com/svn/trunk/projects/dm/vo-dml/doc/UTYPEs-WD-v1.0.pdf
> >
> > Reference 2. PhotDM REC
> http://www.ivoa.net/documents/PHOTDM/20130928/PR-PhotDM-1.0-20130928.pdf
> >
> > Reference 3. PhotDM in VOTAble
> > NOTE
> http://wiki.ivoa.net/internal/IVOA/PhotometryDataModel/NOTE-PPDMDesc-0.1-20101202.pdf
> >
> > Reference 4. VOTable 1.3
> > REC
> http://www.ivoa.net/documents/VOTable/20130920/REC-VOTable-1.3-20130920.html
> >
> > Reference 5. Spectrum 1.1
> > REC
> http://www.ivoa.net/documents/SpectrumDM/20111120/REC-SpectrumDM-1.1-20111120.pdf
> >
> > Reference 6. UTYPEs: Current Usages
> > NOTE
> http://www.ivoa.net/documents/Notes/UTypesUsage/20130213/NOTE-utypes-usage-1.0-20130213.html
> >
> >
> > --
> > Omar Laurino
> > Smithsonian Astrophysical Observatory
> > Harvard-Smithsonian Center for Astrophysics
> > 100 Acorn Park Dr. R-377 MS-81
> > 02140 Cambridge, MA
> > (617) 495-7227
>
>


-- 
Omar Laurino
Smithsonian Astrophysical Observatory
Harvard-Smithsonian Center for Astrophysics
100 Acorn Park Dr. R-377 MS-81
02140 Cambridge, MA
(617) 495-7227
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ivoa.net/pipermail/dm/attachments/20140506/d14cce90/attachment-0001.html>


More information about the dm mailing list