[Cube/vo-dml] ivoa datatypes

Laurino, Omar olaurino at cfa.harvard.edu
Tue May 6 12:35:11 PDT 2014


Carlos,


> My question was actually more about how data models are expected to be
> serialized right now, before
> vo-dml. That is, a typical votable serialization of any data model
> (spectrum, photometry, etc).
> There is where I'm still confused.
>

I understand now, sorry for the confusion. The thing is that right now how
to serialize a data model depends on the data model. VODML and the VODML
mapping documents are trying to provide a single, consistent,
model-independent framework for making life easier to implementors, so that
your question would have a simple answer that is not "it depends". :)

I don't think your question was out of place, I just thought you were
referring to the tiger team proposal.

Omar.


>
> But I didn't notice that we were talking about vo-dml so the question may
> be out of place.
>
> Thanks
>
> Carlos
>
> On 06/05/14 19:48, Laurino, Omar wrote:
> > Carlos,
> >
> > you implemented one of the prototypes I presented in Hawaii, so you saw
> that in action, even though
> > that happened a while ago, and we were using prototype model
> descriptions. Also, I now realize we
> > made some mistakes, which is to expect for a prototype. When you want we
> can look at those issues
> > and fix them.
> >
> > In any case in the prototype you correctly use the pattern that Pierre
> suggested, with the UCD,
> > Unit, and datatype in the FIELDs.
> >
> > Your prototype is also a good example of how the proposal is backward
> compatible, since your FIELDs
> > have the "old-style" UTYPEs, while the FIELDrefs have the "new-style"
> ones, so that an existing
> > client can make sense of the file even though they ignore VODML GROUPS.
> >
> > Omar.
> >
> >
> >
> >
> > On Tue, May 6, 2014 at 12:04 PM, Carlos Rodrigo <crb at cab.inta-csic.es<mailto:
> crb at cab.inta-csic.es>>
> > wrote:
> >
> >     Hi
> >
> >     I have always had a doubt that could have something to do with this
> discussion (if I'm not
> >     understanding everything wrong)
> >
> >     I want to serialize an spectrum in a votable.
> >     I have two fields: wavelength and flux.
> >
> >     <FIELD name="WAVELENGTH" utype="spec:Data.SpectralAxis.Value"
> ucd="em.wl" unit="angstrom"
> >     datatype="double"/>
> >     <FIELD name="FLUX" utype="spec:Data.FluxAxis.Value"
> ucd="phot.flux.density;em.wl" unit="erg/cm2/s/A"
> >     datatype="double"/>
> >
> >     the information about ucd, unit and also name for the Spectral and
> Flux axis is given there.
> >
> >     But reading the Spectrum DM (at least version 2.0, but I think that
> it was similar in the previous
> >     one and in other DataModels) I get the impression that I must
> duplicate this information in a
> >     Characterization group:
> >
> >     <GROUP name="Characterization">
> >      <GROUP name="Char.FluxAxis" utype="spec:Char.FluxAxis">
> >       <PARAM name="FluxAxisName" utype="spec:Char.FluxAxis.name <
> http://Char.FluxAxis.name>"
> >     value="FLUX"  .../>
> >       <PARAM name="FluxAxisUcd"  utype="spec:Char.FluxAxis.ucd"
>  value="phot.flux.density;em.wl" .../>
> >       <PARAM name="FluxAxisUnit" utype="spec:Char.FluxAxis.unit"
> value="erg/cm2/s/A" .../>
> >      </GROUP>
> >      <GROUP name="Char.SpectralAxis">
> >       <PARAM name="SpectralAxisName" utype="spec:Char.SpectralAxis.name
> >     <http://Char.SpectralAxis.name>" value="WAVELENGTH" .../>
> >       <PARAM name="SpectralAxisUcd"  utype="spec:Char.SpectralAxis.ucd"
>  value="em.wl" .../>
> >       <PARAM name="SpectralAxisUnit" utype="spec:Char.SpectralAxis.unit"
> value="angstrom" .../>
> >       </GROUP>
> >     </GROUP>
> >
> >     where I say again the name, ucd and unit for the spectral and flux
> axis.
> >
> >     Is that really needed? what for? I've always found this odd.
> >
> >     Carlos
> >
> >     On 06/05/14 17:03, Laurino, Omar wrote:
> >     > Hi Pierre,
> >     >
> >     >
> >     >
> >     >     May I precise my position.
> >     >
> >     >
> >     > Your feedback has been valuable in the Tiger Team and is always
> welcome.
> >     >
> >     > =====
> >     > TL;DR reply (more details follow):
> >     >
> >     >     I said one year ago that the VO-DML VOTable serialization
> proposed by Gerard tended to
> >     move some
> >     >     meta information such as *UCD*, *unit *or *datatype *outside
> the VOTable FIELD entity towards
> >     >     the proposed GROUP VO-DML hierarchy extension. I noted that
> this point would be extremely
> >     >     annoying for all VOTable clients such as TOPcat or Aladin for
> which this metadata information
> >     >     must stay in the FIELD entities.
> >     >
> >     >
> >     > I am not sure what you are exactly referring to. If it is what
> Gerard commented on, yes, this was
> >     > fixed long ago after you made this comment.
> >     >
> >     > If it is not, I am giving more information in the second part, but
> in summary we are trying to
> >     > standardize the serialization of Data Models also for the reason
> you mention: allowing clients to
> >     > know where to look for metadata, which is tricky, to say the
> least, with the current usages and
> >     > standards (see the second part of the email for details and
> examples).
> >     >
> >     >     For bypassing this issue, and if I correctly understand the
> current 2014-05-03 XML basic IVOA
> >     >     model description
> >     >     (
> https://volute.googlecode.com/svn/trunk/projects/dm/vo-dml/models/ivoa/IVOA.vo-dml.xml),
> the
> >     >     "quantity" entry duplicates now the UCD role and unit role.
> >     >
> >     >
> >     > We are not duplicating existing standards, we are defining a
> standardized way to describe and
> >     > serialize data models in a machine-readable way. You might be
> confusing the two levels of the
> >     > solution, which correspond to two different documents: VODML
> descriptions of data models, and the
> >     > serialization of such data models in VOTable. In the second
> document we use the standardized units
> >     > and ucd and the corresponding VOTable standard attributes.
> >     >
> >     >
> >     >
> >     >       And I have to say that the current basic IVOA model appears
> for me too heteroclite to be
> >     used
> >     >     without fear: "identity, rational, complex, duration, anyURI,
> boolean, real,
> >     nonnegativeInteger,
> >     >     datetime, integer, string, quantity". For a no-DM person, it
> is quite difficult to understand
> >     >     why such or such data type is considered as a basic datatype
> (duration ? datetime ? anyURI ?),
> >     >     and why others are not (char ?, range ? frequency ? ...).
> >     >
> >     >
> >     > Where to draw the line is a good question, and the current
> descriptions have been there to be
> >     > commented for about a year, so we are happy we are finally
> discussing them!
> >     >
> >     > =====
> >     >
> >     >
> >     > More detailed responses below.
> >     >
> >     >
> >     >
> >     >     I said one year ago that the VO-DML VOTable serialization
> proposed by Gerard tended to
> >     move some
> >     >     meta information such as *UCD*, *unit *or *datatype *outside
> the VOTable FIELD entity towards
> >     >     the proposed GROUP VO-DML hierarchy extension. I noted that
> this point would be extremely
> >     >     annoying for all VOTable clients such as TOPcat or Aladin for
> which this metadata information
> >     >     must stay in the FIELD entities.
> >     >
> >     >
> >     >
> >     > I am not sure whether you refer to the fact that in an early proof
> of concept serialization there
> >     > were standalone PARAMs for unit and ucd. If that's the case, as
> Gerard pointed out this was fixed
> >     > long ago in response to your feedback and the result is in section
> 6.8 of the UTYPEs draft we
> >     > presented in Heidelberg one year ago, as well as in the actual
> examples (Reference 1 below).
> >     >
> >     > It may also sound like you are worried about FIELDref having the
> UCD metadata as opposed to
> >     FIELDs.
> >     > If that's the case, there are several current standards and
> production implementations that
> >     use UCDs
> >     > in FIELDrefs. I am not going to elaborate too much on this, since
> I am not sure whether this is
> >     > really what you meant, but I will give a couple of references,
> just in case. The PhotDM, in
> >     section
> >     > C.2 (Reference 2) provides an example of a Cone Search response,
> and use FIELDrefs (with UCDs).
> >     > FIELDs are not even mentioned. This is, I believe, taken directly
> to the note by Sebastien et al
> >     > (Reference 3) on how to serialize Photometry Measurements in
> VOTable. The only examples that makes
> >     > use of FIELDs (section 4.1 and 4.2) have two sets of (different)
> UCDs, one for the FIELDs and one
> >     > for the FIELDrefs. The other examples do not mention FIELDs.
> >     >
> >     >
> >     > In any case, whether you meant the first or the second
> interpretation, more generally, the problem
> >     > is that the current standards make it hard for clients to make
> sense of the metadata, and this is
> >     > one of the reasons why we are trying to standardize the
> serialization of data models: to make
> >     > clients' life easier.
> >     >
> >     > As far as I know this only applies to UCDs and UTYPEs, because
> FIELDrefs can only have these
> >     > attributes (Reference 4, Sections 7.2).
> >     >
> >     > Some models (e.g. Spectrum 1.1, Reference 5) define reify UCDs by
> creating UCD fields in the model
> >     > (thus creating many *.ucd UTYPEs). For instance, see the VOTable
> example in section 8.2 (I'm
> >     > including a snippet for convenience):
> >     >
> >     >     <PARAM ID="DataFluxUcd" datatype="char" name="DataFluxUcd"
> >     >     utype="spec:Spectrum.Data.FluxAxis.Ucd"
> value="phot.flux.density;em.wl" arraysize="*">
> >     >     <DESCRIPTION>UCD for flux</DESCRIPTION>
> >     >     </PARAM>
> >     >
> >     >
> >     > Notice that, as opposed to Gerard's 2012 proof of concept, this is
> stated in a *standard*
> >     document.
> >     >
> >     > The status quo is that a client parsing a *standard* Spectrum 1.1
> VOTable (I am using the example
> >     > above, but there may be other examples in other models) can find a
> UCD in many different places:
> >     >   - a FIELDref with @utype spec:Spectrum.Data.FluxAxis
> >     >   - a FIELD referenced by a FIELDref and without a @utype
> >     >   - a FIELD with @utype spec:Spectrum.Data.FluxAxis
> >     >   - a PARAM with @utype spec:Spectrum.Data.FluxAxis.Ucd
> >     >   - a TD relative to a FIELD with @utype
> spec:Spectrum.Data.FluxAxis.Ucd
> >     >
> >     > This is what we are trying to standardize, so that it is clear to
> clients how to look for metadata
> >     > in an unambiguous way. Even better, with a standard like the one
> suggested by the Tiger Team,
> >     > parsing a VOTable according to a data model becomes a mechanical
> effort, so that users and
> >     > developers can use libraries, which is currently impossible (if
> not convinced by the above example
> >     > see the Current Usages document, Reference 6).
> >     >
> >     >
> >     >
> >     >     For bypassing this issue, and if I correctly understand the
> current 2014-05-03 XML basic IVOA
> >     >     model description
> >     >     (
> https://volute.googlecode.com/svn/trunk/projects/dm/vo-dml/models/ivoa/IVOA.vo-dml.xml),
> the
> >     >     "quantity" entry duplicates now the UCD role and unit role.
> >     >
> >     >
> >     > I believe you are confusing two levels, which are represented by
> two documents. One level is the
> >     > data model description. Data Models can (in fact they do, see
> Sprectrum 1.1) define ucd and
> >     unit as
> >     > fields of their models, reifying them. Even when they don't, there
> are some cases (I can provide
> >     > examples from production services) where the data publisher needs
> to reify some of the
> >     metadata. For
> >     > instance consider a column where the same quantity is expressed in
> different units: in this
> >     case the
> >     > unit piece of metadata becomes data and you need a column to store
> them.
> >     >
> >     > So, VODML supports all of these real world examples. This has
> nothing to do with VOTable or any
> >     > other serialization. As a matter of fact, VODML is indeed an
> effort to make serializations of Data
> >     > Models interoperable.
> >     >
> >     > The other level is the one of serialization. since VOTable has a
> @ucd attribute, it's smart to use
> >     > it, and that's what we do in the serialization document.
> >     >
> >     >
> >     >
> >     >     Personally, I am not sure that this solution to duplicate this
> kind of information will be the
> >     >     more appropriate approach: 1) we redo our VO efforts already
> done on UCDs and units...
> >     >
> >     >
> >     > Nope. When you serialize a data model instance in VOTable you use
> the standard UCDs and Units, and
> >     > the standard VOTable attributes for them (again, section 6.8 in
> the UTYPEs WD).
> >     >
> >     >
> >     >     2) we will have to manage correspondances between
> FIELD-UCD/FIELD-unit and VO-DML-quantity.
> >     >
> >     >
> >     > You already need to do that now, but with VODML and the
> serialization document there is a standard
> >     > to be implemented, applications developers do not need to "guess",
> or to assume conventions.
> >     >
> >     >
> >     >       And I have to say that the current basic IVOA model appears
> for me too heteroclite to be
> >     used
> >     >     without fear: "identity, rational, complex, duration, anyURI,
> boolean, real,
> >     nonnegativeInteger,
> >     >     datetime, integer, string, quantity". For a no-DM person, it
> is quite difficult to understand
> >     >     why such or such data type is considered as a basic datatype
> (duration ? datetime ? anyURI ?),
> >     >     and why others are not (char ?, range ? frequency ? ...).
> >     >
> >     >
> >     > Where to draw the line is a good question, and the current
> descriptions have been there to be
> >     > commented for about a year, so we are happy we are finally
> discussing them!
> >     >
> >     > Notice, however, that a no-DM person shouldn't care: VODML
> descriptions are meant to be used by
> >     > software developers who need to know how to map the IVOA types to
> their language, and only DM
> >     people
> >     > need to create models, so...
> >     >
> >     > Primitive types are special in that they need to be defined
> beforehand so that developers can map
> >     > them to their own "primitive" classes or structures. All other
> types can be derived from them, and
> >     > that can be done mechanically in any language (we have prototypes
> and reference implementations in
> >     > Java and Python already, as I showed in Hawaii).
> >     >
> >     > I believe primitive types should all be domain-independent:
> frequency is a physics concept, you
> >     > won't find it as a primitive type in MySQL or Java, while datetime
> is general and can be found in
> >     > both (I am using "primitive" in a broad sense, not in a
> language-specific sense... e.g. Java
> >     doesn't
> >     > have a datetime "primitive", but datetime and duration have
> corresponding classes in the standard
> >     > Java library).
> >     >
> >     > Also, they should map at least to the VOTable concepts.
> >     >
> >     > Of course, this is all to some extent arbitrary and fuzzy. For
> instance, you mention char and
> >     > duration: the first one would be good to include because it maps
> directly to a VOTable
> >     datatype. The
> >     > second one is really on the fuzzy edge. I think it makes sense to
> include it among the primitive
> >     > types, but I wouldn't be against leaving it out of the list.
> >     >
> >     > Thanks for the feedback!
> >     >
> >     > Omar.
> >     >
> >     > Reference 1. UTYPEs WD
> >
> http://volute.googlecode.com/svn/trunk/projects/dm/vo-dml/doc/UTYPEs-WD-v1.0.pdf
> >     >
> >     > Reference 2. PhotDM REC
> http://www.ivoa.net/documents/PHOTDM/20130928/PR-PhotDM-1.0-20130928.pdf
> >     >
> >     > Reference 3. PhotDM in VOTAble
> >     > NOTE
> http://wiki.ivoa.net/internal/IVOA/PhotometryDataModel/NOTE-PPDMDesc-0.1-20101202.pdf
> >     >
> >     > Reference 4. VOTable 1.3
> >     > REC
> http://www.ivoa.net/documents/VOTable/20130920/REC-VOTable-1.3-20130920.html
> >     >
> >     > Reference 5. Spectrum 1.1
> >     > REC
> http://www.ivoa.net/documents/SpectrumDM/20111120/REC-SpectrumDM-1.1-20111120.pdf
> >     >
> >     > Reference 6. UTYPEs: Current Usages
> >     > NOTE
> http://www.ivoa.net/documents/Notes/UTypesUsage/20130213/NOTE-utypes-usage-1.0-20130213.html
> >     >
> >     >
> >     > --
> >     > Omar Laurino
> >     > Smithsonian Astrophysical Observatory
> >     > Harvard-Smithsonian Center for Astrophysics
> >     > 100 Acorn Park Dr. R-377 MS-81
> >     > 02140 Cambridge, MA
> >     > (617) 495-7227 <tel:%28617%29%20495-7227>
> >
> >
> >
> >
> > --
> > Omar Laurino
> > Smithsonian Astrophysical Observatory
> > Harvard-Smithsonian Center for Astrophysics
> > 100 Acorn Park Dr. R-377 MS-81
> > 02140 Cambridge, MA
> > (617) 495-7227
>
>


-- 
Omar Laurino
Smithsonian Astrophysical Observatory
Harvard-Smithsonian Center for Astrophysics
100 Acorn Park Dr. R-377 MS-81
02140 Cambridge, MA
(617) 495-7227
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ivoa.net/pipermail/dm/attachments/20140506/7523ca79/attachment-0001.html>


More information about the dm mailing list