[Cube/vo-dml] ivoa datatypes
Laurino, Omar
olaurino at cfa.harvard.edu
Tue May 6 12:35:11 PDT 2014
Carlos,
> My question was actually more about how data models are expected to be
> serialized right now, before
> vo-dml. That is, a typical votable serialization of any data model
> (spectrum, photometry, etc).
> There is where I'm still confused.
>
I understand now, sorry for the confusion. The thing is that right now how
to serialize a data model depends on the data model. VODML and the VODML
mapping documents are trying to provide a single, consistent,
model-independent framework for making life easier to implementors, so that
your question would have a simple answer that is not "it depends". :)
I don't think your question was out of place, I just thought you were
referring to the tiger team proposal.
Omar.
>
> But I didn't notice that we were talking about vo-dml so the question may
> be out of place.
>
> Thanks
>
> Carlos
>
> On 06/05/14 19:48, Laurino, Omar wrote:
> > Carlos,
> >
> > you implemented one of the prototypes I presented in Hawaii, so you saw
> that in action, even though
> > that happened a while ago, and we were using prototype model
> descriptions. Also, I now realize we
> > made some mistakes, which is to expect for a prototype. When you want we
> can look at those issues
> > and fix them.
> >
> > In any case in the prototype you correctly use the pattern that Pierre
> suggested, with the UCD,
> > Unit, and datatype in the FIELDs.
> >
> > Your prototype is also a good example of how the proposal is backward
> compatible, since your FIELDs
> > have the "old-style" UTYPEs, while the FIELDrefs have the "new-style"
> ones, so that an existing
> > client can make sense of the file even though they ignore VODML GROUPS.
> >
> > Omar.
> >
> >
> >
> >
> > On Tue, May 6, 2014 at 12:04 PM, Carlos Rodrigo <crb at cab.inta-csic.es<mailto:
> crb at cab.inta-csic.es>>
> > wrote:
> >
> > Hi
> >
> > I have always had a doubt that could have something to do with this
> discussion (if I'm not
> > understanding everything wrong)
> >
> > I want to serialize an spectrum in a votable.
> > I have two fields: wavelength and flux.
> >
> > <FIELD name="WAVELENGTH" utype="spec:Data.SpectralAxis.Value"
> ucd="em.wl" unit="angstrom"
> > datatype="double"/>
> > <FIELD name="FLUX" utype="spec:Data.FluxAxis.Value"
> ucd="phot.flux.density;em.wl" unit="erg/cm2/s/A"
> > datatype="double"/>
> >
> > the information about ucd, unit and also name for the Spectral and
> Flux axis is given there.
> >
> > But reading the Spectrum DM (at least version 2.0, but I think that
> it was similar in the previous
> > one and in other DataModels) I get the impression that I must
> duplicate this information in a
> > Characterization group:
> >
> > <GROUP name="Characterization">
> > <GROUP name="Char.FluxAxis" utype="spec:Char.FluxAxis">
> > <PARAM name="FluxAxisName" utype="spec:Char.FluxAxis.name <
> http://Char.FluxAxis.name>"
> > value="FLUX" .../>
> > <PARAM name="FluxAxisUcd" utype="spec:Char.FluxAxis.ucd"
> value="phot.flux.density;em.wl" .../>
> > <PARAM name="FluxAxisUnit" utype="spec:Char.FluxAxis.unit"
> value="erg/cm2/s/A" .../>
> > </GROUP>
> > <GROUP name="Char.SpectralAxis">
> > <PARAM name="SpectralAxisName" utype="spec:Char.SpectralAxis.name
> > <http://Char.SpectralAxis.name>" value="WAVELENGTH" .../>
> > <PARAM name="SpectralAxisUcd" utype="spec:Char.SpectralAxis.ucd"
> value="em.wl" .../>
> > <PARAM name="SpectralAxisUnit" utype="spec:Char.SpectralAxis.unit"
> value="angstrom" .../>
> > </GROUP>
> > </GROUP>
> >
> > where I say again the name, ucd and unit for the spectral and flux
> axis.
> >
> > Is that really needed? what for? I've always found this odd.
> >
> > Carlos
> >
> > On 06/05/14 17:03, Laurino, Omar wrote:
> > > Hi Pierre,
> > >
> > >
> > >
> > > May I precise my position.
> > >
> > >
> > > Your feedback has been valuable in the Tiger Team and is always
> welcome.
> > >
> > > =====
> > > TL;DR reply (more details follow):
> > >
> > > I said one year ago that the VO-DML VOTable serialization
> proposed by Gerard tended to
> > move some
> > > meta information such as *UCD*, *unit *or *datatype *outside
> the VOTable FIELD entity towards
> > > the proposed GROUP VO-DML hierarchy extension. I noted that
> this point would be extremely
> > > annoying for all VOTable clients such as TOPcat or Aladin for
> which this metadata information
> > > must stay in the FIELD entities.
> > >
> > >
> > > I am not sure what you are exactly referring to. If it is what
> Gerard commented on, yes, this was
> > > fixed long ago after you made this comment.
> > >
> > > If it is not, I am giving more information in the second part, but
> in summary we are trying to
> > > standardize the serialization of Data Models also for the reason
> you mention: allowing clients to
> > > know where to look for metadata, which is tricky, to say the
> least, with the current usages and
> > > standards (see the second part of the email for details and
> examples).
> > >
> > > For bypassing this issue, and if I correctly understand the
> current 2014-05-03 XML basic IVOA
> > > model description
> > > (
> https://volute.googlecode.com/svn/trunk/projects/dm/vo-dml/models/ivoa/IVOA.vo-dml.xml),
> the
> > > "quantity" entry duplicates now the UCD role and unit role.
> > >
> > >
> > > We are not duplicating existing standards, we are defining a
> standardized way to describe and
> > > serialize data models in a machine-readable way. You might be
> confusing the two levels of the
> > > solution, which correspond to two different documents: VODML
> descriptions of data models, and the
> > > serialization of such data models in VOTable. In the second
> document we use the standardized units
> > > and ucd and the corresponding VOTable standard attributes.
> > >
> > >
> > >
> > > And I have to say that the current basic IVOA model appears
> for me too heteroclite to be
> > used
> > > without fear: "identity, rational, complex, duration, anyURI,
> boolean, real,
> > nonnegativeInteger,
> > > datetime, integer, string, quantity". For a no-DM person, it
> is quite difficult to understand
> > > why such or such data type is considered as a basic datatype
> (duration ? datetime ? anyURI ?),
> > > and why others are not (char ?, range ? frequency ? ...).
> > >
> > >
> > > Where to draw the line is a good question, and the current
> descriptions have been there to be
> > > commented for about a year, so we are happy we are finally
> discussing them!
> > >
> > > =====
> > >
> > >
> > > More detailed responses below.
> > >
> > >
> > >
> > > I said one year ago that the VO-DML VOTable serialization
> proposed by Gerard tended to
> > move some
> > > meta information such as *UCD*, *unit *or *datatype *outside
> the VOTable FIELD entity towards
> > > the proposed GROUP VO-DML hierarchy extension. I noted that
> this point would be extremely
> > > annoying for all VOTable clients such as TOPcat or Aladin for
> which this metadata information
> > > must stay in the FIELD entities.
> > >
> > >
> > >
> > > I am not sure whether you refer to the fact that in an early proof
> of concept serialization there
> > > were standalone PARAMs for unit and ucd. If that's the case, as
> Gerard pointed out this was fixed
> > > long ago in response to your feedback and the result is in section
> 6.8 of the UTYPEs draft we
> > > presented in Heidelberg one year ago, as well as in the actual
> examples (Reference 1 below).
> > >
> > > It may also sound like you are worried about FIELDref having the
> UCD metadata as opposed to
> > FIELDs.
> > > If that's the case, there are several current standards and
> production implementations that
> > use UCDs
> > > in FIELDrefs. I am not going to elaborate too much on this, since
> I am not sure whether this is
> > > really what you meant, but I will give a couple of references,
> just in case. The PhotDM, in
> > section
> > > C.2 (Reference 2) provides an example of a Cone Search response,
> and use FIELDrefs (with UCDs).
> > > FIELDs are not even mentioned. This is, I believe, taken directly
> to the note by Sebastien et al
> > > (Reference 3) on how to serialize Photometry Measurements in
> VOTable. The only examples that makes
> > > use of FIELDs (section 4.1 and 4.2) have two sets of (different)
> UCDs, one for the FIELDs and one
> > > for the FIELDrefs. The other examples do not mention FIELDs.
> > >
> > >
> > > In any case, whether you meant the first or the second
> interpretation, more generally, the problem
> > > is that the current standards make it hard for clients to make
> sense of the metadata, and this is
> > > one of the reasons why we are trying to standardize the
> serialization of data models: to make
> > > clients' life easier.
> > >
> > > As far as I know this only applies to UCDs and UTYPEs, because
> FIELDrefs can only have these
> > > attributes (Reference 4, Sections 7.2).
> > >
> > > Some models (e.g. Spectrum 1.1, Reference 5) define reify UCDs by
> creating UCD fields in the model
> > > (thus creating many *.ucd UTYPEs). For instance, see the VOTable
> example in section 8.2 (I'm
> > > including a snippet for convenience):
> > >
> > > <PARAM ID="DataFluxUcd" datatype="char" name="DataFluxUcd"
> > > utype="spec:Spectrum.Data.FluxAxis.Ucd"
> value="phot.flux.density;em.wl" arraysize="*">
> > > <DESCRIPTION>UCD for flux</DESCRIPTION>
> > > </PARAM>
> > >
> > >
> > > Notice that, as opposed to Gerard's 2012 proof of concept, this is
> stated in a *standard*
> > document.
> > >
> > > The status quo is that a client parsing a *standard* Spectrum 1.1
> VOTable (I am using the example
> > > above, but there may be other examples in other models) can find a
> UCD in many different places:
> > > - a FIELDref with @utype spec:Spectrum.Data.FluxAxis
> > > - a FIELD referenced by a FIELDref and without a @utype
> > > - a FIELD with @utype spec:Spectrum.Data.FluxAxis
> > > - a PARAM with @utype spec:Spectrum.Data.FluxAxis.Ucd
> > > - a TD relative to a FIELD with @utype
> spec:Spectrum.Data.FluxAxis.Ucd
> > >
> > > This is what we are trying to standardize, so that it is clear to
> clients how to look for metadata
> > > in an unambiguous way. Even better, with a standard like the one
> suggested by the Tiger Team,
> > > parsing a VOTable according to a data model becomes a mechanical
> effort, so that users and
> > > developers can use libraries, which is currently impossible (if
> not convinced by the above example
> > > see the Current Usages document, Reference 6).
> > >
> > >
> > >
> > > For bypassing this issue, and if I correctly understand the
> current 2014-05-03 XML basic IVOA
> > > model description
> > > (
> https://volute.googlecode.com/svn/trunk/projects/dm/vo-dml/models/ivoa/IVOA.vo-dml.xml),
> the
> > > "quantity" entry duplicates now the UCD role and unit role.
> > >
> > >
> > > I believe you are confusing two levels, which are represented by
> two documents. One level is the
> > > data model description. Data Models can (in fact they do, see
> Sprectrum 1.1) define ucd and
> > unit as
> > > fields of their models, reifying them. Even when they don't, there
> are some cases (I can provide
> > > examples from production services) where the data publisher needs
> to reify some of the
> > metadata. For
> > > instance consider a column where the same quantity is expressed in
> different units: in this
> > case the
> > > unit piece of metadata becomes data and you need a column to store
> them.
> > >
> > > So, VODML supports all of these real world examples. This has
> nothing to do with VOTable or any
> > > other serialization. As a matter of fact, VODML is indeed an
> effort to make serializations of Data
> > > Models interoperable.
> > >
> > > The other level is the one of serialization. since VOTable has a
> @ucd attribute, it's smart to use
> > > it, and that's what we do in the serialization document.
> > >
> > >
> > >
> > > Personally, I am not sure that this solution to duplicate this
> kind of information will be the
> > > more appropriate approach: 1) we redo our VO efforts already
> done on UCDs and units...
> > >
> > >
> > > Nope. When you serialize a data model instance in VOTable you use
> the standard UCDs and Units, and
> > > the standard VOTable attributes for them (again, section 6.8 in
> the UTYPEs WD).
> > >
> > >
> > > 2) we will have to manage correspondances between
> FIELD-UCD/FIELD-unit and VO-DML-quantity.
> > >
> > >
> > > You already need to do that now, but with VODML and the
> serialization document there is a standard
> > > to be implemented, applications developers do not need to "guess",
> or to assume conventions.
> > >
> > >
> > > And I have to say that the current basic IVOA model appears
> for me too heteroclite to be
> > used
> > > without fear: "identity, rational, complex, duration, anyURI,
> boolean, real,
> > nonnegativeInteger,
> > > datetime, integer, string, quantity". For a no-DM person, it
> is quite difficult to understand
> > > why such or such data type is considered as a basic datatype
> (duration ? datetime ? anyURI ?),
> > > and why others are not (char ?, range ? frequency ? ...).
> > >
> > >
> > > Where to draw the line is a good question, and the current
> descriptions have been there to be
> > > commented for about a year, so we are happy we are finally
> discussing them!
> > >
> > > Notice, however, that a no-DM person shouldn't care: VODML
> descriptions are meant to be used by
> > > software developers who need to know how to map the IVOA types to
> their language, and only DM
> > people
> > > need to create models, so...
> > >
> > > Primitive types are special in that they need to be defined
> beforehand so that developers can map
> > > them to their own "primitive" classes or structures. All other
> types can be derived from them, and
> > > that can be done mechanically in any language (we have prototypes
> and reference implementations in
> > > Java and Python already, as I showed in Hawaii).
> > >
> > > I believe primitive types should all be domain-independent:
> frequency is a physics concept, you
> > > won't find it as a primitive type in MySQL or Java, while datetime
> is general and can be found in
> > > both (I am using "primitive" in a broad sense, not in a
> language-specific sense... e.g. Java
> > doesn't
> > > have a datetime "primitive", but datetime and duration have
> corresponding classes in the standard
> > > Java library).
> > >
> > > Also, they should map at least to the VOTable concepts.
> > >
> > > Of course, this is all to some extent arbitrary and fuzzy. For
> instance, you mention char and
> > > duration: the first one would be good to include because it maps
> directly to a VOTable
> > datatype. The
> > > second one is really on the fuzzy edge. I think it makes sense to
> include it among the primitive
> > > types, but I wouldn't be against leaving it out of the list.
> > >
> > > Thanks for the feedback!
> > >
> > > Omar.
> > >
> > > Reference 1. UTYPEs WD
> >
> http://volute.googlecode.com/svn/trunk/projects/dm/vo-dml/doc/UTYPEs-WD-v1.0.pdf
> > >
> > > Reference 2. PhotDM REC
> http://www.ivoa.net/documents/PHOTDM/20130928/PR-PhotDM-1.0-20130928.pdf
> > >
> > > Reference 3. PhotDM in VOTAble
> > > NOTE
> http://wiki.ivoa.net/internal/IVOA/PhotometryDataModel/NOTE-PPDMDesc-0.1-20101202.pdf
> > >
> > > Reference 4. VOTable 1.3
> > > REC
> http://www.ivoa.net/documents/VOTable/20130920/REC-VOTable-1.3-20130920.html
> > >
> > > Reference 5. Spectrum 1.1
> > > REC
> http://www.ivoa.net/documents/SpectrumDM/20111120/REC-SpectrumDM-1.1-20111120.pdf
> > >
> > > Reference 6. UTYPEs: Current Usages
> > > NOTE
> http://www.ivoa.net/documents/Notes/UTypesUsage/20130213/NOTE-utypes-usage-1.0-20130213.html
> > >
> > >
> > > --
> > > Omar Laurino
> > > Smithsonian Astrophysical Observatory
> > > Harvard-Smithsonian Center for Astrophysics
> > > 100 Acorn Park Dr. R-377 MS-81
> > > 02140 Cambridge, MA
> > > (617) 495-7227 <tel:%28617%29%20495-7227>
> >
> >
> >
> >
> > --
> > Omar Laurino
> > Smithsonian Astrophysical Observatory
> > Harvard-Smithsonian Center for Astrophysics
> > 100 Acorn Park Dr. R-377 MS-81
> > 02140 Cambridge, MA
> > (617) 495-7227
>
>
--
Omar Laurino
Smithsonian Astrophysical Observatory
Harvard-Smithsonian Center for Astrophysics
100 Acorn Park Dr. R-377 MS-81
02140 Cambridge, MA
(617) 495-7227
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ivoa.net/pipermail/dm/attachments/20140506/7523ca79/attachment-0001.html>
More information about the dm
mailing list