[Cube/vo-dml] ivoa datatypes
Carlos Rodrigo
crb at cab.inta-csic.es
Tue May 6 11:45:22 PDT 2014
Ah, OK. That is a good example that I should remember hehe :)
My question was actually more about how data models are expected to be serialized right now, before
vo-dml. That is, a typical votable serialization of any data model (spectrum, photometry, etc).
There is where I'm still confused.
But I didn't notice that we were talking about vo-dml so the question may be out of place.
Thanks
Carlos
On 06/05/14 19:48, Laurino, Omar wrote:
> Carlos,
>
> you implemented one of the prototypes I presented in Hawaii, so you saw that in action, even though
> that happened a while ago, and we were using prototype model descriptions. Also, I now realize we
> made some mistakes, which is to expect for a prototype. When you want we can look at those issues
> and fix them.
>
> In any case in the prototype you correctly use the pattern that Pierre suggested, with the UCD,
> Unit, and datatype in the FIELDs.
>
> Your prototype is also a good example of how the proposal is backward compatible, since your FIELDs
> have the "old-style" UTYPEs, while the FIELDrefs have the "new-style" ones, so that an existing
> client can make sense of the file even though they ignore VODML GROUPS.
>
> Omar.
>
>
>
>
> On Tue, May 6, 2014 at 12:04 PM, Carlos Rodrigo <crb at cab.inta-csic.es <mailto:crb at cab.inta-csic.es>>
> wrote:
>
> Hi
>
> I have always had a doubt that could have something to do with this discussion (if I'm not
> understanding everything wrong)
>
> I want to serialize an spectrum in a votable.
> I have two fields: wavelength and flux.
>
> <FIELD name="WAVELENGTH" utype="spec:Data.SpectralAxis.Value" ucd="em.wl" unit="angstrom"
> datatype="double"/>
> <FIELD name="FLUX" utype="spec:Data.FluxAxis.Value" ucd="phot.flux.density;em.wl" unit="erg/cm2/s/A"
> datatype="double"/>
>
> the information about ucd, unit and also name for the Spectral and Flux axis is given there.
>
> But reading the Spectrum DM (at least version 2.0, but I think that it was similar in the previous
> one and in other DataModels) I get the impression that I must duplicate this information in a
> Characterization group:
>
> <GROUP name="Characterization">
> <GROUP name="Char.FluxAxis" utype="spec:Char.FluxAxis">
> <PARAM name="FluxAxisName" utype="spec:Char.FluxAxis.name <http://Char.FluxAxis.name>"
> value="FLUX" .../>
> <PARAM name="FluxAxisUcd" utype="spec:Char.FluxAxis.ucd" value="phot.flux.density;em.wl" .../>
> <PARAM name="FluxAxisUnit" utype="spec:Char.FluxAxis.unit" value="erg/cm2/s/A" .../>
> </GROUP>
> <GROUP name="Char.SpectralAxis">
> <PARAM name="SpectralAxisName" utype="spec:Char.SpectralAxis.name
> <http://Char.SpectralAxis.name>" value="WAVELENGTH" .../>
> <PARAM name="SpectralAxisUcd" utype="spec:Char.SpectralAxis.ucd" value="em.wl" .../>
> <PARAM name="SpectralAxisUnit" utype="spec:Char.SpectralAxis.unit" value="angstrom" .../>
> </GROUP>
> </GROUP>
>
> where I say again the name, ucd and unit for the spectral and flux axis.
>
> Is that really needed? what for? I've always found this odd.
>
> Carlos
>
> On 06/05/14 17:03, Laurino, Omar wrote:
> > Hi Pierre,
> >
> >
> >
> > May I precise my position.
> >
> >
> > Your feedback has been valuable in the Tiger Team and is always welcome.
> >
> > =====
> > TL;DR reply (more details follow):
> >
> > I said one year ago that the VO-DML VOTable serialization proposed by Gerard tended to
> move some
> > meta information such as *UCD*, *unit *or *datatype *outside the VOTable FIELD entity towards
> > the proposed GROUP VO-DML hierarchy extension. I noted that this point would be extremely
> > annoying for all VOTable clients such as TOPcat or Aladin for which this metadata information
> > must stay in the FIELD entities.
> >
> >
> > I am not sure what you are exactly referring to. If it is what Gerard commented on, yes, this was
> > fixed long ago after you made this comment.
> >
> > If it is not, I am giving more information in the second part, but in summary we are trying to
> > standardize the serialization of Data Models also for the reason you mention: allowing clients to
> > know where to look for metadata, which is tricky, to say the least, with the current usages and
> > standards (see the second part of the email for details and examples).
> >
> > For bypassing this issue, and if I correctly understand the current 2014-05-03 XML basic IVOA
> > model description
> > (https://volute.googlecode.com/svn/trunk/projects/dm/vo-dml/models/ivoa/IVOA.vo-dml.xml), the
> > "quantity" entry duplicates now the UCD role and unit role.
> >
> >
> > We are not duplicating existing standards, we are defining a standardized way to describe and
> > serialize data models in a machine-readable way. You might be confusing the two levels of the
> > solution, which correspond to two different documents: VODML descriptions of data models, and the
> > serialization of such data models in VOTable. In the second document we use the standardized units
> > and ucd and the corresponding VOTable standard attributes.
> >
> >
> >
> > And I have to say that the current basic IVOA model appears for me too heteroclite to be
> used
> > without fear: "identity, rational, complex, duration, anyURI, boolean, real,
> nonnegativeInteger,
> > datetime, integer, string, quantity". For a no-DM person, it is quite difficult to understand
> > why such or such data type is considered as a basic datatype (duration ? datetime ? anyURI ?),
> > and why others are not (char ?, range ? frequency ? ...).
> >
> >
> > Where to draw the line is a good question, and the current descriptions have been there to be
> > commented for about a year, so we are happy we are finally discussing them!
> >
> > =====
> >
> >
> > More detailed responses below.
> >
> >
> >
> > I said one year ago that the VO-DML VOTable serialization proposed by Gerard tended to
> move some
> > meta information such as *UCD*, *unit *or *datatype *outside the VOTable FIELD entity towards
> > the proposed GROUP VO-DML hierarchy extension. I noted that this point would be extremely
> > annoying for all VOTable clients such as TOPcat or Aladin for which this metadata information
> > must stay in the FIELD entities.
> >
> >
> >
> > I am not sure whether you refer to the fact that in an early proof of concept serialization there
> > were standalone PARAMs for unit and ucd. If that's the case, as Gerard pointed out this was fixed
> > long ago in response to your feedback and the result is in section 6.8 of the UTYPEs draft we
> > presented in Heidelberg one year ago, as well as in the actual examples (Reference 1 below).
> >
> > It may also sound like you are worried about FIELDref having the UCD metadata as opposed to
> FIELDs.
> > If that's the case, there are several current standards and production implementations that
> use UCDs
> > in FIELDrefs. I am not going to elaborate too much on this, since I am not sure whether this is
> > really what you meant, but I will give a couple of references, just in case. The PhotDM, in
> section
> > C.2 (Reference 2) provides an example of a Cone Search response, and use FIELDrefs (with UCDs).
> > FIELDs are not even mentioned. This is, I believe, taken directly to the note by Sebastien et al
> > (Reference 3) on how to serialize Photometry Measurements in VOTable. The only examples that makes
> > use of FIELDs (section 4.1 and 4.2) have two sets of (different) UCDs, one for the FIELDs and one
> > for the FIELDrefs. The other examples do not mention FIELDs.
> >
> >
> > In any case, whether you meant the first or the second interpretation, more generally, the problem
> > is that the current standards make it hard for clients to make sense of the metadata, and this is
> > one of the reasons why we are trying to standardize the serialization of data models: to make
> > clients' life easier.
> >
> > As far as I know this only applies to UCDs and UTYPEs, because FIELDrefs can only have these
> > attributes (Reference 4, Sections 7.2).
> >
> > Some models (e.g. Spectrum 1.1, Reference 5) define reify UCDs by creating UCD fields in the model
> > (thus creating many *.ucd UTYPEs). For instance, see the VOTable example in section 8.2 (I'm
> > including a snippet for convenience):
> >
> > <PARAM ID="DataFluxUcd" datatype="char" name="DataFluxUcd"
> > utype="spec:Spectrum.Data.FluxAxis.Ucd" value="phot.flux.density;em.wl" arraysize="*">
> > <DESCRIPTION>UCD for flux</DESCRIPTION>
> > </PARAM>
> >
> >
> > Notice that, as opposed to Gerard's 2012 proof of concept, this is stated in a *standard*
> document.
> >
> > The status quo is that a client parsing a *standard* Spectrum 1.1 VOTable (I am using the example
> > above, but there may be other examples in other models) can find a UCD in many different places:
> > - a FIELDref with @utype spec:Spectrum.Data.FluxAxis
> > - a FIELD referenced by a FIELDref and without a @utype
> > - a FIELD with @utype spec:Spectrum.Data.FluxAxis
> > - a PARAM with @utype spec:Spectrum.Data.FluxAxis.Ucd
> > - a TD relative to a FIELD with @utype spec:Spectrum.Data.FluxAxis.Ucd
> >
> > This is what we are trying to standardize, so that it is clear to clients how to look for metadata
> > in an unambiguous way. Even better, with a standard like the one suggested by the Tiger Team,
> > parsing a VOTable according to a data model becomes a mechanical effort, so that users and
> > developers can use libraries, which is currently impossible (if not convinced by the above example
> > see the Current Usages document, Reference 6).
> >
> >
> >
> > For bypassing this issue, and if I correctly understand the current 2014-05-03 XML basic IVOA
> > model description
> > (https://volute.googlecode.com/svn/trunk/projects/dm/vo-dml/models/ivoa/IVOA.vo-dml.xml), the
> > "quantity" entry duplicates now the UCD role and unit role.
> >
> >
> > I believe you are confusing two levels, which are represented by two documents. One level is the
> > data model description. Data Models can (in fact they do, see Sprectrum 1.1) define ucd and
> unit as
> > fields of their models, reifying them. Even when they don't, there are some cases (I can provide
> > examples from production services) where the data publisher needs to reify some of the
> metadata. For
> > instance consider a column where the same quantity is expressed in different units: in this
> case the
> > unit piece of metadata becomes data and you need a column to store them.
> >
> > So, VODML supports all of these real world examples. This has nothing to do with VOTable or any
> > other serialization. As a matter of fact, VODML is indeed an effort to make serializations of Data
> > Models interoperable.
> >
> > The other level is the one of serialization. since VOTable has a @ucd attribute, it's smart to use
> > it, and that's what we do in the serialization document.
> >
> >
> >
> > Personally, I am not sure that this solution to duplicate this kind of information will be the
> > more appropriate approach: 1) we redo our VO efforts already done on UCDs and units...
> >
> >
> > Nope. When you serialize a data model instance in VOTable you use the standard UCDs and Units, and
> > the standard VOTable attributes for them (again, section 6.8 in the UTYPEs WD).
> >
> >
> > 2) we will have to manage correspondances between FIELD-UCD/FIELD-unit and VO-DML-quantity.
> >
> >
> > You already need to do that now, but with VODML and the serialization document there is a standard
> > to be implemented, applications developers do not need to "guess", or to assume conventions.
> >
> >
> > And I have to say that the current basic IVOA model appears for me too heteroclite to be
> used
> > without fear: "identity, rational, complex, duration, anyURI, boolean, real,
> nonnegativeInteger,
> > datetime, integer, string, quantity". For a no-DM person, it is quite difficult to understand
> > why such or such data type is considered as a basic datatype (duration ? datetime ? anyURI ?),
> > and why others are not (char ?, range ? frequency ? ...).
> >
> >
> > Where to draw the line is a good question, and the current descriptions have been there to be
> > commented for about a year, so we are happy we are finally discussing them!
> >
> > Notice, however, that a no-DM person shouldn't care: VODML descriptions are meant to be used by
> > software developers who need to know how to map the IVOA types to their language, and only DM
> people
> > need to create models, so...
> >
> > Primitive types are special in that they need to be defined beforehand so that developers can map
> > them to their own "primitive" classes or structures. All other types can be derived from them, and
> > that can be done mechanically in any language (we have prototypes and reference implementations in
> > Java and Python already, as I showed in Hawaii).
> >
> > I believe primitive types should all be domain-independent: frequency is a physics concept, you
> > won't find it as a primitive type in MySQL or Java, while datetime is general and can be found in
> > both (I am using "primitive" in a broad sense, not in a language-specific sense... e.g. Java
> doesn't
> > have a datetime "primitive", but datetime and duration have corresponding classes in the standard
> > Java library).
> >
> > Also, they should map at least to the VOTable concepts.
> >
> > Of course, this is all to some extent arbitrary and fuzzy. For instance, you mention char and
> > duration: the first one would be good to include because it maps directly to a VOTable
> datatype. The
> > second one is really on the fuzzy edge. I think it makes sense to include it among the primitive
> > types, but I wouldn't be against leaving it out of the list.
> >
> > Thanks for the feedback!
> >
> > Omar.
> >
> > Reference 1. UTYPEs WD
> http://volute.googlecode.com/svn/trunk/projects/dm/vo-dml/doc/UTYPEs-WD-v1.0.pdf
> >
> > Reference 2. PhotDM REC http://www.ivoa.net/documents/PHOTDM/20130928/PR-PhotDM-1.0-20130928.pdf
> >
> > Reference 3. PhotDM in VOTAble
> > NOTE http://wiki.ivoa.net/internal/IVOA/PhotometryDataModel/NOTE-PPDMDesc-0.1-20101202.pdf
> >
> > Reference 4. VOTable 1.3
> > REC http://www.ivoa.net/documents/VOTable/20130920/REC-VOTable-1.3-20130920.html
> >
> > Reference 5. Spectrum 1.1
> > REC http://www.ivoa.net/documents/SpectrumDM/20111120/REC-SpectrumDM-1.1-20111120.pdf
> >
> > Reference 6. UTYPEs: Current Usages
> > NOTE http://www.ivoa.net/documents/Notes/UTypesUsage/20130213/NOTE-utypes-usage-1.0-20130213.html
> >
> >
> > --
> > Omar Laurino
> > Smithsonian Astrophysical Observatory
> > Harvard-Smithsonian Center for Astrophysics
> > 100 Acorn Park Dr. R-377 MS-81
> > 02140 Cambridge, MA
> > (617) 495-7227 <tel:%28617%29%20495-7227>
>
>
>
>
> --
> Omar Laurino
> Smithsonian Astrophysical Observatory
> Harvard-Smithsonian Center for Astrophysics
> 100 Acorn Park Dr. R-377 MS-81
> 02140 Cambridge, MA
> (617) 495-7227
More information about the dm
mailing list