To Omar Re: VO-DML specification document

Jesus Salgado Jesus.Salgado at sciops.esa.int
Mon May 12 03:25:18 PDT 2014


Dear all,

I think we are all trying to do our best to create the best
specifications possible for our community. It is many times difficult
and not always the effort is well rewarded but we have to remember that
we all have the same objectives and nothing is possible without
collaboration.

In this particular point and, although the positions look to be quite
opposed, I think we are not so far (string utypes and VO-DML preamble
are not fully incompatible).

We have time allocated in DM II where we will recap the VO-DML related
ongoing specs and the implementations that some people have been
developing. We can concentrate the discussion on the mapping.

Let's use this time all together to discuss it face by face as, in many
occasions, emails can be misinterpreted. I am sure we can all get a
common view.

Cheers,
Jesus


On Fri, 2014-05-09 at 09:21 -0400, Laurino, Omar wrote:
> 
> 
>         I don't think it's a good idea to doubt that I read the
>         documents. I don't think this kind of assomptions create good
>         conditions for a fruitfull discussion.
>         
> 
> 
> I said I didn't understand from your email whether you read it or not.
> You said you didn't find it.
> 
> 
> And your email didn't sound very friendly either.
> 
> 
> Notice that by assuming that you didn't read the document I can
> understand why you missed the Frequently Asked Questions in the
> document itself. It contains some replies to your objections, and it
> has been there since before Heidelberg.
>  
>         Life didn't stop at Sao Paulo. There has been extensive
>         controversial discussion in  Heidelberg TCG, Heidelberg
>         Interop and Hawai interop.
> 
> 
> So why are we still having the same discussion we had at all of those
> Interop that you mention? Why are you suggesting to go back to a
> solution that is not compatible with the current usages document? Why
> are you stating that the publicly collected requirements that the
> Tiger Team tried to meet are "useless"?
> 
> 
> That doesn't sound very constructive either.
>  
> Cheers,
> 
> 
> Omar.
> 
> 
>         My understanding was that the "Mapping of Complex data models"
>         was an open subject after Hawaï.
>         I will answer to all the technical points from your mail, or
>         Markus' one and the issues I see later by mail or in Madrid.
>         
>         Best regards
>         François   
>         > 
>         > In particular, there are now a number of reference
>         > implementations and prototypes that show that the mapping
>         > document:
>         >  - simplifies the interpretation and interoperability of
>         > serializations and applications. You now have a means to
>         > allow people to publish to the VO without having to read the
>         > VO standard documents!
>         >  - is backward compatible, allowing *all* of the current
>         > usages to be still valid in a transitional period or for
>         > implementing local requirements.
>         >  - meets the requirements it was supposed to meet.
>         > 
>         > To be perfectly sure you grasp the meaning of the second
>         > point: CDS services and Aladin can keep using the UTYPEs
>         > they are using now. The Tiger Team’s specification is
>         > perfectly fine with that. If you want to add one more level
>         > of interoperability, we can provide a point-and-click
>         > graphical user interface that produces the metadata section
>         > you need to add to your service responses. You can make this
>         > extra section a mere pointer to the existing PARAMs and
>         > FIELDs, and you are all set. You mentioned that most of your
>         > tables do not have many annotations, so you can probably
>         > automate the process very easily.
>         > 
>         > Your suggestion to keep the current lack of standardization
>         > for UTYPEs means to throw away two years of work under the
>         > Exec mandate and go back to the pre-Tiger Team state. Why
>         > would the Exec ask a Tiger Team to find a solution if there
>         > was no problem to solve?
>         > 
>         > You also seem to forget that this work was started in order
>         > to overcome some issues in some of the work for the IVOA
>         > science priorities. In particular, there is a general lack
>         > of interoperability when building SEDs, with applications
>         > required to ask the user to “import” data coming from the VO
>         > into VO applications. We have been stuck with these issues
>         > for three years, and the solution itself is stuck because
>         > you don’t seem to take this into account in your objections.
>         > Now SED is not a priority anymore, but this problem is
>         > coming back with Cubes and Time Series.
>         > 
>         > Some of your arguments actually support the Tiger Team’s
>         > proposal of matching the local schemata to a global,
>         > implementation-independent one. As Markus has shown you can
>         > indeed apply this to TAP, and it’s not rocket science. So I
>         > am not going to argue with those points.
>         > 
>         > A more general comment about some of your more general
>         > points:
>         > 
>         > “Simple String Matching”. You claim that with the Tiger
>         > Team's proposal you cannot just compare a single string to
>         > get the information you need.
>         > This is a very old point that we discussed before, but I am
>         > willing to give one more try in explaining why it’s wrong.
>         > The use of standard libraries is only one of the
>         > counterarguments, but there are plenty of others that we
>         > have made in the past and that you are ignoring in reviving
>         > this point.
>         > 
>         > In particular, you seem to forget that Simple String
>         > Matching is something you cannot do even now with the
>         > current standards and the lack of a standard for UTYPEs.
>         > 
>         > The most trivial example is that you cannot parse a VOTable
>         > by matching one, static string.
>         > 
>         > Also, the PhotDM standard requires GROUPs and FIELDrefs, as
>         > the spec points to a Note by Sebastien Derriere after a
>         > EuroVO ICE meeting in 2010 in which the use of GROUPs and
>         > FIELDrefs is deemed beneficial and necessary. The Tiger
>         > Team’s proposal is basically generalizing and standardizing
>         > that note and the PhotDM serialization strategies so that
>         > all serialization and models are interoperable through a
>         > single specification.
>         > 
>         > Refer to Sebastien’s presentation in Naples 2011 for the
>         > features and the benefits of this approach.
>         > 
>         > Admittedly, in order to generalize his note and PhotDM, and
>         > include other Notes and standards (Markus’ note about STC,
>         > SimDM) under a single framework along with all other models,
>         > we needed to leverage a standard VOTable feature: nested
>         > GROUPs. Technically, this is far from being a revolution,
>         > but the result is powerful and fixes a number of issues we
>         > have been stuck with for years.
>         > 
>         > For a real-case example of this I will use one of the
>         > production implementations at CDS.
>         > Consider this snippet from one of your Vizier production
>         > services:
>         > 
>         > <GROUP ID="gsed" name="_sed" ucd="phot"
>         > utype="spec:PhotometryPoint">
>         >       <DESCRIPTION>The SED group is made of 4 columns: mean
>         > frequency, flux, flux error, and filter
>         > designation</DESCRIPTION>
>         >       <FIELDref ref="sed_freq"
>         > utype="photdm:PhotometryFilter.SpectralAxis.Coverage.Location.Value"/>
>         >       <FIELDref ref="sed_flux"
>         > utype="spec:PhotometryPoint"/>
>         >       <FIELDref ref="sed_eflux"
>         > utype="spec:PhotometryPointError"/>
>         >       <FIELDref ref="sed_filter"
>         > utype="photdm:PhotometryFilter.identifier"/>
>         >  </GROUP>
>         > […]
>         > <FIELD ID="sed_freq" name="_sed_freq" ucd="em.freq"
>         > unit="GHz" datatype="double" width="10" precision="E6">
>         > <FIELD ID="sed_flux" name="_sed_flux"
>         > ucd="phot.flux.density" unit="Jy" datatype="float" width="9"
>         > precision="E3">
>         > <FIELD ID="sed_eflux" name="_sed_eflux"
>         > ucd="stat.error;phot.flux.density" unit="Jy"
>         > datatype="float" width="8" precision="E2">
>         > <FIELD ID="sed_filter" name="_sed_filter"
>         > ucd="meta.id;instr.filter" unit="" datatype="char"
>         > width="32" arraysize="32*”>
>         > 
>         > [I omitted the descriptions for clarity]
>         > 
>         > I will assume that the “spec:” UTYPEs were defined in some
>         > standard. [They are not as far as I know, but it’s hard to
>         > tell because there are several thousands UTYPEs defined in
>         > many documents and their versions, and there is no mechanism
>         > to know what “spec:” is pointing to. These are all issues
>         > fixed by VODML and the mapping strategy we suggested, by the
>         > way]
>         > 
>         > How can you get to the FIELD with a single string match? You
>         > can’t, you need to match one string, parse the parent
>         > element according to the VOTable spec (more conditional
>         > string matching, if you prefer), find the “ref” attribute,
>         > match one more string and find the FIELD. Yet, you haven’t
>         > accomplished much because you need to know much more
>         > information in order to get to the data. In general, I don't
>         > think this Turing Machine approach to VOTable is useful, and
>         > it's certainly not robust. 
>         > 
>         > 
>         > You can also refer to the Current Usages document and find
>         > that applications already need to workaround the old UTYPEs,
>         > parse them, make assumptions on them, because the simple
>         > string matching is utterly naive and, in the real life, it
>         > just doesn't work.
>         > 
>         > This statement, in particular:
>         >         For many applications the proposed mechanism will
>         >         make the recognition of model attributes associated
>         >         with FIELDS in our table a much more complex and
>         >         heavy process than the current one. Instead of
>         >         simple string matching recognition it will require
>         >         development of a complex hierarchical structure
>         >         which has to be fully created and filled from
>         >         VOTABLE parsing and explored for recognition. The
>         >         objection to my pôint is that standard libraries
>         >         could do it for the developper, but application
>         >         developer may want to avoid using this and it may
>         >         also be unsufull (see c for details)
>         > 
>         > Apart from being wrong, as demonstrated by the
>         > implementations and by the rich literature on the Tiger Team
>         > proposal, since they show that the process is actually
>         > simplified and standardized, this statement questions
>         > VOTable itself.
>         > 
>         > 
>         > In fact, VOTable is "a complex hierarchical structure which
>         > has to be fully created and filled from VOTABLE parsing and
>         > explored for recognition”. If a developer doesn’t want to
>         > use standard libraries they are free to knock themselves out
>         > and reimplement a VOTable parser, to do which they need to
>         > read the specs. The same applies to FITS, to XML, to JSON,
>         > etc. So this applies also to VODML. I don’t see the
>         > problem. 
>         > 
>         > 
>         > 
>         > Cheers,
>         > 
>         > 
>         > Omar.
>         > 
>         > 
>         > 
>         > 
>         > On Wed, May 7, 2014 at 12:38 PM, François Bonnarel
>         > <francois.bonnarel at astro.unistra.fr> wrote:
>         >         hi all,
>         >         Starting from the last TCG teleconf time and from
>         >         that email from Gerard below  there has been  a lot
>         >         of discussions around VO-DML those days and there
>         >         are some aspects which give me some concerns:
>         >               *     I think many people are still mixing two
>         >                 aspects which were to be separated in two
>         >                 different drafts according to the
>         >                 conclusions of our controversal discussion
>         >                 held in Hawai as they were summurized by
>         >                 Jesus here:
>         >         
>         >         http://wiki.ivoa.net/internal/IVOA/PlenarySessionsSep2013/DM_Closing_Hawaii2013_JSalgado.pdf (slide 6 and 7)
>         >              
>         >         I see that a lot of work has been done for the
>         >         update of the first draft "VO-DML: A Data Modeling
>         >         Language for the VO" but nothing new has been done
>         >         about the second draft "Mapping of Complex Data
>         >         Models". The title of this second draft reflected
>         >         the difficulties appearing in using vo-dml
>         >         description to map the models into VOTable, making
>         >         an extensive but seriously modified usage of the
>         >         utype attribute. However,  I can read several
>         >         sentences which show that for many people nothing
>         >         has changed since the time where the introduction of
>         >         the draft VO-DML draft was first written.
>         >            Here is a quotation of the abstract of the vo-dml
>         >         document "VO-DML a consistent modelling language or
>         >         IVOA data models"
>         >         > "Arguably the most important use case for VO-DML
>         >         > is the UTYPE specification [2]
>         >         > which uses it to provide a translational semantics
>         >         > for VOTable annotations.
>         >         > These annotations allow one to explicitly describe
>         >         > how instances of types from a
>         >         > data model are stored in the VOTable."
>         >           and all the introduction still emphasize that the
>         >         main use case for VO-DML modelling language is utype
>         >         specification in VOTABLE. It also Implies that the
>         >         VO-DML GROUP mechanism is THE (unique) way to do the
>         >         mapping.
>         >         
>         >             Last but not least I see nothing like a
>         >         ""Mapping of Complex Data Models" document in the
>         >         repository.
>         >         
>         >             I think this is not the spirit of the decision
>         >         taken in  HawaI. The title chosen in Hawai,
>         >         reflecting the discussion held there was unambiguous
>         >         in pre-deciding no peculiar solution.
>         >               *      The mechanism proposed so far to map
>         >                 the models into VOTABLE present several
>         >                 severe issues which I would like to develop.
>         >              It must be clear to everybody that apart from
>         >         this, most of the effort done under building a
>         >         consistent modelling language for IVOA looks very
>         >         promising to me. Having a description language with
>         >         xml serialization alllows to share diagrams and
>         >         models built with different modelling softwares and
>         >         allows to help generating interoperable
>         >         documentation and code. This is a real progress and
>         >         I appraciate the effort done By Mark (and now
>         >         Arnold) to map various models in the work, done by
>         >         Gerard, Omar and others. For me this is core of
>         >         "VO-DML a consistent modelling language for IVOA
>         >         data models" and this is a progress. Probably we
>         >         have things to discuss still (the utype attribute
>         >         stuff and the ivoa datatypes among others) but I see
>         >         no objection in going forward along this path
>         >         towards recommendation  
>         >         
>         >               *       So let's talk about "Mapping of
>         >                 Complex data models"
>         >               I see three severe issues in adopting the
>         >         mapping of VO-DML structures to VOTABLE
>         >         
>         >              1.        For many applications the proposed
>         >                 mechanism will make the recognition of model
>         >                 attributes associated with FIELDS in our
>         >                 table a much more complex and heavy process
>         >                 than the current one. Instead of simple
>         >                 string matching recognition it will require
>         >                 development of a complex hierarchical
>         >                 structure which has to be fully created and
>         >                 filled from VOTABLE parsing and explored for
>         >                 recognition. The objection to my pôint is
>         >                 that standard libraries could do it for the
>         >                 developper, but application developer may
>         >                 want to avoid using this and it may also be
>         >                 unsufull (see c for details)
>         >              2. For probably more than 90% of tables
>         >                 exchanged in the VO the application of this
>         >                 mapping seem to be simply impossible (or at
>         >                 least awkward).               
>         >                       *  A large majority of the huge number
>         >                         of columns available in the VO
>         >                         (those of the catalogs) are not
>         >                         associated with a model attribute.
>         >                         Probably many can have one. It has
>         >                         started with PhotDM ones for SED
>         >                         bulding and lot can be associated
>         >                         with STC or others. But as long as
>         >                         we add models the number of VO-DML
>         >                         GROUPS will increase for very
>         >                         partial matching
>         >                       * Astronomical catalogs are (or will
>         >                         be) distributed with TAP. Tap
>         >                         provides TABLES where the number of
>         >                         columns is variable, dependant of
>         >                         the Actual ADL querry sent. This
>         >                         will  imply that either the
>         >                         VO-DML-groups are also dependant of
>         >                         the QUERY (and not unique for a
>         >                         service implementong a model) OR
>         >                         (alternativly) that the VO-DML
>         >                         GROUPS contain some empty (or
>         >                         absent) FIELDS.
>         >              3. In many actual and current VO use cases  it
>         >                 is more or less useless.
>         >                 
>         >                 
>         >                                         
>         >                       *   Why ? IT is well admitted that
>         >                         IVOA datamodels are not (in general)
>         >                         internal datamodels of servers and
>         >                         archives. IT is a model for
>         >                         interaction of the archive with the
>         >                         outside world.
>         >                       * What is the situation for
>         >                         applications (desktop client
>         >                         applications I mean)? I think the
>         >                         assomption  of VO-DML to VOTABLE
>         >                         mapping is that application will
>         >                         contain a full implementation of the
>         >                         IVOA data model (this can probably
>         >                         be done by preparing the IVOA model
>         >                         classes when creating/modifying the
>         >                         application code or by creating them
>         >                         dynamically when reading the
>         >                         VOTABLE). Then the parsing of the
>         >                         VOTABLE allows to populate the
>         >                         objects with  values contained in
>         >                         the columns and raws of the VOTABLE.
>         >                         So everything in the IVOA model is
>         >                         mapped one to one to the application
>         >                         model 
>         >                       * I don't say that this cannot be used
>         >                         and is not usefull  in some cases. I
>         >                         say it's not usefull in general.
>         >                         Because as allready discussed in the
>         >                         past applications use model
>         >                         attributes (known through
>         >                         current-style utypes) as roles to
>         >                         know what to do with the content of
>         >                         the column.
>         >                       * Let me now try to formalize this a
>         >                         little bit. Let's call it "current
>         >                         life model mapping mechanism". I am
>         >                         proposing a formalization. that
>         >                         means everything SEEMS to work like
>         >                         if it was built like this although I
>         >                         know it is actually NOT TRUE and is
>         >                         more dispersed in the code)    
>         >                             - Application has its own model
>         >                 (its classes and methods, let's say in
>         >                 java)  
>         >                             - There is the IVOA model
>         >                 described somewhere. Each attribute and
>         >                 class has its fully qualified name
>         >                 (a.b.c.d ...). This is the current-style
>         >                 utype.
>         >                              - The application implements a
>         >                 VOTABLE parser. When it runs and read a
>         >                 given VOTABLE the current-style utypes are
>         >                 recognized. An action is driven by this
>         >                 recognition which is basically to either
>         >                 populate the objects of the Application
>         >                 model with values taken frome the VOTABLE
>         >                 cells or to launch Application model methods
>         >                 implied by the occurence of this IVOA DM
>         >                 attribute. This also a kind of mapping but
>         >                 it is not one to one and maybe incomplete.
>         >                 Several (incomplete) IVOA data models can be
>         >                 used in the same VOTABLE document.
>         >                 I think all the current VO applications work
>         >                 like this and I assume that in the future a
>         >                 majority of application developpers would
>         >                 like to work like this again. For example
>         >                 that would be the case of developpers
>         >                 maintaining existing software and eager to
>         >                 connect their application nearly "as is" to
>         >                 the VO world. I don't say that nobody would
>         >                 like to use direct implementation of IVOA
>         >                 datamodels in their applications but I claim
>         >                 that not all of them will want to do it or
>         >                 need to do it.
>         >             ---------------------------------------------
>         >         
>         >         As a matter of conclusion of this mail, I would say
>         >         I wrote all this to (re)open the discussion on the
>         >         second slot of Jesus' Hawai summary:      "Mapping
>         >         of Complex data models"
>         >             I see an alternative to the so-far-VO-DML
>         >         proposed mapping mechanism :
>         >               * Let's keep the utypes as java-like fully
>         >                 qualified names (a.b.c.d) more ore less as
>         >                 it is now.
>         >               * This allows existing and future applications
>         >                 to work according to   my little "current
>         >                 life model mapping mechanism"
>         >               * This doesn't forbid to populate an
>         >                 application model structure exactly mapping
>         >                 the IVOA model. The fully qualified name can
>         >                 be decomposed to find out in which class
>         >                 and which member the FIELD is associated to.
>         >                 So WE POINT FROM the FIELDS to the model and
>         >                 NOT the REVERSE way.
>         >               * Transporting the structure of the model in
>         >                 the VOTABLE will not be forbidden (mechanism
>         >                 probably rather similar to the one proposed
>         >                 so far, utype usage  apart), but will not be
>         >                 mandatory and in many cases not usefull or
>         >                 not possible (eg TAP) 
>         >         Best regards
>         >         François
>         >         Le 24/04/2014 14:31, Gerard Lemson a écrit : 
>         >         > Dear data modelers
>         >         > 
>         >         > After urging from the DM chairs, I would like to direct your attention to
>         >         > the VO-DML page on the IVOA wiki:
>         >         > http://wiki.ivoa.net/twiki/bin/view/IVOA/VODML
>         >         > 
>         >         > There you will find links to the VO-DML specification document and the
>         >         > associated technical xml schema  and schematron files.
>         >         > Notice that the document is one of the three documents that came from the
>         >         > UTYPEs Tiger Team. This one, in particular, describes how to express Data
>         >         > Models in a standard, machine-readable way.
>         >         > Please read the comment at the start of the spec file for info which parts
>         >         > are not quite done (mainly some paragraphs in intro have to still be added).
>         >         > 
>         >         > The core part can be commented on freely.
>         >         > 
>         >         > The wiki page needs some updating with links to, and descriptions of, some
>         >         > reference implementations.
>         >         > Much code has been available since Heidelberg (and before) as part of the
>         >         > prototyping effort mandated after that Interop. Stable code includes a bunch
>         >         > of XSLT scripts for validatioon, java code generation, and hypertext
>         >         > documentation generation which includes DM figures and cross-references
>         >         > between elements. For two UML tools (Magic Draw CE 12..1 and Modelio) there
>         >         > are also scripts available to generate VO-DML documents from a properly
>         >         > designed UML representation. Pointers to the code and to the documentation
>         >         > is or will be available on the aforementioned wiki pageasap.
>         >         > 
>         >         > Implementations related to the mapping of VO-DML to VOTable, like the
>         >         > "VO-DML Mapper" (http://gavo.mpa-garching.mpg.de/dev/vodml-mapper/) created
>         >         > for helping users and data providers through a point-and-click interface and
>         >         > the photometry service prototype presented in Hawaii are not included, since
>         >         > they target a different document. But the vo-dml mapper in particular shows
>         >         > how one can make use of the machine readable DM documents at runtime and
>         >         > might be seen as another proof of concept implementation also for this
>         >         > specification.
>         >         > 
>         >         > We thank those that, in the past year, have sent comments to the editors
>         >         > directly, but we would urge members to address comments directly to the dm
>         >         > mailing list.
>         >         > 
>         >         > Best regards
>         >         > 
>         >         > Gerard Lemson
>         >         
>         >         
>         > 
>         > 
>         > 
>         > 
>         > -- 
>         > Omar Laurino
>         > Smithsonian Astrophysical Observatory
>         > Harvard-Smithsonian Center for Astrophysics 
>         > 100 Acorn Park Dr. R-377 MS-81 
>         > 02140 Cambridge, MA 
>         > (617) 495-7227
>         
>         
> 
> 
> 
> 
> -- 
> Omar Laurino
> Smithsonian Astrophysical Observatory
> Harvard-Smithsonian Center for Astrophysics
> 100 Acorn Park Dr. R-377 MS-81
> 02140 Cambridge, MA
> (617) 495-7227
-- 
Jesus J. SALGADO
ESA Science Archives and VO Team
ISDEFE for ESA - European Space Agency

Science Operations Department (SRE-O)
Science Archives and Computer Engineering Unit (SRE-OE)

ESAC, European Space Astronomy Centre
P.O. Box 78
E-28691 Villanueva de la Cañada, Madrid, Spain
Jesus.Salgado at sciops.esa.int | www.esa.int
Tel. +34 918 131 271         | Fax  +34 918 131 218


This message and any attachments are intended for the use of the addressee or addressees only.
The unauthorised disclosure, use, dissemination or copying (either in whole or in part) of its
content is not permitted.
If you received this message in error, please notify the sender and delete it from your system.
Emails can be altered and their integrity cannot be guaranteed by the sender.

Please consider the environment before printing this email.



More information about the dm mailing list