To Omar Re: VO-DML specification document

François Bonnarel francois.bonnarel at astro.unistra.fr
Tue May 13 10:14:53 PDT 2014


Hi all,
Hi Jesus
    Ready to discuss some of my concerns in Madrid.
This is the topic I choosed to illustrate that (As DAL vice-chair I have 
some intrest in the subject).
""Mapping of ImageDM in SIAV2 metadata resource response: the case of 
WCS information / issues and discussion"

Cheers
François

On 12/05/2014 12:25, Jesus Salgado wrote:
> Dear all,
>
> I think we are all trying to do our best to create the best
> specifications possible for our community. It is many times difficult
> and not always the effort is well rewarded but we have to remember that
> we all have the same objectives and nothing is possible without
> collaboration.
>
> In this particular point and, although the positions look to be quite
> opposed, I think we are not so far (string utypes and VO-DML preamble
> are not fully incompatible).
>
> We have time allocated in DM II where we will recap the VO-DML related
> ongoing specs and the implementations that some people have been
> developing. We can concentrate the discussion on the mapping.
>
> Let's use this time all together to discuss it face by face as, in many
> occasions, emails can be misinterpreted. I am sure we can all get a
> common view.
>
> Cheers,
> Jesus
>
>
> On Fri, 2014-05-09 at 09:21 -0400, Laurino, Omar wrote:
>>
>>          I don't think it's a good idea to doubt that I read the
>>          documents. I don't think this kind of assomptions create good
>>          conditions for a fruitfull discussion.
>>          
>>
>>
>> I said I didn't understand from your email whether you read it or not.
>> You said you didn't find it.
>>
>>
>> And your email didn't sound very friendly either.
>>
>>
>> Notice that by assuming that you didn't read the document I can
>> understand why you missed the Frequently Asked Questions in the
>> document itself. It contains some replies to your objections, and it
>> has been there since before Heidelberg.
>>   
>>          Life didn't stop at Sao Paulo. There has been extensive
>>          controversial discussion in  Heidelberg TCG, Heidelberg
>>          Interop and Hawai interop.
>>
>>
>> So why are we still having the same discussion we had at all of those
>> Interop that you mention? Why are you suggesting to go back to a
>> solution that is not compatible with the current usages document? Why
>> are you stating that the publicly collected requirements that the
>> Tiger Team tried to meet are "useless"?
>>
>>
>> That doesn't sound very constructive either.
>>   
>> Cheers,
>>
>>
>> Omar.
>>
>>
>>          My understanding was that the "Mapping of Complex data models"
>>          was an open subject after Hawaï.
>>          I will answer to all the technical points from your mail, or
>>          Markus' one and the issues I see later by mail or in Madrid.
>>          
>>          Best regards
>>          François
>>          >
>>          > In particular, there are now a number of reference
>>          > implementations and prototypes that show that the mapping
>>          > document:
>>          >  - simplifies the interpretation and interoperability of
>>          > serializations and applications. You now have a means to
>>          > allow people to publish to the VO without having to read the
>>          > VO standard documents!
>>          >  - is backward compatible, allowing *all* of the current
>>          > usages to be still valid in a transitional period or for
>>          > implementing local requirements.
>>          >  - meets the requirements it was supposed to meet.
>>          >
>>          > To be perfectly sure you grasp the meaning of the second
>>          > point: CDS services and Aladin can keep using the UTYPEs
>>          > they are using now. The Tiger Team’s specification is
>>          > perfectly fine with that. If you want to add one more level
>>          > of interoperability, we can provide a point-and-click
>>          > graphical user interface that produces the metadata section
>>          > you need to add to your service responses. You can make this
>>          > extra section a mere pointer to the existing PARAMs and
>>          > FIELDs, and you are all set. You mentioned that most of your
>>          > tables do not have many annotations, so you can probably
>>          > automate the process very easily.
>>          >
>>          > Your suggestion to keep the current lack of standardization
>>          > for UTYPEs means to throw away two years of work under the
>>          > Exec mandate and go back to the pre-Tiger Team state. Why
>>          > would the Exec ask a Tiger Team to find a solution if there
>>          > was no problem to solve?
>>          >
>>          > You also seem to forget that this work was started in order
>>          > to overcome some issues in some of the work for the IVOA
>>          > science priorities. In particular, there is a general lack
>>          > of interoperability when building SEDs, with applications
>>          > required to ask the user to “import” data coming from the VO
>>          > into VO applications. We have been stuck with these issues
>>          > for three years, and the solution itself is stuck because
>>          > you don’t seem to take this into account in your objections.
>>          > Now SED is not a priority anymore, but this problem is
>>          > coming back with Cubes and Time Series.
>>          >
>>          > Some of your arguments actually support the Tiger Team’s
>>          > proposal of matching the local schemata to a global,
>>          > implementation-independent one. As Markus has shown you can
>>          > indeed apply this to TAP, and it’s not rocket science. So I
>>          > am not going to argue with those points.
>>          >
>>          > A more general comment about some of your more general
>>          > points:
>>          >
>>          > “Simple String Matching”. You claim that with the Tiger
>>          > Team's proposal you cannot just compare a single string to
>>          > get the information you need.
>>          > This is a very old point that we discussed before, but I am
>>          > willing to give one more try in explaining why it’s wrong.
>>          > The use of standard libraries is only one of the
>>          > counterarguments, but there are plenty of others that we
>>          > have made in the past and that you are ignoring in reviving
>>          > this point.
>>          >
>>          > In particular, you seem to forget that Simple String
>>          > Matching is something you cannot do even now with the
>>          > current standards and the lack of a standard for UTYPEs.
>>          >
>>          > The most trivial example is that you cannot parse a VOTable
>>          > by matching one, static string.
>>          >
>>          > Also, the PhotDM standard requires GROUPs and FIELDrefs, as
>>          > the spec points to a Note by Sebastien Derriere after a
>>          > EuroVO ICE meeting in 2010 in which the use of GROUPs and
>>          > FIELDrefs is deemed beneficial and necessary. The Tiger
>>          > Team’s proposal is basically generalizing and standardizing
>>          > that note and the PhotDM serialization strategies so that
>>          > all serialization and models are interoperable through a
>>          > single specification.
>>          >
>>          > Refer to Sebastien’s presentation in Naples 2011 for the
>>          > features and the benefits of this approach.
>>          >
>>          > Admittedly, in order to generalize his note and PhotDM, and
>>          > include other Notes and standards (Markus’ note about STC,
>>          > SimDM) under a single framework along with all other models,
>>          > we needed to leverage a standard VOTable feature: nested
>>          > GROUPs. Technically, this is far from being a revolution,
>>          > but the result is powerful and fixes a number of issues we
>>          > have been stuck with for years.
>>          >
>>          > For a real-case example of this I will use one of the
>>          > production implementations at CDS.
>>          > Consider this snippet from one of your Vizier production
>>          > services:
>>          >
>>          > <GROUP ID="gsed" name="_sed" ucd="phot"
>>          > utype="spec:PhotometryPoint">
>>          >       <DESCRIPTION>The SED group is made of 4 columns: mean
>>          > frequency, flux, flux error, and filter
>>          > designation</DESCRIPTION>
>>          >       <FIELDref ref="sed_freq"
>>          > utype="photdm:PhotometryFilter.SpectralAxis.Coverage.Location.Value"/>
>>          >       <FIELDref ref="sed_flux"
>>          > utype="spec:PhotometryPoint"/>
>>          >       <FIELDref ref="sed_eflux"
>>          > utype="spec:PhotometryPointError"/>
>>          >       <FIELDref ref="sed_filter"
>>          > utype="photdm:PhotometryFilter.identifier"/>
>>          >  </GROUP>
>>          > […]
>>          > <FIELD ID="sed_freq" name="_sed_freq" ucd="em.freq"
>>          > unit="GHz" datatype="double" width="10" precision="E6">
>>          > <FIELD ID="sed_flux" name="_sed_flux"
>>          > ucd="phot.flux.density" unit="Jy" datatype="float" width="9"
>>          > precision="E3">
>>          > <FIELD ID="sed_eflux" name="_sed_eflux"
>>          > ucd="stat.error;phot.flux.density" unit="Jy"
>>          > datatype="float" width="8" precision="E2">
>>          > <FIELD ID="sed_filter" name="_sed_filter"
>>          > ucd="meta.id;instr.filter" unit="" datatype="char"
>>          > width="32" arraysize="32*”>
>>          >
>>          > [I omitted the descriptions for clarity]
>>          >
>>          > I will assume that the “spec:” UTYPEs were defined in some
>>          > standard. [They are not as far as I know, but it’s hard to
>>          > tell because there are several thousands UTYPEs defined in
>>          > many documents and their versions, and there is no mechanism
>>          > to know what “spec:” is pointing to. These are all issues
>>          > fixed by VODML and the mapping strategy we suggested, by the
>>          > way]
>>          >
>>          > How can you get to the FIELD with a single string match? You
>>          > can’t, you need to match one string, parse the parent
>>          > element according to the VOTable spec (more conditional
>>          > string matching, if you prefer), find the “ref” attribute,
>>          > match one more string and find the FIELD. Yet, you haven’t
>>          > accomplished much because you need to know much more
>>          > information in order to get to the data. In general, I don't
>>          > think this Turing Machine approach to VOTable is useful, and
>>          > it's certainly not robust.
>>          >
>>          >
>>          > You can also refer to the Current Usages document and find
>>          > that applications already need to workaround the old UTYPEs,
>>          > parse them, make assumptions on them, because the simple
>>          > string matching is utterly naive and, in the real life, it
>>          > just doesn't work.
>>          >
>>          > This statement, in particular:
>>          >         For many applications the proposed mechanism will
>>          >         make the recognition of model attributes associated
>>          >         with FIELDS in our table a much more complex and
>>          >         heavy process than the current one. Instead of
>>          >         simple string matching recognition it will require
>>          >         development of a complex hierarchical structure
>>          >         which has to be fully created and filled from
>>          >         VOTABLE parsing and explored for recognition. The
>>          >         objection to my pôint is that standard libraries
>>          >         could do it for the developper, but application
>>          >         developer may want to avoid using this and it may
>>          >         also be unsufull (see c for details)
>>          >
>>          > Apart from being wrong, as demonstrated by the
>>          > implementations and by the rich literature on the Tiger Team
>>          > proposal, since they show that the process is actually
>>          > simplified and standardized, this statement questions
>>          > VOTable itself.
>>          >
>>          >
>>          > In fact, VOTable is "a complex hierarchical structure which
>>          > has to be fully created and filled from VOTABLE parsing and
>>          > explored for recognition”. If a developer doesn’t want to
>>          > use standard libraries they are free to knock themselves out
>>          > and reimplement a VOTable parser, to do which they need to
>>          > read the specs. The same applies to FITS, to XML, to JSON,
>>          > etc. So this applies also to VODML. I don’t see the
>>          > problem.
>>          >
>>          >
>>          >
>>          > Cheers,
>>          >
>>          >
>>          > Omar.
>>          >
>>          >
>>          >
>>          >
>>          > On Wed, May 7, 2014 at 12:38 PM, François Bonnarel
>>          > <francois.bonnarel at astro.unistra.fr> wrote:
>>          >         hi all,
>>          >         Starting from the last TCG teleconf time and from
>>          >         that email from Gerard below  there has been  a lot
>>          >         of discussions around VO-DML those days and there
>>          >         are some aspects which give me some concerns:
>>          >               *     I think many people are still mixing two
>>          >                 aspects which were to be separated in two
>>          >                 different drafts according to the
>>          >                 conclusions of our controversal discussion
>>          >                 held in Hawai as they were summurized by
>>          >                 Jesus here:
>>          >
>>          >         http://wiki.ivoa.net/internal/IVOA/PlenarySessionsSep2013/DM_Closing_Hawaii2013_JSalgado.pdf (slide 6 and 7)
>>          >
>>          >         I see that a lot of work has been done for the
>>          >         update of the first draft "VO-DML: A Data Modeling
>>          >         Language for the VO" but nothing new has been done
>>          >         about the second draft "Mapping of Complex Data
>>          >         Models". The title of this second draft reflected
>>          >         the difficulties appearing in using vo-dml
>>          >         description to map the models into VOTable, making
>>          >         an extensive but seriously modified usage of the
>>          >         utype attribute. However,  I can read several
>>          >         sentences which show that for many people nothing
>>          >         has changed since the time where the introduction of
>>          >         the draft VO-DML draft was first written.
>>          >            Here is a quotation of the abstract of the vo-dml
>>          >         document "VO-DML a consistent modelling language or
>>          >         IVOA data models"
>>          >         > "Arguably the most important use case for VO-DML
>>          >         > is the UTYPE specification [2]
>>          >         > which uses it to provide a translational semantics
>>          >         > for VOTable annotations.
>>          >         > These annotations allow one to explicitly describe
>>          >         > how instances of types from a
>>          >         > data model are stored in the VOTable."
>>          >           and all the introduction still emphasize that the
>>          >         main use case for VO-DML modelling language is utype
>>          >         specification in VOTABLE. It also Implies that the
>>          >         VO-DML GROUP mechanism is THE (unique) way to do the
>>          >         mapping.
>>          >
>>          >             Last but not least I see nothing like a
>>          >         ""Mapping of Complex Data Models" document in the
>>          >         repository.
>>          >
>>          >             I think this is not the spirit of the decision
>>          >         taken in  HawaI. The title chosen in Hawai,
>>          >         reflecting the discussion held there was unambiguous
>>          >         in pre-deciding no peculiar solution.
>>          >               *      The mechanism proposed so far to map
>>          >                 the models into VOTABLE present several
>>          >                 severe issues which I would like to develop.
>>          >              It must be clear to everybody that apart from
>>          >         this, most of the effort done under building a
>>          >         consistent modelling language for IVOA looks very
>>          >         promising to me. Having a description language with
>>          >         xml serialization alllows to share diagrams and
>>          >         models built with different modelling softwares and
>>          >         allows to help generating interoperable
>>          >         documentation and code. This is a real progress and
>>          >         I appraciate the effort done By Mark (and now
>>          >         Arnold) to map various models in the work, done by
>>          >         Gerard, Omar and others. For me this is core of
>>          >         "VO-DML a consistent modelling language for IVOA
>>          >         data models" and this is a progress. Probably we
>>          >         have things to discuss still (the utype attribute
>>          >         stuff and the ivoa datatypes among others) but I see
>>          >         no objection in going forward along this path
>>          >         towards recommendation
>>          >
>>          >               *       So let's talk about "Mapping of
>>          >                 Complex data models"
>>          >               I see three severe issues in adopting the
>>          >         mapping of VO-DML structures to VOTABLE
>>          >
>>          >              1.        For many applications the proposed
>>          >                 mechanism will make the recognition of model
>>          >                 attributes associated with FIELDS in our
>>          >                 table a much more complex and heavy process
>>          >                 than the current one. Instead of simple
>>          >                 string matching recognition it will require
>>          >                 development of a complex hierarchical
>>          >                 structure which has to be fully created and
>>          >                 filled from VOTABLE parsing and explored for
>>          >                 recognition. The objection to my pôint is
>>          >                 that standard libraries could do it for the
>>          >                 developper, but application developer may
>>          >                 want to avoid using this and it may also be
>>          >                 unsufull (see c for details)
>>          >              2. For probably more than 90% of tables
>>          >                 exchanged in the VO the application of this
>>          >                 mapping seem to be simply impossible (or at
>>          >                 least awkward).
>>          >                       *  A large majority of the huge number
>>          >                         of columns available in the VO
>>          >                         (those of the catalogs) are not
>>          >                         associated with a model attribute.
>>          >                         Probably many can have one. It has
>>          >                         started with PhotDM ones for SED
>>          >                         bulding and lot can be associated
>>          >                         with STC or others. But as long as
>>          >                         we add models the number of VO-DML
>>          >                         GROUPS will increase for very
>>          >                         partial matching
>>          >                       * Astronomical catalogs are (or will
>>          >                         be) distributed with TAP. Tap
>>          >                         provides TABLES where the number of
>>          >                         columns is variable, dependant of
>>          >                         the Actual ADL querry sent. This
>>          >                         will  imply that either the
>>          >                         VO-DML-groups are also dependant of
>>          >                         the QUERY (and not unique for a
>>          >                         service implementong a model) OR
>>          >                         (alternativly) that the VO-DML
>>          >                         GROUPS contain some empty (or
>>          >                         absent) FIELDS.
>>          >              3. In many actual and current VO use cases  it
>>          >                 is more or less useless.
>>          >
>>          >
>>          >
>>          >                       *   Why ? IT is well admitted that
>>          >                         IVOA datamodels are not (in general)
>>          >                         internal datamodels of servers and
>>          >                         archives. IT is a model for
>>          >                         interaction of the archive with the
>>          >                         outside world.
>>          >                       * What is the situation for
>>          >                         applications (desktop client
>>          >                         applications I mean)? I think the
>>          >                         assomption  of VO-DML to VOTABLE
>>          >                         mapping is that application will
>>          >                         contain a full implementation of the
>>          >                         IVOA data model (this can probably
>>          >                         be done by preparing the IVOA model
>>          >                         classes when creating/modifying the
>>          >                         application code or by creating them
>>          >                         dynamically when reading the
>>          >                         VOTABLE). Then the parsing of the
>>          >                         VOTABLE allows to populate the
>>          >                         objects with  values contained in
>>          >                         the columns and raws of the VOTABLE.
>>          >                         So everything in the IVOA model is
>>          >                         mapped one to one to the application
>>          >                         model
>>          >                       * I don't say that this cannot be used
>>          >                         and is not usefull  in some cases. I
>>          >                         say it's not usefull in general.
>>          >                         Because as allready discussed in the
>>          >                         past applications use model
>>          >                         attributes (known through
>>          >                         current-style utypes) as roles to
>>          >                         know what to do with the content of
>>          >                         the column.
>>          >                       * Let me now try to formalize this a
>>          >                         little bit. Let's call it "current
>>          >                         life model mapping mechanism". I am
>>          >                         proposing a formalization. that
>>          >                         means everything SEEMS to work like
>>          >                         if it was built like this although I
>>          >                         know it is actually NOT TRUE and is
>>          >                         more dispersed in the code)
>>          >                             - Application has its own model
>>          >                 (its classes and methods, let's say in
>>          >                 java)
>>          >                             - There is the IVOA model
>>          >                 described somewhere. Each attribute and
>>          >                 class has its fully qualified name
>>          >                 (a.b.c.d ...). This is the current-style
>>          >                 utype.
>>          >                              - The application implements a
>>          >                 VOTABLE parser. When it runs and read a
>>          >                 given VOTABLE the current-style utypes are
>>          >                 recognized. An action is driven by this
>>          >                 recognition which is basically to either
>>          >                 populate the objects of the Application
>>          >                 model with values taken frome the VOTABLE
>>          >                 cells or to launch Application model methods
>>          >                 implied by the occurence of this IVOA DM
>>          >                 attribute. This also a kind of mapping but
>>          >                 it is not one to one and maybe incomplete.
>>          >                 Several (incomplete) IVOA data models can be
>>          >                 used in the same VOTABLE document.
>>          >                 I think all the current VO applications work
>>          >                 like this and I assume that in the future a
>>          >                 majority of application developpers would
>>          >                 like to work like this again. For example
>>          >                 that would be the case of developpers
>>          >                 maintaining existing software and eager to
>>          >                 connect their application nearly "as is" to
>>          >                 the VO world. I don't say that nobody would
>>          >                 like to use direct implementation of IVOA
>>          >                 datamodels in their applications but I claim
>>          >                 that not all of them will want to do it or
>>          >                 need to do it.
>>          >             ---------------------------------------------
>>          >
>>          >         As a matter of conclusion of this mail, I would say
>>          >         I wrote all this to (re)open the discussion on the
>>          >         second slot of Jesus' Hawai summary:      "Mapping
>>          >         of Complex data models"
>>          >             I see an alternative to the so-far-VO-DML
>>          >         proposed mapping mechanism :
>>          >               * Let's keep the utypes as java-like fully
>>          >                 qualified names (a.b.c.d) more ore less as
>>          >                 it is now.
>>          >               * This allows existing and future applications
>>          >                 to work according to   my little "current
>>          >                 life model mapping mechanism"
>>          >               * This doesn't forbid to populate an
>>          >                 application model structure exactly mapping
>>          >                 the IVOA model. The fully qualified name can
>>          >                 be decomposed to find out in which class
>>          >                 and which member the FIELD is associated to.
>>          >                 So WE POINT FROM the FIELDS to the model and
>>          >                 NOT the REVERSE way.
>>          >               * Transporting the structure of the model in
>>          >                 the VOTABLE will not be forbidden (mechanism
>>          >                 probably rather similar to the one proposed
>>          >                 so far, utype usage  apart), but will not be
>>          >                 mandatory and in many cases not usefull or
>>          >                 not possible (eg TAP)
>>          >         Best regards
>>          >         François
>>          >         Le 24/04/2014 14:31, Gerard Lemson a écrit :
>>          >         > Dear data modelers
>>          >         >
>>          >         > After urging from the DM chairs, I would like to direct your attention to
>>          >         > the VO-DML page on the IVOA wiki:
>>          >         > http://wiki.ivoa.net/twiki/bin/view/IVOA/VODML
>>          >         >
>>          >         > There you will find links to the VO-DML specification document and the
>>          >         > associated technical xml schema  and schematron files.
>>          >         > Notice that the document is one of the three documents that came from the
>>          >         > UTYPEs Tiger Team. This one, in particular, describes how to express Data
>>          >         > Models in a standard, machine-readable way.
>>          >         > Please read the comment at the start of the spec file for info which parts
>>          >         > are not quite done (mainly some paragraphs in intro have to still be added).
>>          >         >
>>          >         > The core part can be commented on freely.
>>          >         >
>>          >         > The wiki page needs some updating with links to, and descriptions of, some
>>          >         > reference implementations.
>>          >         > Much code has been available since Heidelberg (and before) as part of the
>>          >         > prototyping effort mandated after that Interop. Stable code includes a bunch
>>          >         > of XSLT scripts for validatioon, java code generation, and hypertext
>>          >         > documentation generation which includes DM figures and cross-references
>>          >         > between elements. For two UML tools (Magic Draw CE 12..1 and Modelio) there
>>          >         > are also scripts available to generate VO-DML documents from a properly
>>          >         > designed UML representation. Pointers to the code and to the documentation
>>          >         > is or will be available on the aforementioned wiki pageasap.
>>          >         >
>>          >         > Implementations related to the mapping of VO-DML to VOTable, like the
>>          >         > "VO-DML Mapper" (http://gavo.mpa-garching.mpg.de/dev/vodml-mapper/) created
>>          >         > for helping users and data providers through a point-and-click interface and
>>          >         > the photometry service prototype presented in Hawaii are not included, since
>>          >         > they target a different document. But the vo-dml mapper in particular shows
>>          >         > how one can make use of the machine readable DM documents at runtime and
>>          >         > might be seen as another proof of concept implementation also for this
>>          >         > specification.
>>          >         >
>>          >         > We thank those that, in the past year, have sent comments to the editors
>>          >         > directly, but we would urge members to address comments directly to the dm
>>          >         > mailing list.
>>          >         >
>>          >         > Best regards
>>          >         >
>>          >         > Gerard Lemson
>>          >
>>          >
>>          >
>>          >
>>          >
>>          >
>>          > --
>>          > Omar Laurino
>>          > Smithsonian Astrophysical Observatory
>>          > Harvard-Smithsonian Center for Astrophysics
>>          > 100 Acorn Park Dr. R-377 MS-81
>>          > 02140 Cambridge, MA
>>          > (617) 495-7227
>>          
>>          
>>
>>
>>
>>
>> -- 
>> Omar Laurino
>> Smithsonian Astrophysical Observatory
>> Harvard-Smithsonian Center for Astrophysics
>> 100 Acorn Park Dr. R-377 MS-81
>> 02140 Cambridge, MA
>> (617) 495-7227


More information about the dm mailing list