To Omar Re: VO-DML specification document
François Bonnarel
francois.bonnarel at astro.unistra.fr
Tue May 13 10:14:53 PDT 2014
Hi all,
Hi Jesus
Ready to discuss some of my concerns in Madrid.
This is the topic I choosed to illustrate that (As DAL vice-chair I have
some intrest in the subject).
""Mapping of ImageDM in SIAV2 metadata resource response: the case of
WCS information / issues and discussion"
Cheers
François
On 12/05/2014 12:25, Jesus Salgado wrote:
> Dear all,
>
> I think we are all trying to do our best to create the best
> specifications possible for our community. It is many times difficult
> and not always the effort is well rewarded but we have to remember that
> we all have the same objectives and nothing is possible without
> collaboration.
>
> In this particular point and, although the positions look to be quite
> opposed, I think we are not so far (string utypes and VO-DML preamble
> are not fully incompatible).
>
> We have time allocated in DM II where we will recap the VO-DML related
> ongoing specs and the implementations that some people have been
> developing. We can concentrate the discussion on the mapping.
>
> Let's use this time all together to discuss it face by face as, in many
> occasions, emails can be misinterpreted. I am sure we can all get a
> common view.
>
> Cheers,
> Jesus
>
>
> On Fri, 2014-05-09 at 09:21 -0400, Laurino, Omar wrote:
>>
>> I don't think it's a good idea to doubt that I read the
>> documents. I don't think this kind of assomptions create good
>> conditions for a fruitfull discussion.
>>
>>
>>
>> I said I didn't understand from your email whether you read it or not.
>> You said you didn't find it.
>>
>>
>> And your email didn't sound very friendly either.
>>
>>
>> Notice that by assuming that you didn't read the document I can
>> understand why you missed the Frequently Asked Questions in the
>> document itself. It contains some replies to your objections, and it
>> has been there since before Heidelberg.
>>
>> Life didn't stop at Sao Paulo. There has been extensive
>> controversial discussion in Heidelberg TCG, Heidelberg
>> Interop and Hawai interop.
>>
>>
>> So why are we still having the same discussion we had at all of those
>> Interop that you mention? Why are you suggesting to go back to a
>> solution that is not compatible with the current usages document? Why
>> are you stating that the publicly collected requirements that the
>> Tiger Team tried to meet are "useless"?
>>
>>
>> That doesn't sound very constructive either.
>>
>> Cheers,
>>
>>
>> Omar.
>>
>>
>> My understanding was that the "Mapping of Complex data models"
>> was an open subject after Hawaï.
>> I will answer to all the technical points from your mail, or
>> Markus' one and the issues I see later by mail or in Madrid.
>>
>> Best regards
>> François
>> >
>> > In particular, there are now a number of reference
>> > implementations and prototypes that show that the mapping
>> > document:
>> > - simplifies the interpretation and interoperability of
>> > serializations and applications. You now have a means to
>> > allow people to publish to the VO without having to read the
>> > VO standard documents!
>> > - is backward compatible, allowing *all* of the current
>> > usages to be still valid in a transitional period or for
>> > implementing local requirements.
>> > - meets the requirements it was supposed to meet.
>> >
>> > To be perfectly sure you grasp the meaning of the second
>> > point: CDS services and Aladin can keep using the UTYPEs
>> > they are using now. The Tiger Team’s specification is
>> > perfectly fine with that. If you want to add one more level
>> > of interoperability, we can provide a point-and-click
>> > graphical user interface that produces the metadata section
>> > you need to add to your service responses. You can make this
>> > extra section a mere pointer to the existing PARAMs and
>> > FIELDs, and you are all set. You mentioned that most of your
>> > tables do not have many annotations, so you can probably
>> > automate the process very easily.
>> >
>> > Your suggestion to keep the current lack of standardization
>> > for UTYPEs means to throw away two years of work under the
>> > Exec mandate and go back to the pre-Tiger Team state. Why
>> > would the Exec ask a Tiger Team to find a solution if there
>> > was no problem to solve?
>> >
>> > You also seem to forget that this work was started in order
>> > to overcome some issues in some of the work for the IVOA
>> > science priorities. In particular, there is a general lack
>> > of interoperability when building SEDs, with applications
>> > required to ask the user to “import” data coming from the VO
>> > into VO applications. We have been stuck with these issues
>> > for three years, and the solution itself is stuck because
>> > you don’t seem to take this into account in your objections.
>> > Now SED is not a priority anymore, but this problem is
>> > coming back with Cubes and Time Series.
>> >
>> > Some of your arguments actually support the Tiger Team’s
>> > proposal of matching the local schemata to a global,
>> > implementation-independent one. As Markus has shown you can
>> > indeed apply this to TAP, and it’s not rocket science. So I
>> > am not going to argue with those points.
>> >
>> > A more general comment about some of your more general
>> > points:
>> >
>> > “Simple String Matching”. You claim that with the Tiger
>> > Team's proposal you cannot just compare a single string to
>> > get the information you need.
>> > This is a very old point that we discussed before, but I am
>> > willing to give one more try in explaining why it’s wrong.
>> > The use of standard libraries is only one of the
>> > counterarguments, but there are plenty of others that we
>> > have made in the past and that you are ignoring in reviving
>> > this point.
>> >
>> > In particular, you seem to forget that Simple String
>> > Matching is something you cannot do even now with the
>> > current standards and the lack of a standard for UTYPEs.
>> >
>> > The most trivial example is that you cannot parse a VOTable
>> > by matching one, static string.
>> >
>> > Also, the PhotDM standard requires GROUPs and FIELDrefs, as
>> > the spec points to a Note by Sebastien Derriere after a
>> > EuroVO ICE meeting in 2010 in which the use of GROUPs and
>> > FIELDrefs is deemed beneficial and necessary. The Tiger
>> > Team’s proposal is basically generalizing and standardizing
>> > that note and the PhotDM serialization strategies so that
>> > all serialization and models are interoperable through a
>> > single specification.
>> >
>> > Refer to Sebastien’s presentation in Naples 2011 for the
>> > features and the benefits of this approach.
>> >
>> > Admittedly, in order to generalize his note and PhotDM, and
>> > include other Notes and standards (Markus’ note about STC,
>> > SimDM) under a single framework along with all other models,
>> > we needed to leverage a standard VOTable feature: nested
>> > GROUPs. Technically, this is far from being a revolution,
>> > but the result is powerful and fixes a number of issues we
>> > have been stuck with for years.
>> >
>> > For a real-case example of this I will use one of the
>> > production implementations at CDS.
>> > Consider this snippet from one of your Vizier production
>> > services:
>> >
>> > <GROUP ID="gsed" name="_sed" ucd="phot"
>> > utype="spec:PhotometryPoint">
>> > <DESCRIPTION>The SED group is made of 4 columns: mean
>> > frequency, flux, flux error, and filter
>> > designation</DESCRIPTION>
>> > <FIELDref ref="sed_freq"
>> > utype="photdm:PhotometryFilter.SpectralAxis.Coverage.Location.Value"/>
>> > <FIELDref ref="sed_flux"
>> > utype="spec:PhotometryPoint"/>
>> > <FIELDref ref="sed_eflux"
>> > utype="spec:PhotometryPointError"/>
>> > <FIELDref ref="sed_filter"
>> > utype="photdm:PhotometryFilter.identifier"/>
>> > </GROUP>
>> > […]
>> > <FIELD ID="sed_freq" name="_sed_freq" ucd="em.freq"
>> > unit="GHz" datatype="double" width="10" precision="E6">
>> > <FIELD ID="sed_flux" name="_sed_flux"
>> > ucd="phot.flux.density" unit="Jy" datatype="float" width="9"
>> > precision="E3">
>> > <FIELD ID="sed_eflux" name="_sed_eflux"
>> > ucd="stat.error;phot.flux.density" unit="Jy"
>> > datatype="float" width="8" precision="E2">
>> > <FIELD ID="sed_filter" name="_sed_filter"
>> > ucd="meta.id;instr.filter" unit="" datatype="char"
>> > width="32" arraysize="32*”>
>> >
>> > [I omitted the descriptions for clarity]
>> >
>> > I will assume that the “spec:” UTYPEs were defined in some
>> > standard. [They are not as far as I know, but it’s hard to
>> > tell because there are several thousands UTYPEs defined in
>> > many documents and their versions, and there is no mechanism
>> > to know what “spec:” is pointing to. These are all issues
>> > fixed by VODML and the mapping strategy we suggested, by the
>> > way]
>> >
>> > How can you get to the FIELD with a single string match? You
>> > can’t, you need to match one string, parse the parent
>> > element according to the VOTable spec (more conditional
>> > string matching, if you prefer), find the “ref” attribute,
>> > match one more string and find the FIELD. Yet, you haven’t
>> > accomplished much because you need to know much more
>> > information in order to get to the data. In general, I don't
>> > think this Turing Machine approach to VOTable is useful, and
>> > it's certainly not robust.
>> >
>> >
>> > You can also refer to the Current Usages document and find
>> > that applications already need to workaround the old UTYPEs,
>> > parse them, make assumptions on them, because the simple
>> > string matching is utterly naive and, in the real life, it
>> > just doesn't work.
>> >
>> > This statement, in particular:
>> > For many applications the proposed mechanism will
>> > make the recognition of model attributes associated
>> > with FIELDS in our table a much more complex and
>> > heavy process than the current one. Instead of
>> > simple string matching recognition it will require
>> > development of a complex hierarchical structure
>> > which has to be fully created and filled from
>> > VOTABLE parsing and explored for recognition. The
>> > objection to my pôint is that standard libraries
>> > could do it for the developper, but application
>> > developer may want to avoid using this and it may
>> > also be unsufull (see c for details)
>> >
>> > Apart from being wrong, as demonstrated by the
>> > implementations and by the rich literature on the Tiger Team
>> > proposal, since they show that the process is actually
>> > simplified and standardized, this statement questions
>> > VOTable itself.
>> >
>> >
>> > In fact, VOTable is "a complex hierarchical structure which
>> > has to be fully created and filled from VOTABLE parsing and
>> > explored for recognition”. If a developer doesn’t want to
>> > use standard libraries they are free to knock themselves out
>> > and reimplement a VOTable parser, to do which they need to
>> > read the specs. The same applies to FITS, to XML, to JSON,
>> > etc. So this applies also to VODML. I don’t see the
>> > problem.
>> >
>> >
>> >
>> > Cheers,
>> >
>> >
>> > Omar.
>> >
>> >
>> >
>> >
>> > On Wed, May 7, 2014 at 12:38 PM, François Bonnarel
>> > <francois.bonnarel at astro.unistra.fr> wrote:
>> > hi all,
>> > Starting from the last TCG teleconf time and from
>> > that email from Gerard below there has been a lot
>> > of discussions around VO-DML those days and there
>> > are some aspects which give me some concerns:
>> > * I think many people are still mixing two
>> > aspects which were to be separated in two
>> > different drafts according to the
>> > conclusions of our controversal discussion
>> > held in Hawai as they were summurized by
>> > Jesus here:
>> >
>> > http://wiki.ivoa.net/internal/IVOA/PlenarySessionsSep2013/DM_Closing_Hawaii2013_JSalgado.pdf (slide 6 and 7)
>> >
>> > I see that a lot of work has been done for the
>> > update of the first draft "VO-DML: A Data Modeling
>> > Language for the VO" but nothing new has been done
>> > about the second draft "Mapping of Complex Data
>> > Models". The title of this second draft reflected
>> > the difficulties appearing in using vo-dml
>> > description to map the models into VOTable, making
>> > an extensive but seriously modified usage of the
>> > utype attribute. However, I can read several
>> > sentences which show that for many people nothing
>> > has changed since the time where the introduction of
>> > the draft VO-DML draft was first written.
>> > Here is a quotation of the abstract of the vo-dml
>> > document "VO-DML a consistent modelling language or
>> > IVOA data models"
>> > > "Arguably the most important use case for VO-DML
>> > > is the UTYPE specification [2]
>> > > which uses it to provide a translational semantics
>> > > for VOTable annotations.
>> > > These annotations allow one to explicitly describe
>> > > how instances of types from a
>> > > data model are stored in the VOTable."
>> > and all the introduction still emphasize that the
>> > main use case for VO-DML modelling language is utype
>> > specification in VOTABLE. It also Implies that the
>> > VO-DML GROUP mechanism is THE (unique) way to do the
>> > mapping.
>> >
>> > Last but not least I see nothing like a
>> > ""Mapping of Complex Data Models" document in the
>> > repository.
>> >
>> > I think this is not the spirit of the decision
>> > taken in HawaI. The title chosen in Hawai,
>> > reflecting the discussion held there was unambiguous
>> > in pre-deciding no peculiar solution.
>> > * The mechanism proposed so far to map
>> > the models into VOTABLE present several
>> > severe issues which I would like to develop.
>> > It must be clear to everybody that apart from
>> > this, most of the effort done under building a
>> > consistent modelling language for IVOA looks very
>> > promising to me. Having a description language with
>> > xml serialization alllows to share diagrams and
>> > models built with different modelling softwares and
>> > allows to help generating interoperable
>> > documentation and code. This is a real progress and
>> > I appraciate the effort done By Mark (and now
>> > Arnold) to map various models in the work, done by
>> > Gerard, Omar and others. For me this is core of
>> > "VO-DML a consistent modelling language for IVOA
>> > data models" and this is a progress. Probably we
>> > have things to discuss still (the utype attribute
>> > stuff and the ivoa datatypes among others) but I see
>> > no objection in going forward along this path
>> > towards recommendation
>> >
>> > * So let's talk about "Mapping of
>> > Complex data models"
>> > I see three severe issues in adopting the
>> > mapping of VO-DML structures to VOTABLE
>> >
>> > 1. For many applications the proposed
>> > mechanism will make the recognition of model
>> > attributes associated with FIELDS in our
>> > table a much more complex and heavy process
>> > than the current one. Instead of simple
>> > string matching recognition it will require
>> > development of a complex hierarchical
>> > structure which has to be fully created and
>> > filled from VOTABLE parsing and explored for
>> > recognition. The objection to my pôint is
>> > that standard libraries could do it for the
>> > developper, but application developer may
>> > want to avoid using this and it may also be
>> > unsufull (see c for details)
>> > 2. For probably more than 90% of tables
>> > exchanged in the VO the application of this
>> > mapping seem to be simply impossible (or at
>> > least awkward).
>> > * A large majority of the huge number
>> > of columns available in the VO
>> > (those of the catalogs) are not
>> > associated with a model attribute.
>> > Probably many can have one. It has
>> > started with PhotDM ones for SED
>> > bulding and lot can be associated
>> > with STC or others. But as long as
>> > we add models the number of VO-DML
>> > GROUPS will increase for very
>> > partial matching
>> > * Astronomical catalogs are (or will
>> > be) distributed with TAP. Tap
>> > provides TABLES where the number of
>> > columns is variable, dependant of
>> > the Actual ADL querry sent. This
>> > will imply that either the
>> > VO-DML-groups are also dependant of
>> > the QUERY (and not unique for a
>> > service implementong a model) OR
>> > (alternativly) that the VO-DML
>> > GROUPS contain some empty (or
>> > absent) FIELDS.
>> > 3. In many actual and current VO use cases it
>> > is more or less useless.
>> >
>> >
>> >
>> > * Why ? IT is well admitted that
>> > IVOA datamodels are not (in general)
>> > internal datamodels of servers and
>> > archives. IT is a model for
>> > interaction of the archive with the
>> > outside world.
>> > * What is the situation for
>> > applications (desktop client
>> > applications I mean)? I think the
>> > assomption of VO-DML to VOTABLE
>> > mapping is that application will
>> > contain a full implementation of the
>> > IVOA data model (this can probably
>> > be done by preparing the IVOA model
>> > classes when creating/modifying the
>> > application code or by creating them
>> > dynamically when reading the
>> > VOTABLE). Then the parsing of the
>> > VOTABLE allows to populate the
>> > objects with values contained in
>> > the columns and raws of the VOTABLE.
>> > So everything in the IVOA model is
>> > mapped one to one to the application
>> > model
>> > * I don't say that this cannot be used
>> > and is not usefull in some cases. I
>> > say it's not usefull in general.
>> > Because as allready discussed in the
>> > past applications use model
>> > attributes (known through
>> > current-style utypes) as roles to
>> > know what to do with the content of
>> > the column.
>> > * Let me now try to formalize this a
>> > little bit. Let's call it "current
>> > life model mapping mechanism". I am
>> > proposing a formalization. that
>> > means everything SEEMS to work like
>> > if it was built like this although I
>> > know it is actually NOT TRUE and is
>> > more dispersed in the code)
>> > - Application has its own model
>> > (its classes and methods, let's say in
>> > java)
>> > - There is the IVOA model
>> > described somewhere. Each attribute and
>> > class has its fully qualified name
>> > (a.b.c.d ...). This is the current-style
>> > utype.
>> > - The application implements a
>> > VOTABLE parser. When it runs and read a
>> > given VOTABLE the current-style utypes are
>> > recognized. An action is driven by this
>> > recognition which is basically to either
>> > populate the objects of the Application
>> > model with values taken frome the VOTABLE
>> > cells or to launch Application model methods
>> > implied by the occurence of this IVOA DM
>> > attribute. This also a kind of mapping but
>> > it is not one to one and maybe incomplete.
>> > Several (incomplete) IVOA data models can be
>> > used in the same VOTABLE document.
>> > I think all the current VO applications work
>> > like this and I assume that in the future a
>> > majority of application developpers would
>> > like to work like this again. For example
>> > that would be the case of developpers
>> > maintaining existing software and eager to
>> > connect their application nearly "as is" to
>> > the VO world. I don't say that nobody would
>> > like to use direct implementation of IVOA
>> > datamodels in their applications but I claim
>> > that not all of them will want to do it or
>> > need to do it.
>> > ---------------------------------------------
>> >
>> > As a matter of conclusion of this mail, I would say
>> > I wrote all this to (re)open the discussion on the
>> > second slot of Jesus' Hawai summary: "Mapping
>> > of Complex data models"
>> > I see an alternative to the so-far-VO-DML
>> > proposed mapping mechanism :
>> > * Let's keep the utypes as java-like fully
>> > qualified names (a.b.c.d) more ore less as
>> > it is now.
>> > * This allows existing and future applications
>> > to work according to my little "current
>> > life model mapping mechanism"
>> > * This doesn't forbid to populate an
>> > application model structure exactly mapping
>> > the IVOA model. The fully qualified name can
>> > be decomposed to find out in which class
>> > and which member the FIELD is associated to.
>> > So WE POINT FROM the FIELDS to the model and
>> > NOT the REVERSE way.
>> > * Transporting the structure of the model in
>> > the VOTABLE will not be forbidden (mechanism
>> > probably rather similar to the one proposed
>> > so far, utype usage apart), but will not be
>> > mandatory and in many cases not usefull or
>> > not possible (eg TAP)
>> > Best regards
>> > François
>> > Le 24/04/2014 14:31, Gerard Lemson a écrit :
>> > > Dear data modelers
>> > >
>> > > After urging from the DM chairs, I would like to direct your attention to
>> > > the VO-DML page on the IVOA wiki:
>> > > http://wiki.ivoa.net/twiki/bin/view/IVOA/VODML
>> > >
>> > > There you will find links to the VO-DML specification document and the
>> > > associated technical xml schema and schematron files.
>> > > Notice that the document is one of the three documents that came from the
>> > > UTYPEs Tiger Team. This one, in particular, describes how to express Data
>> > > Models in a standard, machine-readable way.
>> > > Please read the comment at the start of the spec file for info which parts
>> > > are not quite done (mainly some paragraphs in intro have to still be added).
>> > >
>> > > The core part can be commented on freely.
>> > >
>> > > The wiki page needs some updating with links to, and descriptions of, some
>> > > reference implementations.
>> > > Much code has been available since Heidelberg (and before) as part of the
>> > > prototyping effort mandated after that Interop. Stable code includes a bunch
>> > > of XSLT scripts for validatioon, java code generation, and hypertext
>> > > documentation generation which includes DM figures and cross-references
>> > > between elements. For two UML tools (Magic Draw CE 12..1 and Modelio) there
>> > > are also scripts available to generate VO-DML documents from a properly
>> > > designed UML representation. Pointers to the code and to the documentation
>> > > is or will be available on the aforementioned wiki pageasap.
>> > >
>> > > Implementations related to the mapping of VO-DML to VOTable, like the
>> > > "VO-DML Mapper" (http://gavo.mpa-garching.mpg.de/dev/vodml-mapper/) created
>> > > for helping users and data providers through a point-and-click interface and
>> > > the photometry service prototype presented in Hawaii are not included, since
>> > > they target a different document. But the vo-dml mapper in particular shows
>> > > how one can make use of the machine readable DM documents at runtime and
>> > > might be seen as another proof of concept implementation also for this
>> > > specification.
>> > >
>> > > We thank those that, in the past year, have sent comments to the editors
>> > > directly, but we would urge members to address comments directly to the dm
>> > > mailing list.
>> > >
>> > > Best regards
>> > >
>> > > Gerard Lemson
>> >
>> >
>> >
>> >
>> >
>> >
>> > --
>> > Omar Laurino
>> > Smithsonian Astrophysical Observatory
>> > Harvard-Smithsonian Center for Astrophysics
>> > 100 Acorn Park Dr. R-377 MS-81
>> > 02140 Cambridge, MA
>> > (617) 495-7227
>>
>>
>>
>>
>>
>>
>> --
>> Omar Laurino
>> Smithsonian Astrophysical Observatory
>> Harvard-Smithsonian Center for Astrophysics
>> 100 Acorn Park Dr. R-377 MS-81
>> 02140 Cambridge, MA
>> (617) 495-7227
More information about the dm
mailing list