Reference implementations

Laurino, Omar olaurino at cfa.harvard.edu
Tue May 10 19:07:56 CEST 2016


To respond to Tom: I think everybody in this conversation is sane enough
for agreeing with you on the general goal of the reference implementations.

Technically speaking, though, a data model alone does not achieve any
purpose other than describing entities and their relationships in a
particular domain. It doesn't serve them in protocols, it doesn't serialize
them in VOTables, but it is used by other standards for achieving these
other goals.

We have a process for defining standards that ensures that use cases are
established first, and the standards need to be reviewed to make sure that
they fulfill them.

What does it mean for a data model? Since we are introducing standard
representations for data models, they have to pass the "human" review and
reference implementation process to make sure they capture the universe of
discourse they were supposed to capture, but now they can also be validated
against an XML schema and a schematron.

As for what qualifies reference serializations (which I thought was in the
scope of the Mapping document, that's why we split them years ago), you
probably need at least three things:
  * A serialization (fake if necessary) that touches all of the entities
and all of the relationships in the model. You can't possibly have all the
possible combinations/permutations. That would be insane and I don't know
of anybody that validates UML class diagrams by building all possible
object diagrams that can be created from it (and if there is recursion,
that's not even possible).

  * One or more real life serializations from actual or prototype services
covering the use cases the model was designed to fulfill.

  * Client software that can read the serializations and prove that it can
"understand" the information that was encrypted with it.

I would argue that if there are no concrete, real-life implementations of
parts of a data model, then the model was over-speficied and covers more
use cases than were needed. If those parts have not been tested, they
should not be there to begin with. In this sense, while it is fine if a
reference implementation does not cover the whole model, the union of all
of the implementations (minus the possible "fake" serialization", which
works as some sort of "unit test", so it doesn't count in this context)
should.

Reference implementations are supposed to be complete. If they (alone or
together) are not complete, the standard does not have enough reference
implementations by definition.

Omar.


On Tue, May 10, 2016 at 12:32 PM, Mireille Louys <mireille.louys at unistra.fr>
wrote:

> Hi Gerard, Hi all,
>
> Le 10/05/2016 18:04, Gerard Lemson a écrit :
>
> HI Tom
> I think that writing a model in VO-DML is *not* an implementation of the
> model, but its definition. It *is* an implementation of VO-DML,
>
> Yes, I agree.
>
> An *instance* of the model is a valid implementation of the model, but we
> have no generic standard way (yet) for representing instances of data
> models. The mapping document will fit that, but protocols can also do that.
> Showing interoperability of the data model could be an application that
> uses two or more independent instances of the data model serialized in some
> standard way.
> One way this might happen is that votables annotated with the same data
> model are interpreted as instances of the data model.
> And it would be nice if something interesting is done with them. Ieally
> this could be an implementation of a use cases the model was supposed to
> support.
>
> The different scenarios  proposed in the use-cases should be checked , I
> guess.
> This is sometimes difficult if the model tries to encompass many different
> situations  ( as Char tried, and ND-Cube will probably) .
> Could we envisage a partial validation where the main scenarios are
> checked first and the secondary ones later.
>
> In other terms , should we be happy with a 75% validation rate for
> serialisations obtained from a new data model?
>
> my 2c, Mireille.
>
>
> Gerard
>
>
> On Tue, May 10, 2016 at 10:46 AM, Tom McGlynn (NASA/GSFC Code 660.1) <
> <tom.mcglynn at nasa.gov>tom.mcglynn at nasa.gov> wrote:
>
>> If you feel that data models are not subject to the requirement of two
>> reference implementations, that's fine but then this discussion is moot
>> regardless.  If you think they are then you are quibbling about my choice
>> of words. Feel free to substitute whichever you like for 'protocol'.
>>
>>     Regards,
>>     Tom
>>
>> Matthew Graham wrote:
>>
>>> But a data model is not a protocol.
>>>
>>> -- Matthew
>>>
>>> On May 10, 2016, at 4:37 PM, Tom McGlynn (NASA/GSFC Code 660.1) wrote:
>>>
>>> This kind of using the fact that you have written a definition of the
>>>> model counting as an implementation of the model sounds awfully incestuous.
>>>> I have always read the requirement as having two different groups using the
>>>> protocol in some service, ideally one that supports doing astronomy.
>>>>
>>>>     Tom McGlynn
>>>>
>>>> Matthew Graham wrote:
>>>>
>>>>> So once we have a mapping standard with reference implementations any
>>>>> DM model specified in VO-DML could automatically have a reference
>>>>> implementation (according to these criteria) which would make life easier.
>>>>> Time to get that mapping spec out :)
>>>>>
>>>>> -- Matthew
>>>>>
>>>>>
>>>>>
>>>>> On May 10, 2016, at 4:00 PM, Laurino, Omar wrote:
>>>>>
>>>>> Hi,
>>>>>>
>>>>>> I agree with Gerard that items 1) and 2) look like the same, unless
>>>>>> one brings in a standard for instance serializations. Since we can have
>>>>>> standardized mapping strategies (as different recommendations) that map
>>>>>> instances of valid VODML/XML models and their standard serializations, I
>>>>>> don't think a valid serialization of an instance should be required for
>>>>>> data models: this should be guaranteed by the mapping standard(s) and their
>>>>>> reference implementations.
>>>>>>
>>>>>> As for the ModelImport requirement, it makes sense for "low level"
>>>>>> models but not for "high level" ones, plus there is no way to guarantee
>>>>>> that all types defined by a model are extendable/usable by other models. It
>>>>>> gets too complicated.
>>>>>>
>>>>>> I would suggest we include the "model import" evidence as a "soft
>>>>>> requirement", to be evaluated on a case-by-case basis depending on the use
>>>>>> cases of the model. For STC, it makes a lot of sense to require this
>>>>>> additional proof of interoperability, because the model is intended to be a
>>>>>> building block for other models.
>>>>>>
>>>>>> Or, we might be to *require* that at least one reference
>>>>>> implementation is a mission/archive/service-specific model that extends the
>>>>>> standard one. So, if we had a model for Sources/Catalogs, at least one
>>>>>> reference implementation should be a mission/archive/system-specific model
>>>>>> that proofs the model can be meaningfully extended by actual
>>>>>> specializations. Other than properly validating as VODML/XML, this model
>>>>>> should be evaluated for its domain-specific content, and that's probably
>>>>>> not something you can automate. This should also be a good way to involve
>>>>>> implementors from the community, so it's probably my preferred one.
>>>>>>
>>>>>> Omar.
>>>>>>
>>>>>>
>>>>>> On Tue, May 10, 2016 at 9:38 AM, Gerard Lemson <
>>>>>> gerard.lemson at gmail.com <mailto:gerard.lemson at gmail.com>> wrote:
>>>>>>
>>>>>>     Hi
>>>>>>     What is meant by item 2, "An XML serialization of the DM".
>>>>>>     The standard representation (serialization?) of a VO-DML data
>>>>>>     model is VO-DML/XML, i.e. XML. And that is the representation
>>>>>>     that can be validated (step 1?) using automated means, for
>>>>>>     example using XSLT scripts in the vo-dml/xslt folder on
>>>>>> volute at gavo.
>>>>>>     If 2) is meant to imply an XML serialization of an *instance* of
>>>>>>     the model, that we can only do once we have a standard XML
>>>>>>     representation of instances of models. That does not yet exist.
>>>>>>     The original VO-URP framework does contain an automated XML
>>>>>>     Schema generator for its version of VO-DML, that has not yet been
>>>>>>     ported to VO-DML.
>>>>>>     And of course the mapping document describes how one can describe
>>>>>>     instances serialized in VOTable, but that is a different standard.
>>>>>>
>>>>>>     For what it's worthy, I think that an "implementation of VO-DML"
>>>>>>     is a data model expressed using that language (in VO-DML/XML to
>>>>>>     be precise) and validated using software. The latter enforces
>>>>>>     that the language should allow automated validtion.
>>>>>>     I think interoperable implementations of VO-DML are two or more
>>>>>>     valid models that are linked by "modelimport" relationships.
>>>>>>     I.e.one model "imports" the other(s) and uses types from the
>>>>>>     other as roles or super types in the definition of its own types.
>>>>>>     This is supported by the VODMLID/VODMLREF meachanism of the
>>>>>> language.
>>>>>>
>>>>>>     Cheers
>>>>>>     Gerard
>>>>>>
>>>>>>
>>>>>>     On Tue, May 10, 2016 at 9:04 AM, Matthew Graham
>>>>>>     <mjg at cd3.caltech.edu <mailto:mjg at cd3.caltech.edu>> wrote:
>>>>>>
>>>>>>         Hi,
>>>>>>
>>>>>>         We're trying to define specifically what would satisfy the
>>>>>>         reference implementation requirement for an IVOA Spec in the
>>>>>>         context of a data model. The proposal is that:
>>>>>>
>>>>>>         (1) If the DM has been described using VO-DML it can be
>>>>>>         validated as valid VO-DML
>>>>>>
>>>>>>         (2) An XML serialization of the DM can be validated
>>>>>>
>>>>>>         so therefore is the combination of the two sufficient to
>>>>>>         demonstrate the validity and potential interoperability of
>>>>>>         the data model (which is the purpose of the reference
>>>>>>         implementations).
>>>>>>
>>>>>>                 Cheers,
>>>>>>
>>>>>>                 Matthew
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Omar Laurino
>>>>>> Smithsonian Astrophysical Observatory
>>>>>> Harvard-Smithsonian Center for Astrophysics
>>>>>> 100 Acorn Park Dr. R-377 MS-81
>>>>>> 02140 Cambridge, MA
>>>>>> (617) 495-7227 <%28617%29%20495-7227>
>>>>>>
>>>>>
>>
>
>


-- 
Omar Laurino
Smithsonian Astrophysical Observatory
Harvard-Smithsonian Center for Astrophysics
100 Acorn Park Dr. R-377 MS-81
02140 Cambridge, MA
(617) 495-7227
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ivoa.net/pipermail/dm/attachments/20160510/6ee713bf/attachment-0001.html>


More information about the dm mailing list