UType proposals

Norman Gray norman at astro.gla.ac.uk
Tue Jul 21 10:56:01 PDT 2009


Mireille, hello.

[I note that we've gone up to three mailing lists again -- to avoid  
boiling the oceans, I vote we take it down to one or at most two, and  
I'd vote for the DM list]

On 2009 Jul 3, at 17:31, Mireille Louys wrote:

> I just want to remind you that we would like to agree on a simple  
> Utype representation from the data model point of view, well focused  
> on implementations at the application level as well as the protocol  
> level.
> Our first focus is to discover, access and manipulate data of  
> various archives.
> This is why we need to stick on the object programming properties  
> underlying the data modeling effort and the Utype string definition.

In order to avoid misunderstanding: I don't in any sense disagree with  
the use of UML or the like as a data modelling language, nor with the  
suggested algorithm for generating utypes from UML model elements.  I  
do feel that the algorithm can safely be made a good practice rather  
than a requirement, on the grounds that it's always best to  
standardise as little as possible, but I don't think that's an  
important issue.

> What I find interesting in Norman's proposal is the possibility to  
> enhance the documentation aspect of data models, which is tricky to  
> manage properly.
> I think we can have a compromise in the Utype string definition, so  
> that the uri mecanism would allow to define the name space and build  
> up links to proper documentation pages describing classes of the  
> model and their attributes in well defined structured document.

Thank you.  Although I believe that the documentation aspect is  
useful, I intended it less as a core feature than as an illustration  
of the sort of useful functionality which could be retrofitted to  
UTypes in future, as long as they were in principle URIs.

> Remember that a Utype string is derived from a data model, its root  
> name always derives from a class name, identifying a concept in the  
> model so the critiscism that it is a simple string unaware of his  
> context is not valid.

I think that 'context' is emerging as a key word in this discussion.   
It may mean slightly different things to different people.

I understand you to mean that, for example, 'PublisherId' is an item  
which exists within the 'context' of 'Curation', and gains some of its  
meaning from there.  Each item in the data model, and each utype- 
labelled element in a VOTable (say), acquires meaning from the context  
within the data model.  I certainly agree with this.  [perhaps we  
could call this 'concept context'?]

My understanding is that Doug (who will surely correct me if I've  
misunderstood) sees 'context' as referring to the version of the SSA  
protocol through which a data item was retrieved, and possibly other  
information, and that this information must remain associated with the  
data, and available to an application which wishes to interpret the  
UType in future. [perhaps 'retrieval context'?]

I intend 'context' to mean pretty much the same as 'UType namespace',  
and suggest that it is prudent to have this as the _only_ information  
required to interpret a UType correctly.  Put another way, I think  
that the 'retrieval context' should be discardable.  Thus a Char'n  
v1.1 PublisherId means the same thing whether it appeared through an  
SSA transaction or from a FITS file on a floppy disk.  Having the  
namespace and the DM item conjoined in a single URI seems a very  
robust way of stopping them getting separated. [perhaps 'namespace  
context'?]

> The documentation can be organised so that a Utype string like :
> obs:Curation.PublisherId is interpreted on the fly :
> a) resolve the name space  obs --> http://ivoa.net/DM/Observationv0.9/
> b) build up a link to the proper documentation page
> http://ivoa.net/DM/Observationv0.9/Documentation/Curation.html
> c) open it and highlight the http://ivoa.net/DM/Observationv0.9/Documentation/Curation.html 
>  #Curation.PublisherId section within this page.

One could certainly do this, but it might be simpler, for both the  
namespace provider and the application, to simply have <http://ivoa.net/DM/Observationv0.9/ 
 > return HTML, so that <http://ivoa.net/DM/Observationv0.9/#Curation.PublisherId 
 > went straight to the relevant section.  As was pointed out (most  
recently I think by Doug, but apologies if it was someone else), you  
would typically want to understand a data model element in context  
(that word again, here used in your 'concept context' sense), and so  
would want to retrieve the whole DM documentation at once, even if you  
wanted to start reading it at the section talking about  
#Curation.PublisherId.

The same URI could of course return HTML to a browser, or an XMI file,  
or something else, depending on the details of the HTTP request.

> This is not as sophisticated as an ontology , but it would certainly  
> help data providers and users to implement, and use models.

Ah, but I think the DMs _are_ ontologies, and ontologies perfectly  
matched to the problem at hand, for the reasons you state (a logician  
would think they were rather dull ontologies, but that's fine -- we  
are not here to provide amusement for logicians).

Job done, almost.  The only remaining job is the inter-WG cooperation,  
making sure that the DM group's models, and the UTypes they include,  
are expressed in such a way that the Semantics WG can make progress  
offering the glue that will potentially link DMs, vocabularies, UCDs,  
VOEvents, and sophisticated Registry searches.

If we don't get that inter-WG cooperation right, we'll have to invent  
_another_ new thing to do the linking.  Unified-Content-Semantic- 
Vocabulary-UTypes anyone?  UCSVUTs?  Please save us from that!

All the best,

Norman


-- 
Norman Gray  :  http://nxg.me.uk
Dept Physics and Astronomy, University of Leicester, UK



More information about the dal mailing list