next

Bernard Vatant bernard.vatant at mondeca.com
Mon Jan 28 09:43:19 PST 2008


Hi all

Some thoughts below on various points (URI, versions, SKOS, RDF/XML ...)

Norman Gray a écrit :
>> - suggest keeping version info in the name of the vocabulary (rather 
>> than in the pathname, which might change without the vocabulary 
>> actually changing) - eg. "http://myvocab.org/myvocab-v1.1#mytoken" 
>> rather than "http://myvocab.org/v1.1/myvocab/#mytoken".
>
I wonder if it's a good idea to have version number at all in the token 
URI (identifying a concept, right?). Generally, the vocabulary itself 
has a version numner, but the concepts are better off defined by a 
stable URI, e.g., http://myvocab.org/myvocab#mytoken. Since the 
definition can change and the authoritative one is in the current 
version, you can use something like :

http://myvocab.org/myvocab#mytoken    rdfs:isDefinedBy   
http://myvocab.org/myvocab/
where the later URI serves the most recent version of the vocabulary.

See a good exemple of this good practice in the publication of Dublin 
Core elements in RDF. The vocabulary is permanently at 
http://purl.org/dc/terms/
which currently serves the latest version of the vocabulary 
http://dublincore.org/2008/01/14/dcterms.rdf
in which elements have version-independent URIs, such as 
http://purl.org/dc/terms/creator

The stability of vocabulary URIs is important for stability of 
applications.
> True, though this might be unnecessary inasmuch as it'd always be the 
> complete namespace URI which is quoted.  On one side, having the 
> version number in the `filename' is convenient for people downloading 
> the file; on the other, having the number further `up' the path name 
> might be convenient in bundling together a number of separate files in 
> a release.  I think this could probably be handled as part of the 
> minutiae of making the release.
See above
>
>> - suggest using either a "hash namespace" (e.g. 
>> "http://myvocab.org/myvocab-v1.1#mytoken"), a "slash namespace" (e.g. 
>> "http://yourvocab.net/yourvocab-v1.1/mytoken") or a 303-redirect 
>> service; all three proposals are out there and have their points.  I 
>> believe the "hash" variety is simpler to understand and configure 
>> (for small vocabularies, and we all agree we want many small 
>> vocabularies rather than a few gigantic ones).  If the congnicenti 
>> think that all the GET requests for distinguishing between contents 
>> are far enough along, then the "slash" variety may make it easier to 
>> query individual entries (e.g. HTML docs rather than RDF), something 
>> which appears to be  harder using "hash".  The point is to make a 
>> single mechanism standard - at least at first, when there are enough 
>> other things to worry about.  When the semantic web finally chooses a 
>> standard, we can still adopt it then (if we haven't already).
Not sure what you mean by "choose a standard"? In this domain, I perfer 
to speak about of (best) recommended practices. Serving one page by 
individual entry is more costly to set up, but has clear advantages in 
making clear what is the individual description, which can be otherwise 
split all over in a single RDF file. OTOH, the full vocabulary file 
publication is also convenient when it's not too big. See what I've done 
at http://lingvoj.org, where I serve both a complete RDF file, and 
individual pages with 303 redirects for each concept.
>
> This is a rather subtle issue, and I agree, something like what you've 
> said.  I presume you've seen papers like
> ...
> and the httprange-14 resolution.  I think this is, like the first, 
> mostly a fine-tuning distribution issue.
Good references, and good conclusion too :))
>> - Looked at a few parsers (e.g. HP's Jena for Java) but still can't 
>> judge whether there are enough flexible parsers out there to be able 
>> to say it doesn't matter whether a publisher uses XML, N3 or Turtle 
>> (I think we can agree we're not going to accept plain ascii, given 
>> that only a trivial vocabulary - list of words - is interpretable and 
>> it is trivial to put plain ascii content into any of the official 
>> formats).  We need to make some statement about format, even if the 
>> statement is that any of the most common formats is OK.  If we need 
>> to choose just one, I still say XML is best, since it doesn't need 
>> any additional parser at all to get started.
>
> One can't be definitive here, but the list at 
> <http://planetrdf.com/guide/#sec-tools> includes a fair number of RDF 
> APIs buried amongst other applications.  The best known parsers are 
> Jena and Sesame, both in Java, and librdf.org, which is written in C 
> but which has bindings in a variety of scripting languages.  I think 
> there are `enough' parsers.
>
> The problem with RDF serialised in XML (aka RDF/XML) remains that, 
> although it is _lexically_ XML, one can't rely on being able to 
> process it sensibly using just XML tools.  Thus you _do_ need an RDF 
> parser to get started.  It's possible to hand-write RDF/XML which 
> looks like `normal' XML, but that's as far as one can go.
>
> So I still believe (as I know you know; I'm just repeating it here for 
> the list!) that our best tactic would be to require RDF in any format, 
> perhaps with an expectation that a published vocabulary would be 
> available in more than one of the legal formats.
Having struggling with this format issue for quite a while, I could not 
stress enough the importance of the points made by Norman here. There is 
a heap of RDF good parsers, but they are not equal vs the XML world. If 
you want to interface RDF storage/processing with a "normal" (non-RDF) 
XML environment, this is to be looked at closely.
>
>> - given that SKOS mappings appear to be a moving target, it looks 
>> like there's no point in mentioning anything about what we're 
>> actually going to DO with the vocabularies.  I was hoping we could at 
>> least support a few RDF containers like rdf:Bag, but it appears we're 
>> going to depend upon simple things like
>>
>>     <param name="event-is-not" rdf:resource="voe:GRB">
>
> Regrettably yes, the SKOS mappings work does seem to be in flux right 
> now (for those who're not following the relevant list slavishly, the 
> current situation seems to be that although the SKOS Core document 
> seemed almost finished apart from fine details, there now appears to 
> be an expectation that the still changing SKOS Mappings standard will 
> move from its own document into the SKOS Core document).  The SKOS 
> Core structures which aren't involved in inter-vocabulary mappings do 
> appear to be stable, though.  Alasdair has been keeping track of this 
> and will surely comment if I'm misrepresenting the SKOS list.
Just a reminder that SKOS mapping had never been so far a "standard", 
but a side-order of first versions of SKOS vocabulary as a product of 
SWAD-Europe project, when it was not even on the W3C track. In a 
nutshell, was has been decided is that the few elements of SKOS mapping 
were not worth a separate namespace and specification, but were to be 
revisited and included in the "core" specification. And effectively with 
less expressivity (or more simplicity).

I take the opportunity to point at the current SKOS working draft, 
released last friday at http://www.w3.org/TR/skos-reference/, which 
includes the state of discussion on those matters, singularly at 
http://www.w3.org/TR/skos-reference/#L1309

Feedback from IVOA use cases and requirements will of course be welcome, 
as usual.

Best

Bernard

-- 

*Bernard Vatant
*Knowledge Engineering
----------------------------------------------------
*Mondeca**
*3, cité Nollez 75018 Paris France
Web:    www.mondeca.com <http://www.mondeca.com>
----------------------------------------------------
Tel:       +33 (0) 871 488 459
Mail:     bernard.vatant at mondeca.com <mailto:bernard.vatant at mondeca.com>
Blog:    Leçons de Choses <http://mondeca.wordpress.com/>




More information about the semantics mailing list