Format of tokens (was Re: Fwd: Re: IVOA Thesaurus)
Douglas Burke
dburke at cfa.harvard.edu
Thu Nov 1 13:49:22 PDT 2007
Brian Thomas wrote:
> On Thursday 01 November 2007 1:06:55 pm Frederic V. Hessman wrote:
>> At the time, there where lots of voices saying that, while you are
>> perfectly correct (and I'd prefer to have them as humanly readable as
>> possible), the realities of computer-based parsing mean that a
>> trivial token format costs less pain.
>>
>> How about an official show of hands?
>
> Could we have the arguments against human readable again first, before voting?
Brian,
Norman wrote the following in an email on Oct 10 - Versions and
namespaces (was: Vocab AND Ontology?) - where >> indicates a quite from
Rick.
HTH,
Doug
>> I personally find the revamped token list to be much more
palatable (which is obviously why I did it), being nearly human-usable
(I don't like to be shouted at by capitalized tokens) and with implicit
additional info (e.g. formal names of people and objects).
Doug brought up the issue of how to generate the concept names, as URI
fragments. This is a stylistic point, but I think an important one.
I'd like to suggest a rather drastic canonicalisation, so that "He+
ionization zone" would turn into #heionizationzone. This is a pragmatic
middle ground between having the concept name mirror the label, and
having it fully opaque (such as #concept12345).
Having it consist of only lowercase alpha means (a) we're guaranteed to
avoid any parsing troubles, with RDF parsers or with anything else; (b)
it's clear to anyone looking at this that they're not supposed to be
displaying the concept name, but using the concept's 'Label' and
declared relationships instead; while (c) it retains some mnemonic value.
There is a case which can be made for having fully opaque concept names
(this is what's done in the Gene Ontology, for example): it's point (b)
above, plus it removes any temptation to argue about relationships based
on the name alone. Despite that, I think there's value in making it at
least partly human-recognisable.
More information about the semantics
mailing list