VOEvent References

Norman Gray norman at astro.gla.ac.uk
Mon Mar 21 11:32:04 PDT 2011


Rick (and all), hello.

So we have the problem of distinguishing between the data format, and the data meaning.  The former is a syntactic matter -- which library can parse this stream of bytes? -- and the second is a semantic one -- what scientific role does this data object have?

These two aspects seem orthogonal and complete, and that's nice and simple.  But your further examples complicate this somewhat.

On 2011 Mar 21, at 10:44, Frederic V. Hessman wrote:

>  If the "type" attribute is supposed to say what the format of the dataq are in and the semantic content of the reference is supposed to say what the reference "means", then may I suggest we considering the following brainstorming examples which will NOT work with MIME:
> 
> 	<Reference format="vo://net.ivoa/vospace/core#sextractor"  meaning="ivoat:catalogues" />
> 	<Reference format="http://gsfc.nasa.gov/rdf/swift#SwiftByteFormatStream"  meaning="ivoat:telemetry" />
> 	<Reference format="skypublishing:web2/skos#CelestiaObjectDescriptionFile"  meaning="ivoat:simulation" />

Also, from your reply to Steve,

> I understand, and this is all very good, but does IVOA then have to create VO standards for everyone who might need it?  My examples of SWIFT telemetry and Celestia simulation files weren't (entirely)  tongue-in-cheek.

I think the answer lies at the end of this train of thought:

The useful insight of REST (which claimed to be the insight of The Web) is that there are _things_, such as the weather in Oaxaca, or a picture of Santa, which can be _represented_ by serialisations which can be transmitted over the web.  A thing may have multiple representations, and you can get information about the representation you've actually received, by looking at the MIME type.

The _thing_ may be more or less abstract, but it's turned into something else -- a representation -- for delivery to the user/client/astronomer.  That suggests that the representation is to some extent a detail of the transport, and not important once you've moved back up the network stack to the real application logic.

That perhaps indicates where the boundary is between @mimetype and @sciencetype -- it's at some boundary between the details of the interaction, and the meat-and-potatoes of the application.  In your example of a CelestiaObjectDescriptionFile, the important thing for the application is that this is a CelestiaObjectDescriptionFile.  It's not a simulation which just happens to be encoded in that format, which can therefore be abstracted away in the way that GIF vs PNG can be abstracted away -- the fact it's in that format is of significance to the meat-and-potatoes level of the application.

The same thing can I think be said of the other examples you mention.

I'm aware this is not a hard boundary, and that there will be a point of view from which the important thing is that the file is a simulation, only incidentally in this Celestia format.  But in the first place, that relationship will probably be of significance only when you are, for example, _searching_ for simulations rather than processing a VOEvent packet, and in the second place, this seems to be suggesting that if you really want to have this network of semantic information written down somewhere, so that the application 'knows' that a Celestia file is a simulation, then... have I mentioned RDF at all before?

So perhaps the _real_ answer is to say "forget about this distinction between @sciencetype and @mimetype, since the boundary between them evaporates as you look at it; instead describe all the information you have in some suitably flexible packet of semantic description".  But I _know_ no-one's going to buy that (which is a pity, because I'm starting to think that's close to the right answer).

Instead, identify some pragmatic boundary between interaction and meat-and-potatoes, with the 'interaction layer' being characterised as "everything you can describe in a MIME type".

The MIME types don't _have_ to be standardised ones, of course.  It's not pretty, but you could say

    mimetype='text/x-sextractor-catalogue'
    sciencetype='ivoat:catalogue'

f you really think that the fact that the bytestream is in that format (and could be losslessly recoded in a different tabular format) is an interaction-layer detail, and say

    mimetype='application/octet-stream'
    sciencetype="vo://net.ivoa/vospace/core#sextractor"

if you think that being a SExtractor output file is the key thing.

This may be turning into a larger question.  Enough now!

Best wishes,

Norman


-- 
Norman Gray  :  http://nxg.me.uk
School of Physics and Astronomy, University of Glasgow, UK



More information about the semantics mailing list