STC-S in DataLink
François Bonnarel
francois.bonnarel at astro.unistra.fr
Fri Jun 21 08:33:38 PDT 2013
Hi Markus , hi all,
Apparently my email yesterday evenning din't appear on the list so
I restate the litlle caveat I wrote
I would say it's not "STC-S in Datalink" but something like "STC-S in
cutout services and access data methods".
This kind of services and methods will be part of the ressources
Datalink will attach to Dataproducts indeed,
but according to the discussion during Heidelberg interop last month
Datalink protocol in itself is only describing
the nature , format, type and semantics or descriptions of the links and
will say nothing about the ressources parameters themselves
But anyway "Datalink" is not the core of Markus' stuff..
You are wright Markus that full managment of all STC subtilities
from a string expression can be very hard and should be banished from
our interfaces.
But, as Pat said, we don't have a real STC-S standard yet, and as Arnold
stated we can limit the scope not only at the level of STC-S definition
but at the
level of the services capabilities.
Using some kind of STC-S for defining regions of interest, cutout
shapes, positions, in a predefined coordinate system is probably
usefull, because it will give direct model-based standardization of the
parameters.
On the other side (but this was not your point) I don't think
using STC-S for something moer complex than footprints (Regions,
Intervals; AstroCoordArea) is a good idea in the query response. Stc
Utypes could be better there (When I say more complex than footprints I
am thinking to WCS for example).
Best regards
François
Le 20/06/2013 16:46, Markus Demleitner a écrit :
> Dear DAL group,
>
> Since this is going to be a long mail, I feel obliged to start with
> an
>
> Abstract
>
> In the context of DataLink, there is renewed interest in STC-S on a
> protocol level, i.e., for passing shape descriptions into services.
> I believe we should not do this.
>
> In a first part of this (longish) mail, I try to make this point.
> There's a second part below in which I try to outline how STC-S *has*
> worked for me, and under which conditions; that part is basically
> something like "If you agree with me, encourage me, and I'll work on an
> STC-S working draft with actual EBNF."
>
> Sorry for this long soliloquy, but I am very sure this is an
> important point for the later interoperability of DataLink-based
> services.
>
>
> Part I: STC-S in Protocols
>
> I'm arguing against usages like "CUTOUT=Ellipse ICRS 33 45 4 5 unit mas
> SpectralInterval TOPOCENTER 55 65 unit MeV pixelSize 1" in protocol
> parameters, and actually against abusing STC-S in cases where you just
> want to define some shape in spherical geometry. Here are my reasons:
>
> (1) Mashing data and metdata is like denormalizing databases:
> You may get away with it and save some work, but if you've not
> understood exactly what you're doing, you'll almost certainly regret it.
>
> (2) STC-S covers an enormous wealth of features. Even suggesting that
> all services should be able to transform, e.g., wavelength into the rest
> system is not going to be a high incentive to make people take up a
> standard. Don't wave at "there's going to be a library" -- there's not.
> Many years after STC-S was published and the STC DM passed, all we have
> are some libraries that can -- more or less -- parse STC-S and spit the
> stuff out in some other form.
>
> Actually *doing* something with what's parsed is something completely
> different. In my STC library I'm allowing some "conforming" (making one
> STC spec use the reference frame, units, and such of another STC spec),
> but exactly because STC is a complex beast, that's no fun at all. That
> was part of the pain I alluded to above, and I'm still ignoring most
> things that actually are complicated (like tranfroming spatial
> coordinates from the EMBARYCENTER reference position to the PLUTO
> reference position).
>
> But worse (just for this example) -- if you want to transform positions
> for differing reference positions, you need to know the source's
> distance, and it's completely unclear how to do that for images.
> Transforming spectral coordinates to, say, the observer's frame, you
> need to know the source's redshift -- which is something I don't know
> for the majority of the spectra in my database. And what should happen
> for, say, the Lyman forests in quasars, where there's sources with lots
> of redshifts?
>
> And now start to imagine the wealth of decisions facing your code when
> people come in with CART3 coordinates, some of which are perfectly good
> to define regions in SPHER2 or SPHER3. Reject them? Process them?
> Only when you're dead sure you're not misunderstanding what people pass
> your service?
>
> This kind of thing goes on and on and on, just because there's so many
> features in STC, and, to make things worse, most of them are optional
> (for which there are good reasons, but again you have a combinatorial
> explosion of what data you actually have available for your transform).
>
> (3) After that, it's clear that no given service will support all of
> STC-S. To reliably operate such a service, a client would have to
> discover the extent of that support (can it do coordinate
> transformations? which frames? which reference positions? can it apply
> proper motions? will it include errors? those I specify or those in
> the data? does it care about timeframes? will it transform my spectral
> intervals? etc. pp). I've thought a bit about how such a "STC-S
> capabilities" record could look like, and I've come to the conclusion
> that drawing up such a thing requires a greater mind than mine if the
> result is supposed to work reasonably simply.
>
> So: To use STC-S we need an STC capabilities record, and defining such a
> record in a way that it is both comprehensive, useful, and usable
> appears, to me, hair-raisingly close to impossible.
>
>
> Part Ia: What I suggest instead
>
> This requires a short excursion: I strongly believe we should stop
> lying. We're currently lying when we, as in current SSAP, say something
> like<PARAM name="INPUT:BAND" datatype="double" unit="m"...> in the
> service metadata. That's a lie because if you actually pass in a double
> ("1e-7"), you'll likely get back an empty result.
>
> What clients are expected to pass in is (for most services) something
> like "1e-7/", which clearly is *not* a double literal. The SSAP spec
> even suggests something like "1e-7/2e-7,5e-7/6e-7;REST" could work --
> now feed that to your favourite programming system's float parser (of
> course, there aren't terribly many servers that actually support this
> kind of thing, either).
>
> There's the old saying: "If you lie to a computer, it will catch you".
> Case in point: An SSA client effectively has no idea what syntaxes and
> features a given service will support, which makes non-trivial all-VO SSA
> queries pretty much a gamble. This is even worse when it comes to
> custom parameters; check out LOG_G support in theoretical spectral
> services for a taste of why I am ranting here.
>
> It turns out that most implementors in the real VO (not me, though, so
> far, but I'll change that), when they had custom float parameters,
> choose to define pairs of LOG_G_MIN and LOG_G_MAX. Looks a bit evil on
> the first glance, but it's actually close to perfect -- except you can
> only specify one range, but I claim that's a good deal for no longer
> having to lie, and whoever needs multiple intervals and similarly
> complex stuff should be using ObsTAP anyway. Future specifications, I
> maintain, should follow suit: There are only "atomic" parameters, using
> "structured" names (I'm open to discussion on whether machines should be
> allowed to parse the the names to figure out that LOG_G_MIN and
> LOG_G_MAX have a certain relationship: I think yes, but I also think
> metadata responses should group them).
>
> For what we've seen as STC-S usages, I therefore suggest getting the
> cutout region into the service using parameters like POS_RA_MIN,
> POS_RA_MAX, POS_DEC_MIN, POS_DEC_MAX. If a service insists, it can have
> POS_FRAME and must then, in a metadata PARAM VALUES child (or equivalent
> if you insist not to use VOTable), let the client know which frames it
> understands (but ICRS always is a must outside of solar system studies).
> If people really insist on oddly-shaped regions (I don't think that's a
> good idea, incidentally), you could still say CIRCLE_CENTER_RA and
> friends, and by writing things out like that, you at least get a feeling
> for the amount of implementation work. Again, there's easy discovery of
> features supported for clients for free.
>
> Several such parameter names should probably be predefined in DataLink,
> to the extent of the subset of STC-S we'd be willing to support *in all
> services*. A funky service that can, say, apply proper motions, could
> still add POS_EPOCH and give a sensible description in its metadata
> response (or datalink document), and a client can at least validate user
> input against that (and maybe even make out what that is from its UCD).
>
> The data model of those input parameters is fairly simple, so UCDs (and
> possibly grouping) should do as metadata to allow clients semantically
> sane and helpful user interfaces. Or do the even righter thing and
> write VO-DML, which would let you mark up where all your parameters are
> in a data model (my take: overkill for this purpose, mainly because most
> of the stuff that's actually requiring proper descriptions will probably
> happen outside of the data model).
>
>
>
> Part II: What about STC-S then?
>
> There are two uses of STC-S in DaCHS (GAVO's data center software,
> http://soft.g-vo.org) I actually like:
>
> (1) STC coverage (resource profile) for registry purposes. A resource
> description could thus say something like:
>
> <meta name="coverage.profile">
> TimeInterval TT 1995-06-03T10:30:48 1998-01-12T01:41:56
> Circle ICRS 163 57.5 1
> SpectralInterval TOPOCENTER 1.318 1.446 unit MHz
> </meta>
>
> This stuff is then turned into STC-X when resource records are
> requested, which works fairly well. Even there, STC-S is, really, much
> too powerful, though, since the registries at the other end (would) have
> to do something with this metadata. Let's ignore for a second all the
> stife about spatial specifications: If you're a registry and you harvest
> the STC-X equivalent of "SpectralInterval TOPOCENTER 1.318 1.446 unit
> MHz" -- what do you do with it? To make this kind of thing useful,
> you'd need to put it into a table next to, maybe, "SpectralInterval
> PLUTO 1 2 unit m". Requiring the registries to perform the magic
> required to bring the two specifications to a common reference position
> (which, indicentally, is advanced divination in this case since the
> registry has no way of knowing what TOPOCENTER really refers to) is an
> invitation to continue the current state (almost all searchable
> registries have no STC support apart from waveband).
>
> Still, the registry could define a subset of "permitted features" of STC
> (only ICRS, only BARYCENTER refpos if people care about Refposses at
> all, only Union and PositionInterval allowed, etc), and STC-S would still
> be useful to input the data.
>
> (2) Defining STC metadata
>
> For this, I've made an extension to STC-S that allows column references.
> Then, in the metadata declaration, you say something like
>
> <stc>
> Time TT "Date"
> Position ICRS CART3 Epoch J2010 "alpha" "delta" "distance"
> Velocity "mualpha" "mudelta" "radialvelocity"
> Redshift OPTICAL "z"
> </stc>
>
> or (this is for SSAP):
>
> <stc>
> Time TT "ssa_dateObs" Size "ssa_timeExt"
> Position ICRS [ssa_location] Size "ssa_aperture" "ssa_aperture"
> SpectralInterval "ssa_specstart" "ssa_specend"
> Spectral "ssa_specmid" Size "ssa_specext"
> </stc>
>
>
> These then get translated into VOTable STC declarations
> (http://www.ivoa.net/Documents/Notes/VOTableSTC/) -- and here, I'd say
> we can be generous with the features. On the client side, it's much
> easier to communicate "I don't understand that particular feature of the
> metadata description" or just "Here's what STC metadata I have -- now,
> dear astronomer, make sense of that yourself". Indeed, I had to
> extend my "private" STC-S with the concepts of epoch and planetary
> ephemeris, and to allow automatic error estimates I'd yet need the
> concept of the mean epoch.
>
> So -- when all you want is a structured description that is basically
> directed at a scientist, STC's wealth of features is just fine (I'd even
> advocate some additions). But note again that the recipient here is not
> (really) a program, it's a human that can decide what to do and how much
> effort should go into bringing some data together.
>
>
> My conclusion: Whenever you actually deal with STC instances, and you're
> ready to do so (taking into account that nobody so far can do fancy
> computations with a significant subset of it), STC-S has a place as a
> convenient language to input them (as opposed to, e.g., STC-X or their
> VOTable serialization, both of which you *really* don't want to type).
> This -- and not the use in protocols, for which full STC is far to
> heavyweight and prescribing systems, units, and such makes much more
> sense -- is the niche I see for STC-S.
>
>
> While I'm speaking, I've not been too happy by the combination of
> positional and keyword+positional elements in current STC-S (Quick:
> which of the following two STC-S specifications is valid (only one
> answer possible):
>
> (1) Position ICRS unit m pixsize 1 2
> (2) Position ICRS pixsize 1 2 unit m
> )
>
> I'd therefore like to suggest that we should relax some of those
> constraints and probably move everything to keyword/value except what's
> already been used in actual protocols and clients; I'd expect that's
> only the reference system, so we'd by fine as long as stuff like
>
> Box ICRS 1 2 3 4
>
> would remain being valid STC-S.
>
>
> And here's now my offer: I'd write up EBNF and accompanying prose for
> something that's "pretty much" like STC-S according to the current note,
> leaving existing usages of STC-S intact and simplifying the remaining
> rules to, e.g., allow both (1) and (2). I'd have it ready for Hawaii,
> complete with an implementation that at least can move the stuff to
> STC-X and VOTable utypes.
>
> Both encouragement and, erm, well, let's say discouragement is welcome
> (since this definitely would not be a standard I'd enjoy writing, I'd
> actually appreciate the latter a bit more...).
>
> Cheers,
>
> Markus
>
More information about the dal
mailing list