Time Series Cube DM - IVOA Note

Sun Mar 19 21:38:50 CET 2017

Hi Markus, all,

my reaction to you remarks below.

>
> I'll take the liberty of illustrating what I'm proposing taking up an
> example from Jiri's Time Series serialisation proposal.  His basic
> annotation looks like this:
>
>   <GROUP id="timeseries" vodml="ndcube:TimeSeriesCube">
>                 <GROUP id="independent_axes" vodml="ndcube:CubeAxis">
>                         <GROUP name="dateTimeAxis" vodml="ndcube:CubeAxis">
>                                 <FIELDref ref="HJD" id="field"/>
>                                 <GROUPref ref="datestc" id="model"
> vodml="VODML Model"/>
>                         </GROUP>
>     ...
>                 <GROUP id="dependent_axes" vodml="ndcube:CubeAxis">
>                         <GROUP name="fluxAxis" vodml="CubeAxis">
>                                 <FIELDref ref="FLX" id="field"/>
>                                 <FIELDref ref="FLXERR" id="error"/>
>                                 <GROUPref id="model" vodml="VODML Model"/>
>                         </GROUP>
>                         <GROUP name="magnitudeAxis"
> vodml="ndcube:CubeAxis">
>                                 <FIELDref ref="MAG" id="field"/>
>                                 <FIELDref ref="MAGERR" id="error"/>
>                                 <GROUPref id="model" vodml="VODML Model"/>
>                         </GROUP>
>                 </GROUP>
>
> -- essentially, this says "there's a (sparse) data cube somewhere in
> this data set that has a time as the independent axis as FLX and MAG
> as observables.  Plus, there are references to additional metadata,
> and the thing groups values and errors together.
>
> Leaving aside that this is invalid XML (you can't have multiple
> elements with the same @id; these should have been role annotations),
>

Yes, I agree, these are indeed the roles of the GROUPrefs, not their
identifiers.

> I'm convinced it's wrong to model something like "value with error"
> separately in each DM.  I also don't think it's helpful to have a
> reference to *the* DM annotation for an axis somewhere -- there can
> always be multiple annotations (e.g., photometry-1.0, photometry-2.1,
> and provenance) on a given thing in the VO-DML world.  If I had to
> name a single killer feature of VO-DML, that's what I'd name.
>

I completely agree with the "value and error" thing being modeled in a
separate model which will be used for any data representation. The reason
why I didn't use the Quantity is that even the Basic Quantity in Quantity DM
<http://wiki.ivoa.net/internal/IVOA/IvoaDataModel/qty23.pdf> is containing
frames and CoordSys, making it obligated to know every kind of CoordSys
that you can think of.

Why does a simple quantity need to know about the whole universe is a
different question, but I didn't use it because of this.

> Now, here's how I'd like such an annotation to look like.  I'm using
> vodml-type and -role as attributes rather than elements here for
> readability, and I'm interspersing the comments; please allow me to
> indulge in improvising class and attribute names for the time being.
> I hope if they don't match current models they should at least
> readily map to them:
>
>
> ================= Dataset annotation =======================
> <GROUP vodml-type="ds:Dataset">
>   <PARAM vodml-role="dataproductType" value="Timeseries"/>
>   <PARAM vodml-role="publisherDID" value="ivo://example.org/prod?ts0000"/>
>   <GROUP vodml-type="ds:BaseTarget" vodml-role="target">
>     <GROUPref vodml-role="position" ref="targetPosition"/>
>     <!-- this is an exampel of a reference to a complex entity;
>     I believe we should reduce these as much as possible,
>     because they introduce hard dependencies of DMs and will
>     lead to a combinatorial catastrophe if used too much.
>     As long as any stc2 annotation will work here, though,
>     we might still pull it off -->
>   </GROUP>
>   ...
> </GROUP>
> <!-- That's it - no embedding of this, no turning up of the
> attribute names somewhere else.  If it's a dataset, you have a
> GROUP[@vodml-type='ds:Dataset'], and if there's a
> *[@vodml-role='dataproductType'] in there, that's where you figure
> out where to get the dataproduct type (could be a PARAM, PARAMref or
> even FIELDref if you have a metadata table for lots of ds:Datasets
> -->

> <GROUP vodml-type="stc:Position2D" id="targetPosition">
>   <!-- any STC client can now figure out there's a position here,
>   and it can be referenced from multiple annotations.  It just
>   *happens* that this group works for dataset's target position -->
>   <PARAM vodml-role="c1" value="54.3"/>
>   <PARAM vodml-role="c2" value="-12"/>
>   <GROUP vodml-type="SpaceFrame" vodml-role="Frame">
>     ...
>   </GROUP>
> </GROUP>
>
> Hmm, no objections to the this, but this should not be part of the Cube
DM, but the Dataset DM right?

>
> ================== Cube annotation =========================
>
> <GROUP vodml-type="ndcube:Cube">
>    <!-- No reason to have an extra type for time series; that's
>     already defined in ds:Dataset.dataproductType and unlikely to
>     be of relevance to a cube-only client (e.g., a plot program)
>     anyway. -->
>

I must say I kind of agree to this - I tried to make Time Series Cube DM to
work as generic as possible, ending up pretty much with entities usable
also at the level of Cube DM not only Time Series Cube DM. The only
difference is that I say one independent-axis of the data cube must be a
time-derived axis and that's it.

>   <FIELDref vodml-role="independent-axis" ref="obs_date"/>
>   <!-- that's it; a client just counts
>   *[@vodml-role="independent-axis" and knows the number of dimensions
>   in the cube.  All additional annotation is on the FIELD itself. -->
>
>   <FIELDref vodml-role="dependent-axis" ref="FLX"/>
>   <FIELDref vodml-role="dependent-axis" ref="MAG"/>
>   <!-- that's it; a single reference defines a "value" in this cube,
>   and any further annotation is on the field itself, where non-cube
>   clients can also use it. -->
> </GROUP>  <!-- I don't think much further metadata is needed here -->
>

I don't like this. Read further.

>
>
> ============== STC+Quantity annotation =====================
>
> <GROUP vodml-type="stc:Time">
>   <!-- the one place STC metadata is collected -->
>   <FIELDref vodml-role="value" ref="obs_date"/>
>   <!-- note how we "amend" metadata on obs_date here; by co-reference
>   with the ndcube:Cube annotation, obs_date is *both* an independent
>   axis *and* a time. -->
>
>   <PARAM vodml-role="timescale" value="TT"/>
>   <PARAM vodml-role="timeformat" value="MJD"/>
>   <!-- timeformat is an invention; STC 1.0 uses classes to
>   distinguish between JD, MJD, "ISO". -->
>   <PARAM vodml-role="referencePosition" value="BARYCENTER"/>
>   ...
> </GROUP>
>

I like this. Read further.

>
> <GROUP vodml-type="ivoa:Quantity">
>   <!-- all measurements (can) have errors, min/max vals, etc, so
>   there's no point separately modelling this in cube, stc,
>   photometry, etc.; let's have ivoa:Quantity for that. -->
>   <FIELDref vodml-role="value" ref="obs_date"/>
>   <FIELDref vodml-role="standard-deviation" ref="err_time"/>
>   <PARAM name="minimum" value="56493.339"/>
>   <PARAM name="maximum" value="56498.341"/>
>   <!-- which also does much of char:, without introducing 1000s of
>   utypes -->
> </GROUP>
>

I kind of like this. The analogy in the Time Series Cube DM as it is now
is:

   - ivoa:Quantity == Cube Axis.
   - stc: Time == Axis Domain Model

The only difference is that you are referencing from the cube only the
data, losing direct link between the cube and it's metadata. The problem is
that we have a Cube dataset, meaning we are storing only cubes in there.
But I don't want to put a restriction on the whole VOTable that it mustn't
contain a single PARAM or FIELD element that is not actually referenced
from the Cube DM part.

Given that, when I read the Cube DM part, I don't have a clue where its
metadata lie. I can go to the *data*, but there I find only metadata about
the data itself, not about the cube that is holding the data. To find it, I
still need to scan the whole VOTable, find all Quantity models and check
whether they don't by chance reference the same FIELDs as my Cube is.

So I propose to reference the Quantity directly from the independent_axis,
instead of the FIELD. That way the relationship is more logical IMHO. I
read information about the Cube - I know how many independent axes (coords)
and dependent (values) there are. If I want to know the metadata about
those axes (their length, maximum and minimum values, etc.), I go to the
quantity metadata and from there I can go to the actual quantity itself
(the data) along with its own metadata.

This is actually what already is in the Time Series Cube DM if you apply
the substitutions mentioned above.

> ============ Photometry+Quantity annotation ==================
>
> <GROUP vodml-type="phot:PhotometryPoint">
>   <FIELDref vodml-role="value" ref="FLX">
>   <GROUP vodml-type="phot:Filter">
>     <PARAM vodml-role="name" value="K_s"/>
>
>     <PARAMref vodml-role="spectralLocation" ref="spec_loc_K_s"/>
>     <!-- note how I'm referencing a param here that's part of an
>     annotation; this way, Photometry (in principle) still doesn't
>     have to know anything about Quantity, but we can still have full
>     Quantity info on the spectral location. -->
>    </GROUP>
>    <GROUP vodml-type="phot:PhotometricSystem">
>     <PARAM vodml-role="description" value="Sloan"/>
>    </GROUP>
> </GROUP>
>
> <GROUP vodml-type="ivoa:Quantity">
>   <!-- it's exactly the same thing as above for obs_date; if
>   a client understands Quantity, it'll need no extra code, no extra
>   utypes to interpret this as well as the obs_date annotation. -->
>   <FIELDref vodml-role="value" ref="FLX"/>
>
>   <FIELDref vodml-role="standard-deviation" ref="FLXERR"/>
>   <!-- here, and not in a custom thing within ndcube or somewhere
>   else, is the connection made between FLX and FLXERR; that way, a
>   Quanitity-knowing client can figure out the error, not just one
>   for NDCube -->
>   <PARAM name="minimum"...
> </GROUP>
>
> <GROUP vodml-type="ivoa:Quantity">
>   <!-- By furnishing spec_loc_K_s with Quantity metadata, we can
>   communicate additional information if we have it, even for a PARAM.
>   -->
>   <PARAM id="spec_loc_K_s" vodml-role="value" value="2.2e-6"/>
>   <PARAM vodml-role="minimum" value="1.8e-6"/>
>   <PARAM vodml-role="maximum" value="2.5e-6"/>
> </GROUP>
>
> .... and the same for MAG ...
>
> ============= Field declarations
> <FIELD ID="dateObs" name="dateObs"/>
> <FIELD ID="FLX" name="FLX"/>
> <FIELD ID="FLXERR" name="FLXERR"/>
>
>
Again, I agree in the concept, I don't mind if we call it Quantity metadata
or Data Axis metadata or different. We describe a cube. We describe its
data axes. We describe the physical meaning of the data in those axes.

> If the size of this puts you off: Well, compared to today's FITS
> headers, that's still compact and eminently readable, so I'd not
> worry about this here.
>
> What's really missing at this point as far as standards are concerned
> is, as far as I know, ivoa:Quantity (or perhaps we should have a DM
> of its own?  I suspect Quantity will go through several revisions as
> it starts getting used).
>
> Can anyone summarise the state of Quantity modelling?  I always get a
> bit lost in all the artefacts in volute's dm branch...
>

Agree. As I mentioned earlier, the Data Model for Quantity
<http://wiki.ivoa.net/internal/IVOA/IvoaDataModel/qty23.pdf> is the only
one I managed to find and I did not like it.

>
> Cheers (and with apologies if this indeed comes a bit late),
>

Not at all. After reading this, I got the feeling that we don't have
completely different opinions, we just use different terminology. We will
see after our discussions in Strasbourg whether this true or not.

Cheers,

Jiri
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ivoa.net/pipermail/dm/attachments/20170319/fba9bb45/attachment-0001.html>