Axes in Obscore

Marco Molinaro molinaro at oats.inaf.it
Thu Apr 23 15:50:05 CEST 2015


Hi Markus, hi all,
if this reply is confusing it's probably me not completely getting the
points.
So...if you get lost, discard this reply from your inbox.

2015-04-23 11:57 GMT+02:00 Markus Demleitner <msdemlei at ari.uni-heidelberg.de
>:

> Dear DAL, dear DM,
>
> On Thu, Apr 23, 2015 at 09:51:46AM +0200, Marco Molinaro wrote:
> > regarding this topic I have a small use case that comes from a (currently
> > custom) set of services whose aim is to allow velocity spectra analysis
> of
> > galactic FITS cubes.
>
> That's a perfect use case for obscore+datalink, I'd say.
>

sure, it can also fit a SIAv2 plus AccessData, I think.
Honestly I'm quite puzzled on the way to go, I'll vote for providing both,
at least on the long term,
even if currently a related service is already a TAP one, so obscore should
be a direct extension.


> > a - a super-set of FITS cubes from non-homogeneous galactic surveys and
> > pointed archives in the radio band is deployed to allow velocity spectra
> > analysis
> > b - the first step for the user is to search in this set for available
> data
> > along a line-of-sight, with possible filtering on a cone around it, or a
> > box around it, limiting the velocity range, selecting explicitly one/more
> > survey(s) by name, species, transition, ...
>
> It seems to me most of the necessary metadata already is in obscore,
> no?
>

also this is right.


> > c - the search output (which includes something along the lines of a
> > PublisherDID) is then used to explicitly cut the needed cube(s) to make
> > data transfer affordable (in the near future merging of adjacent
> > "same-survey" cubes will be also implemented)
>
> And here I'd argue that's a Datalink thing.  There's just too many
> sorts of processing one could do on data products to have any
> hopes of describing them in a single database table, and datalink
> lets you do exactly what you're asking for with minimal overhead on
> both the table and the client (it will typically have to request one
> small file per cutout, of course, but given the transfer volumes
> we're talking about here on the data side that's neglible).
>

one small file for each cutout call is what current custom service does.
I'm not saying I will not use Datalink, but I'd like to see whether that's
the only solution.


> Conversely, just having the pixel sizes of the cube (as in the +6
> columns proposal) won't really help you for your use case either, and
> even if that information could, you'd still have to have some
> descriptor of the access service somewhere, and so you'd have to use
> datalink either way.
>
> > The need for WCS information in the output of the search comes from the
> > idea of allowing the client side to build correct cutout queries to the
>
> Well, let me do a general plea here: "Keep data discovery and data
> access separate as much as you can."  Datalink is the model to
> efficiently perform that separation.
>

again, current custom service does so, but uses the same parameters to do
both,
it's just for logical reasons that the endpoints of the services will be
detached, otherwise
simple parameter discrimination between the two would do (in my case).
You're probably right my use case is already covered from DAL point of
view, maybe the only
thing not spoken about a lot is the velocity axis of the cubes.

[cutting...] I think I follow you, but I'm not completely convinced...could
be my use case is not so complex
in these terms.


> I give you, though, that there are open issues from a practical
> perspective.  Mine are:
>
> (1) certain types of queries (e.g., "give me all datasets that have a
> certain axis type in any position") aren't really too well suited
> for going through indices.
>
> (2) there might be major *discovery* use cases that require additional
> information on the axes
>
> On (1), I've already written something in
> http://mail.ivoa.net/pipermail/dm/2015-April/005150.html, which I
> think hasn't been disputed yet.
>

In my case it would be finding datasets having velocity axis
somewhere (usually it's the 3rd axis, but
in general it shouldn't change if it's in another position).
But I yet have to understand what to put to identify the velocity axis.

On (2), I can well imagine they exist, but I'd still hope we can avoid
> expanding obscore by 20% to satisfy those.  Let's identify them if
> they're there, shall we?
>
> Let me mount the soapbox once more (I'm done in a few lines): Think
> of our adopters.  Based on what I hear from DaCHS' users and even a
> sizable crowd on this list, I'm convinced that additional fields in
> DMs are being paid for in terms of takeup (not to mention that people
> tend to put junk in fields whose purpose they don't understand).
>
> For the sake of takeup, please don't add fields without a strong,
> validated use case that cannot be sanely satisfied in any other way.
>

I find it difficult to reply this soapbox buildup, because I found myself
many
times in the dilemma of using or not (also some SHOULD) fields due to lack
of description of what was required.

I understand (already said so in my previous mail) your concern on widening
the number of table fields. I agree we should have use cases, to be sure we
don't leave someone out and find ourselves in troubles with new revisions
when
they require a major one for back-compatibility issues
(oh, god, breaking back-compatibility!).

I also think we need clear guidelines if we want takeup to increase when
resources are limited or otherwise spent.

ciao
     Marco
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ivoa.net/pipermail/dal/attachments/20150423/3cb78dc3/attachment.html>


More information about the dal mailing list