[Radioig] Obscore extension for radio: review, implementation

Mark Kettenis kettenis at jive.eu
Wed Dec 13 13:17:41 CET 2023


> Date: Mon, 11 Dec 2023 13:46:53 +0100
> From: Markus Demleitner <msdemlei at ari.uni-heidelberg.de>
> 
> Dear DM, dear Radio IG,

Hi Markus,

> I've put support for (most of) the 2023-09-29 edition of the obscore
> radio extension into DaCHS (see below) and I've done a test
> publication of such a thing in Heidelberg.  However, I'm still lacking
> good metadata of this sort (even for what radio data I have), so that
> table is empty (though discoverable) at this point.  If you have
> radio data you would like to publish in this way, by all means get in
> touch with me.

As we discussed in Tucson, I still fully intent to implement the bits
of the extension that make sense for VLBI in our TAP service.

> As part of this effort, I have written two sections on operations and
> discovery for the extension.  Please review at
> https://github.com/ivoa-std/ObsCoreExtensionForRadioData/pull/43
> 
> And then, here is some general feedback:
> 
> (1) The longer I think about them, the less I like f_min and f_max.  If
> you look at use case 1.3: "range inside the 1 to 1.5 Ghz band" -- and
> then people have to write f_min > 1000 AND f_max < 1500 and thus do
> some conversion anyway, and to the relatively random unit MHz on top.
> Please let's reconsider this; I have sympathies for not wanting to
> write the λ-ν conversions manually, but if
> 
> 	1 = ivo_interval_overlaps(
> 		em_min, em_max,
> 		ivo_specconv(1.5, "GHz", "m"),
> 		ivo_specconv(1, "GHz", "m"))
> 
> doesn't work for you, let's think again and figure out something that's
> less verbose.  But let's not define something parallel to em_min and
> em_max with an even more random unit than m.

That is quite a mouthfull... but it does bother me as well to provide
what is essentially the same information twice.  And I agree that the
arbitrary units are problematic (the current draft specifies f_min/max
as using "Mhz" but f_resolution as using "kHz").  We do need to retain
f_resolution though as em_res_power simply varies too much for
low-frequency observations that span a large fractional bandwidth.
Radio observations typically have a fixed frequency reolution (and
therefore varying resolving power) across the band.

> (2) I have to say I find the use cases relatively unconvincing
> because at this point they take some constraints out of thin air and
> then repeat them three times with slightly differenty syntaxes.  That's
> not very helpful for working out why anything in the extension is the
> way it is.
> 
> I'd find it a lot convincing if the use cases stated a scientifically
> meaningful discovery problem -- for instance, *why* would one look for
> a "dataset with a field of view larger than 0.5 degree"?

I agree.  I'm going to propose some improvements here for the subset
of use cases that make sense to me.

> Also, at this point I don't believe in *any* use case involving
> target_name -- we don't have reliable rules for how to write them
> (and likely will never have them for "ordinary" objects).  Don't fool
> people into believing they could do all-VO searches using
> target_name.  Convert all of them into positional queries, because
> that's the only interoperable technique for this kind of thing at
> this point (but of course it's fine and even welcome to mention
> object names in the use case formulation).
> 
> For solar system objects with their fast-changing positions, the
> whole problem might pose itself differently, but then that's more
> epn-tap's problem and arguably doesn't belong here.
> 
> Of course, all the queries will have to be re-done if PR #43 is
> merged, but that's minor, and I'd volunteer.  But before such an
> update, let's have (perhaps fewer but) more meaningful use cases,
> preferably giving, in sum, justification to each and every column in
> the extension.
> 
> (3) As usual, I have utype quibbles; in particular, it looks funny if
> there suddenly are underscore-separated words in there
> ("Provenance.Observation.tracking_mode") where I think all other utypes
> (including some here) are CamelCase.
> 
> Also as usual, I don't think the column utypes here have any
> discernable function -- can't we just drop them altogether?
> I'd be willing to bet that *nothing* negative happens if we do.
> (yeah, we're doing something with the *table* utype, so that's
> different)
> 
> (4) The column descriptions need more work.  Ideally, a non-domain
> expert should be able to figure out what something might roughly be by
> reading the description.  Something like "targeted, alt-azimuth, wobble,
> ...)" (currently on tracking_mode) won't do that for them.
> 
> (5) In general, if you have word lists, think about some way to maintain
> them.  I'm happy to advise from a vocabulary point of view.  If you
> put an ellipsis (...) into the spec, you'll almost certainly creating
> something that's not validatable and that will likely quickly turn
> into a mess.
> 
> (6) Also in general: Please don't retain commented-out stuff in the
> source.  It makes it hard to read, makes the document history a lot
> harder to follow, and in general it's much preferred to keep obsolete
> items in the VCS history; it's a major reason to do version control
> in the first place.  If what you want is a placeholder for future
> discussions, use \todo (cf. ivoatexDoc).
> 
> (7) The short standard name (ObsCoreExtensionForRadioData) is a lot
> too long.  The Registry has a limit for a resource's shortName of 16
> characters.  I'm now proposing ObsRadio in the utype used for
> discovery, ivo://ivoa.net/std/obsradio#table-1.0.  If nobody is too
> abhorred by that: Can we rename the document source like that, too?
> 
> (8) The columns the DaCHS extension now offers are f_resolution,
> instrument_ant_diameter, instrument_ant_max_dist,
> instrument_ant_min_dist, instrument_ant_number, instrument_feed,
> obs_publisher_did, s_fov_max, s_fov_min, s_maximum_angular_scale,
> s_resolution_max, s_resolution_min, scan_mode, t_exp_max, t_exp_mean,
> t_exp_min, tracking_mode, uv_distance_max, uv_distance_min,
> uv_distribution_ecc, uv_distribution_fill
> 
> (select column_name from tap_schema.columns
> where table_name='ivoa.obs_radio'
> order by column_name)
> 
> This doesn't mean I'm convinced all of them should be in.  But it
> does mean that I'm pretty sure all others should go from table 1; in
> particular, the "via DataLink" things I think are just confusing in
> there.

I raised my concerns about the instrument_* columns during the
InterOp.  Those really are instrument specific and therefore not very
useful in generic queries one would issue to multiple TAP services.
It may make more sense to provide this information in an observatory
specific table (see for example the presentation by Greg Sleap on how
the MWA presents information like this).

> Don't get me wrong, to a radio layman, the extra "via DataLink" items
> do them look reasonable, but the way to specify them is to have a
> section saying "here's a few extra artefacts that you should provide,
> too, and we suggest that you serve DataLink documents with your data,
> and in there you identify this artefact in that way" (where I suspect
> you'll want to involve <http://www.ivoa.net/datalink/core>).
> 
> Thanks,
> 
>               Markus
> 




More information about the Radioig mailing list