<div dir="ltr"><div class="gmail_default" style="font-size:small"><br></div><div class="gmail_default" style="font-size:small">I did participate in the meeting but (due to a major s/w release last week) have </div><div class="gmail_default" style="font-size:small">not had time to follow up (check the notes or post some of my thoughts in more </div><div class="gmail_default" style="font-size:small">detail). So I'll clarify what I meant that was quoted above and add some others </div><div class="gmail_default" style="font-size:small">to the mix.</div><div class="gmail_default" style="font-size:small"><br></div><div class="gmail_default" style="font-size:small">> + PD: if your database is natively standards compliant the<br>
> join approach is more normalized and makes sense.</div><div class="gmail_default" style="font-size:small">I think what I meant here is that if someone was creating their database from </div><div class="gmail_default" style="font-size:small">scratch and chose to create an ObsCore table, then having a separate table for </div><div class="gmail_default" style="font-size:small">extensions and expecting to join them makes sense and is the more normalised </div><div class="gmail_default" style="font-size:small">approach. So "natively standards compliant" I think was trying to capture that <br></div><div class="gmail_default" style="font-size:small">one created an ObsCore table and populated it rather than created a "view" in some</div><div class="gmail_default" style="font-size:small">fashion (combination of database view and/or TAP and SIA code to manipulate or </div><div class="gmail_default" style="font-size:small">generate the query).<br></div><div class="gmail_default" style="font-size:small"><br></div><div class="gmail_default" style="font-size:small">My understanding is that many data centres would implement ObsCore as a view</div><div class="gmail_default" style="font-size:small">on existing underlying table(s). In the case of CADC (and others) where we use</div><div class="gmail_default" style="font-size:small">the CAOM data model, ObsCore is a well defined view that joins two tables (because</div><div class="gmail_default" style="font-size:small">CAOM is more normalised than ObsCore). For anyone who implements</div><div class="gmail_default" style="font-size:small">ObsCore as a view on another model, extension views may well be views on the same </div><div class="gmail_default" style="font-size:small">tables and "ObsCore natural join ObsRadio" could be quite complex underneath.In this </div><div class="gmail_default" style="font-size:small">kind of situation, the complete "pre-cooked" views and no joins is more direct and simple,</div><div class="gmail_default" style="font-size:small">but I completely agree that the concept doesn't scale to N extensions and their </div><div class="gmail_default" style="font-size:small">combinations. <br></div><div class="gmail_default" style="font-size:small"><br></div><div class="gmail_default" style="font-size:small">Just for completeness, the approach that we take with CAOM to make it suitable for </div><div class="gmail_default" style="font-size:small">describing all kinds of data is to have a single model that covers all scenarios. We try </div><div class="gmail_default" style="font-size:small">to minimise the "sparseness" by emphasizing the C(ommon) when designing the model</div><div class="gmail_default" style="font-size:small">and when it evolves (add new fields) to describe something new. Emphasizing "common"</div><div class="gmail_default" style="font-size:small">means finding ways to satisfy the broadest range of use cases with the minimum number</div><div class="gmail_default" style="font-size:small">of sufficiently generic concepts. The analogy here would be to try to add things directly to</div><div class="gmail_default" style="font-size:small">ObsCore instead of having extensions. CAOM v1 had a smaller "core" and "archive specific</div><div class="gmail_default" style="font-size:small">metadata" (extensions) and that turned out to be a bad idea. Of course, a large and potentially</div><div class="gmail_default" style="font-size:small">sparse model (set of tables) means users have to make good use of the "select list" (not just</div><div class="gmail_default" style="font-size:small">"select * from") and it also helps a lot to use "is not null" effectively in queries. There are ways </div><div class="gmail_default" style="font-size:small">to make both of these easier (TBD).<br></div><div class="gmail_default" style="font-size:small"><br></div><div class="gmail_default" style="font-size:small">--<br></div><div><div dir="ltr" class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div><div>Patrick Dowler<br></div>Canadian Astronomy Data Centre<br></div>Victoria, BC, Canada<br></div></div></div></div></div><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, 15 Apr 2024 at 00:35, Markus Demleitner via dm <<a href="mailto:dm@ivoa.net">dm@ivoa.net</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Dear Colleagues,<br>
<br>
On Fri, Apr 12, 2024 at 10:42:42AM +0200, BONNAREL FRANCOIS via dm wrote:<br>
> You can find the dedicated page for this meeting with presentations and<br>
> Notes here: <a href="https://wiki.ivoa.net/twiki/bin/view/IVOA/RadioastronomyInterestGroupSeventhVirtualMeeting" rel="noreferrer" target="_blank">https://wiki.ivoa.net/twiki/bin/view/IVOA/RadioastronomyInterestGroupSeventhVirtualMeeting</a><br>
<br>
Sorry I couldn't make it to the meeting, but I do care fairly deeply<br>
about the question of whether we want a pre-cooked obs_radio join or<br>
we want individual extension tables designed to be NATURAL-ly JOIN-ed<br>
together. And I am *very* convinced that the pre-cooked joins are<br>
trouble. On my pain level scale<br>
<<a href="https://blog.g-vo.org/building-consensus.html#scale" rel="noreferrer" target="_blank">https://blog.g-vo.org/building-consensus.html#scale</a>>, they would be<br>
an 8+.<br>
<br>
The main reason is that obs_radio hopefully will not be the last<br>
obscore extension we will design. Once the next extension comes in,<br>
what do we say? That the extensions cannot be used together? I<br>
think that would be a mistake, as the domains of the extensions are<br>
not necessarily orthogonal; this was obviously discussed in the<br>
meeting on the example of time and radio, but I want to emphasise<br>
this once more, because it is really the central point.<br>
<br>
You see, *if* we expect extensions to be used together, you are<br>
either back at the JOIN-s some of you seem to be skeptical about, or<br>
you will end up with 2^(n-1) pre-cooked JOIN-s, each joining a subset of<br>
the set of extensions you implement. That would be a horrible<br>
nightmare. Let us not do that.<br>
<br>
Let me also briefly comment the points made in this connection on<br>
<a href="https://wiki.ivoa.net/internal/IVOA/RadioastronomyInterestGroupSeventhVirtualMeeting/_Radio_IG_7th_Running_meeting.txt" rel="noreferrer" target="_blank">https://wiki.ivoa.net/internal/IVOA/RadioastronomyInterestGroupSeventhVirtualMeeting/_Radio_IG_7th_Running_meeting.txt</a>:<br>
<br>
> - registration and how to expose the tables.<br>
> + Separate tables (Markus + ... ) vs. single table (François + ...).<br>
> + tables identified by standardID set via utype on the table in<br>
> the registry --> standardIF for the extenSION table or for the<br>
> extenDED table ?<br>
> + Combining multiple extensions (time+radio) in the same tables<br>
> becomes painful.<br>
<br>
That's my point above, so +1 (at least).<br>
<br>
> + on the other side querying only the extension table makes no<br>
> sense so why imposing a join to users ?<br>
<br>
Because we cannot predict *which* tables users will want to query<br>
together. And don't worry about the mental load imposed by JOIN-s;<br>
it's fairly low on NATURAL JOIN-s to begin with, and then just<br>
provide a few example queries in the document and in your VOSI<br>
examples. People will cope just fine. Most of them start from<br>
examples anyway and then tweak them. Also, whoever has passed rookie<br>
level in ADQL will be loosely familiar with joins: crossmatching<br>
means JOIN-s, and crossmatching, I claim, is the single most popular<br>
use case for doing ADQL in the first place.<br>
<br>
> + the issue has Two aspects; queries and discovery of the tables.<br>
<br>
Marginally on these being two aspects. I claim it turns out<br>
explicit JOIN-s work fine for both queries and discovery, whereas<br>
pre-cooked tables are painful in both cases. I won't elaborate this<br>
here because it seems so obvious to me, but I'm happy to expand on it<br>
on request.<br>
<br>
> + WHICH utype for extended tables : one ad hoc utype<br>
> (standardID) per extended table or multiple utypes on the same<br>
> extended single table<br>
<br>
Please note that neither VODataService nor RegTAP can deal with<br>
multiple table utypes at this point, so if we went for multi-utype,<br>
someone would have to push through changes in both of these<br>
standards, and presumably breaking ones (what's to end up in the<br>
current table.table_utype column?) at that.<br>
<br>
> + PD: if your database is natively standards compliant the<br>
> join approach is more normalized and makes sense.<br>
<br>
I don't think that's strongly related to what your native tables look<br>
like, except perhaps if all you have is a single, "pre-extended"<br>
table. But then defining two simple views on top of that<br>
pre-extended table is a lot less hassle than forcing those with more<br>
complex setups to bother with pre-cooked joins.<br>
<br>
> + But other implementations implement obscore as a view on an<br>
> existing database and making a complete view would be better.<br>
> Joining views is going to be cumbersome. Joining two views<br>
> that are views on the same underlying table is strange and<br>
> would require implementers to remove the join.<br>
<br>
No, absolutely not. Nested joins are not uncommon at all, and the<br>
only people that *may* have to worry about them is the people writing<br>
the database engines. And do not worry: *That* part they have long<br>
worked out. It's no trouble at all.<br>
<br>
> + in the current draft there are the two versions (single table<br>
> appears in pdf, two tables is there but commented in latex)<br>
<br>
As a general point: Please don't comment out text in<br>
version-controlled material. Delete whatever you want to disappear<br>
(and fork if you want to explore alternatives). It is *much* simpler<br>
to follow the edit processes using VCS tools than to squint at<br>
comment characters. That capability is one of the major reasons to<br>
have standards in VCS in the first place.<br>
<br>
Finally, one last, unrelated point:<br>
<br>
> + FB: user defined functions only used for TAP (not for SIA/DAP interface)<br>
<br>
Even if we want multi-unit in DAP, that is unrelated; you can, of<br>
course, have independed parameters FREQ, WAVELENGTH, and ENERGY (or<br>
perhaps BAND and BAND_UNIT) in such an interface regardless of the<br>
underlying table structure, and if you *really* wanted that kind of<br>
thing, you could have SPECTRAL_OUTPUT_UNIT on top.<br>
<br>
Of course, I believe all of these are terrible ideas, and both the<br>
conversion of user-preferred units to service units (TOPCAT shows how<br>
to do it in a few places) and the presentation of spectral<br>
coordinates are responsibilities of the clients in a well-designed<br>
system. But let's not complicate the f_min/f_max debate with DAP,<br>
which simply is unrelated.<br>
<br>
Thanks,<br>
<br>
Markus<br>
<br>
</blockquote></div>