<div dir="ltr"><div class="gmail_default" style="font-size:small"><br></div><div class="gmail_default" style="font-size:small">I did participate in the meeting but (due to a major s/w release last week) have </div><div class="gmail_default" style="font-size:small">not had time to follow up (check the notes or post some of my thoughts in more </div><div class="gmail_default" style="font-size:small">detail). So I'll clarify what I meant that was quoted above and add some others </div><div class="gmail_default" style="font-size:small">to the mix.</div><div class="gmail_default" style="font-size:small"><br></div><div class="gmail_default" style="font-size:small">>     + PD: if your database is natively  standards compliant the<br>

>     join approach is more normalized and makes sense.</div><div class="gmail_default" style="font-size:small">I think what I meant here is that if someone was creating their database from </div><div class="gmail_default" style="font-size:small">scratch and chose to create an ObsCore table, then having a separate table for </div><div class="gmail_default" style="font-size:small">extensions and expecting to join them makes sense and is the more normalised </div><div class="gmail_default" style="font-size:small">approach. So "natively standards compliant" I think was trying to capture that <br></div><div class="gmail_default" style="font-size:small">one created an ObsCore table and populated it rather than created a "view" in some</div><div class="gmail_default" style="font-size:small">fashion (combination of database view and/or TAP and SIA code to manipulate or </div><div class="gmail_default" style="font-size:small">generate the query).<br></div><div class="gmail_default" style="font-size:small"><br></div><div class="gmail_default" style="font-size:small">My understanding is that many data centres would implement ObsCore as a view</div><div class="gmail_default" style="font-size:small">on existing underlying table(s). In the case of CADC (and others) where we use</div><div class="gmail_default" style="font-size:small">the CAOM data model, ObsCore is a well defined view that joins two tables (because</div><div class="gmail_default" style="font-size:small">CAOM is more normalised than ObsCore). For anyone who implements</div><div class="gmail_default" style="font-size:small">ObsCore as a view on another model, extension views may well be views on the same </div><div class="gmail_default" style="font-size:small">tables and "ObsCore natural join ObsRadio" could be quite complex underneath.In this </div><div class="gmail_default" style="font-size:small">kind of situation, the complete "pre-cooked" views and no joins is more direct and simple,</div><div class="gmail_default" style="font-size:small">but I completely agree that the concept doesn't scale to N extensions and their </div><div class="gmail_default" style="font-size:small">combinations. <br></div><div class="gmail_default" style="font-size:small"><br></div><div class="gmail_default" style="font-size:small">Just for completeness, the approach that we take with CAOM to make it suitable for </div><div class="gmail_default" style="font-size:small">describing all kinds of data is to have a single model that covers all scenarios. We try </div><div class="gmail_default" style="font-size:small">to minimise the "sparseness" by emphasizing the C(ommon) when designing the model</div><div class="gmail_default" style="font-size:small">and when it evolves (add new fields) to describe something new. Emphasizing "common"</div><div class="gmail_default" style="font-size:small">means finding ways to satisfy the broadest range of use cases with the minimum number</div><div class="gmail_default" style="font-size:small">of sufficiently generic concepts. The analogy here would be to try to add things directly to</div><div class="gmail_default" style="font-size:small">ObsCore instead of having extensions. CAOM v1 had a smaller "core" and "archive specific</div><div class="gmail_default" style="font-size:small">metadata" (extensions) and that turned out to be a bad idea. Of course, a large and potentially</div><div class="gmail_default" style="font-size:small">sparse model (set of tables) means users have to make good use of the "select list" (not just</div><div class="gmail_default" style="font-size:small">"select * from") and it also helps a lot to use "is not null" effectively in queries. There are ways </div><div class="gmail_default" style="font-size:small">to make both of these easier (TBD).<br></div><div class="gmail_default" style="font-size:small"><br></div><div class="gmail_default" style="font-size:small">--<br></div><div><div dir="ltr" class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div><div>Patrick Dowler<br></div>Canadian Astronomy Data Centre<br></div>Victoria, BC, Canada<br></div></div></div></div></div><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, 15 Apr 2024 at 00:35, Markus Demleitner via dm <<a href="mailto:dm@ivoa.net">dm@ivoa.net</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Dear Colleagues,<br>

<br>

On Fri, Apr 12, 2024 at 10:42:42AM +0200, BONNAREL FRANCOIS via dm wrote:<br>

> You can find the dedicated page for this meeting with presentations and<br>

> Notes here: <a href="https://wiki.ivoa.net/twiki/bin/view/IVOA/RadioastronomyInterestGroupSeventhVirtualMeeting" rel="noreferrer" target="_blank">https://wiki.ivoa.net/twiki/bin/view/IVOA/RadioastronomyInterestGroupSeventhVirtualMeeting</a><br>

<br>

Sorry I couldn't make it to the meeting, but I do care fairly deeply<br>

about the question of whether we want a pre-cooked obs_radio join or<br>

we want individual extension tables designed to be NATURAL-ly JOIN-ed<br>

together.  And I am *very* convinced that the pre-cooked joins are<br>

trouble.  On my pain level scale<br>

<<a href="https://blog.g-vo.org/building-consensus.html#scale" rel="noreferrer" target="_blank">https://blog.g-vo.org/building-consensus.html#scale</a>>, they would be<br>

an 8+.<br>

<br>

The main reason is that obs_radio hopefully will not be the last<br>

obscore extension we will design.  Once the next extension comes in,<br>

what do we say?  That the extensions cannot be used together?  I<br>

think that would be a mistake, as the domains of the extensions are<br>

not necessarily orthogonal; this was obviously discussed in the<br>

meeting on the example of time and radio, but I want to emphasise<br>

this once more, because it is really the central point.<br>

<br>

You see, *if* we expect extensions to be used together, you are<br>

either back at the JOIN-s some of you seem to be skeptical about, or<br>

you will end up with 2^(n-1) pre-cooked JOIN-s, each joining a subset of<br>

the set of extensions you implement.  That would be a horrible<br>

nightmare.  Let us not do that.<br>

<br>

Let me also briefly comment the points made in this connection on<br>

<a href="https://wiki.ivoa.net/internal/IVOA/RadioastronomyInterestGroupSeventhVirtualMeeting/_Radio_IG_7th_Running_meeting.txt" rel="noreferrer" target="_blank">https://wiki.ivoa.net/internal/IVOA/RadioastronomyInterestGroupSeventhVirtualMeeting/_Radio_IG_7th_Running_meeting.txt</a>:<br>

<br>

> - registration and how to expose the tables.<br>

>     + Separate tables (Markus + ... ) vs. single table (François + ...).<br>

>     + tables identified by standardID set via utype on the table in<br>

>     the registry --> standardIF for the extenSION table or for the<br>

>     extenDED table ?<br>

>     + Combining multiple extensions (time+radio) in the same tables<br>

>     becomes painful.<br>

<br>

That's my point above, so +1 (at least).<br>

<br>

>     + on the other side querying only the extension table makes no<br>

>     sense so why imposing a join to users ?<br>

<br>

Because we cannot predict *which* tables users will want to query<br>

together.  And don't worry about the mental load imposed by JOIN-s;<br>

it's fairly low on NATURAL JOIN-s to begin with, and then just<br>

provide a few example queries in the document and in your VOSI<br>

examples.  People will cope just fine.  Most of them start from<br>

examples anyway and then tweak them.  Also, whoever has passed rookie<br>

level in ADQL will be loosely familiar with joins: crossmatching<br>

means JOIN-s, and crossmatching, I claim, is the single most popular<br>

use case for doing ADQL in the first place.<br>

<br>

>     + the issue has Two aspects; queries and discovery of the tables.<br>

<br>

Marginally on these being two aspects.   I claim it turns out<br>

explicit JOIN-s work fine for both queries and discovery, whereas<br>

pre-cooked tables are painful in both cases.  I won't elaborate this<br>

here because it seems so obvious to me, but I'm happy to expand on it<br>

on request.<br>

<br>

>     + WHICH utype for extended tables : one ad hoc utype<br>

>     (standardID) per extended table or multiple utypes on the same<br>

>     extended single table<br>

<br>

Please note that neither VODataService nor RegTAP can deal with<br>

multiple table utypes at this point, so if we went for multi-utype,<br>

someone would have to push through changes in both of these<br>

standards, and presumably breaking ones (what's to end up in the<br>

current table.table_utype column?) at that.<br>

<br>

>     + PD: if your database is natively  standards compliant the<br>

>     join approach is more normalized and makes sense.<br>

<br>

I don't think that's strongly related to what your native tables look<br>

like, except perhaps if all you have is a single, "pre-extended"<br>

table.  But then defining two simple views on top of that<br>

pre-extended table is a lot less hassle than forcing those with more<br>

complex setups to bother with pre-cooked joins.<br>

<br>

>     + But other implementations implement obscore as a view on an<br>

>     existing database and making a complete view would be better.<br>

>     Joining views is going to be cumbersome.  Joining two views<br>

>     that are views on the same underlying table is strange and<br>

>     would require implementers to remove the join.<br>

<br>

No, absolutely not.  Nested joins are not uncommon at all, and the<br>

only people that *may* have to worry about them is the people writing<br>

the database engines.  And do not worry: *That* part they have long<br>

worked out.  It's no trouble at all.<br>

<br>

>     + in the current draft there are the two versions (single table<br>

>     appears in pdf, two tables is there but commented in latex)<br>

<br>

As a general point: Please don't comment out text in<br>

version-controlled material.  Delete whatever you want to disappear<br>

(and fork if you want to explore alternatives).  It is *much* simpler<br>

to follow the edit processes using VCS tools than to squint at<br>

comment characters.  That capability is one of the major reasons to<br>

have standards in VCS in the first place.<br>

<br>

Finally, one last, unrelated point:<br>

<br>

>    + FB: user defined functions only used for TAP (not for SIA/DAP interface)<br>

<br>

Even if we want multi-unit in DAP, that is unrelated; you can, of<br>

course, have independed parameters FREQ, WAVELENGTH, and ENERGY (or<br>

perhaps BAND and BAND_UNIT) in such an interface regardless of the<br>

underlying table structure, and if you *really* wanted that kind of<br>

thing, you could have SPECTRAL_OUTPUT_UNIT on top.<br>

<br>

Of course, I believe all of these are terrible ideas, and both the<br>

conversion of user-preferred units to service units (TOPCAT shows how<br>

to do it in a few places) and the presentation of spectral<br>

coordinates are responsibilities of the clients in a well-designed<br>

system.  But let's not complicate the f_min/f_max debate with DAP,<br>

which simply is unrelated.<br>

<br>

Thanks,<br>

<br>

           Markus<br>

<br>

</blockquote></div>