ADQL 2.1: Preferred crossmatch syntax

Marco Molinaro molinaro at oats.inaf.it
Mon Nov 6 13:57:40 CET 2017


Hi Mark, hi all,
I second your proposal and Markus comment on arguments
order.

I may agree with François-Xavier, but would you use
CONTAINS or DISTANCE in that case?

I also agree that Laszlo's input is too much for ADQL-2.1,
especially at this stage. I suggest Laszlo to join the DAL list
and keep us alerted.

Cheers
    Marco


2017-11-05 11:40 GMT+01:00 Mark Taylor <m.b.taylor at bristol.ac.uk>:

> Laszlo,
>
> thank you for this input.  It might be a promising direction for
> future major revisions of ADQL, but since I'm no kind of database
> expert, I'm not qualified to say.
>
> However, it's addressing a somewhat different problem than I've
> raised in this thread, which is proposing one of the several
> existing equivalent options within the existing ADQL 2.1 syntax
> to use as a preferred form for a common type of crossmatch.
> Your suggestion would be a significant new addition to the ADQL
> syntax which I doubt (? unless the authors disagree) is
> appropriate for a minor ADQL update, especially near the end of the
> revision process.
>
> I'll leave it to others to consider whether this looks like
> something worth pursuing for ADQL v3.
>
> Best wishes,
>
> Mark
>
> On Sat, 4 Nov 2017, Dobos, Laszlo wrote:
>
> > Hi guys,
> >
> > I'm not on the DAL list but Tamas Budavari forwarded the mail to me so
> let
> > me chip in. In the new skyquery, we put the cross-match definition into
> the
> > FROM clause but not as a list of joined tables but rather something like
> a
> > table-valued function. Putting it into the FROM clause is much better
> than
> > in the WHERE clause, not because it's much easier to parse but because if
> > it's in the WHERE clause, it's quite hard to sort the other conditions
> out
> > if there's complex boolean algebra allowed in the clause. If xmatching
> has
> > some special, custom syntax, then the possibility of writing queries that
> > cannot be executed can be minimized. Also, putting it into the xmatch
> clause
> > makes query rewriting much easier because you simply replace the xmatch
> part
> > with the output of the xmatch operation and leave all other joins in
> place
> > to execute the final query.
> >
> > Here's an example:
> >
> > SELECT s.objid, s.ra, s.dec, g.objid, g.ra, g.dec, x.ra, x.dec
> > INTO twowayxmatch
> > FROM XMATCH
> >      (MUST EXIST IN SDSSDR7:PhotoObjAll AS s WITH(POINT(s.ra, s.dec),
> > ERROR(0.1, 0.1, 0.1)),
> >       MUST EXIST IN GALEXGR6:PhotoObjAll AS g WITH(POINT(g.ra, g.dec),
> > ERROR(0.2, 0.2, 0.2)),
> >       LIMIT BAYESFACTOR TO 1e3) AS x
> > WHERE s.ra BETWEEN 0 AND 5 AND s.dec BETWEEN 0 AND 5
> >     AND g.ra BETWEEN 0 AND 5 AND g.dec BETWEEN 0 AND 5
> >
> > The MUST EXIST will soon be extended to MAY EXIST and MUST NOT EXIST
> > (drop-outs, but for those, you need the footprints). This syntax is quite
> > extensible, because you can add more catalog-specific options to the
> WITH()
> > part. For instance, we could support extended source matching by
> replacing
> > POINT with ELLIPSE, etc. LIMIT BAYESFACTOR could be replaced with LIMIT
> > SEPARATION etc. to use a different algorithm.
> >
> > We also support region-based filtering with the REGION clause. This is
> > intentionally not on defined on a per-table basis to avoid query plan
> > complexities.
> >
> > SELECT s.objid, s.ra, s.dec, g.objid, g.ra, g.dec, x.ra, x.dec
> > INTO twowayxmatch
> > FROM XMATCH
> >      (MUST EXIST IN SDSSDR7:PhotoObjAll AS s WITH(POINT(s.ra, s.dec),
> > ERROR(0.1, 0.1, 0.1)),
> >       MUST EXIST IN GALEXGR6:PhotoObjAll AS g WITH(POINT(g.ra, g.dec),
> > ERROR(0.2, 0.2, 0.2)),
> >       LIMIT BAYESFACTOR TO 1e3) AS x
> > REGION 'CIRCLE J2000 0 0 5'
> >
> > You can try it here: http://test.voservices.net/skyquery . It doesn't
> > support adql yet but I'm working on that.
> >
> > -Laszlo
> >
> > -----Original Message-----
> > From: dal-bounces at ivoa.net [mailto:dal-bounces at ivoa.net] On Behalf Of
> Mark
> > Taylor
> > Sent: Friday, November 3, 2017 5:27 PM
> > To: Francois-Xavier PINEAU <francois-xavier.pineau at astro.unistra.fr>
> > Cc: dal at ivoa.net
> > Subject: Re: ADQL 2.1: Preferred crossmatch syntax
> >
> >
> > On Fri, 3 Nov 2017, Francois-Xavier PINEAU wrote:
> >
> > > Dear Mark and DAL,
> > >
> > > Although I agree with the approach, there is one point that bothers me.
> > >
> > > I know that not everyone agrees on the matter, but I personally prefer
> > > to declare the cross-match condition in the JOIN (like in your
> > > InteropOct2015DAL
> > > presentation) rather than in the WHERE.
> > >
> > > In fact, in relational algebra, I prefer to see the cross-match as a
> > > theta-join, not as a selection on the result of a cross-product.
> >
> > OK, I don't really have an opinion on that, I never really know about
> what
> > syntax is best for joins, or whether it makes a difference.
> > [And I'm afraid I wouldn't know a theta-join if one trod on my foot]
> >
> > Is it likely to make a difference in performance terms which of these two
> > ways it's done?  I.e. do we need to specify here what's the preferred
> way to
> > use a constraint like DISTANCE()<threshold, or is this something we can
> > harmlessly leave to the taste of the query author?
> >
> > Mark
> >
> > --
> > Mark Taylor   Astronomical Programmer   Physics, Bristol University, UK
> > m.b.taylor at bris.ac.uk +44-117-9288776  http://www.star.bris.ac.uk/~mbt/
> >
> >
> >
>
> --
> Mark Taylor   Astronomical Programmer   Physics, Bristol University, UK
> m.b.taylor at bris.ac.uk +44-117-9288776  http://www.star.bris.ac.uk/~mbt/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ivoa.net/pipermail/dal/attachments/20171106/ab8d1f09/attachment.html>


More information about the dal mailing list