Obscore: obs_id not null requirement
m.b.taylor at bristol.ac.uk
Fri Mar 4 12:43:29 CET 2022
Markus et DAL,
I'm not generally in favour of MUST rather than SHOULD constraints
for non-NULL column values, on the grounds that data producers
who don't have sensible values to put in there may end up giving
useless or nonsense values just to comply with the standards,
which doesn't provide value to anybody.
However, it may be that the best answer to Markus's problem is
for me to drop that taplint query, which could be reasonable.
The intention is that taplint doesn't place unreasonable demands
on the services that it validates, and I'm happy to take advice
in cases where that's not happening.
On Fri, 4 Mar 2022, Markus Demleitner wrote:
> Dear DAL, Dear DM,
> [followups suggested to DAL]
> On page 20 for version 1.1, Obscore requires the obs_id column to be
> NOT NULL.
> This has operational consequences, for instance because Mark Taylor's
> stilts taplint duly tries to verify this, issuing a query like
> SELECT TOP 1 obs_id FROM ivoa.ObsCore WHERE obs_id IS NULL
> Since obs_id is rather meaningless for almost all of the tables
> contributing to my obscore view, and there aren't any "recommended"
> queries using it, I don't have an index on that column. Since (I
> think) there are no obs_id NULLs, this means it has to sequentially
> scan all the tables, hitting close to 100 million rows.
> Of course, I could add such an index, but since ~50 tables
> contribute, that would be significant work for both me and the
> computer (at least postgres can't have indexes on views). I am hence
> wondering: What is the purpose of the non-null constraint on obs_id?
> Is there actually a use case that exploits that, any known software
> that would break with obs_id NULL?
> Or, really: Can we drop that constraint?
Mark Taylor Astronomical Programmer Physics, Bristol University, UK
m.b.taylor at bristol.ac.uk http://www.star.bristol.ac.uk/~mbt/
More information about the dal