RegTAP questions

Markus Demleitner msdemlei at ari.uni-heidelberg.de
Fri Nov 16 14:10:48 PST 2012


Hi Ray,

Thanks for your comments --

On Fri, Nov 16, 2012 at 03:24:55PM -0600, Ray Plante wrote:
> CREATE TABLE sql statements that implement (pretty closely, anyway) 
> the Registry TAP model you proposed.  Folks can find it attached to 
> RI2Discussion 
> (http://wiki.ivoa.net/twiki/bin/view/IVOA/RI2Discussion).  
> 
> 1.  The resource table, while including created and updated 
>     (attributes of the root resource element), does not apparently 
>     include status (whose values can be "active", "inactive", or 
>     "deleted").  Is this intended so to follow the RI-1 specification 
>     that searches only return active records?
Yes -- in the first version, I had a status column, but Theresa
convinced me that there's little point to expose those to the user.
I tend to agree that they are basically an implementation detail of
OAI-PMH and shouldn't be part of RegTAP.

> 2.  I like the use of hashes as delimiters for string array values as 
>     well as the function ivo_hashlist_has().  I believe, though, your 
>     list of columns is incomplete (e.g. content_type).  Some columns, 
>     though, refer specifically to "comma-separated lists"; shouldn't 
>     we also make these hash-delimited as well?  
Grepping for "comma", I only saw this in the columns' flags.  A fix
for the column description will go in with the next commit, my
implementation already has hashes there.

> 3.  In the intf_param table, the form column, by its definition, seems 
>     to correspond to interface/param/@use.  Was there a reason for 
>     using a different name?  Was there intention to include a 
>     column for the interface/param/@std? 
Right, form should be use.  Since the utype was bad, too, I wonder
what happened there -- good catch.  This stuff went in in February,
so I can blame neither Paul (too late) or Theresa (too early).
Weird. "Use" may not be a great column name, but it's not ADQL
reserved or used in another table.  Thus, you're right, the column
should be name "use"; it's updated in the spec as of volute rev 1878.
The Heidelberg implementation will only be updated the week after
next week -- I'll be on vacation next week.  Ha!

The std column exists, albeit with a questionable utype in the
intf_param table.  Did you miss it or do I misunderstand you
question?

> 4.  I may have misunderstood your intent with creator_seq.  I see from 
>     the document, it is part of the resource table and from your 
>     implementation that it is a comma-delimited list of creator names.  
Basically right, except that I'm not prescribing commas (if we dare
prescribe punctuation, I'd stronly prefer a semicolon).  The
intention was to have this column mainly for presentation purposes.
Since I expect creator names to be displayed quite frequently, it
makes sense to not force clients to have to get them from a join, and
of course, order matters for those, so having a formatted version of
them adds information.

>     I thought that the intent was to provide a means for indicating 
>     the original order of the creators/authors.  When you mentioned 
>     creator_seq in S.P., I imagined it would be an integer value that 
>     appears in the res_role table which indicates the position in the 
>     subject in the creator list.  This would allow you, for instance, 
>     to search for records based on, say, the first author.  Did you 
>     have a different use case in mind?  
*If* the metadata were good enough to let us identify the first
author reliably, I'd say we should have a first author field in
resource if we wanted to support such searches.  Adding an index to
res_role would be possible as well, of course, though I don't think
anyone will ever look for second or fifth authors.

But alas, there's no telling what's in the author field in the
resource records that are out in the wild.  Therefore, I guess the
best we can to is let people search with some leniency in res_role
(such that it works with both authors properly marked up in
VOResource and the common name soup) and a display string that should
fairly faithfully preserve the RR author's intentions in both cases.

I'd like that to be defined more strictly (I can only repeat that on
ADS, author searches are the most common), but as I said: With the
metadata we have, it's not going to work reliably.

Thanks,

          Markus


More information about the registry mailing list