VOResource v0.8.2

martin hill mch at roe.ac.uk
Fri Sep 26 02:26:45 PDT 2003


I shall take this opportunity to raise my own pet peeve: UCDs are not nearly 
enough for describing data properly, so that we can compare and combine data 
across different datacenters.  

But the registry does need one (or several) ways of describing data using a 
common 'dictionary' of some sort, and UCDs are the best we've got right now!  It 
may be that not every column needs to be assigned a UCD (? what do people think?
) but the ones that astronomers are likely to query on (the WHERE part of an SQL 
clause) will need them.

Quoting "Patricio F. Ortiz" <pfo at star.le.ac.uk>:

> On Fri, 26 Sep 2003, Anita Richards wrote:
> > > > UCDs
> > > > These should be in the Registry if they exist, possibly under
> > > > Documentation or in the section which includes Coverage.  Attempting
> > >
> > > I'm still not so keen on this idea.  For simple tables, perhaps, but for
> > > complex resources such as data archives, or large surveys (I think Alex
> > > needed something like 500 UCDs to describe the SDSS) this will not be
> very
> > > easy to do, or very useful.  If you want to find catalogs with proper
> > > motions, I bet you can usually do so from the Description text.
> > >
> > > I will take a crack at adding a UCD element into RM.  And then I suspect
> it
> > > will be one of the fields that no one fills in, at least not if they are
> > > doing manual entry.
> >
> > On the contrary, at least initially many of the entries will describe
> > resources which are already held by CDS in which case the UCDs can be
> > automatically extracted and inserted.  The reason to use Keywords and UCDs
> > is because the data provider's description and column headings can be e.g.
> > Proper motion, proper motions, dRA/dt etc etc. So you would have to have a
> > big translation tool - but, as CDS have already done it and are refining
> > it, we might as well use what exists.
> >
> > But, we both agree, the proof of the pudding will be in the eating!
> >
> > Awaiting the next RSM with interest,
> >
> > Anita
> 
> I fully agree with Anita regarding the usefulness of UCDs in the resource
> discovery arena and in the data federation/merging area.
> 
> The number of UCDs per resource is, of course, proportional to the number of
> different quantities are listed.
> 
> Using descriptions only, as Anita said, has far more ambiguities than using
> UCDs (including different spellings) and can be quite frustrating. Adding
> one more element to a column element seems to be a small price to pay to
> make finding of catalogs much easier and not to speak of data-merging.
> In an experiment I carried in the last few months, UCDs prove extremely
> useful to recognize catalogs which would otherwise pass unnoticed by
> scanning their titles and keywords. I agree that assigning UCDs can be
> tedious, but so is to write meaningful column explanations which would
> uniquely lead you to a given quantity. I'm sure CDS has already tools to
> assign UCDs based on a description and perhaps units, and as long as that
> tool is widely available, there should be no excuse to assign UCDs to new
> columns.
> 
> If we have to drop the UCDs, fine, but let's do it after we test them and
> prove that they are not a useful piece of information. As far as I've seen,
> that's not the case.
> 
> Cheers,
> 
> Patricio
> 
> P.S. in a pseudo registry I wrote for the experiment I mentioned, I
>      incorporated UCDs into it. Finding catalogs in that way is much more
>      accurate and complete than by other quantities.
> 
> ---
> Patricio F. Ortiz			pfo at star.le.ac.uk
> AstroGrid project
> Department of Physics and Astronomy
> University of Leicester			Tel: +44 (0)116 252 2015
> LE1 7RH, UK
> 
> 
> 


-- 
Software Engineer
Astrogrid, ROE (www.astrogrid.org)
Mob: +44 7901 55 24 66
Fax: +44 131 668 82 64



More information about the registry mailing list