Terms - Proposals

Mon Dec 13 09:45:11 PST 2004

Tony Linde wrote:

>>>>>database  
>>>>>          
>>>>>
>
>Maybe we should lose this term - a database is a computing term, a thing
>created under a DBMS which organises data in some regular way (in terms of
>tables in a RDBMS - which is what you buy from Oracle, not a database,
>usually). A physical database might store multiple collections of data and a
>collection might be spread across several databases. It is likely an
>implementation issue of little interest to the user (or to the querier of a
>registry).
>
>  
>
Lose the term is AOK with me.

>>>>>mirror  copy  
>>>>>          
>>>>>
>....
>  
>
>>I think we need another term to mean something very similar but 
>>    
>>
>....
>  
>
>>Of these I prefer likeness and representation because they signify an 
>>inexact copy more than the others do.  Then we need to discuss 
>>attributes of inexactness.
>>    
>>
>
>How about 'mirror' with your definition Ed, 'snapshot' for a static copy
>taken at a given time and 'extract' for any copy which is partial or has had
>modifications to it.
>
>So a mirror will generally always return the same results as the original. A
>snapshot can only guarantee to return the same results up until the next
>update of the original. An extract can never guarantee to return the same
>results.
>  
>
Snapshot is pretty good.  It gives the sense of something fixed in 
time.  However, now I think we need to bring in a few more terms to 
express different ways in which things evolve.  We have data that grow 
in time as more knowledge is gained.  Tables can grow more rows.  
Datasets can grow more tables.  Collections can grow more datasets.   
Usually this means the coverage is growing, perhaps in spatial coverage, 
definitely in time coverage.  For data with expected "Growth", 
"Snapshot" is a good term for a copy of the data at a given time.   The 
other way things change is through errata.  Data that is meant to be 
static is often open to revisions as mistakes are uncovered or as 
calibrations are improved.  These data are not meant to be "in motion", 
so I would prefer a different term to describe copies of data with very 
occasional "Revising".  "Snapshot" would work for data with both 
"Growth" and "Revising", but it may be somewhat misleading for just 
"Revising".  This could be a "Transcript", "Reproduction", or perhaps  a 
"Copy".  Finally, there are some documents which are "Non-revising" and 
"Non-Growing".  The Abell, Zwicky, and SAO catalogs are examples.  If 
you went through the original raw data and found errors, to add or fix 
them you would necessarily give the new dataset a different name because 
these are such standard reference points.  When you refer to them it is 
understood that you mean the publication in ApJ or the published 
catalog.  For this, a copy would be a "Duplicate" because you are 
guaranteed sameness.

>Cheers,
>Tony. 
>  
>
Cheers,
Ed