sexagesimal
Tom McGlynn
Thomas.A.McGlynn at nasa.gov
Sat Sep 16 18:09:44 PDT 2006
There are lots of perfectly readable coordinates that would flunk and
are used in the literature.
I do not believe you want to use a system that simply throws out the
the non-numeric pieces of the coordinates. They provide information
about the fields and also facilitate disambiguation with
target names. Anywhere we have coordinates
(especially coordinate pairs) we are very likely to see target
names as well -- by design or because of user error.
E.g.,
12h06m14d08m
has only 4 integer fields and no whitespace, but it's still clear which
is the RA and Dec
or
12:06 14:08
is also pretty clear, or
12 06 +14 08
I suspect we can handle
12: 6 14: 8
too
though I confess I might be a little worried by
12 +14
or
12:06 14.3
though a program could give them a go.
These are all (except the penultimate) coordinates that I wouldn't be
surprised to see in a table (I see
a lot of X-ray and gamma-ray data).
Another issue that you want to be careful of is confusing coordinates
with names.
I can see pretty clearly that
4C+48.61
is not a coordinate, but a target name, but if I just ignore the
non-numerics I might
translate it improperly as 4h, 48.61d .
There are lots of catalogs that have a names comprised of catalog ids that
that include a number followed by coordinates. It would be very easy for
these to be misinterpreted as coordinates with the number in the
name interpreted as the first field, if we just throw out the separators
without checking what they are. There really aren't that many
possibilities.
I don't think it would be hard to extend Alberto's code to handle most
of the
common cases while excluding virtually all target names
(there aren't really too many valid separators).
Other heuristics that could be used....
If a number is signed and it is not the first field, it must be the start
of the second coordinate.
A comma can be used to separate the lat/lon coordinates, but not the
fields within them.
If the separator after something other than the first field is 'd',
'deg', 'degree' or
the non-ASCII degree character, this preceding field is the first field in
the second coordinate (especially if there is a cooresponding separator
after the first field)
If the total number of fields is >2 and < 6, and the first separator
is a legal, non-space separator (e.g., ':') and the second separator
is white-space, then the second coordinate begins in the
third field.
Might also consider values
13 24 81 47
is not ambiguous -- though it might be dangerous to have tools smart
enough to handle this!
In a few cases you might want to try resolving the string as a name
before splitting it up...
Regardless of the rules used I daresay there will still be a few problems.
Of course none of this matters if the concern is defining a standard for
coordinate
formats -- but that's going to be ignored anyway. The real issue is trying
to be able to read the formats that are out there and that people
are going to try to use within our services.
Tom
P.S., Just noting that Roy's suggestion for IRAF to fill in the coordinates
-47:30.021:00
in contrary to Alberto's suggested rule for where decimals
are allowed. Personally I would prefer the later -- I have
never seen a coordianate like that above.
More information about the dm
mailing list