User-Agent in HTTP requests

Mon Mar 23 11:12:30 PDT 2009

Dear Apps'n'DAL,

I recently had a conversation with Alberto Micol about VO service
providers (e.g. the administrators of SSA servers) wanting to find 
out more about who is consuming their services, in particular 
which software tools are originating requests.  This information 
may be useful to service providers for gathering statistics, 
investigating usage patterns, tracking down illegal/problematic 
service usages, or for other reasons.  It may be of interest to
application authors too if the information is made available to them.

We considered the possibility of adding an optional client application
name parameter to the relevant protocols, so for instance a cone search
request to the service http://cone.org/ngc from the client TOPCAT
might use the URL:

   http://cone.org/ngc?RA=56.20&DEC=24.29&SR=0.01&CLIENT=topcat

However, this would mean changes to all the DAL service standards
(and others?) and clutter up those standards with options which are
really orthogonal to their purpose.

I think that a better solution is to use the existing mechanism of
the HTTP "User-Agent" request header; see RFC2616 sec 14.43:

   http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.43

This is a sensible place to put information about the source of the
request, and since it is an existing part of HTTP it won't break
anything or require any changes to standards.  Some applications
may already be populating this field appropriately in any case.

This may not be possible for portal-type systems which work by
embedding service access URLs in HTML, though in those cases 
the HTTP "Referer" header (RFC2616 sec 14.36) may provide the 
relevant information.

How easy or difficult it is for clients to populate the User-Agent 
field appropriately will vary according to language used; in Java
the easiest way is for the client author to set the System Property 
"http.agent" to a suitable string near application startup time
(http://java.sun.com/javase/6/docs/technotes/guides/net/properties.html).
Experts in other languages may be able to provide similar tips
for those.

Since populating the User-Agent header in this way may be useful to 
some service providers and should at worst be harmless, I plan to 
implement it forthwith in my applications.

The purpose of this message is:

   1. to encourage other application authors to do the same, for the
      benefit of service providers who may wish to make use of this kind
      of information (if you plan to do so, a "me too" follow-up to this
      message would be useful to gauge response)

   2. to advertise the fact to service providers that the User-Agent
      field is a good place to look for client-type information
      (or at least may become so after some application authors have
      followed up this idea)

   3. to enquire whether service providers agree that this is a good
      way to provide this information, or whether this information is
      of no interest, or whether there are other similar things that 
      application authors can do to help along these lines.

If this is generally accepted as a good idea, perhaps it should be
formalised somewhere as good practice for application authors
who may be consuming VO services.  However I'm not sure what would 
be be the best way to go about this - it's a very small suggestion, 
so an IVOA Note would seem like overkill.  An IVOA Note "Guidelines for
authors of VO applications" might be a nice idea - but I'm not sure
what else would go in it :-).

Mark

-- 
Mark Taylor   Astronomical Programmer   Physics, Bristol University, UK
m.b.taylor at bris.ac.uk +44-117-928-8776 http://www.star.bris.ac.uk/~mbt/