TAP, automated site monitoring, and gzip encoding.

Adam Brazier abrazier at astro.cornell.edu
Tue Jul 5 08:19:51 PDT 2011


Just to bring people up to date...
>
> The problem was that a standard TAP metadata query/response looked 
> like a SQL injection attack and triggered flags in GSFC security 
> monitors.
>
> We changed to using the STREAM/BINARY encoding in the VOTables used in 
> our TAP interface (suggested by Mark) and so far this seems to be 
> satisfying our security types.  [I don't want to get into a discussion 
> of what should or should not trigger such flags.]
>
> There was some discussion that we should simply assume that we are 
> going to have to be responsible for our own security and tell our 
> security monitors to ignore everything from our TAP server.  Certainly 
> it's the case that we need to do our best to ensure that there are no 
> security holes in our TAP interface.  If we had to do so, we could 
> have gone this way.  However an independent layer of checking is 
> something I don't want to forego if I don't have to.  There are lots 
> of hackers out there and I daresay many are smarter and certainly more 
> versed in the holes in our database's security than I.  So I'm hopeful 
> that our format change will enable our security scanners  to continue 
> monitoring our services without burdening them with large numbers of 
> false intrusion detections.
>
I would stress that even if the alarms go away, the reason they're 
happening now is that consuming raw SQL is generally a warning sign 
(although not disastrous, particularly if permissions are properly set; 
query-checking is built-in to some frameworks, too, although I've never 
wanted to rely on it) so that making the warnings go away by hiding the 
SQL is just about making the warnings go away (so that other benefits 
from monitoring may be sustained).

Ideally, it seems to me, the local monitoring should be fine-grained 
enough that we can tell it to stop protesting about SQL being ingested 
whilst maintaining other monitoring activities. As SQL consumption is 
*required* by us, we firstly can't really do anything about the initial 
risks that presents while we should also ensure that we secure our 
databases; if local IT security need convincing it's safe, then all to 
the good, I'd say, as additional eyes are helpful in assessing security. 
If it's just a case of getting all the monitoring or none, then I guess 
that work-arounds might be required, but it feels a bit icky to me.

Cheers

Adam
>
> Douglas Tody wrote:
>> Right - we distinguished between compression of the dataset itself and
>> compression as used in the transport protocol.  HTTP already supports
>> the latter and ideally the client and server would both support stream
>> compression.  But of course it is optional (where we really need this is
>> to speed up feeding large text VOTables back to the client).  If
>> security is the main issue it might be better to require an
>> authenticated (HTTPS) connection.  Or just limit the TAP implementation
>> and client connection to data which could not be compromised by any
>> amount of SQL trickery.
>>
>>       - Doug
>>
>>
>> On Mon, 4 Jul 2011, Mark Taylor wrote:
>>
>>> On Thu, 30 Jun 2011, Tom McGlynn wrote:
>>>
>>>> One solution that I had hoped might work was to use a GZIP transfer 
>>>> encoding
>>>> (or content encoding) for the query results.  Unfortunately it 
>>>> doesn't look
>>>> like clients currently note the HTTP encoding headers.
>>>>
>>>> NASA is probably a bit more paranoid about this than some, but I 
>>>> suspect that
>>>> this will become a more common issue as time goes on.
>>>> Support for content or transfer encoding is an HTTP level issue so 
>>>> I don't
>>>> think it requires any change to the TAP standard, just clients that 
>>>> look for
>>>> the appropriate HTTP headers.  Would it be reasonable to request 
>>>> that clients
>>>> support gzip encoding?  In addition to address this security issue 
>>>> I suspect
>>>> this would generally substantially decrease the size of downloaded 
>>>> data and
>>>> make our queries more responsive.
>>>>
>>>>     Tom McGlynn
>>>
>>> FWIW, although TAP does not address this, the SSA standard
>>> (PR-SSA-1.1-20110417) does discuss compression in section 7.3:
>>>
>>>    7.3 Data Compression
>>>
>>>    If the query parameter COMPRESS is present then the service may 
>>> return
>>>    a compressed dataset, using some standard compression technique such
>>>    as gzip, in place of a normal dataset, without indicating this in 
>>> the
>>>    query response. Basically the client is indicating that it is 
>>> prepared
>>>    to receive either compressed or uncompressed datasets and does not
>>>    care which is delivered (the service should pick whichever is more
>>>    efficient). This should be distinguished from protocol-level 
>>> compression,
>>>    which is transparent to the client, and may occur at the level of 
>>> the
>>>    HTTP protocol if both client and server support HTTP protocol 
>>> compression.
>>>
>>>    In case of an HTTP GET the keyword Content-Encoding informs the 
>>> receiver
>>>    about the encoding of the output data, and should have a value 
>>> such as
>>>    gzip. Note that the encoding is distinct from the MIME-type 
>>> (Content-Type)
>>>    of the returned data object.
>>>
>>> the tone seems to suggest that Content-Encoding is something that
>>> clients might (but not MUST) be expected to do as a matter of course.
>>>
>>> Probably DALI ought to say what the general assumption is for DAL
>>> services about content- and/or transfer-encoding.
>>>
>>> Mark
>>>
>>> -- 
>>> Mark Taylor   Astronomical Programmer   Physics, Bristol University, UK
>>> m.b.taylor at bris.ac.uk +44-117-928-8776 http://www.star.bris.ac.uk/~mbt/
>>>
>



More information about the dal mailing list