Hub resource hosting service?

Laurent Bourgès bourges.laurent at gmail.com
Tue Oct 18 02:09:36 PDT 2011


Dear all,

I think this topic is quite related to my previous post related to URL
handling: I was and I am still convinced that the SAMP implementation
(hub) could provide an easier API to deal with documents given by URL
location (file, http ...). My principal concern was to avoid
developers to handle on their own the download mechanism (error prone)
because it can be tricky:
- timeouts / proxy / protocol errors
- resource management (http / ftp connection, https security ...)
- security i.e. privileges (tcp ports / file permissions ...)

I agree tom's proposal that the SAMP hub could handle document
retrieval / caching (in-memory or file) and provide an easy lookup
method:
- SampDocument GetDocument(URL) where the SampDocument class could
then provide both string or file representation ...

Comments are always welcome as this topic seems more active than mine,

Laurent

2011/10/17 Tom McGlynn <Thomas.A.McGlynn at nasa.gov>:
> Hi Mark,
>
> I'm not particularly attached to any solution to the issue, but I do think
> it would be desirable to enable work in a sandbox environment (both
> JavaScript and untrusted applets) to use SAMP effectively.  Your proposed
> solution is to provide a standard caching mechanism through the WebSamp hub.
>  I've some comments on that...
>
>  - The simplicity of the SAMP interface is very nice.  Adding a whole new
> layer to the interface to enable this kind of caching seems a bit painful,
> but maybe that isn't necessary.  We could allow the hub to transform the
> message:  E.g., the VOTable generating client would just send the message:
>   table.load.votable: content=<VOTABLE>...
> to the hub and the hub would automatically save the content and send
>   table.load.votable: url=http://localhost:xxxx?...
> to the consuming clients.  The hub would send a response to the generating
> client giving the URL.
>
> The rule could be that a hub is free to cache any 'content' element and
> transform them into references to that content.  Given that we are
> implementing caching, I don't think this makes the hub any more complex,
> we're just using the mtype syntax for the caching messages. Presumably
> generated FITS data or images could be sent this way too.
>
> So there's no new protocol for clients to learn.  That would be a nice
> feature of a solution.
>
>  - I'd be worried about security with a hub that is going to write into the
> filesystem.  I'd be more comfortable with a hub that provided say 100 MB of
> cache space using only memory.
>
>
>  - In the simple, but probably very common, case of one generating client
> and one consuming client, using a cache may be more expensive than just
> sending the data, especially if we cache using a file. We have two
> transactions,
>  generator->hub->write-to-file  read-from-file->hub->consumer
> in the cached case versus
>  generator->hub->consumer
> in the direct case.  This would be mitigated by an in-memory cache.
>
>  - Similarly in the one generator to many consumers case, using URLs means
> that the hub may need to read the cache many times independently to respond
> to each of the clients as they request the reference URL. The direct
> transfer may be able send data more efficiently in parallel.  However, a
> direct transfer doesn't give a consumer a chance to ignore a request, and
> thus will be less efficient if there are clients that are are registered to
> receive certain events, but in practice ignore them.
>
>
> Hope you can work something out in Pune.
>
>        Tom
>
>
>
> Mark Taylor wrote:
>>
>> Doug,
>>
>> On Sun, 16 Oct 2011, Douglas Tody wrote:
>>
>>> One thing discussed way back in the original SAMP discussions was
>>> separating the message header from the message content, which would
>>> allow the content to be anything and defined by the messaging protocol.
>>> This would allow real data to be passed via the messaging system,
>>> including even binary data.  To keep things simple this was left out of
>>> SAMP 1.0 but it was always considered a possibility for future versions.
>>> Essentially what we have now is only the message header, with the
>>> message type specifying what type of message it is, and parameters
>>> providing a limited way to pass some data.
>>>
>>> Keeping messages small and lightweight is good, but what is best really
>>> depends upon the use case and should not be constrained by the messaging
>>> system.  In real time applications for example (including desktop
>>> applications such as displaying an image in real time) data can be
>>> streamed in chunks and there is no possibility to build files which are
>>> stored some place and then referenced by a URL or other pointer.  While
>>> limiting messages to pointing at bulk data is a good model in many cases
>>> it is not the only one, and ultimately it is the application which
>>> should decide what is the best model.
>>
>> Your analysis is quite right.  SAMP as it stands is not a good candidate
>> for a real time messaging system, and there are various other messaging
>> scenarios that it is not targetted at.  Certainly the application has
>> to decide what's the best model, and having decided, it may
>> decide that SAMP (as it stands) is not the system it should use to
>> deliver that.  Some future evolution of SAMP (AMP?) that is capable of
>> addressing those issues is possible, and as you say was something
>> we deliberately left open when we started to think about it.
>> On the whole I don't feel that we are currently in a position where we
>> have the need to take that further.  My reluctance is mainly that the
>> ways I can think of to do bulk data transfer within SAMP itself
>> would complicate implementations quite a bit, and SAMP seems to
>> be working reasonably well now for the kinds of things it's good at,
>> with good takeup because it keeps things simple.  But if you or
>> others think different, we can re-open that discussion
>> (though probably not in this Thursday's session, which is full
>> already).
>>
>>> I don't much like the idea of the hub being turned into a general file
>>> store - lots of complication there with persistence issues and such.
>>> This has nothing to do with messaging.  If such a capability is needed
>>> it ought to be a different service, with messaging limited to only the
>>> communication aspect.
>>
>> The Web Profile hub already provides some non-messaging services, in
>> particular dereferencing of URLs, since the Web Profile is in the
>> business of providing web applications with the things they need
>> to be useful SAMP clients and can't get in any other way.
>> However, I take your point.  I don't want to push the resource
>> hosting idea if other people are not keen.
>>
>> Mark
>>
>>> On Sun, 16 Oct 2011, Mark Taylor wrote:
>>>
>>>> Hi all.
>>>>
>>>> Here is an issue, and a possible solution, which was raised by some
>>>> Web Profile implementation work that Tom is doing.  We will discuss
>>>> it in the Apps SAMP session on Thursday, but I wanted to air it here
>>>> first to give people time to think about it, and also to solicit
>>>> comments from people who are not in Pune.
>>>>
>>>> Tom said:
>>>>
>>>> On Fri, 14 Oct 2011, Tom McGlynn wrote:
>>>>
>>>>> A final largely unrelated issue...  SAMP's reliance on passing VOTables
>>>>> only
>>>>> via URLs makes it difficult to adapt to JavaScript clients.  They have
>>>>> to
>>>>> create relay URLs, stage results, or duplicate functionality on the
>>>>> server
>>>>> side.  E.g., if we were to want to send a VOTable that had been
>>>>> filtered
>>>>> using VOView, there is no easy way to do this that I know of.  I'm sure
>>>>> this
>>>>> was considered early on, but it's possible to do quite extensive
>>>>> processing
>>>>> in the web browser now and we may wish to consider supporting that,
>>>>> especially with WebSAMP.  E.g., we might allow either 'url' or
>>>>> 'content'
>>>>> tags in the message (where the later would be the actual VOTable
>>>>> content)
>>>>> or
>>>>> defining equivalent mtypes for messages with table content.
>>>>
>>>> This hadn't occurred to me before.  I had assumed that if web apps
>>>> needed URLs to pass around they would come from the origin HTTP server.
>>>> While an app could pass locally generated/manipulated data back to its
>>>> origin server and send the resulting URL around in SAMP messages it's
>>>> clearly not an efficient solution, since the data starts and ends on
>>>> the client so doesn't want to have to go via a remote server.
>>>>
>>>> Tom's suggestion of adding content parameters as an alternative to url
>>>> parameters in MTypes is possible, but a bit unpalatable for a couple
>>>> of reasons.  First it falls foul of the SHOULD (though not MUST) in
>>>> SAMP section 3.3 that says:
>>>>
>>>>   "General purpose MTypes SHOULD therefore be specified so that bulk
>>>>    data is not sent within the message. In general it is preferred to
>>>>    define a message parameter as the URL or filename of a potentially
>>>>    large file rather than as the inline text of the file itself."
>>>>
>>>> largely to prevent clients having to deal with very large chunks of XML,
>>>> which is especially a problem if they DOMise them as a matter of
>>>> routine.
>>>> Second it means that a lot of new messages (or at least new versions
>>>> of messages, using content rather than url parameters to specify
>>>> content) would be flying around and old clients wouldn't be able to
>>>> handle them, until they were upgraded accordingly.
>>>>
>>>> Another way to look at the problem is that Web Profile SAMP clients lack
>>>> an
>>>> efficient resource hosting mechanism (since, unlike Standard Profile
>>>> clients, they are unable to write temporary files or host HTTP servers).
>>>> This could be addressed by allowing the hub to provide such a mechanism.
>>>> The most straightforward thing would be for the hub to provide a URL
>>>> to which documents could be POSTed in return for a (201 Created -
>>>> see RFC2616 sec 9.5) Location URL.  Web clients which want
>>>> to pass data to another client could therefore POST the docuement to
>>>> the hub, get a reference URL in return, and pass that reference URL
>>>> in messages.  The hub arranges to redeliver the content to anyone
>>>> who asks for it via that reference URL later.
>>>> This also isn't perfect, since it means the hub has to manage some
>>>> client resources rather than clients, but to me it seems cleaner.
>>>>
>>>> Mark
>>>>
>


More information about the apps-samp mailing list