Message-id management revisited

Fri Jun 6 08:57:11 PDT 2008

On Thu, 5 Jun 2008, Douglas Tody wrote:

> I have been trying to stay out of this endless discussion but see this
> one is coming up again...
>
> On Wed, 4 Jun 2008, Mike Fitzpatrick wrote:
>
>>>  2. sender generates single msg-id with fixed form <client-id>-<msg-tag>
>>>        (this was Mike's original proposal)
>>>      - hub can avoid maintaining *essential* state, but if it wants to
>>>        keep track of other per-message info (e.g. timestamp, checksum,
>>>        ...) it will need to store it internally.  Also need to worry
>>>        about what happens if the sender does not follow the requirement
>>>        for how the id is generated - better to have an interface in
>>>       which it's impossible to do the wrong thing.
>>
>>     This is still my first choice (surprised?).  Because:
>
> This was my original suggestion as well (long ago).  Since the

I am arguing for something very similar to this: the only difference
is that instead of the msg-id form being fixed as <client-id>-<msg-tag>,
the hub can choose how it builds its msg-id from the <client-id> and
<msg-tag>.  The only additional complication that this causes is 
that *if* the sender needs to know the msg-id that the hub has generated
(and it is rare that it will - it's not required simply for matching
messages with responses) then it will have to get it from the hub, 
e.g. using a translation method or waiting for a round trip at 
send time.  As in the above scheme, the hub does not need to 
maintain any per-message state (except for synchronous messages, 
which is not what we're talking about here).

As it happens I've been writing a hub implementation today, and I 
find that I actually do want to store per-message information 
additional to the sender's <client-id> and the sender-generated 
<msg-tag>.  There are one or two additional bits of information 
I'd like to store:

    1. a flag saying whether the message was sent using the synchronous
       or asynchronous method, since the replies are handled differently

    2. (maybe) a sequence number for synchronous calls, so I can
       outdate the oldest pending synchronous calls if a limit
       on the number of simultaneous ones is reached

If we use the above suggestion of a fixed form <client-id>-<msg-tag>
for the msg-id, then I can't encode these bits of information in
the msg-id because there's nowhere to put them.  So either I now have
to maintain per-message state in the hub, or we should redefine 
the fixed form as something like 
"<client-id>-<msg-tag>-<is-synchronous>-<seq-number>".
The latter looks to me like it's getting too baroque to write in 
a standard - and other hub implementors might want to store different
information in any case.  That illustrates why I'm in favour of 
allowing the hub to choose the msg-id format.

> On Thu, 5 Jun 2008, Alasdair Allan wrote:
>
>> [...] no state was necessary. Suddenly we're adding huge amounts of
>> overhead to the Hub. It has to keep track of which message arrived
>> from which sender, where it got dispatched to, it has to figure out
>> when these expire (so it can clean out its backend cache of such
>> things). Suddenly, there is all this overhead. I see absolutely no
>> advantages of adding all this extra book work.
>
> Agree completely.  All the "hub" should be doing is delivering

I would agree completely too *if* the suggestion was to add huge amounts
of overhead to the hub or to require it to keep track of which messages 
arrive from where.  As I've tried to convince Alasdair earlier in
this thread, it is not!

-- 
Mark Taylor   Astronomical Programmer   Physics, Bristol University, UK
m.b.taylor at bris.ac.uk +44-117-928-8776 http://www.star.bris.ac.uk/~mbt/