MType vocabulary design principles

Wed Jun 11 05:11:09 PDT 2008

All,

the issue of what MType(s) to use for progress messages brings up 
a question with much wider applicability in designing the MType 
vocabulary.  I think that getting the design principles of that right 
will be important to maximise interoperability in practice. 
In my opinion if the MType vocabulary becomes too large or fine-grained 
it will be hard to recover the kind of interoperability for PLASTIC-like 
applications which PLASTIC has achieved.  Because of the importance 
of this I would urge application authors to consider the points 
raised here (and perhaps subsequently on this thread) and to 
contribute their views to the debate.

On Tue, 10 Jun 2008, Mike Fitzpatrick wrote:

> On Mon, Jun 9, 2008 at 4:09 AM, Mark Taylor <m.b.taylor at bristol.ac.uk> wrote:
>
>>>  app.status.progress           msgid,str       progress string
>>>  app.status.progress.percent   msgid,float     percentage completed
>>> (float)
>>>  app.status.progress.timeLeft  msgid,int       est. time remaining (sec)
>>
>> As I've said, I think a progress MType is a good idea.  However, I would
>> rather see this as a single MType (app.status.progress) with required
>> arguments msgid and progress string, and optional arguments giving
>> percentage completed and time remaining.
>
>    Percent and timeLeft are just two examples, one might also consider
> 'bytesDownloaded', 'filesRemaining', 'messagesHandled', etc.  Some of
> these are likely to be application-specifc, the above hierarchy provides
> a framework to hang these new mtypes on without later polluting the
> list of optional args, and simply reserves the mtype for the types of
> progress most apps would likely implement.  As with any mtype, at
> least two apps would have to understand app.status.progress.frogsEaten
> for it to be meaningful, the rest would ignore it.  I don't see where
> interoperability is an issue.

Having multiple different MTypes which do similar things leads to a 
tradeoff between interoperability and burden on the participants.
The basic problem is that if there is a (wide) choice of MTypes which 
might apply to the message you want to send, then there is a (serious) 
risk that the sender and the recipient will fail to agree on which 
one to use.  To cope robustly with this, clients at both ends have
to use all the variants, which is more work.

This is an example of a general question which is going to come up 
in relation to many other MType definitions, so it's worth some
discussion.  It's the question of whether it is better to "pollute"
the namespace of defined MTypes or the per-MType namespaces of 
(maybe optional) MType parameters.  More neutrally: for a related
set of functionality, is it better to handle semantic flexibility
by defining one MType with flexible optional parameters, or multiple
MTypes each with relatively constrained parameters.

Looking at these progress messages specifically, consider the 
situation from the sender's point of view.  If there is a set of 
related app.status.progress.* MTypes as you suggest, the sender 
could just do:

    notify( clientId, app.status.progress.percent( percent="50" ) );

however, if the recipient is only subscribed to 
app.status.progress.timeLeft then there is no information exchanged
and no interoperation.  In general there is the possibility, and 
indeed likelihood, that the MType the sender decides to send is 
not the same as the MType the potential receiver will decide 
to subscribe to.

So, to be safe, the sender really ought to do this:

    notify( clientId, app.status.progress( txt="50% done" ) );
    notify( clientId, app.status.progress.percent( percent="50") );
    notify( clientId, app.status.progress.timeLeft( timeLeft="96" ) );

or, better (because it reduces unnecessary traffic), this:

    if ( isSubscribed( clientId, app.status.progress ) )
        notify( clientId, app.status.progress( txt="50% done" ) );
    if ( isSubscribed( clientId, app.status.progress.percent ) )
        notify( clientId, app.status.progress.percent( percent="50" ) );
    if ( isSubscribed( clientId, app.status.progress.timeLeft ) )
        notify( clientId, app.status.progress.timeLeft( timeLeft="96" ) );

There are three problems with this:

    1. It's more coding effort for the sender
    2. The number of messages transmitted through the system may be
       higher than it needs to be
    3. It will still fail to transmit progress information if the
       recipient is only subscribed to, e.g.,
       app.status.progress.frogsEaten

In the alternative way of doing it, there is one MType named 
app.status.progress with a required parameter txt and documented
optional parameters percent and timeLeft.  Then I can write just

    notify( clientId, app.status.progress( txt="50% done",
                                           percent="50",
                                           timeLeft="96" ) );

or, better,

    if ( isSubscribed( clientId, app.status.progress ) )
       notify( clientId, app.status.progress( txt="50% done",
                                              percent="50",
                                              timeLeft="96" ) );

I can add on frogsEaten="23" and byteDownloaded="9999" parameters 
if I want.

This is easier to code, it means there is only one message sent per 
progress event thus reducing load on the messaging infrastructure and
recipient, and it guarantees that if both sides are interested in
progress information at all they will be able to communicate about it 
to some extent, even if it's only a text string (which in practice
can always be generated by a sender, and can nearly always be made
some use of by the recipient).

This specific case aside: when deciding how broadly defined 
specific MTypes should be (what goes in parameters and what warrants a
new MType) it will in general be necessary to look at the details
on a case by case basis.  But I would say that in most cases, 
because of the kinds of considerations I've outlined above,
it is better to keep to a small number of MTypes, and to handle 
semantic flexibility with (perhaps optional) parameters.  I suggest 
the following rules of thumb:

    1. If an application wants to perform some specific kind of
       interoperation with another, it should be clear which MType
       to use for that, rather than there being a selection of
       similar ones available which differ in detail.

    2. It's OK to define or use as many optional parameters as are
       useful for an MType as long as a recipient which ignores
       all the optional parameters can still stand a good chance
       of processing the message in a sensible fashion.

These arguments are most important when considering a PLASTIC-like
scenario in which applications are communicating with other 
applications which they may never have met before in a relatively 
uncontrolled environment.  However I don't believe that doing it the
way I am advocating will be disadvantageous even in more controlled
environments.

Mark

-- 
Mark Taylor   Astronomical Programmer   Physics, Bristol University, UK
m.b.taylor at bris.ac.uk +44-117-928-8776 http://www.star.bris.ac.uk/~mbt/