Libero/2  Request for Comments

Location:  /cvs/internal/products/base2/libero/rfc.txt
Written:   2004/09/23
Revised:   2004/09/23
Author:    Pieter Hintjens


Please add your comments in the text using this form:

 [initials: comment]


GENERAL
-------
The goal of this document is to wrap-up a running discussion on the FSM
syntax developed so far in SMT/3.  Some of the topics discussed here are
relevant to SMT/3, others address the wider issue of building multithreaded
state machines in a usable packaging.

RELATIONSHIP BETWEEN iCL, Libero, AND SMT/3
-------------------------------------------

iCL is a general class definition language that provides a framework for
writing, building, documenting, and testing packages of functionality.
We are using iCL for all new code that can be fit into this framework,
including the iMatix Portable Runtime (replaces SFL).

iCL defines a class with an API and an inheritence model that lets the
developer create new classes by deriving data and code from existing
ones.

iCL provides a basic grammar for classes.  It additionally allows for
arbitrary grammars to be added to a class definition.  These grammars
can be used to generate code for specific domains.  Rather than write
code generators as random stand-alone tools, they can thus be integrated
into classes, providing a single API to any type of generated code.

One of the principle plug-in grammars is Libero/2, which defines the
state-machine abstraction we know and love.

[JS: Does this allow for adding arbitrary grammars to the basic libero?]

The Libero/2 grammar does not assume any specific runtime, language, or
operating environment.  This is provided by a "schema".  The schema is
a property of the class, inherited to child classes.  The schema is
identical to the classic Libero concept: it is a code generator for one
language and one target domain.

[JS: not quite identical: we have dialog-calls and a couple of other things]

Libero/2 schemas are split into two parts.  The first is a generic parser
that creates a language-independent syntax tree that is very close to the
final code structure.  For instance, a classic parser would construct FSM
jump tables.  A optimising parser would construct a flattened "switch"
tree.  The second part of the schema is a code generator that walks the
tree and turns it into native code.  The goal is to make the code generator
as small and simple as possible, making it easy to write new code generators
for new languages.

[JS: This seems to imply that the same code generator could work from
jump tables or flattened switch tree interchangably. This is not the 
case, furthermore there is unlikely to be any advantage to flattened 
trees in a language which does not use an optimising compiler].

SMT/3 is thus a Libero/2 schema, written as a parser (lrschema_flat.gsl) and
a code generator for the SMT-over-iPR-over-C runtime environment
(lrschema_smt_iprc.gsl).


AGENTS vs THREADS vs OBJECTS
----------------------------
We're defining agents, not threads. Threads are execution objects and
actually an implementation detail of the runtime, not of the FSM.  The
language will be compatible with iCL, namely:

<class>
  <state>
    ...
  </state>
</class>

The term "thread object" will be used to define an executable instance of
an agent.  Note that iCL/1.0 does not yet allow for FSM defintions; this
is being added now.

[JS: How does this incorporate the notion of different thread types in a
single agent?]

STANDARD HANDLERS
-----------------
The standard handlers will be those used in iCL, namely:

  initialise:
    invoked during global initialisation for the agent
  terminate:
    invoked during global termination for the agent
  new:
    invoked during when a new thread object is created
  destroy:
    invoked during when a thread object is destroyed
  test:
    invoked to test the agent

I'm happy to get suggestions for these names but we will use the same names
in iCL and Libero (which will be an iCL class unless anyone has objections).

The correct way to create a new thread object is therefore:

    agent_t myobjectref = agent_name_new ([arguments]);

And to destroy a thread:

    agent_name_destroy (myobjectref);


METHODS
-------
Methods are how the outside world sends events to an thread object.  Methods can
have any name and arguments as necessary.  To invoke a method, you use a call like
this:

    agent_name_method (myobjectref, [arguments]);

The iCL grammar defines the syntax for methods.

The difference between methods and handlers is that methods always work on a
specific thread object and do not return any value.  Handlers work on either an
thread object, or on the entire class, and return values.  In a synchronous API,
the handlers and methods can overlap.  In an asynchronous API, they serve distinct
purposes.

For example, we can define a _destroy handler and a _destroy method.  The first
causes an immediate destruction of the thread.  The second sends the destroy
event to the thread, which it can handle as it needs to.

[JS: Is this really a useful distinction? I don't see it.]

EVENTS AND EXCEPTIONS
---------------------
Historically we've used events for exception handling.  This is not nice
and leads to things like: raise_exception (ok_event).  We should separate
these and use the standard terminology for exception handling:

<agent>
  <state>
    <event name = "ok">
      <action/>...
    </event>
    <catch name = "error">
      <action/>...
    </catch>
  </state>
</agent>

The API is:

    set_next_event  (ok_event);
    raise_exception (error_catch);

Events and exceptions are defined so that it is illegal to mix them: probably
by using numeric ranges that do not overlap.

[JS: So maybe we need something else, like a 'break':raise_break (ok_event) ]

DEFAULT HANDLING
----------------
Historically defaults states have satisfied most scenarios; when we mixed
multiple types of thread in a single agent, we used superstates to provide
an additional level of inheritence.

We do not need superstates and I suggest we eliminate this concept, which
is complex and rarely used.

For default handling we do not need a special state.  There is a simpler and
more obvious way:

<agent>
  <state>
    <event name = "ok">
      <action/>...
    </event>
    <catch name = "error">
      <action/>...
    </catch>
  </state>
  <event name = "timeout">
    ... default handling for timeouts
  </event>
  <catch name = "fatal">
    ... default handling for fatal exceptions
  </catch>
</agent>

[JS: Over my head. How does this accomplish default handling? In what
way is it either simpler or more obvious?]

THREAD SYNCHRONISATION
----------------------
I am assuming an SMT-style runtime now.  There are two main ways of doing
asynchronous i/o: the request can wait, or it can continue and get a signal
when it's ready.  Note that the wait blocks that thread, but not the agent,
of course.  These two methods should work as follows:

Wait on i/o:
    - the thread is blocked until the i/o is complete
    - if the i/o finished, the thread is restarted normally
    - if there is an error, the thread is restarted with an exception

Note: if the action was the last one in an action list, the dialog can only
move to the next state if there was no exception.  Exceptions do not happen
in the next state!  (I know smt/3 works differently, but it's wrong.)

[JS: Yes I think you're right conceptually about the state change thing.
 It's going to be nasty to implement though.]

Signal i/o:
    - the thread continues normally
    - when the i/o finishes, an event is sent to the thread's event queue
    - if there is an error, an error event is sent to the thread's event queue
    - the thread collects this event in the next state

Both i/o methods are similar when they are the last action in a list.  In both
cases the thread is blocked - either waiting on i/o, or waiting on an event.
The big difference is in error handling, which is done with exceptions in one
case, events in the other.

[JS: ...and the_next_event has not been set.]

How do we generate these events and exceptions?  A simple way is to define
handlers which the kernel calls when necessary:

  <handler name = "handle i/o exception">
    ... checks this-> iostatus and returns an exception event
  </handler>
  <handler name = "handle i/o completion">
    ... checks this-> iostatus and returns an event
  </handler>


IO MONITORING
-------------
When we want to mix asynchronous i/o with other events - such as alarms - the
above two methods are not enough.  We cannot wait on i/o and miss an alarm
event, and we cannot accept i/o signals when we are handling an alarm event.

[JS: Why not? I thought the problem was that it was too difficult to
remember that the 'wait on event' was still current.]

We need a way to monitor a set of conditions and block the thread until one
of these is true.  The possible conditions are:

  - input is ready on socket or device
  - socket or device is ready for output
  - an alarm event has arrived
  - an event (from another thread object) has arrived in event queue

[JS: Do you mean OS-alarm, or some SMT notion of alarm?]

The simplest way to translate these conditions into events is to use handlers
that do the work:

  <handler name = "handle input ready">
  </handler>

  <handler name = "handle output ready">
  </handler>

For the alarm event, no translation is needed.  For the external events, ditto.

[JS: Isn't it nicer to handle all 4 occurrences in the same way?]


METHODS AND THE EVENT QUEUE
---------------------------
Methods may send an event to the thread object's event queue.  This happens
immediately.  Methods may also pass arguments to the thread object.  We do this
by passing a structure reference.  For each method with arguments, iCL generates
a structure with the name <classname>_<methodname>_t.  The method will (when
sending the an event with arguments) allocate a new structure, populate it, and
send it to the event queue with a call like this:

    queue_the_event (xxx_event, (void *) &structure);

By convention the name of the event will be the same as the name of the
method, and we will probably generate this code automatically to eliminate
the possibility of errors.

[JS: You know my feelings about this. Some methods can be processed
synchronously; forcing events, states, replies in this case just 
introduces more room for error as well as reducing performance. Having 
said this, I don't mind generating the event where is is appropriate, 
but let's allow 1. for immediate processing prior to sending the event 
and 2. the event is not obligatory.]

When the thread object receives the event it can access the arguments by
addressing the variable (with a type cast):

    (<classname>_<methodname>_t *) thread-> s-> arguments-> ...

The structure will be automatically freed by the SMT kernel at the next
state transition.

[JS: What if, as is typically the case, several different events use
identical arguments? SMT/2 and SMT/3 as it is have an additional layer 
('message') to accommodate this. OTHT we could define 
<classname>_<methodname>_t for each method, even if some of them are 
identical. But that makes using common code for the different methods 
difficult. Furthermore, it is typically the case that several different 
methods are almost identical, and when hand-coded at least, use the same 
event. How about

<agent>
  <message name = ...>
   <method name = ... event = "event name">
   ....
   <method> - this method doesn't use an event
  </message>
  <method event = "event name"> - this method takes no arguments
  <method> - this method has no arguments & no event
  ...

Then, rather than <classname>_<methodname>_t we can use 
<classname>_<messagename>_t.

Although this is somewhat more complicated than your proposal, suitable 
default processing can keep this transparent, and allowing more 
flexibility from the start is less work than introducing it later.]

[JS: Why is this an issue?]


Method structures passed via the event queue should not contain large
data buffers, but should instead contain pointers to buffers that are
guaranteed to be valid until the thread object has replied with some event
of its own.  Small amounts of data may be passed via the event queue.

The event queue has no size limit (we use the iCL memory model to minimize
heap allocations).

