Using the Actor Programming Model
Using the Actor Programming Model
Introduction
Software developers are starting to wake up to the notion that concurrency and parallel processing are becoming more important as the industry matures. I recently studied the Actor Programming Model in a little bit of detail and am intrigued by its simplicity and robustness. The Actor Programming model is a concurrent programming technique which provides a powerful mechanism for encapsulating several concurrency features. Building code based on this model allows implementation of some advanced concurrent constructs with fairly simple code. A developer not well experienced in parallel coding should be able to write concurrent code following only a few rules to prevent the common problems of race conditions and non-deterministic behavior.
This article explores a few uses of Actors using Win32 in C++. The code examples provided were developed and tested with Visual Studio 2010, but I would expect them to work with older and newer versions of the compiler and in a 64 bit environment. The recent C++ 2011 language standard has added some new threading syntax to the language (implemented starting with Visual Studio 2012) which provide some builtin C++ features that would allow alternate implementation of some of the threading details. The code is Win32 but the underlying principles are universal, applicable to many languages and platforms.
Background
In general terms, the Actor concept was developed and refined about 1973(1), and developed on a platform of multiple independent processors in a network. Implementation on a multiprocessor machine provides several basic concurrency features including encapsulation of parallel synchronization and serialized message processing, which allow higher level concurrent features such as fork/join, async/await, pipeline processing and others. The actor code encapsulates the threading and synchronization management so that a class derived from it can use threading techniques without having to implement the low level plumbing details.
What Is It?
In simplest terms, an actor is an object which has the following characteristics:
- It is an autonomous, interacting component of a parallel system comprising an execution control context (ie. A process, thread or fiber), an externally accessible address, mutable local state, and APIs to manipulate and observe the state.(2) Each actor's state is unique and is not shared with other objects.
- The processing states consist of: Created, running, stopped, with substates determined by the programmer. For all processing states, external code can look at the actor's internal details and retrieve state information about the actor, as allowed or prohibited by the actor. The actor's lifetime proceeds from created, to running, to stopped. Once stopped, it does not restart.(3)
- It has APIs to start processing, and manage a synchronized message queue, from which it receives requests for action from the enclosing program (including itself or other actors). When the actor is created, the queue can accept messages, but they are not processed. When running, the actor processes the messages sequentially and atomically, one message at a time. Pending messages are queued in the message queue. When stopped, messages are ignored. Messages sent from multiple execution contexts to the actor are not guaranteed to arrive in temporal order, although multiple messages from the same source will arrive in chronological order.(4)
- The actor is created externally and started with an external API call. It is stopped by sending a 'stop' request through the message queue, which the actor responds to by cleaning up and terminating itself. When running, an actor can process a finite number of messages, send messages to itself or other actors, change local state and create/control/terminate a finite number of other actor objects. Besides creation and starting, the local state mutates only when processing a message.
- An actor is a passive and lazy object. It will not respond or execute unless a message is sent to it via the message queue.
- Parallelism can be observed with multiple actors processing messages concurrently.
The examples created here will consider actors as a C++ "framework" base class containing basic functionality and one or more derived classes containing some required plumbing and desired behavior provided by the programmer. For the actor representation described above, see the diagram in Figure 1.
Once created, the actor base class has 2 provided
public
methods Start()
and Send()
which start the actor
and send messages to the message
queue, plus a protected method Process()
to implement payload
behavior and Exit()
to
terminate.
The Process()
method is pure
virtual and must be implemented in the derived class.
The base class encapsulates message handling
and the creation, deletion and management of a thread.
The derived class (this would be provided by
you) must implement the main intended actor behavior and activation of
termination
handling in the Process()
method.
This is a minimal description; of course, other
details can be implemented at the discretion of the programmer, for example
to
retrieve actor state or internal data. Note that the actor is not
(gracefully) stopped
directly from an external API call but asynchronously shuts itself down and
exits when processing a prearranged shutdown message through the message
queue.
Ensuring this behavior is the
responsibility of the derived class and calling code.
An extract of the actor base class is:
class HBActor
{
public:
HBActor();
virtual ~HBActor();
public:
virtual void Send(BaseMessage* message);
virtual void Start();
protected:
virtual void Process(BaseMessage* /* msg */) = 0;
// ...
};
By implementing some fairly simple classes, we can build on this to create a fairly complex framework with little effort. Actors can interact in a network, comprising well known architectures such as a fork/join structure, a pipeline or shared work queue. Let's look at a fork/join example.
Fork/Join
A fork/join solver(5) is briefly summarized as follows:
Result solve(Problem problem)
{
if (problem is small)
directly solve problem
else {
split problem into independent parts
fork new subtasks to solve each part
join all subtasks
compose result from subresults
}
}
Suppose that we have some code with two orthogonal pieces that could execute in parallel. This can be implemented fairly simply with actors using a simplistic fork/join arrangement. Code illustrating this would look like:
void foo()
{
LongProcess1();
LongProcess2();
}
void MyCode()
{
foo();
}
If we implement at least one of these as an actor
object,
method foo() can be rewritten as a trivial fork/join with the two pieces for
some speedup.
To do this, of course the
two functions need to be truly orthogonal to each other with no shared data
to
avoid race conditions. LongProcess2()
also must be void (or the return value ignored) since it operates autonomously and we can't get a return value back from it.
This would look like:
typedef enum { DOTASK, STOPACTOR } MsgType_t;
// Message class
class Msg : public BaseMessage
{
public:
Msg(int iValue)
{ m_iValue = iValue; }
virtual ~Msg(){}
int GetValue() const
{ return m_iValue; }
private:
int m_iValue;
};
// Simple actor class to implement fork/join of a function
class MyActor : public HBActor
{
public:
MyActor() {}
virtual ~MyActor() {}
protected:
virtual void Process(BaseMessage* pBMsg)
{
Msg*pMsg = (Msg*) pBMsg;
if(STOPACTOR == pMsg->GetValue()) // Handle termination request
Exit();
else // Handle execution request
LongProcess2();
delete pBMsg;
}
};
void foo()
{
MyActor actor;
actor.Start();
actor.Send(new Msg(DOTASK));
actor.Send(new Msg(STOPACTOR));
LongProcess1();
actor.Join();
}
void MyCode()
{
foo();
}
This is a simplistic configuration for the fork/join model. Ask Google for some more involved code examples.
The actor object activates a thread as part of its
startup
code, which can consume resources and take some time.
Creation of a thread is not free or
instantaneous.
Be sure that the two APIs
LongProcess1()
and LongProcess2()
are indeed "long" compared to thread
creation,
or you will be wasting your time with this implementation.
Another example calculating a list of primes using a pipeline of actors is included in the sample code.
Limitations of the Actor Programming Model
To be complete, here are some of the realities of the Actor model:
- Use of actors reduces mechanisms for race conditions but does not eliminate them. Data race conditions are possible if the messages or underlying logic touched by the actor objects includes mutable shared objects. Implementation of truly concurrent data structures is non-trivial. The actor model improves on some of these issues, but does not solve all of the problems.
- Deadlocks are possible under a number of situations.
- The Actor model implements message passing in the direction of the actor, but does not facilitate sending a request and receiving a specific status or a reply to a request. Synchronous replies require some sort of blocking logic. For information on objects which can provide this behavior, look at "futures".
Footnotes
- Source: http://dl.acm.org/citation.cfm?id=1624804
- Actors can actually execute in a computer network or as multiple processes in separate address spaces. In this article, I consider actors on a single machine in one address space with multiple threads.
- There is also some work describing a "pause" and "resume" feature which is not considered here.
- Some reference information does not guarantee this detail. For purposes of this article, it will be assumed to be the case.
- Source: http://gee.cs.oswego.edu/dl/papers/fj.pdf
Points of Interest
I have seen various comments on the Actor Programming Model on the web, including some detractors. I have been very happy with the functionality this model presents. I hope you enjoy it!
In my coding travels, I have used the actor model as a logging class, a buffered I/O handler (both input and output) and as an iterative problem solver. I love it!
As stated above, this code was developed with VS2010 on Win32. I am interested in validation of the code with other compilers and other platforms. Perhaps leave comments below if you have used it with success on another platform.
History
- 2012/10/20 Initial version.
- 2012/10/27 Reinstate lost bullet points, add missing Join() call, fix footnote reference.