Using the Actor Programming Model

H.Brydon

4.78/5 (8 votes)

Oct 23, 2012

CPOL

7 min read

54825

463

Using the Actor Programming Model

Download source - 7.3 KB

Introduction

Software developers are starting to wake up to the notion that concurrency and parallel processing are becoming more important as the industry matures. I recently studied the Actor Programming Model in a little bit of detail and am intrigued by its simplicity and robustness. The Actor Programming model is a concurrent programming technique which provides a powerful mechanism for encapsulating several concurrency features. Building code based on this model allows implementation of some advanced concurrent constructs with fairly simple code. A developer not well experienced in parallel coding should be able to write concurrent code following only a few rules to prevent the common problems of race conditions and non-deterministic behavior.

This article explores a few uses of Actors using Win32 in C++. The code examples provided were developed and tested with Visual Studio 2010, but I would expect them to work with older and newer versions of the compiler and in a 64 bit environment. The recent C++ 2011 language standard has added some new threading syntax to the language (implemented starting with Visual Studio 2012) which provide some builtin C++ features that would allow alternate implementation of some of the threading details. The code is Win32 but the underlying principles are universal, applicable to many languages and platforms.

Background

In general terms, the Actor concept was developed and refined about 1973⁽¹⁾, and developed on a platform of multiple independent processors in a network. Implementation on a multiprocessor machine provides several basic concurrency features including encapsulation of parallel synchronization and serialized message processing, which allow higher level concurrent features such as fork/join, async/await, pipeline processing and others. The actor code encapsulates the threading and synchronization management so that a class derived from it can use threading techniques without having to implement the low level plumbing details.

What Is It?

In simplest terms, an actor is an object which has the following characteristics:

It is an autonomous, interacting component of a parallel system comprising an execution control context (ie. A process, thread or fiber), an externally accessible address, mutable local state, and APIs to manipulate and observe the state.⁽²⁾ Each actor's state is unique and is not shared with other objects.

The processing states consist of: Created, running, stopped, with substates determined by the programmer. For all processing states, external code can look at the actor's internal details and retrieve state information about the actor, as allowed or prohibited by the actor. The actor's lifetime proceeds from created, to running, to stopped. Once stopped, it does not restart.⁽³⁾

It has APIs to start processing, and manage a synchronized message queue, from which it receives requests for action from the enclosing program (including itself or other actors). When the actor is created, the queue can accept messages, but they are not processed. When running, the actor processes the messages sequentially and atomically, one message at a time. Pending messages are queued in the message queue. When stopped, messages are ignored. Messages sent from multiple execution contexts to the actor are not guaranteed to arrive in temporal order, although multiple messages from the same source will arrive in chronological order.⁽⁴⁾

The actor is created externally and started with an external API call. It is stopped by sending a 'stop' request through the message queue, which the actor responds to by cleaning up and terminating itself. When running, an actor can process a finite number of messages, send messages to itself or other actors, change local state and create/control/terminate a finite number of other actor objects. Besides creation and starting, the local state mutates only when processing a message.

An actor is a passive and lazy object. It will not respond or execute unless a message is sent to it via the message queue.

Parallelism can be observed with multiple actors processing messages concurrently.

The examples created here will consider actors as a C++ "framework" base class containing basic functionality and one or more derived classes containing some required plumbing and desired behavior provided by the programmer. For the actor representation described above, see the diagram in Figure 1.

Figure 1. Representation of the base class in the actor programming model

Once created, the actor base class has 2 provided public methods Start() and Send() which start the actor and send messages to the message queue, plus a protected method Process() to implement payload behavior and Exit() to terminate. The Process() method is pure virtual and must be implemented in the derived class. The base class encapsulates message handling and the creation, deletion and management of a thread. The derived class (this would be provided by you) must implement the main intended actor behavior and activation of termination handling in the Process() method. This is a minimal description; of course, other details can be implemented at the discretion of the programmer, for example to retrieve actor state or internal data. Note that the actor is not (gracefully) stopped directly from an external API call but asynchronously shuts itself down and exits when processing a prearranged shutdown message through the message queue. Ensuring this behavior is the responsibility of the derived class and calling code.

An extract of the actor base class is:

class HBActor
{
public:
  HBActor();
  virtual ~HBActor();

public:
  virtual void Send(BaseMessage* message);
  virtual void Start();

protected:
  virtual void Process(BaseMessage* /* msg */) = 0;

// ...
};

By implementing some fairly simple classes, we can build on this to create a fairly complex framework with little effort. Actors can interact in a network, comprising well known architectures such as a fork/join structure, a pipeline or shared work queue. Let's look at a fork/join example.

Fork/Join

A fork/join solver⁽⁵⁾ is briefly summarized as follows:

Result solve(Problem problem)
{
  if (problem is small)
    directly solve problem
  else {
    split problem into independent parts
    fork new subtasks to solve each part
    join all subtasks
    compose result from subresults
  }
}

Suppose that we have some code with two orthogonal pieces that could execute in parallel. This can be implemented fairly simply with actors using a simplistic fork/join arrangement. Code illustrating this would look like:

void foo()
{
  LongProcess1();
  LongProcess2();
}

void MyCode()
{
  foo();
}

If we implement at least one of these as an actor object, method foo() can be rewritten as a trivial fork/join with the two pieces for some speedup. To do this, of course the two functions need to be truly orthogonal to each other with no shared data to avoid race conditions. LongProcess2() also must be void (or the return value ignored) since it operates autonomously and we can't get a return value back from it.

This would look like:

 typedef enum { DOTASK, STOPACTOR } MsgType_t;

// Message class
class Msg : public BaseMessage
{
public:
  Msg(int iValue)
  { m_iValue = iValue; }
  virtual ~Msg(){}

  int GetValue() const
  { return m_iValue; }

private:
  int m_iValue;
};


// Simple actor class to implement fork/join of a function
class MyActor : public HBActor
{
public:
  MyActor() {}
  virtual ~MyActor() {}
protected:
  virtual void Process(BaseMessage* pBMsg)
  {
    Msg*pMsg = (Msg*) pBMsg;
    if(STOPACTOR == pMsg->GetValue()) // Handle termination request
      Exit();
    else // Handle execution request
      LongProcess2();
    delete pBMsg;
  }
};

void foo()
{
  MyActor actor;
  actor.Start();
  actor.Send(new Msg(DOTASK));
  actor.Send(new Msg(STOPACTOR));
  LongProcess1();
  actor.Join();
}

void MyCode()
{
  foo();
}

This is a simplistic configuration for the fork/join model. Ask Google for some more involved code examples.

The actor object activates a thread as part of its startup code, which can consume resources and take some time. Creation of a thread is not free or instantaneous. Be sure that the two APIs LongProcess1() and LongProcess2() are indeed "long" compared to thread creation, or you will be wasting your time with this implementation.

Another example calculating a list of primes using a pipeline of actors is included in the sample code.

Limitations of the Actor Programming Model

To be complete, here are some of the realities of the Actor model:

Use of actors reduces mechanisms for race conditions but does not eliminate them. Data race conditions are possible if the messages or underlying logic touched by the actor objects includes mutable shared objects. Implementation of truly concurrent data structures is non-trivial. The actor model improves on some of these issues, but does not solve all of the problems.

Deadlocks are possible under a number of situations.

The Actor model implements message passing in the direction of the actor, but does not facilitate sending a request and receiving a specific status or a reply to a request. Synchronous replies require some sort of blocking logic. For information on objects which can provide this behavior, look at "futures".

Footnotes

Source: http://dl.acm.org/citation.cfm?id=1624804
Actors can actually execute in a computer network or as multiple processes in separate address spaces. In this article, I consider actors on a single machine in one address space with multiple threads.
There is also some work describing a "pause" and "resume" feature which is not considered here.
Some reference information does not guarantee this detail. For purposes of this article, it will be assumed to be the case.
Source: http://gee.cs.oswego.edu/dl/papers/fj.pdf

Points of Interest

I have seen various comments on the Actor Programming Model on the web, including some detractors. I have been very happy with the functionality this model presents. I hope you enjoy it!

In my coding travels, I have used the actor model as a logging class, a buffered I/O handler (both input and output) and as an iterative problem solver. I love it!

As stated above, this code was developed with VS2010 on Win32. I am interested in validation of the code with other compilers and other platforms. Perhaps leave comments below if you have used it with success on another platform.

History

2012/10/20 Initial version.
2012/10/27 Reinstate lost bullet points, add missing Join() call, fix footnote reference.