Sunday, September 13, 2009

Independent Study: A Universal Modular ACTOR Formalism for Artificial Intelligence

(This is one of a series of posts about papers I'm reading for an independent study with Prof. Evan Chang at the University of Colorado, Boulder. The format of this and subsequent papers will be similar to that of a review of a paper submitted to a conference. These are already-published papers, so I'll be writing with the obvious benefit of hindsight.)

Submission: A Universal Modular ACTOR Formalism for Artificial Intelligence [PDF]

Reviewer: Nicholas Dronen

Please give a brief, 2-3 sentence summary of the main ideas of this paper:
The authors present their notion of an actor, a formalism which unifies control flow and data flow. Actors allow programs to achieve a high degree of parallelism without the explicit use of semaphores. Actors accomplish this by being free of side effects and communicating with other actors only by passing messages.
What is the strength of this paper (1-3 sentences):
The paper's greatest impact is the scalability (i.e. degree of parallelism) and reliability (discussed below) of actors.
What is the weakness of this paper (1-3 sentences):
While the ambition of the paper is admirable, it nonetheless casts too wide a net, covering topics from hardware design to formal representations of knowledge and belief. Historically, the value of actors as a way to reason about the properties of programs isn't clear, considering that other approaches, such as the pi calculus, have seen wider adoption.
Evaluation
In retrospect, a great deal of this paper could be removed without harm. Specifically, the idea of actors as an appropriate abstraction for hardware (e.g. removing interrupts) and the discussion of using actors to represent knowledge could be completely removed. The remaining material could then be expanded, ideally covering in more depth the concurrency-related aspects of actors.
Novelty
Some aspects of actors -- such as message passing and perhaps pattern matching -- are not original. What is novel in this paper is the combination of these techniques into a scheme that provides a high degree of concurrency.
Convincing
Quite.
Worth Solving
Yes.
Confidence
I am very confident in my evaluation of this paper. I've been thinking about actors for a year or so.
Detailed Comments
To understand actors, consider a thread. Its stack is finite, but it's undecidable how much stack space a function running in a thread requires, so system designers allocate a constant amount of stack space for each thread -- enough for well-behaved programs, but still too much to allow even tens of thousands of threads on a reasonably-equipped machine. It's conventional for 2 MB to be allocated to a thread on a 32-bit machine running Linux. If a process is limited to 2 GB of memory, and assuming for the sake of discussion that no other implementation issues (such as the placement of the heap) interfere, a process can have at most ~1000 threads. Actors solve this problem, which is admittedly not an enormous problem, but the solution to this problem solves another huge problem -- the high degree of concurrency that programs will require in order to exploit the large number of cores of future processors.

An actor is a user space thread, the stack of which doesn't grow -- essentially a tail-recursive function. So the system designer only has to allocate an extremely small amount of stack space -- basically enough for an activation record and local variables -- which means that a single process can contain many thousands of actors. For example, a benchmark has run with 20 million Erlang processes (the Erlang term for an actor).

An actor contains only local state, so it has no side effects. Since there's no shared state, the programmer doesn't have to concern herself with error-prone synchronization primitives, such as semaphores. Actors communicate with each other by sending messages. The incoming messages for an actor are placed in an actor's mailbox (this term doesn't appear in the paper but is in common use today). When an actor receives a message, it uses pattern matching (in the style of Haskell and ML) to determine the type of the message.

One would be wrong to infer that synchronization primitives are not necessary for a runtime system that supports actor-based programs in a single operating system process on an SMP or multicore machine. Imagine a program running on a multicore machine with 10,000 actors, and 9,999 are sending messages to the remaining one (who we'll call A). The actor A's mailbox will be filling up with messages, and it's possible that two actors executing concurrently can send a message to A at the same time, which implies that the scheduler for A has to be capable of handling two requests at once without corrupting A's mailbox.

The paper's authors found inspiration for actors in packet-switched networks, which in part explains why a program written in actor-based language seems so much like a distributed system. Where the program has actors sending messages to each other, the distributed system has nodes exchanging packets. In the domain of software architecture, there is an interesting analogue between REpresentational State Transfer (REST) and the actor model, which is mentioned during the question and answer period of a recent presentation by REST advocate Steve Vinoski.

Additional benefits of actors: since they contain only local state, they can be terminated if an error occurs. This effectively isolates errors. A small amount of data may be lost, but the program continues to run. Since actors consume so little memory, each actor can be monitored by another actor. The monitoring actor can be notified when the "child" actor dies, and it can restart the child. This increases the reliability of programs. Another consequence of the lack of global state in actor-based programs is the ease with which code can be reloaded into a running program without consequences. This increases availability, which is especially important in systems that are supposed to be fault tolerant.





No comments: