Communicating Sequential Elements


The Communicating Sequential Elements pattern is a domain parallelism pattern where each component performs the same operations on different pieces of ordered data [Fos94, CT92]. Operations in each component depend on partial results in neighbour components. Usually, this pattern is presented as a logical structure, conceived from the ordered data.

Examples

A Non-programming Example

Consider the case of honey bees. The workers act as constructors, harvesters, soldiers, and nursemaids. Each individual is capable to eventually perform any of these activities in cooperation with the others during its short lifetime. Minute-to-minute control of the hive’s behaviour is dispersed. Changes, like many shifts in foraging patterns of honey bees come from signals among the workers, whose success or failure at various tasks can rise hive-permeating hormone levels that then brings about changes in the whole hive.

A Programming Example

Consider the case of data mining problems, in which government and businesses organisations require information processing to acquire and analyse data about customers information, products, taxes, etc., devoting a lot of computational effort to automatically extract useful information or "knowledge" from these data. A particular type of data mining is mining for associations [CSG97, AS96], in which it is important to discover relations or associations among the information to generate rules for the inference of behaviour. For example, given a database in which records correspond to customer purchase transactions, the goal in mining for associations is to determine which sets of a number of items occur together in more than a given threshold fraction of these transactions.

A proposed solution uses an algorithm of several passes over the database. The first pass simply counts item occurrences to determine sets of items with frequency 1. A subsequent pass, say k, consists of two parts or phases: first the sets of items with frequency k-1 are used to generate a set of candidates, say Ck,, using a certain generation procedure. Next, the database is scanned and the support of candidates in Ck is counted. Fast counting is required to efficiently determine the candidates in Ck contained in a given transaction. Data distribution is used for this. Each processing element counts mutually exclusive local candidates. With a large number of elements, a large number of candidates can be counted in a pass. However, as part of the procedure, every element must broadcast its local data to all other elements at the end of every pass. Finally, after k passes, the results are available for analysis [CSG97, AS96].

Problem

A computation is required that can be performed as a set of quasi-independent operations on ordered data. Results cannot be constrained to a one-way flow among processing stages, but each component executes a relatively autonomous computation. Communications between components follow fixed and predictable paths. Consider, for example, a dynamics problem simulation: the data represents a model of a real system, where any change or modification in one region influences areas above and below it, and perhaps to a different extent, those on either side. Over time, the effects propagate to other areas, extending in all directions; even the source area may experience reverberations or other changes from neighbouring regions. If this simulation was executed serially, it would require that computations be performed across all the data to obtain some intermediate state, and then, a new iteration should begin. Generally, performance as execution time is the feature of interest.

Forces

Considering the problem description and granularity and load balance, as other elements of parallel design [Fos94, CT92], the following forces should be considered:

Solution

Parallelism is introduced as multiple participating concurrent components, each one applying the same set of operations on a data subset. Components communicate partial results by exchanging data, usually through communication channels. No data objects are directly shared among components; each one may access its own private data subset only. A component communicates by sending data objects from its local space to another. This communication may have different variants: synchronous or asynchronous, exchange of a single data object or a stream of data objects, and one to one, one to many, many to one or many or many communications. Often the data of the problem can be conceived in terms of an ordered logical structure. The solution is presented as a network that may reflect this logical structure in a transparent and natural form [CG88, Shaw95, Pan96].

Structure

In this pattern, the same operation is almost simultaneously applied to different pieces of data. However, operations in each element depend on the partial results of operations in other components. The structure of the solution involves an ordered logical structure, conceived from the data structure of the problem. Therefore, the solution is presented as a network of elements that in general follows the shape imposed by this structure. Identical components simultaneously exist and process during the execution time (Figure 7).

Figure 7. Communicating Sequential Elements.

Participants

Figure 8. Scenario of Communicating Sequential Elements.

Dynamics

A typical scenario to describe the basic run-time behaviour of this pattern is presented, where all the Sequential Elements are active at the same time. Every Sequential Element performs the same operations, as a piece of a processing network. In the most simple case, each one communicates only with a previous and next others (figure 8). The processing and communicating scenario is as follows:

Implementation

The implementation process is based on the four stages mentioned above in the Context in general and Implementation in general sections.

Consequences

Benefits

Liabilities

Known uses

Related patterns

The Communication Sequential Elements pattern is based on the original concept of Communicating Sequential Processes (CSP) [Hoare84]. Patterns that can be considered related to this processing approach are the Ubiquitous Agent Design Pattern [JP96] and the Visibility and Communication between Agents pattern [ABM96].


Contact Information

Jorge Luis Ortega Arjona.

E-mail jortega-arjona@acm.org