You are here

Rationale for Algorithms Topics

 

Parallel/Distributed Models and Complexity: It is essential to provide students a firm background in the conceptual underpinnings of parallel and distributed computing (PDC). Not only is parallelism becoming pervasive in computing, but also technology is changing in this dynamic field at a rapid pace. Students whose education is tied too closely to existing - hence, obsolescent - technology will be at a disadvantage.

Paradigm shifts in the computing field are not new. To recapitulate just a few instances from the Introduction: (1) The VLSI revolution of the late 1970s and early 1980s allowed the development of computers with unheard-of levels of parallelism. New problems related to interprocessor communication arose, exposing an aspect of parallel computing that could safely be ignored when levels of parallelism were very modest. (2) As clusters of modest processors (or, workstations) joined “multiprocessors-in-a-box” as parallel computing platforms in the early 1990s, two sea changes occurred: (a) computing platforms became heterogeneous for the first time, and (b) communication delays became impossible to hide via clever latency-hiding strategies. (3) As computational grids became “parallel” computing platforms around the turn of the millennium, the distinction between parallel and distributed computing became fuzzier than ever before.

Fortunately, in spite of the “leaps and bounds” evolution of parallel computing technology, there exists a core of fundamental algorithmic principles. Many of these principles are largely independent of the details of the underlying platform architecture, and they provide the basis for developing applications on current and (foreseeable) future parallel platforms. Students should be taught how to identify and synthesize fundamental ideas and generally applicable algorithmic principles out of the mass of parallel algorithm expertise and practical implementations developed over the last few decades.

What, however, should be taught under the rubric “conceptual underpinnings of parallel and distributed computing?” Our choices reflect a combination of “persistent truths” and “conceptual flexibility.” In the former camp, one strives to impart: (1) an understanding of how one reasons rigorously about the expenditure of computational resources; (2) an appreciation of fundamental computational limitations that transcend details of particular models. In the latter camp, one needs to expose the student to a variety of “situation-specific” models, with the hope of endowing the student with the ability to generate new models in response to new technologies.

Algorithmic Paradigms: This section acknowledges the folk wisdom that contrasts giving a person a fish and teaching a student how to fish. Algorithmic paradigms lie in the latter camp. We have attempted to enumerate a variety of paradigms whose algorithmic utility has been demonstrated over the course of decades. In some sense, one can view these paradigms as algorithmic analogues of high-level programming languages. The paradigms in our list can be viewed as generating control structures that are readily ported onto a broad range of parallel and distributed computing platforms. Included here are: the well-known divide-and-conquer paradigm that is as useful in the world of sequential computing as in the world of parallel and distributed computing (where it is the basis for the expansive-reductive, or “MapReduce” computations); the series-parallel paradigm that one encounters, e.g., in multi-threaded computing environments; the scan, or, parallel-prefix operator, which simplifies the specification of algorithms for myriad computational problems; and many others.

Algorithmic problems: Two genres of specific algorithmic problems play such important roles in a variety of computational situations that we view them as essential to an education about parallel and distributed computing. (1) Several of our entries are auxiliary computations that are useful in a variety of settings. Collective communication primitives are fundamental in myriad applications to the distribution of data and the collection of results. Certain basic functions perform important control tasks, especially as parallel computing incorporates many of the characteristics of distributed computing. The process of leader election endows processors with computationally meaningful names that are useful in initiating and coordinating activities at possibly remote sites. The essential process of termination detection is easy when parallelism is modest and processors are physically proximate, but it is a significant challenge in more general environments. (2) Several of our entries are independent computations that are usually sub-procedures of a broad range of large-scale computations. Sorting and selection are always near the top of everyone’s list of basic computations. Algorithms that operate on graphs and matrices also occur in almost every application area.