You are here

Appendix III: Sample Elective Course: Introduction to Parallel and Distributed Computing

 

This single course is designed to cover most of the proposed core topics into one elective parallel and distributed computing course. Preferred adoption model is integration of proposed topics into core level courses. Some samples would be collected and posted at the curriculum site.

3 semester credits or 4 quarter credits.
Total number of hours of in-class instruction = 45.

Prerequisites: Introductory courses in Computing (CS1 and CS2)

Syllabus:

  • High-level themes: Why and what is parallel/distributed computing? History, Power, Parallel vs. Distributed, Fault tolerance, Concurrency, non-determinism, locality (2 hours)
  • Crosscutting and Broader topics: power, locality; cluster, grid, cloud, p2p, web services (2 hours)
  • Architectures (4.5 hours total)
    • Classes (3 hours)
      • Taxonomy
      • Data versus control parallelism: SIMD/Vector, Pipelines, MIMD, Multi-core, Heterogeneous
      • Shared versus distributed memory: SMP (buses), NUMA (Shared Memory), Message passing (no shared memory): Topologies
    • Memory hierarchy, caches (1 hour)
    • Power Issues (1/2 hour)
  • Algorithms (17.5 hours total)
    • Parallel and distributed models and complexity (6.5 hours)
      • Cost of computation and Scalability: Asymptotics, time, cost, work, cost optimality, speedup, efficiency, space, power - (4 hours)
      • Model-based notions: PRAM model, BSP - (1 hour)
      • Notions from scheduling: Dependencies, task graphs, work, makespan – (1.5 hours)
    • Algorithmic Paradigms (3 hours)
      • Divide and conquer, Recursion
      • Series-parallel composition
    • Algorithmic Problems – (8 hours)
      • Communication: broadcast, multicast, reduction, parallel prefix, scatter/gather (2 hours)
      • Synchronization: atomic operations, mutual exclusion, barrier synchronization; race condition (1 hour)
      • Specialized computations: Rrepresentative sample from among matrix product, transposition, convolution, and linear systems (3 hours)
      • Sorting, selection (2 hour)
  • Programming (19 hours total)
    • Parallel Programming paradigms – (3 hours)
      • By the target machine model: Shared memory, Distributed Memory, Client-Server, Hybrid - (1.5 hours)
      • By the control statements: Task/thread spawning, SPMD, Data parallel, Parallel loop – (1.5 hours)
    • Parallel programming notations – (8=6+2 hours)
      • Shared memory notations: language extensions, compiler directives/pragma, libraries
      • SPMD notations: MPI, CUDA, etc.
    • Semantics and correctness issues (4 hours)
      • Synchronization: shared memory programs with critical regions, producer- consumer; mechanism for concurrency (monitors, semaphores, etc.)
      • Concurrency defects: deadlock (detection, prevention), race conditions (definition), determinacy/indeterminacy in parallel programs
    • Performance issues (3 hour)
      • Computation: static and dynamic scheduling, mapping and impact of load balancing on performance
      • Data: Distribution, Layout, and Locality, False sharing, Data transfer (1 hour)
      • Performance metrics: speedup, efficiency, work, cost; Amdahl's law; scalability
    • Tools (1 hour)
      • Debuggers (1 hour)
      • Performance monitoring (1 hour)