Parallel Programming Short Course
Summer 2001, an informal lecture series
- instructor: Erik Deumens, deumens at qtp.ufl.edu
- schedule: Two lectures per week Tuesday Thursday 5th
period 11:45 am - 12:35 pm. Homework exercises are an
essential part of the course.
- Location NPB 1213
- requirements: familiar with one programming language
from the following list: Fortran 77/90/95 or C or C++.
- notes: lecture notes provided in the form of
PowerPoint notes pages.
- e-mail list adv-prog at qtp.ufl.edu
Synopsis
This course teaches the practical programming details to create
programs that can exploit multiple procesors in compute nodes using
the OpenMP and POSIX Threads standard and that can exploit
multiple compute nodes in compute clusters using the MPI standard.
Syllabus
- MPI basics
message passing;
standard;
260 calls defined, only 10 calls really needed;
simple example program
- Threads basics
threads and processes;
POSIX standard;
60 calls, only 8 calls really needed;
simple example program
- Parallel computing
fine grained, coarse grained;
start from global view;
choice of parallel objects;
distribute actions and data;
analysis of matrix multiply
- MPI programming 1
Different forms of matrix multiply
- MPI programming 2
advanced MPI;
communication worlds, groups, collective operations
- OpenMP programming 1
concepts;
directives;
scope of variables
- OpenMP programming 2
internals;
scalable matrix multiply;
locks
- Threads programming 1
matrix multiply;
client/server threads
- Threads programming 2
advanced pthreads;
synchronization by initialization, locks and mutexes
Advanced Programming
Spring 2000 PHY 6905 Section 3615
- instructor: Erik Deumens, deumens at qtp.ufl.edu
- schedule: Two lectures per week plus homework
exercises in NPB
- Last taught Spring 2000 as PHY 6905 Section 3615
- schedule: Two lectures per week plus homework
exercises.
- requirements: familiar with one programming language
from the following list: Fortran 77/90/95 or C or C++.
- grading: students will be graded on the homework
assignments which are in the form of programs to be written
and submitted.
- notes: lecture notes provided in the form of
PowerPoint notes pages.
- e-mail list adv-prog at qtp.ufl.edu
Synopsis
This course teaches the high level design principles and
practical programming details to create
high performance, scalable parallel programs that run well on
both distributed memory machines like Cray T3E and IBM RS/6000
SP, and shared memory machines such as Sun Enterprise and SGI
Origin and IBM RS/6000 SMP-series, and on clusters of computers
connected by a high performance switching network device.
Concepts to be explained and used in the course are object
oriented design, software engineering, multi-threading and
message passing. Most examples will use Fortran 95, but C and
C++ will be used as comparisons. The POSIX thread library and
the Message Passing Interface standard will be used.
Syllabus
Each lecture of about 1 hour (45 minutes plus questions) treats
one topic. Problems will be assigned for work to be completed
individually. Each lecture can stand on its own, but some topics
provide background essential for understanding other topics.
- Modern Processors
computer architectures;
CISC, RISC, EPIC;
vector, superscalar;
memory, RAM, cache;
virtual memory, disk;
switches, busses;
networks
- Modern Programming: Objects
object oriented analysis (OOA);
object oriented design (OOD);
object oriented programming (OOP);
characteristics: encapsulation, information hiding, message
passing, late binding, delegation, class/instance/object,
generalization/realization without polymorphism and with
polymorphism, relationships;
universal modeling language (UML);
object based language (OBL): Fortran 95;
object oriented language (OOL): C++
- Fortran 95 features
module;
type;
interface;
pointers;
array operations
- Professional tools
editor: emacs;
builder: make;
version control: cvs;
debugger
performance monitor
- Parallel computers
SMP, MPP;
SIMD, MIMD;
dataparallel;
clusters and switched networks;
transputers;
NUMA, COMA, messages;
threads
- RS/6000 SP and O2000
MPP + SMP + switch;
hardware;
software;
user environment;
example program
- MPI basics
message passing;
standard;
260 calls defined, only 10 calls really needed;
simple example program
- Threads basics
threads and processes;
POSIX standard;
60 calls, only 8 calls really needed;
simple example program
- Parallel computing
fine grained, coarse grained;
start from global view;
choice of parallel objects;
distribute actions and data;
analysis of matrix multiply
- Debugging
interactive debugging; dbx, xldb, idebug, TotalView, pedb
- MPI programming 1
Different forms of matrix multiply
- MPI programming 2
advanced MPI;
communication worlds, groups, collective operations
- OpenMP programming 1
concepts;
directives;
scope of variables
- OpenMP programming 2
internals;
scalable matrix multiply;
locks
- Threads programming 1
matrix multiply;
client/server threads
- Threads programming 2
advanced pthreads;
synchronization by initialization, locks and mutexes
- Monitoring and tuning
prof, gprof, xprofiler;
vt;
program marker array
- Production Runs
resource sharing, scheduling;
LoadLeveler;
job command keywords;
job classes, priorities;
nodes and CPUs;
time, data and stack limits;
example job
- Languages and libraries
overview of languages and libraries supporting parallel
programming
- Case Study: Crystal 98
A case study of parallel programming with MPI in a production
program used in Computational Chemistry.
- Case Study: Design with F95
A case study of designing a library with the new features of
Fortran 95 on QTIP, a library for computing integrals for
Computational Quantum Chemistry.
- Script interface
An introduction into extending a scripting language with
your software and embedding the script interpreter into
your software. Both Tcl and Python are discused.
Textbook
There is no single book that covers the material, however,
the combination of the following two works are a good start:
-
High Performance Computing (2nd Edition), Kevin Dowd and
Charles Severance, O'Reilly and Associates 1998.
-
Using MPI: Portable parallel programming with the
message-passing interface, William Gropp, Ewing Lusk, Anthony
Skjellum, MIT Press, 1997.
-
Parallel Programming in OpenMP, Robit Chandra, Leonardo Dagun
Dave Kohr, Dror Maydan, Jeff McDonald, Ramesh Menon, Academic
Press 2001.
Other important references are:
-
Efficient C++ Performance Programming Techniques, Don Bulka,
David Mayhew, Addison-Wesley, 2000.
-
UML and C++: A practical guide to object-oriented
development, Richard C. Lee and William M. Teppenhart,
Prentice Hall, 2001.
-
MPI - The complete reference, Volume 1, The MPI core (2nd
edition), Marc Snir, Steve Otto, Steven Huss-Lederman, David
Walker, Jack Dongarra, Marc Snir, MIT Press, 1998.
-
MPI - The complete reference, Volume 2: The MPI extension,
William Gropp, Steven Huss-Lederman, Andrew Lumsdaine, Ewing
Lusk, Bill Nitzberg, William Saphir, Marc Snir, MIT Press,
1998.
-
Multithreaded Programming with Pthreads, Bill Lewis and
Daniel J. Berg, Sun Microsystems Press (Prentice Hall),
1998.
-
Pthreads Programming, Bradford Nichols, Dick Buttlar and
Jacqueline Proulx Farrell, O'Reilly and Associates, 1996.
Useful references are:
-
Computing for Scientists: Principles of Programming with
Fortran 90 and C++, R. J. Barlow and A. R. Barnett, John Wiley
and Sons, 1998.
-
Fortran 95 Handbook: complete ISO/ANSI reference,
J. C. Adams, W. S. Brainerd, J. T. Martin, B. T. Smith,
J. L. Wagener, MIT Press, 1997.
-
The C++ Programming Language (3rd edition), Bjarne
Stroustrup, Addison-Wesley, 1997.
-
Modern C++ Design: Generic Programming and Design Patterns
Applied, Andrei Alexandrescu, Addison-Wesley, 2001.
-
Using the STL: The C++ Standard Template Library, Robert
Robson, Springer, 1997.
-
Parallel Programming using C++, Bregory V. Wilson and Paul Lu
(editors), MIT Press, 1996.
-
Multithreaded Programming with Windows NT, Thuan Q. Pham
and Pankaj K. Garg, Prentice Hall, 1995.
-
Practical Parallel Programming analysed, Gregory V. Wilson,
MIT Press, 1995.
-
Designing and building parallel programs: Concepts and tools
for parallel software engineering, Ian Foster, Addison-Wesley,
1995.
Format
The course can be taught as a regular class or as a condensed seminar
with the sequence of 19 lectures (numbers as above) and 11
practice sessions or home work exercise sessions.
- One semester of 15 weeks at 2 hours per week, or
- One week of 5 days at 6 hours per day.
|
Day 1 |
Day 2 |
Day 3 |
Day 4 |
Day 5 |
| 9:00 |
1 |
5 |
9 |
13 |
17 |
| 10:00 |
2 |
6 |
10 |
14 |
18 |
| 11:00 |
* |
* |
* |
* |
* |
| 14:00 |
3 |
7 |
11 |
15 |
19 |
| 15:00 |
4 |
8 |
12 |
16 |
20/21 |
| 16:00 |
* |
* |
* |
* |
* |
(c) Erik Deumens Last modified: 14 Aug 2001