Dana Vrajitoru
B424 Parallel and Distributed Programming
Shared Memory Models
Programs and libraries designed for computers with several CPUs that have
access to the same memory - multiprocessor machines.
The RAM memory is common, and each processor has its own cache.
The programs are called multi-threaded (multiple execution threads).
Two types of machines
-
Uniform memory access (UMA) for a small number of CPUs (2 to about 30),
for which the interconnection network is a memory bus or a crossbar switch.
The access time to any memory location is about the same for each process.
-
Non-uniform memory access (NUMA) for a large number of CPUs (tens or hundreds),
for which the interconnection network is a tree-structured collection of
switches and memories. For each processor, some parts of the memory are
closer (faster access) and others are farther (slower access).
Specific problems
-
Cache consistency problem - when a processor modifies the values of a variable
in its own cache, the cache of the other CPUs no longer has the correct
value.
-
Memory consistency problem - when is the primary memory actually updated?
-
Sequential consistency - memory updates occur in a sequential order and
every processor sees it in the same order.
-
False sharing - suppose that processor 1 uses variable x, processor 2 uses
variable y, and the two variables are on the same line in the cache. Whenever
one process modifies its variable, the cache of the other processor is
also updated.
-
These problems are often taken care of by the libraries.
Applications for multi-threaded programs
-
Interfaces where some widgets have to stay active even while the program
is performing some computation (media player).
-
Agent applications - artificial intelligence, games.
Background_play() {
while (data) {
Play_data(data);
if (stop_signal) break;
Get_data(data);
}
}
class Stop_button {
...
onClick: Generate(stop_signal);
...
};
Variables that are shared
-
Global variables among threads (processes) generated in the same module
or declared as extern.
-
Static variables for functions called by several threads.
-
Reference parameters.
- A pointer containing the same address for several threads /
processes refers to the same physical memory.
Variables that are not shared
-
Local auto variables for any functions.
-
Value parameters.
Threads Programming
A thread is an independent stream of instructions that can be
scheduled to run as such by the operating system.
In the Unix/Linux environments a thread:
- Exists within a process and uses the process resources.
- Has its own independent flow of control as long as its parent
process exists and the OS supports it.
- May share the process resources with other threads that act
equally independently (and dependently).
- Dies if the parent process dies - or something similar.
- Creating a new thread within a process requires a lot less time
and resources than creating a new process.
Process
| Thread
|
Process ID, process group ID, user ID, and group ID
Environment
Working directory.
Program instructions
Registers
Stack
Heap
File descriptors
Signal actions
Shared libraries
Inter-process communication tools
(such as message queues, pipes,
semaphores, or shared memory).
|
Stack pointer
Registers
Scheduling properties (such as policy or priority)
Set of pending and blocked signals
Thread specific data.
|
Types of Threads
- Kernel-level threads: the kernel is responsible of the
scheduling them. They can take advantage of multiple CPUs.
- User-level threads: live without any support from the
kernel; they maintain all of their state in user space. Since the
kernel does not know about them, they cannot be scheduled to run on
multiple processors in parallel.
Protection Boundary
A protection boundary protects one software subsystem on a
computer from another, in such a way that only data that is explicitly
shared across such a boundary is accessible to the entities on both
sides.
All code within a protection boundary will have access to all data
within that boundary.
Examples:
- between processes and the kernel. The kernel is protected from
processes; they can only examine or change its internal state in
strictly-defined ways.
- between individual processes. This prevents one buggy or malicious
process from wreaking havoc on others.
Transferring control across protection boundaries is expensive.
Critical Section Problem
Critical section (region): Part of the code and/or memory that must
not be accessed in read or write mode by more than one process at a time.
process Critical_section() {
while (true) {
entry protocol;
critical section;
exit protocol;
noncritical section;
}
}
The entry and exit protocols for the critical region must satisfy the
following requirements:
-
Mutual exclusion - At most one process at a time is executing the CS.
-
Absence of deadlock - If more than one process is trying to enter the CS,
at least one will succeed.
-
Absence of delay - If only one process is trying to access the CS, it is
not prevented from entering it.
-
Entry guarantee - A process that is attempting to enter a CS will
eventually succeed if no process is doing an endless loop in the
CS.
Atomic operation: a sequence of one or more instructions that
appears to be executed as an indivisible action. Example: a statement that
translates into a single machine language instruction.
The initialization, test, increase, and decrease operations must be
atomic.
Example of entry and exit protocols. The while loop in the entry
protocol is also known as a Busy Waiting procedure.
short control = 0;
Entry () {
while (control < 0);
control--;
}
Exit() {
control++;
}