pthread_mutex_lock(&mutex); operation(shared_variable); pthread_mutex_unlock(&mutex);
An operation to be executed on a particular condition that may become true in another thread:
Thread1() { pthread_mutex_lock(&mutex); operation (); } Thread2() { if (condition) pthread_mutex_unlock(&mutex); }
Hungry Bird Problem
int refill; // global Parent() { while (refill > 0) { lock(&pmutex); Refill(); refill--; unlock(&rmutex); } } Baby() { while (refill || !Data_is_empty()){ lock(&rmutex); food = Fetch(); if (!food) unlock(&pmutex); else { Feed(food); unlock(&rmutex); } } }
Note: In the previous code, refill and Data_is_empty() are global shared ressources and should be protected in an appropriate way.
Global Loop Condition
int go_on = 1; // global Thread() { ... while (go_on == 1) do_something(); // might change the value of go_on ... }
Solution: have a local variable you update based on the global one and test the local variable in the loop.
Thread() { int local_go_on = 1; Global_to_local(local_go_on); while (local_go_on) { // This should still change the value of the global go_on. do_something(); Global_to_local(local_go_on); } } Global_to_local(int &local_go_on) { lock(&mutex); local_go_on = go_on; unlock(&mutex); }
Set Identity
Setting the identity of the thread from 0 to number of threads-1 without receiving it from the parent.
int global_id = 0; Thread() { int id; lock(&mutex); id = global_id; global_id++; unlock(&mutex); ... }
It makes programs non-linear. We don't know what statement will be executed next in time.
It explodes the statespace of a program combinatorially. The number of possible states in the system is much higher, so it's more difficult to test for them.
It makes the program non-deterministic. This adds to testing problems.
Poor synchronization strategies can seriously slow down a program.
Code locking: defining locks (mutexes) associated with the algorithm instead of with objects. It is more similar to the serial code but the speedup is more modest.
Partitioning: locking instances of data structures. Programs that have many instances of independent data structures can achieve better speedup.
Data Ownership: independent instances of data structures can be assigned to particular CPUs, so that each CPU or process accesses or modifies only its own data. This eliminates the need for locking.
Giant Locks: consider the serial program to already be a parallel program that uses a single global lock at the start of execution, and unlocks it just before terminating. The idea is to move toward finer-grained locking from there.
Coarse Grain Locking: lock large portions of code in one big chunk to guarantee that only one path of execution at a time will interact with a complex part of the system. (entire classes or functions).
Fine Grain Locking: can more easily lead to deadlocks which are hard to trace.
Critical Section Fusing: when the synchronization overhead is large compared to the instructions, it may be beneficial to fuse adjacent CSs to reduce the number of synchronization primitives that must be executed.
Delegation: have subsystems or classes (like the monitors) be in charge of the synchronization.