Parallel prefix computation.
Given an array a of n elements, compute
a0 º
a1
º
... º an-1 where º is
any operation that is commutative and associative.
Example: finding the minimum in an array.
Sequential algorithm
result = a[0];
for (i=1; i<
Data division:
// Shared memory mode
Divide_receive_data()
{
if (id == 0)
read size, limit;
Barrier();
}
// Distributed mode
Divide_the_data_among_processes(id)
{
if (id == 0)
read size, limit;
Broadcast(size, 0);
Broadcast(limit, 0);
}
Actual computation example for the problem of finding the minimum.
Computation()
{
Generate_array(array, size, limit);
my_min = Find_min(array, size);
}
Process_data(prev_min,
my_min)
{
if (recv_min < my_min)
my_min = recv_min;
}
Process_final_result(my_min)
{
Barrier(); // for shared memory models
output "The global minimum is ", my_min;
}
Linear communication. Each process receives the data from the next one (save for the last process). Then each process sends the data to the previous one (save for the first process). The process number 0 (master) processes the final data.
Master_Slave()
{
if (id != no_proc-1)
{
Recv(prev_data,
id+1);
Process_data(prev_data,
my_data);
}
if (id != 0)
Send(my_data,
id-1);
else
Process_final_result(my_data);
}
Ring communication. Each process receives the data from the next process, processes it, then sends it to the previous one. The last process receives the data from the first one (0). The first process sends the data first, then receives it, then processes the final result. The last process sends the data to the first one.
Master()
{
Send(my_data, no_proc-1);
Recv(prev_data, 1);
Process_data(prev_data,
my_data);
Process_final_result(my_data);
}
Slave()
{
if (id == no_proc-1)
Recv(prev_data, 0);
else
Recv(prev_data,
id+1);
// Recv(prev_data,
(id+1) % no_proc);
Process_data(prev_data,
my_data);
Send(my_data, id-1);
}
Hierarchical Broadcast
Master_Slave()
{
if (id > 0)
Recv(data,
low((id-1)/2));
child = 2*id+1;
if (child < no_proc)
Send(data, child);
if (child+1 < no_proc)
Send(data, child+1);
}
Hierarchical Data Collecting
Master_Slave()
{
child = 2*id+1;
if (child < no_proc) {
Recv(recv_data, child);
Process_data(recv_data,
my_data);
}
if (child+1 < no_proc) {
Recv(recv_data,
child+1);
Process_data(recv_data,
my_data);
}
if (id > 0)
Send(my_data,
low((id-1)/2));
else
Process_final_result(my_data);
}
Master_Slave()
{
if (id != no_proc-1)
Receive(prev_data,
id+1);
Process_data(prev_data,
my_data);
if (id != 0)
Send(my_data,
id-1);
else
Process_final_result(my_data);
}
Variation on the ring communication. Each process sends the data to the next one. Each process receives the data first, processes it, then sends it to the next. The first process sends the data first, then receives it from the last process, then processes the final result. The last process sends the data to the first one.
Master_Slave()
{
if (id != 0)
Receive(prev_data,
id-1);
Process_data(prev_data,
my_data);
if (id != no_proc-1)
Send(my_data,
id+1);
else
Send(my_data, 0);
if (id == 0)
Receive(prev_data,
no_proc-1);
Process_data(prev_data,
my_data);
Process_final_result(my_data);
}