-
Notifications
You must be signed in to change notification settings - Fork 22
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
167aef1
commit c658a06
Showing
8 changed files
with
431 additions
and
11 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
--- | ||
layout: default | ||
title: Parallel patterns | ||
parent: Data | ||
has_children: true | ||
permalink: /Data/Patterns | ||
--- | ||
|
||
# Parallel Patterns |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,78 @@ | ||
--- | ||
layout: default | ||
title: Do-All | ||
has_children: true | ||
parent: Parallel Patterns | ||
grand_parent: Data | ||
nav_order: 1 | ||
--- | ||
|
||
|
||
# Do-All Loop | ||
|
||
## Reporting | ||
Do-All Loops are reported in the following format: | ||
``` | ||
Do-all at: 1:2 | ||
Start line: 1:7 | ||
End line: 1:9 | ||
pragma: "#pragma omp parallel for" | ||
private: [] | ||
shared: [] | ||
first private: [] | ||
reduction: [] | ||
last private: [] | ||
``` | ||
|
||
## Interpretation | ||
The reported values shall be interpreted as follows: | ||
* `Do-all at: <file_id>:<cu_id>`, where the respective parent file can be looked up in the `FileMapping.txt` using `file_id` and `cu_id` can be used for a look up in `Data.xml` | ||
* `Start line: <file_id>:<line_num>`, where `line_num` refers to the source code line of the parallelizable loop. | ||
* `End line: <file_id>:<line_num>`, where `line_num` refers to the last line of the parallelizable loop. | ||
<!-- | ||
Note: Disabled, since these values are not determined correctly at the moment. Values will be added to the result once their implementations are fixed. | ||
* `iterations: <num>` specifies the counted amount of iterations the loop has executed during the profiling. | ||
* `instructions: <num>` specifies the summed number of instructions executed within one iteration of the loop body | ||
* `TODO: workload: <num>` provides an arbitrary value which represents the computational weight of one iteration of the loop. | ||
--> | ||
* `pragma:`shows which type of OpenMP pragma shall be inserted before the target loop in order to parallelize it. | ||
* `private: [<vars>]` lists a set of variables which have been identified as thread-`private` | ||
* The same interpretation applies to the following values aswell: | ||
* `shared` | ||
* `first_private` | ||
* `last_private` | ||
* `reduction: [<operation>:<var>]` specifies a set of identified reduction operations and variables. For `Do-All` suggestions, this list is always empty. | ||
|
||
## Implementation | ||
In order to implement a suggestion, first open the source code file corresponding to `file_id` and navigate to line `Start line -> <line_num>`. | ||
Insert `pragma` before the loop begins. | ||
In order to ensure a valid parallelization, you need to add the following clauses to the OpenMP pragma, if the respective lists are not empty: | ||
* `private` -> clause: `private(<vars>)` | ||
* `shared` -> clause: `shared(<vars>)` | ||
* `first_private` -> clause: `firstprivate(<vars>)` | ||
* `last_private` -> clause: `lastprivate(<vars>)` | ||
* `reduction`-> clause: `reduction(<operation>:<vars>)` | ||
|
||
### Example | ||
As an example, we will analyze the following code snippet for parallelization potential. All location and meta data will be ignored for the sake of simplicity. | ||
|
||
for (int i = 0; i < 10; ++i) { | ||
local_array[i] += 1; | ||
} | ||
|
||
Analyzing this code snippet results in the following parallelization suggestion: | ||
|
||
pragma: "#pragma omp parallel for" | ||
private: ["i"] | ||
shared: ["local_array"] | ||
first private: [] | ||
reduction: [] | ||
last private: [] | ||
|
||
|
||
After interpreting and implementing the suggestion, the resulting, now parallel, source code could look as follows: | ||
|
||
#pragma omp parallel for private(i) shared(local_array) | ||
for (int i = 0; i < 10; ++i) { | ||
local_array[i] += 1; | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,118 @@ | ||
--- | ||
layout: default | ||
title: Geometric Decomposition | ||
has_children: true | ||
parent: Parallel patterns | ||
grand_parent: Data | ||
nav_order: 2 | ||
--- | ||
|
||
|
||
# Geometric Decomposition | ||
|
||
## Reporting | ||
Possible geometric decompositions are reported in the following format: | ||
``` | ||
Geometric decomposition at: 1:9 | ||
Start line: 1:26 | ||
End line: 1:36 | ||
Do-All loops: ['1:11'] | ||
Reduction loops: [] | ||
Number of tasks: 24 | ||
Chunk limits: 1000 | ||
pragma: for (i = 0; i < num-tasks; i++) #pragma omp task] | ||
private: [] | ||
shared: [] | ||
first private: ['i'] | ||
reduction: [] | ||
last private: [] | ||
``` | ||
|
||
## Interpretation | ||
The reported values shall be interpreted as follows: | ||
* `Geometric decomposition at: <file_id>:<cu_id>`, where the respective parent file can be looked up in the `FileMapping.txt` using `file_id` and `cu_id` can be used for a look up in `Data.xml` | ||
* `Start line: <file_id>:<line_num>`, where `line_num` refers to the first source code line of the potential geometrically decomposable code. | ||
* `End line: <file_id>:<line_num>`, where `line_num` refers to the last line of the suggested pattern. | ||
* `Do-All loops: [<file_id>:<cu_id>]` specifies which [Do-all loops](Do-All.md) can be part of the geometric decomposition. | ||
* `Reduction loops: [<file_id>:<ci_id>]` specifies which [Reduction loops](Reduction.md) can be part of the geometric decomposition. | ||
* `Number of tasks: <int>` specifies the number of tasks which should or can be spawned in order to process the geometric decomposition. | ||
* `Chunk limits: <int>` determine the size of a workload package (amount of iterations) for each individual spawned task. | ||
* `private, shared, first_private` and `last_private` indicate variables which should be mentioned within the respective OpenMP data sharing clauses. | ||
* `reduction: [<operation>:<var>]` specifies a set of identified reduction operations and variables. | ||
|
||
|
||
## Implementation | ||
In order to implement a geometric decomposition, first open the source code file corresponding to `file_id` and navigate to line `Start line -> <line_num>`. | ||
Insert `pragma` before each of the loops mentioned in `Do-all loops` and `Reduction loops`. Make sure to replace `num-tasks` with the specified `Number of tasks`, or insert a respective variable into the source code. | ||
Modify the loop conditions of the original source code in order to allow a geometric decomposition. Each task should be responsible for processing a chunk of the size `Chunk limits`. | ||
In order to ensure a valid parallelization, you need to add the following clauses to the OpenMP pragma, if the respective lists are not empty: | ||
* `private` -> clause: `private(<vars>)` | ||
* `shared` -> clause: `shared(<vars>)` | ||
* `first_private` -> clause: `firstprivate(<vars>)` | ||
* `last_private` -> clause: `lastprivate(<vars>)` | ||
* `reduction`-> clause: `reduction(<operation>:<vars>)` | ||
|
||
### Example | ||
As an example, we will analyze the following code snippet for parallelization potential. Some location and meta data will be ignored for the sake of simplicity. | ||
|
||
int main( void) | ||
{ | ||
int i; | ||
int d=20,a=22, b=44,c=90; | ||
for (i=0; i<100; i++) { | ||
a = foo(i, d); | ||
b = bar(a, d); | ||
c = delta(b, d); | ||
} | ||
a = b; | ||
return 0; | ||
} | ||
|
||
Analyzing this code snippet results in the following geometric decomposition suggestion: | ||
``` | ||
Geometric decomposition at: 1:1 | ||
Start line: 1:2 | ||
End line: 1:12 | ||
Type: Geometric Decomposition Pattern | ||
Do-All loops: ['1:3'] // line 5 | ||
Reduction loops: [] | ||
Number of tasks: 10 | ||
Chunk limits: 10 | ||
pragma: for (i = 0; i < num-tasks; i++) #pragma omp task] | ||
private: [] | ||
shared: [] | ||
first private: ['i'] | ||
reduction: [] | ||
last private: [] | ||
``` | ||
|
||
After interpreting and implementing the suggestion, the resulting, now parallel, source code could look as follows. | ||
Since `i` has been used in the original source code already, the inserted `pragma` uses `x` instead. | ||
As a last modification, the loop conditions in the original source code need to be modified slightly in order to allow the decomposition. | ||
For a simpler interpretation of the example we have added the `chunk_size` and `tid` variables. | ||
Note: Since the geometric decomposition relies on the identification of the thread number, the outermost `for` loop should be located inside a `parallel region`. However, depending on the specific analyzed source code, a surrounding `parallel region` might already exist or a different location for the surrounding `parallel region` may be more beneficial. | ||
|
||
int main( void) | ||
{ | ||
int i; | ||
int d=20,a=22, b=44,c=90; | ||
|
||
#pragma omp parallel | ||
#pragma omp single | ||
for (int x = 0; x < 10; x++ ) { | ||
#pragma omp task | ||
{ | ||
int tid = omp_get_thread_num(); | ||
int chunk_size = 10; // value of Chunk limits | ||
|
||
for (i = tid*chunk_size; i < tid*chunk_size + chunk_size; i++) { | ||
a = foo(i, d); | ||
b = bar(a, d); | ||
c = delta(b, d); | ||
} | ||
} | ||
} | ||
|
||
a = b; | ||
return 0; | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,143 @@ | ||
--- | ||
layout: default | ||
title: Pipeline | ||
has_children: true | ||
parent: Parallel patterns | ||
grand_parent: Data | ||
nav_order: 3 | ||
--- | ||
|
||
|
||
# Pipeline | ||
|
||
## Reporting | ||
|
||
### Pipelines | ||
Pipelines are reported in the following format: | ||
``` | ||
Pipeline at: 1:11 | ||
Start line: 1:30 | ||
End line: 1:34 | ||
Stages: | ||
<stage_1> | ||
<stage_2> | ||
... | ||
``` | ||
The reported values shall be interpreted as follows: | ||
* `Pipeline at: <file_id>:<cu_id>`, where the respective parent file can be looked up in the `FileMapping.txt` using `file_id` and `cu_id` can be used for a look up in `Data.xml` | ||
* `Start line: <file_id>:<line_num>`, where `line_num` refers to the first source code line of the identified pipeline. | ||
* `End line: <file_id>:<line_num>`, where `line_num` refers to the last line of the pipeline loop. | ||
* `Stages` defines a list of stages contained in the identified pipeline. The specific format of the stages is described in the following. | ||
|
||
### Pipeline Stages | ||
Individual stages of a pipeline are reported in the following format: | ||
``` | ||
Node: 1:13 | ||
Start line: 1:31 | ||
End line: 1:31 | ||
pragma: "#pragma omp task" | ||
first private: ['i'] | ||
private: [] | ||
shared: ['d', 'in'] | ||
reduction: [] | ||
InDeps: [] | ||
OutDeps: ['a'] | ||
InOutDeps: [] | ||
``` | ||
|
||
The reported values shall be interpreted as follows: | ||
* `Node: <file_id>:<cu_id>`, where the respective parent file can be looked up in the `FileMapping.txt` using `file_id` and `cu_id` can be used for a look up in `Data.xml` | ||
* `Start line: <file_id>:<line_num>`, where `line_num` refers to the first source code line of the identified pipeline stage. | ||
* `End line: <file_id>:<line_num>`, where `line_num` refers to the last line of the stage. | ||
* `pragma:`shows which type of OpenMP pragma shall be inserted before the `start line`. | ||
* `private: [<vars>]` lists a set of variables which have been identified as thread-`private` | ||
* The same interpretation applies to the following values aswell: | ||
* `shared` | ||
* `first_private` | ||
* `reduction: [<operation>:<var>]` specifies a set of identified reduction operations and variables. | ||
* `InDeps: [<vars>]` specifies `in`-dependencies according to the [OpenMP depend clause](https://www.openmp.org/spec-html/5.0/openmpsu99.html). | ||
* `OutDeps: [<vars>]` specifies `out`-dependencies according to the [OpenMP depend clause](https://www.openmp.org/spec-html/5.0/openmpsu99.html). | ||
* `InOutDeps: [<vars>]` specifies `inout`-dependencies according to the [OpenMP depend clause](https://www.openmp.org/spec-html/5.0/openmpsu99.html). | ||
|
||
|
||
## Implementation | ||
In order to implement a suggested pipeline, first navigate to the source code location specified by `Pipeline at:`. | ||
For each individual stage the following OpenMP pragmas and closes need to be added to the source code, if the respective lists are not empty: | ||
* Insert `pragma` prior to the `start line` mentioned by the stage. | ||
* If `private` is not empty, add the clause `private(<vars>)`, where vars are separated by commas to the pragma. | ||
* Do the same for: | ||
* `shared` -> clause: `shared(<vars>)` | ||
* `first_private` -> clause: `firstprivate(<vars>)` | ||
* `reduction`-> clause: `reduction(<operation>:<vars>)` | ||
* `InDeps` -> clause: `depend(in:<vars>)` | ||
* `OutDeps` -> clause: `depend(out:<vars>)` | ||
* `InOutDeps` -> clause: `depend(inout:<vars>)` | ||
|
||
|
||
### Example | ||
As an example, we will analyze the following code snippet for parallelization potential. Some location and meta data will be ignored for the sake of simplicity. | ||
|
||
int i; | ||
int d=20,a=22, b=44,c=90; | ||
for (i=0; i<100; i++) { | ||
a = foo(i, d); | ||
b = bar(a, d); | ||
c = delta(b, d); | ||
} | ||
a = b; | ||
|
||
Analyzing this code snippet results in the following parallelization suggestion: | ||
``` | ||
Pipeline at: | ||
Start line: 1:3 | ||
End line: 1:7 | ||
Stages: | ||
Node: 1:13 | ||
Start line: 1:4 | ||
End line: 1:4 | ||
shared: ['d', 'in'] | ||
reduction: [] | ||
InDeps: [] | ||
OutDeps: ['a'] | ||
InOutDeps: [] | ||
Start line: 1:5 | ||
End line: 1:5 | ||
pragma: "#pragma omp task" | ||
first private: [] | ||
private: [] | ||
shared: ['d', 'in'] | ||
reduction: [] | ||
InDeps: ['a'] | ||
OutDeps: ['b'] | ||
InOutDeps: [] | ||
Start line: 1:6 | ||
End line: 1:7 | ||
pragma: "#pragma omp task" | ||
first private: [] | ||
private: ['c'] | ||
shared: ['d', 'in'] | ||
reduction: [] | ||
InDeps: ['b'] | ||
OutDeps: [] | ||
InOutDeps: [] | ||
``` | ||
|
||
After interpreting and implementing the suggestion, the resulting, now parallel, source code could look as follows: | ||
|
||
int i; | ||
int d=20,a=22, b=44,c=90; | ||
for (i=0; i<100; i++) { | ||
#pragma omp task firsprivate(i) shared(d, in) depend(out:a) | ||
a = foo(i, d); | ||
#pragma omp task shared(d, in) depend(in:a) depend(out:b) | ||
b = bar(a, d); | ||
#pragma omp task private(c) shared(d, in) depend(in: b) | ||
c = delta(b, d); | ||
} | ||
a = b; | ||
|
Oops, something went wrong.