-
Notifications
You must be signed in to change notification settings - Fork 1
/
ChangeLog
206 lines (149 loc) · 7.03 KB
/
ChangeLog
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
------------------- Released version 2.6 -----------------------------
* Build system improvements:
- Auto-detect Cray XC platforms with ARM CPUs, supporting Cray,
ARM, and GCC compilers
- Added support for Clang and AMD AOCC compilers
- Updated support for Spectrum MPI
* Automatic trace analyzer changes & improvements:
- Revised "Early Reduce" wait state definition.
- Added calculation of "Early Reduce" delay costs.
- Fixed various delay cost calculation and propagation issues.
- Fixed various inconsistencies between wait-state and root-cause
analysis.
- Made POSIX threads analysis consistent with Score-P by avoiding
thread function stub call paths underneath 'pthread_create'.
This also fixes a deadlock when analyzing traces containing
"orphaned threads".
* Measurement nexus (scan) changes:
- Added preset mode for multi-run measurements with a preset for
POP analysis requirements as an use case.
- Added support for multiple file systems in SCAN_TRACE_FILESYS
by using a colon separated list of paths.
* Analysis report postprocessing changes:
- Add metric hierarchies for CUDA, OpenCL, and OpenACC.
(NOTE: The trace analysis still only supports host-side events!)
- Renamed '-c' command-line option of 'square' to '-C' for running
sanity checks on newly created reports.
- Added new '-c' command-line option to 'square' to allow specifying
the number of counters considered during report scoring (for
consistency with 'scorep-score').
- Added new '-x' command-line option to 'square' to allow passing
options directly through to 'scorep-score'.
- Avoid unnecessary aggregation/postprocessing of reports with
multi-run experiments.
* Substantial code cleanup.
------------------- Released version 2.5 -----------------------------
* Support for
- Score-P v5.0, incl. virtual process/thread topologies
* Automatic trace analyzer changes & improvements:
- Various fixes and improvements in timestamp correction algorithm.
- Fixed 'Late Receiver' instance tracking.
- Slightly improved analysis report data collation.
* Added support for multi-run experiments.
* Code refactoring and various bug fixes.
* Improved user documentation:
- Revised User Guide including command reference.
- Added man pages.
------------------- Released version 2.4 -----------------------------
* Support for
- Cube v4.4
* Build system improvements:
- Fix build issues with compilers defaulting to C++11 or higher
(e.g., Intel 2017, PGI 17).
- Fix build issues with PGI 16+ compilers (pgCC no longer available)
- Fix build issues on Cray systems, now also properly taking
CRAYPE_LINK_TYPE setting into account
* Automatic trace analyzer changes & improvements:
- Fix rare crash/deadlock in critical-path/delay analysis while
analyzing MPI persistent communication.
- Improved memory management.
- Improved handling of OTF2 traces in SIONlib containers.
- Improved trace reading times, especially at scale.
- Fixed detection of wait states in active-target synchronization
based on EPIK traces
* Code refactoring and various bug fixes.
------------------ Released version 2.3.1 ----------------------------
* Build system improvements:
- Fixed build issue with GCC 6.1.
- Fixed build issue on the Intel Xeon Phi platform.
------------------- Released version 2.3 -----------------------------
* Support for
- Score-P v2.0
- OTF2 v2.0
* Automatic trace analyzer changes & improvements:
- Experimental support for Score-P traces collected using
sampling (see OPEN_ISSUES for limitations).
* Improved analysis report postprocessing:
- Revised metric hierarchies (organization, metric naming, etc).
- Suppress calculation of performance properties that are
only relevant for unused parallel programming models.
* Performance property documentation fixes & improvements.
* Build system improvements.
* Code refactoring and various bug fixes.
------------------- Released version 2.2.2 ---------------------------
* Platform support:
- Fixed a build issue on the Intel Xeon Phi platform.
- Improved support for the 'ibrun' launcher.
* Automatic trace analyzer changes & improvements:
- Worked around rare run-time issue with MVAPICH2.
------------------- Released version 2.2.1 ---------------------------
* Platform support:
- Added build system support for Power8/Linux.
- Added build system support for 64-bit ARM/Linux (AArch64).
- Prefer linking static over dynamic Cube/OTF2 libraries on
Fujitsu K/FX10/FX100.
* Automatic trace analyzer changes & improvements:
- Fixed delay-cost propagation through OpenMP barrier wait states.
- Various algorithmic optimizations reducing overall analysis
time for traces of multi-threaded applications:
~ Improved memory management.
~ Improved trace preprocessing.
~ Improved timestamp correction.
* Code refactoring and various bug fixes.
------------------- Released version 2.2 -----------------------------
* Support for
- Score-P v1.4
- OTF2 v1.5, incl. full SIONlib support (if configured)
- Cube v4.3
* Platform support:
- Added support for Intel Xeon Phi, native mode only.
- Added support for Fujitsu FX100 (thanks to T. Nakamura,
Fujitsu Ltd).
* Automatic trace analyzer changes & improvements:
- Added basic support for POSIX threads.
- Added basic support for OpenMP tasking.
- Added lock contention analysis (OpenMP & POSIX threads).
- Added root-cause/delay analysis (MPI & OpenMP).
- New command-line options '--[no-]rootcause'.
* Code refactoring and various bug fixes.
------------------- Released version 2.1 -----------------------------
* Support for
- Score-P v1.3
- OTF2 v1.4
* Platform support:
- Added support for Fujitsu FX10 & K computer.
- Improved support for Cray systems.
* Automatic trace analyzer changes & improvements:
- Added Critical-path analysis.
- Improved Late Receiver detection.
- New command-line options '--[no-]critical-path' and '--single-pass'.
- Fixed crash in data collation when number of OpenMP threads varied
among MPI processes.
* Code refactoring and various small bug fixes.
* Initial version of updated User Guide (still work in progress).
------------------- Released version 2.0 -----------------------------
* Support for
- Score-P v1.2
- OTF2 v1.2
- Cube v4.2
* New build system based on GNU autotools.
* Significant amount of code refactoring.
* Automatic trace analyzer changes & improvements:
- Support for arbitrary deep system trees.
- Improved performance of timestamp correction.
- Pattern instance tracking and statistics are now enabled by
default.
- New command-line options '--verbose', '--[no-]time-correct',
and '--[no-]statistics'.
- Limited backward-compatibility support for handling existing
traces in EPILOG format generated by Scalasca v1.