Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the option to measure separate timers per thread #3378

Open
wants to merge 14 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,8 @@ set( with-models OFF CACHE STRING "The models to include as a semicolon-separate
set( tics_per_ms "1000.0" CACHE STRING "Specify elementary unit of time [default=1000 tics per ms]." )
set( tics_per_step "100" CACHE STRING "Specify resolution [default=100 tics per step]." )
set( with-detailed-timers OFF CACHE STRING "Build with detailed internal time measurements [default=OFF]. Detailed timers can affect the performance." )
set( with-mpi-sync-timer OFF CACHE STRING "Build with mpi synchronization barrier and timer [default=OFF]. Can affect the performance." )
set( with-threaded-timers ON CACHE STRING "Build with one internal timer per thread [default=ON]. Multi-threaded timers can affect the performance." )
JanVogelsang marked this conversation as resolved.
Show resolved Hide resolved
set( target-bits-split "standard" CACHE STRING "Split of the 64-bit target neuron identifier type [default='standard']. 'standard' is recommended for most users. If running on more than 262144 MPI processes or more than 512 threads, change to 'hpc'." )

# generic build configuration
Expand Down Expand Up @@ -143,6 +145,8 @@ nest_process_with_gsl()
nest_process_with_openmp()
nest_process_with_mpi()
nest_process_with_detailed_timers()
nest_process_with_threaded_timers()
nest_process_with_mpi_sync_timer()
nest_process_with_libneurosim()
nest_process_with_music()
nest_process_with_sionlib()
Expand Down
14 changes: 14 additions & 0 deletions cmake/ConfigureSummary.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -150,6 +150,20 @@ function( NEST_PRINT_CONFIG_SUMMARY )
message( "Detailed timers : No" )
endif ()

message( "" )
if ( THREADED_TIMERS )
message( "Threaded timers : Yes" )
else ()
message( "Threaded timers : No" )
endif ()

JanVogelsang marked this conversation as resolved.
Show resolved Hide resolved
message( "" )
if ( THREADED_TIMERS )
message( "MPI sync timer : Yes" )
else ()
message( "MPI sync timer : No" )
endif ()

message( "" )
if ( HAVE_MUSIC )
message( "Use MUSIC : Yes (MUSIC ${MUSIC_VERSION})" )
Expand Down
14 changes: 14 additions & 0 deletions cmake/ProcessOptions.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -462,6 +462,20 @@ function( NEST_PROCESS_WITH_DETAILED_TIMERS )
endif ()
endfunction()

function( NEST_PROCESS_WITH_THREADED_TIMERS )
set( THREADED_TIMERS OFF PARENT_SCOPE )
if ( ${with-threaded-timers} STREQUAL "ON" )
set( THREADED_TIMERS ON PARENT_SCOPE )
endif ()
endfunction()

function( NEST_PROCESS_WITH_MPI_SYNC_TIMER )
set( MPI_SYNC_TIMER OFF PARENT_SCOPE )
if ( ${with-mpi-sync-timer} STREQUAL "ON" )
set( MPI_SYNC_TIMER ON PARENT_SCOPE )
endif ()
endfunction()

function( NEST_PROCESS_WITH_LIBNEUROSIM )
# Find libneurosim
set( HAVE_LIBNEUROSIM OFF PARENT_SCOPE )
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -556,7 +556,7 @@ For example, the ``stopwatch.h`` file could look like:
}

inline nest::Stopwatch::timestamp_t
nest::Stopwatch::elapsed_timestamp() const
nest::Stopwatch::elapsed_us() const
{
#ifndef DISABLE_TIMING
if ( isRunning() )
Expand Down Expand Up @@ -622,7 +622,7 @@ For example, the ``stopwatch.h`` file could look like:
}

inline nest::Stopwatch::timestamp_t
nest::Stopwatch::get_timestamp()
nest::Stopwatch::get_current_time()
{
// works with:
// * hambach (Linux 2.6.32 x86_64)
Expand All @@ -637,12 +637,12 @@ For example, the ``stopwatch.h`` file could look like:
} /* namespace timer */
#endif /* STOPWATCH_H */

And the corresponding ``stopwatch.cpp``:
And the corresponding ``stopwatch_impl.h``:

.. code:: cpp

/*
* stopwatch.cpp
* stopwatch_impl.h
*
* This file is part of NEST.
*
Expand Down
10 changes: 8 additions & 2 deletions doc/htmldoc/installation/cmake_options.rst
Original file line number Diff line number Diff line change
Expand Up @@ -102,7 +102,7 @@ For more details, see the :ref:`Python binding <compile_with_python>` section be
.. _performance_cmake:

Maximize performance, reduce energy consumption
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The following options help to optimize NEST for maximal performance and thus reduced energy consumption.

Expand All @@ -126,7 +126,7 @@ The following options help to optimize NEST for maximal performance and thus red
in place.
* Using ``-march=native`` requires that you build NEST on the same CPU architecture as you will use to run it.
* For the technically minded: Even just using ``-O3`` removes some ``assert()`` statements from NEST since we
have wrapped some of them in functions, which get eliminated due to interprocedural optimization.
have wrapped some of them in functions, which get eliminated due to interprocedural optimization.



Expand Down Expand Up @@ -197,8 +197,14 @@ NEST properties
+-----------------------------------------------+----------------------------------------------------------------+
| ``-Dtics_per_step=[number]`` | Specify resolution [default=100 tics per step]. |
+-----------------------------------------------+----------------------------------------------------------------+
| ``-Dwith-threaded-timers=[OFF|ON]`` | Build with one internal timer per thread [default=ON]. |
| | Multi-threaded timers can affect the performance. |
+-----------------------------------------------+----------------------------------------------------------------+
| ``-Dwith-detailed-timers=[OFF|ON]`` | Build with detailed internal time measurements [default=OFF]. |
| | Detailed timers can affect the performance. |
+----------------------------------------------------------------------------------------------------------------+
| ``-Dwith-mpi-sync-timer=[OFF|ON]`` | Build with mpi synchronization barrier and timer [default=OFF].|
| | Can affect the performance. |
+-----------------------------------------------+----------------------------------------------------------------+
| ``-Dtarget-bits-split=['standard'|'hpc']`` | Split of the 64-bit target neuron identifier type |
| | [default='standard']. 'standard' is recommended for most users.|
Expand Down
83 changes: 52 additions & 31 deletions doc/htmldoc/nest_behavior/built-in_timers.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,11 +6,9 @@ Built-in timers
Basic timers
------------

Basic built-in timers keep track of the time NEST spent for network
construction and actual simulation (propagation of the network
state). These timers are active in all simulations with NEST, and the
measured times can be checked by querying the corresponding kernel
attributes. For example:
Basic built-in timers keep track of the time NEST spent for network construction and actual simulation (propagation of
the network state). These timers are active in all simulations with NEST, and the measured times can be checked by
querying the corresponding kernel attributes. For example:

::

Expand All @@ -22,7 +20,7 @@ The following basic time measurements are available:
|Name |Explanation |
+=============================+==================================+
|``time_construction_create`` |Cumulative time NEST spent |
| |creating neurons and devices |
| |creating neurons and devices |
+-----------------------------+----------------------------------+
|``time_construction_connect``|Cumulative time NEST spent |
| |creating connections |
Expand All @@ -33,19 +31,14 @@ The following basic time measurements are available:

.. note ::

While preparing the actual simulation after network construction,
NEST needs to build the pre-synaptic part of the connection
infrastructure, which requires MPI communication (`Jordan et
al. 2018 <https://doi.org/10.3389/fninf.2018.00002>`__). This
happens only for the first call to ``Simulate()`` unless
connectivity changed in the meantime, and it may cause significant
overhead by adding to ``time_simulate``. Therefore, the cumulative
time NEST spent for building the pre-synaptic connection
infrastructure is also tracked by a basic timer and available as
kernel attribute ``time_communicate_prepare``.
While preparing the actual simulation after network construction, NEST needs to build the pre-synaptic part of the
connection infrastructure, which requires MPI communication (`Jordan et al. 2018
<https://doi.org/10.3389/fninf.2018.00002>`__). This happens only for the first call to ``Simulate()`` unless
connectivity changed in the meantime, and it may cause significant overhead by adding to ``time_simulate``.
Therefore, the cumulative time NEST spent for building the pre-synaptic connection infrastructure is also tracked by
a basic timer and available as kernel attribute ``time_communicate_prepare``.

In the context of NEST performance monitoring, other useful kernel
attributes are:
In the context of NEST performance monitoring, other useful kernel attributes are:

+-----------------------+----------------------------------+
|Name |Explanation |
Expand All @@ -54,39 +47,34 @@ attributes are:
+-----------------------+----------------------------------+
|``local_spike_counter``|Number of spikes emitted by the |
| |neurons represented on this MPI |
| |rank during the last |
| |rank during the last |
| |``Simulate()`` |
+-----------------------+----------------------------------+

.. note ::

``nest.ResetKernel()`` resets all time measurements as well as
``biological_time`` and ``local_spike_counter``.
``nest.ResetKernel()`` resets all time measurements as well as ``biological_time`` and ``local_spike_counter``.


Detailed timers
---------------

Detailed built-in timers can be activated (and again deactivated)
prior to compilation through the CMake flag
``-Dwith-detailed-timers=ON``. They provide further insights into the
time NEST spends in different phases of the simulation cycle, but they
can impact the runtime. Therefore, detailed timers are by default
inactive.
Detailed built-in timers can be activated (and again deactivated) prior to compilation through the CMake flag
``-Dwith-detailed-timers=ON``. They provide further insights into the time NEST spends in different phases of the
simulation cycle, but they can impact the runtime. Therefore, detailed timers are by default inactive.

If detailed timers are active, the following time measurements are
available as kernel attributes:
If detailed timers are active, the following time measurements are available as kernel attributes:

+--------------------------------+----------------------------------+----------------------------------+
|Name |Explanation |Part of |
+================================+==================================+==================================+
|``time_gather_target_data`` |Cumulative time for communicating |``time_communicate_prepare`` |
| |connection information from | |
| |postsynaptic to presynaptic side | |
| |postsynaptic to presynaptic side | |
+--------------------------------+----------------------------------+----------------------------------+
|``time_communicate_target_data``|Cumulative time for core MPI |``time_gather_target_data`` |
| |communication when gathering | |
| |target data | |
| |target data | |
JanVogelsang marked this conversation as resolved.
Show resolved Hide resolved
+--------------------------------+----------------------------------+----------------------------------+
|``time_update`` |Time for neuron update |``time_simulate`` |
+--------------------------------+----------------------------------+----------------------------------+
JanVogelsang marked this conversation as resolved.
Show resolved Hide resolved
Expand All @@ -107,3 +95,36 @@ available as kernel attributes:
| |buffers of the corresponding | |
| |postsynaptic neurons | |
+--------------------------------+----------------------------------+----------------------------------+
|``time_omp_synchronization_construction`` |Synchronization time of threads during network construction. |``time_construction_create``, ``time_construction_connect``, ``time_communicate_prepare`` |
+--------------------------------+----------------------------------+----------------------------------+
|``time_omp_synchronization_simulation`` |Synchronization time of threads during simulation. |``time_simulate`` |
+--------------------------------+----------------------------------+----------------------------------+

MPI synchronization timer
-------------------------
In order to measure synchronization time between multiple MPI processes, an additional timer can be activated on demand
via the ``-Dwith-mpi-sync-timer=ON`` CMake flag. This timer measures the time between the end of a process' update phase
(i.e., neuron state propagation) and start of collective communication of spikes between all MPI processes. This timer
adds an additional MPI barrier right before the start of communication, which might affect performance.

+-----------------------------+---------------------------------------+
|Name |Explanation |
+=============================+=======================================+
|``time_mpi_synchronization`` |Time spent waiting for other processes.|
+-----------------------------+---------------------------------------+

Multi-threaded timers
---------------------
In previous NEST versions, only the master thread measured timers. Since NEST 3.9, all timers that are recorded in a
parallel (multi-threaded) environment are recorded by each thread individually.

The legacy timer behavior can be restored via the ``-Dwith-threaded-timers=OFF`` CMake flag.

Wall-time vs. CPU-time
-------------------------
All timers in NEST measure the actual wall-time spent between starting and stopping the timer. In order to only measure
time spent on calculations, there is an additional variant for each of the timers above, suffixed with ``_cpu``. They
can be accessed in the exact same way. For example:
::

nest.time_simulate_cpu
1 change: 0 additions & 1 deletion libnestutil/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,6 @@ set( nestutil_sources
numerics.h numerics.cpp
regula_falsi.h
sort.h
stopwatch.h stopwatch.cpp
string_utils.h
vector_util.h
)
Expand Down
6 changes: 6 additions & 0 deletions libnestutil/config.h.in
Original file line number Diff line number Diff line change
Expand Up @@ -182,6 +182,12 @@
/* Whether to enable detailed NEST internal timers */
#cmakedefine TIMER_DETAILED 1

/* Whether to use one NEST internal timer per thread */
#cmakedefine THREADED_TIMERS 1

/* Whether to use the mpi synchronization timer (including an additional barrier) */
#cmakedefine MPI_SYNC_TIMER 1

/* Whether to do full logging */
#cmakedefine ENABLE_FULL_LOGGING 1

Expand Down
33 changes: 0 additions & 33 deletions libnestutil/stopwatch.cpp

This file was deleted.

Loading
Loading