-
Notifications
You must be signed in to change notification settings - Fork 8
Using CoreNeuron in Practice
- If your simulation doesn't fit in the memory
- If you want to run simulation faster!
The parameter target-simulator in the simulation config file specifies which simulator to run, “NEURON” or “CORENEURON”, in this example:
{
...
"target_simulator": "CORENEURON",
...
}
CoreNEURON uses 7x less memory compared to NEURON. This means, we can simulate 7x larger networks on same compute nodes if CoreNEURON is used instead of NEURON. If you want to simulate such large networks and are getting "Out of memory" errors, we suggest you use multi-step execution which runs model building phase with X number of steps, and then continue the simulation with CoreNEURON.
You can choose X based on circuit size : for example, Let's say you can simulation 100k cells on 16 nodes. If you want to simulation 200k on same 16 nodes, you should use X = 2. The command line option --modelbuilding-steps
is used to set the number of steps, with default value 1.
srun dplace special -mpi -python $NEURODAMUS_PYTHON/init.py --configFile=simulation_config.json --modelbuilding-steps=2
Note: --modelbuilding-steps
works only with CoreNEURON. If the NEURON simulator is chosen, this option is disabled and you will see a warning in the log
[WARNING] IGNORING ModelBuildingSteps since simulator is not CORENEURON
If you would like to run very long simulations or would like to re-use part of the simulation to bootstrap other simulations, you can use save-restore feature of NEURON/CoreNEURON with the help of command line options
>neurodamus --help
neurodamus
Usage:
neurodamus <ConfigFile> [options]
neurodamus --help
Options:
...
--save=<PATH> Path to create a save point to enable resume.
--save-time=<TIME> The simulation time [ms] to save the state. (Default: At the end)
--restore=<PATH> Restore and resume simulation from a save point on disk
...
Without Save-Restore, the workflow of full simulation looks like following:
In the above workflow we are using CoreNEURON to run full simulation workflow for simulation of 3000 msec.
Let's assume that we want to break this simulation in three parts, each of 1000 msec. In this case, the overall workflow of the simulation is shown in below figure:
- Part I : We specify
tstop
as 1000 msec in therun
section of simulation config file and introduced the command line option--save=<path_x/output/checkpoint>
. The save option tells simulator to save the state of simulation at the end of specifiedtstop
in the provided directory (in this case checkpoint directory which must be underoutput_dir
directory i.e. output here).
srun <srun params> special -mpi -python $NEURODAMUS_PYTHON/init.py --save=<path-x>/output/checkpoint
- Part II : We specify
tstop
as 2000 msec and introduced the command line option--restore=<path-x/output/checkpoint>
. The restore option tells simulator to restore the state of simulation from the specified directory (/output/checkpoint in this case). As we are restoring from previous simulation state, simulation will start from 1000msec and will continue till 2000 msec. As we have also specified--save=<path-y>/output/checkpoint
in the command line, the simulator will save the state of simulation at 2000 msec in<path-y>/output/checkpoint
directory.
srun <srun params> special -mpi -python $NEURODAMUS_PYTHON/init.py --restore=<path-x>/output/checkpoint --save=<path-y>/output/checkpoint
- Part III : This is the last step of the simulation. We are now restoring simulation from /output/checkpoint directory (i.e. from 2000 msec) and continuing simulation until 3000 msec. As there won't be further simulations, we are not saving the state of simulation at 3000 msec.
srun <srun params> special -mpi -python $NEURODAMUS_PYTHON/init.py --restore=<path-y>/output/checkpoint
Important Note:
- Save directory must be immediate sub-directory of
output
directory - The data folder
coreneuron_input
directories are necessary for subsequent simulations. So do not remove coreneuron_input directories until you finish all simulations. -
coreneuron_input folder
will be automatically deleted at the end of the last restore.
- Simulation is terminating with "Out of Memory" error on KNL
Skylake nodes have 384GB per node of DRAM whereas KNLs have only 96GB of memory. Hence, your simulation may not run on KNL if you are using same number of node as Skylake partition. You can increase number of nodes on KNL with an extra slurm job script option "#SBATCH --qos=bigjob" and increase node count up to 96 nodes.
- What are expected performance improvements?
This depends on model but here are some examples and execution time comparisons:
-
Simulation of 400 cells for 200 msec (from András Ecker):
- On single Skylake node : SpeedUp 4.5x
- On single KNL node : SpeedUp 5.9x
-
Simulation of 400 cells for 400 msec (from Oren Amsalem):
- On single Skylake node : SpeedUp 4.8x
- On single KNL node : SpeedUp 9.7x