diff --git a/docs/coreneuron/how-to/bbcorepointer.md b/docs/coreneuron/how-to/bbcorepointer.md
new file mode 100644
index 0000000000..423396d8a3
--- /dev/null
+++ b/docs/coreneuron/how-to/bbcorepointer.md
@@ -0,0 +1,93 @@
+
+## Transferring dynamically allocated data between NEURON and CoreNEURON
+
+
+User-allocated data can be managed in NMODL using the `POINTER` type. It allows the
+programmer to reference data that has been allocated in HOC or in VERBATIM blocks. This
+allows for more advanced data-structures that are not natively supported in NMODL.
+
+Since NEURON itself has no knowledge of the layout and size of this data it cannot
+transfer `POINTER` data automatically to CoreNEURON. Furtheremore, in many cases there
+is no need to transfer the data between the two instances. In some cases, however, the
+programmer would like to transfer certain user-defined data into CoreNEURON. The most
+prominent example are random123 RNG stream parameters used in synapse mechanisms. To
+support this use-case the `BBCOREPOINTER` type was introduced. Variables that are declared as
+`BBCOREPOINTER` behave exactly the same as `POINTER` but are additionally taken into account
+when NEURON is serializing mechanism data (for file writing or direct-memory transfer).
+For NEURON to be able to write (and indeed CoreNEURON to be able to read) `BBCOREPOINTER`
+data, the programmer has to additionally provide two C functions that are called as part
+of the serialization/deserialization.
+
+```
+static void bbcore_write(double* x, int* d, int* d_offset, int* x_offset, _threadargsproto_);
+
+static void bbcore_read(double* x, int* d, int* d_offset, int* x_offset, _threadargsproto_);
+```
+
+The implementation of `bbcore_write` and `bbcore_read` determines the serialization and
+deserialization of the per-instance mechanism data referenced through the various
+`BBCOREPOINTER`s.
+
+NEURON will call `bbcore_write` twice per mechanism instance. In a first sweep, the call is used to
+determine the required memory to be allocated on the serialization arrays. In the second sweep the
+call is used to fill in the data per mechanism instance.
+
+The functions take following arguments
+
+* `x`: A `double` type array that will be allocated by NEURON to fill with real-valued data. In the
+  first call, `x` is NULL as it has not been allocated yet.
+* `d`: An `int` type array that will be allocated by NEURON to fill with integer-valued data. In the
+  first call, `d` is NULL as it has not been allocated yet.
+* `x_offset`: The offset in `x` at which the mechanism instance should write its real-valued
+  `BBCOREPOINTER` data. In the first call this is an output argument that is expected to be updated
+  by the per-instance size to be allocated.
+* `d_offset`: The offset in `d` at which the mechanism instance should write its integer-valued
+  `BBCOREPOINTER` data. In the first call this is an output argument that is expected to be updated
+  by the per-instance size to be allocated.
+* `_threadargsproto_`: a macro placeholder for NEURON/CoreNEURON data-structure parameters. They
+  are typically only used through generated defines and not by the programmer. The macro is defined
+  as follows:
+
+```
+#define _threadargsproto_                                                                         \
+    int _iml, int _cntml_padded, double *_p, Datum *_ppvar, ThreadDatum *_thread, NrnThread *_nt, \
+    double _v
+```
+
+Putting all of this together, the following is a minimal MOD using BBCOREPOINTER:
+
+```
+TITLE A BBCOREPOINTER Example 
+
+NEURON {
+    BBCOREPOINTER my_data
+}
+
+ASSIGNED {
+    my_data
+}
+
+: Do something interesting with my_data ...
+
+VERBATIM
+static void bbcore_write(double* x, int* d, int* x_offset, int* d_offset, _threadargsproto_) {
+    if (x) {
+        double* x_i = x + *x_offset;
+        x_i[0] = _p_my_data[0];
+        x_i[1] = _p_my_data[1];
+    }
+    *x_offset += 2; // reserve 2 doubles on serialization buffer x
+}
+
+static void bbcore_read(double* x, int* d, int* x_offset, int* d_offset, _threadargsproto_) {
+    assert(!_p_my_data);
+    double* x_i = x + *x_offset;
+    // my_data needs to be allocated somehow
+    _p_my_data = (double*)malloc(sizeof(double)*2); 
+    _p_my_data[0] = x_i[0];
+    _p_my_data[1] = x_i[1];
+    *x_offset += 2;
+}
+ENDVERBATIM
+```
+
diff --git a/docs/coreneuron/how-to/coreneuron.md b/docs/coreneuron/how-to/coreneuron.md
index b07f0936e9..b313dd7b12 100644
--- a/docs/coreneuron/how-to/coreneuron.md
+++ b/docs/coreneuron/how-to/coreneuron.md
@@ -1,91 +1,237 @@
-# NEURON - CoreNEURON Integration
+# Using CoreNEURON with NEURON
 
-**CoreNEURON** is a compute engine for the **NEURON** simulator optimised for both memory usage and computational speed using modern CPU/GPU architetcures. Its goal is to simulate large cell networks with minimal memory footprint and optimal performance.
+[CoreNEURON](https://github.com/BlueBrain/CoreNeuron) is a compute engine for the NEURON simulator optimised for both memory usage and computational speed on modern CPU/GPU architectures. The goals of CoreNEURON are:
 
-If you are a new user and would like to use **CoreNEURON**, this tutorial will be a good starting point to understand the complete workflow of using **CoreNEURON** with **NEURON**. We would be grateful for any feedback about your use cases or issues you encounter using the CoreNEURON. Please [report any issue here](https://github.com/neuronsimulator/nrn/issues) and we will be happy to help.
+* Simulating large network models
+* Reduce memory usage
+* Support for GPUs
+* Optimisations like Vectorisation and memory layout (e.g. Structure-Of-Array)
 
+CoreNEURON is designed as a library within the NEURON simulator and can transparently handle all spiking network simulations including gap junction coupling with the **fixed time step method**. In order to run a NEURON model with CoreNEURON:
 
-# How to install CoreNEURON
+* MOD files shall be [THREADSAFE](https://neuron.yale.edu/neuron/docs/multithread-parallelization)
+* Random123 shall be used if a random generator is needed (instead of MCellRan4)
+* POINTER variables need to be converted to BBCOREPOINTER ([details here](bbcorepointer.md))
 
-**CoreNEURON** is a submodule of the **NEURON** repository and can be installed with **NEURON** by enabling the flag **NRN_ENABLE_CORENEURON** in the CMake command used to install **NEURON**.
-For example:
+## Build Dependencies
+* Bison
+* Flex
+* CMake >=3.8
+* Python >=2.7
+* MPI Library [Optional, for MPI support]
+* [PGI Compiler / NVIDIA HPC SDK](https://developer.nvidia.com/hpc-sdk) [Optional, for GPU support]
+* [CUDA Toolkit >=9.0](https://developer.nvidia.com/cuda-downloads) [Optional, for GPU support]
 
-    cmake .. \
-         -DNRN_ENABLE_INTERVIEWS=OFF \
-         -DNRN_ENABLE_MPI=OFF \
-         -DNRN_ENABLE_RX3D=OFF \
-         -DNRN_ENABLE_CORENEURON=ON
+#### Choosing Compiler
 
-## Notes
+CoreNEURON relies on compiler [auto-vectorisation](https://en.wikipedia.org/wiki/Automatic_vectorization) to achieve better performance on moder CPUs. With this release we recommend compilers like **Intel / PGI / Cray  Compiler**. These compilers are able to vectorize the code better than **GCC** or **Clang**, achieving the best possible performance gains. If you are using any cluster platform, then Intel or Cray compiler should be available as a module. You can also install the Intel compiler by downloading [oneAPI HPC Toolkit](https://software.intel.com/content/www/us/en/develop/tools/oneapi/hpc-toolkit.html).  CoreNEURON supports also GPU execution based on an [OpenACC](https://en.wikipedia.org/wiki/OpenACC) backend. Currently, the best supported compiler for the OpenACC backend is PGI, available as part of [NVIDIA-HPC-SDK](https://developer.nvidia.com/hpc-sdk). You need to use this compiler for NVIDIA GPUs. Note that AMD GPU support is not tested.
 
-### Performance
-**CoreNEURON** is by itself an optimized compute engine of **NEURON**, however to unlock better CPU performance it's recommended to use compilers like **Intel / PGI / Cray  Compiler** or the **ISPC Backend** using the new [NMODL Compiler Framework](https://github.com/BlueBrain/nmodl) (described described below). These compilers are able to vectorize the code better than **GCC** or **Clang**, achieving the best possible performance gains. Note that Intel compiler can be installed by downloading [oneAPI HPC Toolkit](https://software.intel.com/content/www/us/en/develop/tools/oneapi/hpc-toolkit.html).
 
-### GPU execution
-**CoreNEURON** supports also GPU execution based on an **OpenACC** backend. To be able to use it it's needed to install **NEURON** and **CoreNEURON** with a compiler that supports **OpenACC**. Currently the best supported compiler for the **OpenACC** backend is **PGI** (available via [NVIDIA-HPC-SDK](https://developer.nvidia.com/hpc-sdk)) and this is the recommended one for compilation.
-To enable the GPU backend specify the following *CMake* variables:
+## Installation
 
-    -DNRN_ENABLE_CORENEURON=ON
-    -DCORENRN_ENABLE_GPU=ON
+CoreNEURON is a submodule of the NEURON git repository. If you are a NEURON user, the preferred way to install CoreNEURON is to enable extra build options during NEURON installation as follows:
 
-Make sure to set the compilers via CMake flags `-DCMAKE_C_COMPILER=<C_Compiler>` and `-DCMAKE_CXX_COMPILER=<CXX_Compiler>`.
+1. Clone the latest version of NEURON:
 
-### NMODL
+  ```
+  git clone https://github.com/neuronsimulator/nrn
+  cd nrn
+  ```
 
-The **NMODL** Framework is a code generation engine for the **N**EURON **MOD**eling **L**anguage. 
-**NMODL** is a part of **CoreNEURON** and can be used to generate optimized code for modern compute architectures including CPUs and GPUs.
-**NMODL** can also generate code using a backend for the **ISPC** compiler, which can vectorize greatly the generated code running on CPU and accelerate further the simulation.
+2. Create a build directory:
 
-**NOTE**: To use the ISPC compiler backend ISPC compiler must be preinstalled in your system. More information on how to install ISPC can be found [here](https://ispc.github.io/downloads.html).
+  ```
+  mkdir build
+  cd build
+  ```
 
-#### How to use NMODL
-To enable the **NMODL** code generation in **CoreNEURON** specify the following *CMake* variables:
+3. Load software dependencies
 
-    -DNRN_ENABLE_CORENEURON=ON
-    -DCORENRN_ENABLE_NMODL=ON
+	If compilers and necessary dependencies are already available in the default paths then you do not need to do anything. In cluster or HPC environment we often use a module system to select software. For example, you can load the compiler, cmake, and python dependencies using module as follows:
 
-To use the **ISPC** backend add the following flags:
+  ```
+  module load intel openmpi python cmake
+  ```
 
-    -DCORENRN_ENABLE_ISPC=ON
-    -DCMAKE_ISPC_COMPILER=<path-to-ISPC-compiler-installation> # Only if the ispc executable is not in the PATH environment variable
+  If you want to enable GPU support then you have to load PGI/NVIDIA-HPC-SDK and CUDA modules:
 
-**NOTE** : NMODL is currently under active development and some mod files could be unsupported. Please feel free to try using NMODL with your mod files and report any issues on the [NMODL repository](https://github.com/BlueBrain/nmodl)
+	```
+	module load cuda nvidia-hpc-sdk
+	```
 
-# How to use CoreNEURON
+  Make sure to change module names based on your system. Also, if you are building on a Cray system with the GNU toolchain, you have to set the following environment variable:
 
-## Python API
-To use **CoreNEURON** directly from **NEURON** using the in-memory mode add the following in your python script before calling the *psolve* function to run the simulation:
+ 	```
+  	export CRAYPE_LINK_TYPE=dynamic
+  	```
 
-    from neuron import coreneuron
-    coreneuron.enable = True
-    coreneuron.gpu = True # Only for GPU execution
+4. Run CMake with the appropriate [options](https://github.com/neuronsimulator/nrn#build-using-cmake) and additionally enable CoreNEURON with `-DNRN_ENABLE_CORENEURON=ON`:
 
-**NOTE**: In order to run a simulation with CoreNEURON, `h.cvode.cache_efficient(1)` must also be set.
+ 	```
+  	cmake .. \
+   		-DNRN_ENABLE_CORENEURON=ON \
+   		-DNRN_ENABLE_INTERVIEWS=OFF \
+   		-DNRN_ENABLE_RX3D=OFF \
+   		-DCMAKE_INSTALL_PREFIX=$HOME/install \
+   		-DCMAKE_C_COMPILER=icc \
+   		-DCMAKE_CXX_COMPILER=icpc
+  	```
 
-You can find an example of the in-memory mode usage of **CoreNEURON** [here](https://github.com/neuronsimulator/nrn/blob/master/test/coreneuron/test_direct.py).
+	Make sure to replace `icc` and `icpc` with C/CXX compiler you are using. Also change `$HOME/install` to desired installation directory. CMake tries to find MPI libraries automatically but if needed you can set MPI compiler options `-DMPI_C_COMPILER=<mpi C compiler>` and `-DMPI_CXX_COMPILER=<mpi CXX compiler>`.
+ 
+	If you would like to enable GPU support with OpenACC, make sure to use `-DCORENRN_ENABLE_GPU=ON` option and use the PGI/NVIDIA HPC SDK compilers with CUDA. For example,
 
-## HOC API
-To use the **CoreNEURON** in-memory mode using a HOC script you need to add the following before calling the *psolve* function to run the simulation:
+	```
+	cmake .. \
+		-DNRN_ENABLE_CORENEURON=ON \
+		-DCORENRN_ENABLE_GPU=ON \
+		-DNRN_ENABLE_INTERVIEWS=OFF \
+		-DNRN_ENABLE_RX3D=OFF \
+		-DCMAKE_INSTALL_PREFIX=$HOME/install \
+		-DCMAKE_C_COMPILER=nvc \
+		-DCMAKE_CXX_COMPILER=nvc++
+	```
 
-    if (!nrnpython("from neuron import coreneuron")) {
-        printf("Python not available\n")
-        return
-    }
+	You can change C/C++ optimization flags using `-DCMAKE_CXX_FLAGS` and `-DCMAKE_C_FLAGS` options to the CMake command. You have to add the following CMake options:
 
-    po = new PythonObject()
-    po.coreneuron.enable = 1
+	```bash
+		-DCMAKE_CXX_FLAGS="-O3 -g" \
+	  	-DCMAKE_C_FLAGS="-O3 -g" \
+	  	-DCMAKE_BUILD_TYPE=CUSTOM \
+	```
 
-**NOTE**: In order to run a simulation with CoreNEURON, h.cvode.cache_efficient(1) must also be set
+	NOTE : If the CMake command fails, please make sure to delete temporary CMake cache files (`CMakeCache.txt` or build directory) before re-running CMake.
 
-You can find an example of the in-memory mode usage of **CoreNEURON** [here](https://github.com/neuronsimulator/nrn/blob/master/test/coreneuron/test_direct.hoc)
 
-See more informaiton in CoreNEURON [README documentation](https://github.com/BlueBrain/CoreNeuron#dependencies).
+6. Once the configure step is done, you can build and install the project as:
 
-## Compiling MOD files
+  ```bash
+	make -j
+	make install
+  ```
 
-In order to use mod files with **NEURON** or **CoreNEURON** you need to execute **nrnivmodl** and pass as argument the folder of the mod files you want to compile. To run the simulation using **CoreNEURON**, you have to use *-coreneuron* flag and then you can execute your **special** binary to run the model.
-For example:
+7. Set PATH and PYTHONPATH environmental variables to use the installation:
 
-    nrnivmodl -coreneuron mod_folder
-    ./x86_64/special -python init.py
+  ```bash
+  export PATH=$HOME/install/bin:$PATH
+  export PYTHONPATH=$HOME/install/lib/python:$PYTHONPATH
+  ```
 
+Now you should be able to import neuron module as:
+
+```
+python -c "from neuron import h; from neuron import coreneuron"
+```
+
+If you get `ImportError` then make sure `PYTHONPATH` is setup correctly and `python` version is same as the one used for NEURON installation.
+
+## Building MOD files
+
+As in a typical NEURON workflow, you can now use `nrnivmodl` to translate MOD files. In order to enable CoreNEURON support, you must set the  `-coreneuron` flag. Make sure to necessary modules (compilers, CUDA, MPI etc) are loaded before using nrnivmodl:
+
+```bash
+nrnivmodl -coreneuron <directory containing mod files>
+```
+
+If you don't have additional mod files and using only inbuilt mod files from NEURON then **you still need to use `nrnivmodl -coreneuron` to generate CoreNEURON library**. For example, you can run:
+
+```bash
+nrnivmodl -coreneuron .
+```
+
+With above commands, NEURON will create `x86_64/special` binary linked to CoreNEURON (here `x86_64` is the architecture name of your system).
+
+If you see any compilation error then one of the mod files might be incompatible with CoreNEURON. Please [open an issue](https://github.com/BlueBrain/CoreNeuron/issues) with mod file example.
+
+## Running Simulations
+
+With CoreNEURON, existing NEURON models can be run with minimal changes. For a given NEURON model, we typically need to do the following steps:
+
+1. Enable cache effficiency :
+
+	```python
+	from neuron import h
+	h.cvode.cache_efficient(1)
+	```
+
+2. Enable CoreNEURON :
+
+	```python
+	from neuron import coreneuron
+	coreneuron.enable = True
+	```
+
+3. If GPU support is enabled during build, enable GPU execution using:
+
+	```python
+	coreneuron.gpu = True
+    ```
+
+4. Use `psolve` to run simulation after initialization :
+
+	```python
+	h.stdinit()
+	pc.psolve(h.tstop)
+	```
+
+With the above steps, NEURON will build the model and will transfer it to CoreNEURON for simulation. At the end of the simulation CoreNEURON transfers by default : spikes, voltages, state variables, NetCon weights, all Vector.record, and most GUI trajectories to NEURON. These variables can be recorded using regular NEURON API (e.g. [Vector.record](https://www.neuron.yale.edu/neuron/static/py_doc/programming/math/vector.html#Vector.record) or [spike_record](https://www.neuron.yale.edu/neuron/static/new_doc/modelspec/programmatic/network/parcon.html#ParallelContext.spike_record)).
+
+If you are primarily using HOC then before calling `psolve` you can enable CoreNEURON as:
+
+```python
+// make sure NEURON is compiled with Python
+if (!nrnpython("from neuron import coreneuron")) {
+	printf("NEURON not compiled with Python support\n")
+    return
+}
+
+// access coreneuron module via Python object
+py_obj = new PythonObject()
+py_obj.coreneuron.enable = 1
+```
+
+Once you adapted your model with changes described above then you can execute your model like normal NEURON simulation. For example:
+
+```bash
+mpiexec -n <num_process> nrniv -mpi -python your_script.py       # python
+mpiexec -n <num_process> nrniv -mpi your_script.hoc              # hoc
+```
+
+Alternatively, instead of `nrniv` you can use `special` binary generated by `nrnivmodl` command. Note that for GPU execution you must use `special` binary to launch simulation:
+
+```bash
+mpiexec -n <num_process> x86_64/special -mpi -python your_script.py       # python
+mpiexec -n <num_process> x86_64/special -mpi your_script.hoc              # hoc
+```
+
+As CoreNEURON is used as a library under NEURON, it will use the same number of MPI ranks as NEURON. Also, if you enable threads using [ParallelContext.nthread()](https://www.neuron.yale.edu/neuron/static/py_doc/modelspec/programmatic/network/parcon.html#ParallelContext.nthread) then CoreNEURON will internally use the same number of OpenMP threads.
+
+> NOTE: Replace mpiexec with supported MPI launcher on your system (e.g. srun or mpirun)
+
+## Examples
+
+Here are some test examples to illustrate the usage of CoreNEURON API with NEURON:
+
+1. [test_direct.py](https://github.com/neuronsimulator/nrn/blob/master/test/coreneuron/test_direct.py) : This is a simple, single cell, serial Python example using demonstrating use of CoreNEURON. We first run simulation with NEURON and record voltage and membrane current. Then, the same model is executed with CoreNEURON, and we make sure the same results are achieved. Note that in order to run this example make sure to compile [these mod files](https://github.com/neuronsimulator/nrn/tree/master/test/coreneuron/mod) with `nrnivmodl -coreneuron`. You can this example as:
+
+	```bash
+	nrnivmodl -coreneuron mod                # first compile mod files
+	nrniv -python test_direct.py             # run via nrniv
+	x86_64/special -python test_direct.py    # run via special
+	python test_direct.py                    # run via python
+	```
+
+2. [test_direct.hoc](https://github.com/neuronsimulator/nrn/blob/master/test/coreneuron/test_direct.hoc) : This is the same example as above test_direct.py but written HOC.
+
+3. [test_spikes.py](https://github.com/neuronsimulator/nrn/blob/master/test/coreneuron/test_spikes.py) : This is similar to above mentioned test_direct.py but can be run with MPI where each MPI process creates a single cell and connect it with a cell on another rank. Each rank records spike and compare it between NEURON execution and CoreNEURON execution. It also demonstrates usage of [mpi4py](https://github.com/mpi4py/mpi4py) python module or NEURON's native MPI API.
+
+	You can run this MPI example in different ways:
+
+	```bash
+	mpiexec -n <num_process> python test_spikes.py mpi4py                        # using mpi4py
+	mpiexec -n <num_process> x86_64/special -mpi -python test_spikes.py          # neuron internal MPI
+	```
+
+4. [Ring network test](https://github.com/neuronsimulator/ringtest): This is a ring network model of Ball-and-Stick neurons which can be scaled arbitrarily for testing and benchmarking purpose. You can use this as reference for porting your model, see [README](https://github.com/neuronsimulator/ringtest/blob/master/README.md) file for detailed instructions.
+
+5. [3D Olfactory Bulb Model](https://github.com/HumanBrainProject/olfactory-bulb-3d): [Migliore et al. (2014)](https://www.frontiersin.org/articles/10.3389/fncom.2014.00050/) model of the olfactory bulb ported with CoreNEURON on GPU. See [README](https://github.com/HumanBrainProject/olfactory-bulb-3d/blob/master/README.md) for detailed instructions.