You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Mar 20, 2023. It is now read-only.
Overview
In #713 we have added support for GPU offload using OpenMP. This is a good first step, but there are several areas where we hope to improve the implementation. This issue is to track planned improvements.
Asynchronous execution
In #713 we did not include any asynchronous execution clauses for OpenMP-based accelerator offload (nowait, depend, taskwait). This was partly for simplicity, and partly because support for those clauses in the compiler we were using at the time (NVHPC 21.9) is rather limited.
Initially we should aim to recover the performance attained with (asynchronous) OpenACC.
After that, we could look at launching more mechanism kernels in parallel within a single NrnThread.
Present clauses
With OpenACC we had present(...) clauses that allowed us to assert that data were already present on the device and should not be copied. The current OpenMP implementation has no such equivalent, but we basically preserve the same data transfer pattern as OpenACC because we ensure that the data are already present.
In principle a bug in the model transfer code (leading to some relevant data not being transferred to the device during initialisation) would cause a runtime error with OpenACC (✅) and implicit data transfers with OpenMP (⛔). Given that we already know how to generate present() clauses, it seems desirable to add the OpenMP equivalent (map(present, alloc: ...)) once it is widely supported.
Overview
In #713 we have added support for GPU offload using OpenMP. This is a good first step, but there are several areas where we hope to improve the implementation. This issue is to track planned improvements.
Asynchronous execution
In #713 we did not include any asynchronous execution clauses for OpenMP-based accelerator offload (
nowait
,depend
,taskwait
). This was partly for simplicity, and partly because support for those clauses in the compiler we were using at the time (NVHPC 21.9) is rather limited.Work has already started on this, see:
Initially we should aim to recover the performance attained with (asynchronous) OpenACC.
After that, we could look at launching more mechanism kernels in parallel within a single NrnThread.
Present clauses
With OpenACC we had
present(...)
clauses that allowed us to assert that data were already present on the device and should not be copied. The current OpenMP implementation has no such equivalent, but we basically preserve the same data transfer pattern as OpenACC because we ensure that the data are already present.In principle a bug in the model transfer code (leading to some relevant data not being transferred to the device during initialisation) would cause a runtime error with OpenACC (✅) and implicit data transfers with OpenMP (⛔). Given that we already know how to generate
present()
clauses, it seems desirable to add the OpenMP equivalent (map(present, alloc: ...)
) once it is widely supported.(original issue: neuronsimulator/gpuhackathon#5)
The text was updated successfully, but these errors were encountered: