Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenACC GPU porting #149

Open
sangallidavide opened this issue Nov 8, 2024 · 3 comments
Open

OpenACC GPU porting #149

sangallidavide opened this issue Nov 8, 2024 · 3 comments
Assignees

Comments

@sangallidavide
Copy link
Member

I open a second issue for discussion on OpenACC GPU porting.
Another one is here: #79

Sometimes the code calls a "memory free" before doing allocations
This is done via the macro YAMBO_FREE, where the deallocate/devxlib_unmap calls are protected by if allocated.

Example here in https://github.com/yambo-code/yambo/blob/tech-gpu/src/pol_function/X_irredux.F#L250
This call specifically leads to the following line:
https://github.com/yambo-code/yambo/blob/tech-gpu/src/Ymodules/mod_collision_el.F#L102
!DEV_ACC exit data delete(ggw)

With gfortran and Openacc this leads to an error.
From what I understand, there is no data to be deleted if ggw was not allocated. Although here the logic is not super clear to me.

What is the opposite of !DEV_ACC exit data delete(ggw)? Is it !DEV_ACC enter data create(ggw) ?
Why is this not handled via devxlib? Probably because gww is a type ?

sangallidavide added a commit that referenced this issue Nov 8, 2024
MODIFIED *  configure include/version/version.m4 Ymodules/mod_collision_el.F

Bugs:
- [OpenACC] Tentative to address issue #149

Patch sent by:  Davide Sangalli <[email protected]>
@bellenlau
Copy link

Don't know if this can help but YAMBO_FREE gives an error also on Leonardo with nvhpc/23.11 (not for older compilers), it complains about deallocating on the host some data that are stilled present on the GPU. No idea why this arise only with latest compiler versions yet. Which error do you observe with gfortran?

The opposite of exit data is enter data, which seem to be used for gww. Regarding gww, if this is a derived datatype, copying/deleting the derived datatype with enter data copies/deletes any statically allocated attributes of the derived datatype but not dinamically allocated attributes if any; these can be copied/deleted with an additional explicit data clause (e.g. enter data(X%array)). The runtime should then do automatically an "attach" operation between the dynamically allocated attribute and the datatype.

Regarding YAMBO_FREE_GPU, which is called before YAMBO_FREE, shouldn't the code check if the data is mapped with devxlib before un mapping with devxlib, instead of checking if the data is allocated (on the host, for OpenACC)?
At least, if I understood correctly that devxlib_mapped checks if the data is mapped on the GPU

diff --git a/include/headers/common/y_memory.h b/include/headers/common/y_memory.h
index 160b36c97..c01619d9b 100644
--- a/include/headers/common/y_memory.h
+++ b/include/headers/common/y_memory.h
@@ -197,7 +197,7 @@
 #define YAMBO_FREE_GPU(x) \
   if (.not.allocated(x)) &NEWLINE& call MEM_free(QUOTES x QUOTES,int(-1,KIND=IPL))NEWLINE \
   if (     allocated(x)) &NEWLINE& call MEM_free(QUOTES x QUOTES,size(x,KIND=IPL))NEWLINE \
-  if (     allocated(x)) &NEWLINE& call devxlib_unmap(x,MEM_err)
+  if (devxlib_mapped(x)) &NEWLINE& call devxlib_unmap(x,MEM_err)

 #else

@andrea-ferretti
Copy link
Member

Hi Laura,

thanks for raising this point. Apparently the issue is somehow recurrent...

  • On the one side, FREE_GPU should always be called before FREE on cpu (and now this should be the case throughout the code, modulo left over bugs)
  • on the other side, I've observed the devxlib_mapped function to be not 100% reliable, mostly due, I think, to issues in handling the memory table of ACC (issues with slicing or alike ?)

basically, for some reasons the devxlib_mapped function fails, FREE_GPU does not deallocate (since it thinks GPU memory is not allocated), and the final deallocate complains

@sangallidavide
Copy link
Member Author

sangallidavide commented Nov 15, 2024

The opposite of exit data is enter data, which seem to be used for gww.

It is used if elemental collision alloc is called before elementa collision free.

However this does not always happen. See my comment here

Sometimes the code calls a "memory free" before doing allocations
Example here in https://github.com/yambo-code/yambo/blob/tech-gpu/src/pol_function/X_irredux.F#L250

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants