Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to import self_ref_v0.cu? #2

Open
Rogerspy opened this issue Jun 14, 2022 · 2 comments
Open

How to import self_ref_v0.cu? #2

Rogerspy opened this issue Jun 14, 2022 · 2 comments

Comments

@Rogerspy
Copy link

I extract the self_ref_v0.cu and SRWMlayer from source code, and incorporate them into my own model. I get

ImportError: cannot import name 'self_ref' from 'xx.layers'

My package structure as follow:

layers
     __init__.py
     self_ref_v0.cu
     self_ref_layer.py
  • self_ref_v0.cu is the file extracted from this repo
  • self_ref_layer.py is the SRWMlayer package:
     from . import self_ref_v0
    
     class SWRMlayer(...):
         ...
    
@Rogerspy
Copy link
Author

Rogerspy commented Jun 14, 2022

I extract the self_ref_v0.cu and SRWMlayer from source code, and incorporate them into my own model. I get

ImportError: cannot import name 'self_ref' from 'xx.layers'

My package structure as follow:

layers
     __init__.py
     self_ref_v0.cu
     self_ref_layer.py
* `self_ref_v0.cu` is the file extracted from this repo

* `self_ref_layer.py` is the `SRWMlayer` package:
  ```
   from . import self_ref_v0
  
   class SWRMlayer(...):
       ...
  ```

I solved this problem. But I get another error:

    ...
    out = out.reshape(bsz, slen, self.num_head * self.dim_head)
RuntimeError: CUDA error: too many resources requested for launch
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
  • pytorch == 1.11.0
  • bsz=20, slen=512, self.num_head * self.dim_head = 768

I tried different parameters (slen, num_head, dim_head), the same error always occurs.

@kazuki-irie
Copy link
Member

I'm not sure exactly which parameter is causing the error in your case.
You should first try a much smaller setting to see if that works, and then try larger parameters to find the problematic one.
If this still does not resolve the issue, I guess you should follow the error message, i.e., run the code with "CUDA_LAUNCH_BLOCKING=1"...

One thing to be noted is that in the current implementation, the head dimension (dim_head) can not be too big (due to the shared memory limit).
To train large models, the number of heads (num_head) has to be increased while keeping dim_head small.
The rule of thumb I use (on 2080, P100, and V100 GPUs) is to keep dim_head < 64.
For example, to get the total hidden layer size of 768, I'd set dim_head = 48 and n_head = 16.
This works for me on a 2080 while (dim_head = 96, n_head = 8) doesn't.
This limit depends on the type of GPU you are using.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants