diff --git a/docs/how-to/hip_runtime_api/call_stack.rst b/docs/how-to/hip_runtime_api/call_stack.rst index 0e3bdd5477..f0b2c9dabb 100644 --- a/docs/how-to/hip_runtime_api/call_stack.rst +++ b/docs/how-to/hip_runtime_api/call_stack.rst @@ -66,6 +66,11 @@ overflow errors by ensuring sufficient stack memory is allocated. return 0; } +Depending on the GPU model, at full occupancy, it can consume a significant +amount of memory. For instance, an MI300X with 304 compute units (CU) and up to +2048 threads per CU could use 304 · 2048 · 1024 bytes = 608 MB for the call +stack by default. + Handling recursion and deep function calls ------------------------------------------------------------------------------- @@ -73,10 +78,8 @@ Similar to CPU programming, recursive functions and deeply nested function calls are supported. However, developers must ensure that these functions do not exceed the available stack memory, considering the huge amount of memory needed for the call stack due to the GPUs inherent parallelism. This can be -achieved by increasing stack size, implementing error handling to catch stack -overflow, optimizing code to reduce stack usage, and utilizing profiling tools -to monitor stack memory. Proper kernel design and memory management strategies -are essential to maintain efficient and stable application performance. +achieved by increasing stack size or optimizing code to reduce stack usage. To +detect stack overflow add proper error handling or use debugging tools. .. code-block:: cpp