Skip to content

Commit

Permalink
Merge branch 'ershi/clear_cache' into 'main'
Browse files Browse the repository at this point in the history
Add the wp.clear_kernel_cache() to the public API

See merge request omniverse/warp!450
  • Loading branch information
mmacklin committed Jul 3, 2024
2 parents fa3cef4 + 2b73052 commit e33ccab
Show file tree
Hide file tree
Showing 98 changed files with 136 additions and 117 deletions.
4 changes: 3 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,9 +11,11 @@
compiled `(compiled)`, loaded from the cache `(cached)`, or was unable to be
loaded `(error)`.
- `wp.config.verbose = True` now also prints out a message upon the entry to a `wp.ScopedTimer`.
- Add additional documentation and examples demonstrating `wp.copy()`, `wp.clone()`, and `array.assign()` differentiability.
- Add additional documentation and examples demonstrating `wp.copy()`, `wp.clone()`, and `array.assign()` differentiability
- Fix adding `__new__()` methods for all class `__del__()` methods to
anticipate when a class instance is created but not instantiated before garbage collection.
- Add code-completion support for wp.config variables.
- Add `wp.clear_kernel_cache()` to the public API. This is equivalent to `wp.build.clear_kernel_cache()`.
- Add code-completion support for `wp.config` variables.
- Remove usage of a static task (thread) index for CPU kernels to address multithreading concerns.
- The `mask` argument to `wp.sim.eval_fk` now accepts both integer and bool arrays
Expand Down
1 change: 1 addition & 0 deletions docs/basics.rst
Original file line number Diff line number Diff line change
Expand Up @@ -205,6 +205,7 @@ please see :ref:`Structs Reference <Structs>` for more details.

As with kernel parameters, all attributes of a struct must have valid type hints at class definition time.

.. _Compilation Model:

Compilation Model
-----------------
Expand Down
23 changes: 12 additions & 11 deletions docs/faq.rst
Original file line number Diff line number Diff line change
Expand Up @@ -35,15 +35,15 @@ Does Warp support all of the Python language?
No, Warp supports a subset of Python that maps well to the GPU. Our goal
is to not have any performance cliffs so that users can expect
consistently good behavior from kernels that is close to native code.
Examples of unsupported concepts that dont map well to the GPU are
Examples of unsupported concepts that don't map well to the GPU are
dynamic types, list comprehensions, exceptions, garbage collection, etc.

When should I call ``wp.synchronize()``?
----------------------------------------

One of the common sources of confusion for new users is when calls to
``wp.synchronize()`` are necessary. The answer is “almost never”!
Synchronization is quite expensive, and should generally be avoided
Synchronization is quite expensive and should generally be avoided
unless necessary. Warp naturally takes care of synchronization between
operations (e.g.: kernel launches, device memory copies).

Expand Down Expand Up @@ -83,14 +83,15 @@ and :ref:`synchronization guidance <synchronization_guidance>`.
What happens when you differentiate a function like ``wp.abs(x)``?
------------------------------------------------------------------

Non-smooth functions such as ``y=|x|`` do not have a single unique
gradient at ``x=0``, rather they have what is known as a
``subgradient``, which is formally the convex hull of directional
Non-smooth functions such as :math:`y=|x|` do not have a single unique
gradient at :math:`x=0`, rather they have what is known as a
*subgradient*, which is formally the convex hull of directional
derivatives at that point. The way that Warp (and most
auto-differentiation frameworks) handles these points is to pick an
arbitrary gradient from this set, e.g.: for ``wp.abs()``, it will
arbitrarily choose the gradient to be 1.0 at the origin. You can find
the implementation for these functions in ``warp/native/builtin.h``.
the implementation for these functions in
`warp/native/builtin.h <https://github.com/NVIDIA/warp/blob/main/warp/native/builtin.h>`_.

Most optimizers (particularly ones that exploit stochasticity), are not
sensitive to the choice of which gradient to use from the subgradient,
Expand All @@ -107,25 +108,25 @@ conventions of PyTorch and use aliases such as ``cuda:0``, ``cuda:1``,
Should I switch to Warp over IsaacGym/PhysX?
----------------------------------------------

Warp is not a replacement for IsaacGym, IsaacSim, or PhysX - while Warp
does offer some physical simulation capabilities this is primarily aimed
Warp is not a replacement for IsaacGym, IsaacSim, or PhysXwhile Warp
does offer some physical simulation capabilities, this is primarily aimed
at developers who need differentiable physics, rather than a fully
featured physics engine. Warp is also integrated with IsaacGym and is
great for performing auxiliary tasks such as reward and observation
computations for reinforcement learning.

Why aren't assignments to Warp arrays supported outside of kernels?
-------------------------------------------------------------------
------------------------------------------------------------------------

For best performance, reading and writing data that is living on the GPU can
only be performed inside Warp CUDA kernels. Otherwise individual element accesses
such as ``array[i] = 1.0`` in Python scope would require prohibitively slow device
synchronization and copies.

We recommend to either initialize Warp arrays from other native arrays
(e.g.: Python list, NumPy array, ...) or by launching a kernel to set its values.
(Python lists, NumPy arrays, etc.) or by launching a kernel to set its values.

For the common use case of wanting to fill an array with a given value, we
For the common use case of filling an array with a given value, we
also support the following forms:

- ``wp.full(8, 1.23, dtype=float)``: initializes a new array of 8 float values set
Expand Down
6 changes: 3 additions & 3 deletions docs/limitations.rst
Original file line number Diff line number Diff line change
Expand Up @@ -40,9 +40,9 @@ This is not always possible for kernels launched with multi-dimensional grid bou
`hardware limitations <https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#features-and-technical-specifications-technical-specifications-per-compute-capability>`_
on CUDA block dimensions.

Warp will automatically fallback to using
Warp will automatically fall back to using
`grid-stride loops <https://developer.nvidia.com/blog/cuda-pro-tip-write-flexible-kernels-grid-stride-loops/>`_ when
it is not possible for a CUDA thread to process only one element from the Warp grid
it is not possible for a CUDA thread to process only one element from the Warp grid.
When this happens, some CUDA threads may process more than one element from the Warp grid.
Users can also set the ``max_blocks`` parameter to fine-tune the grid-striding behavior of kernels, even for kernels that are otherwise
able to process one Warp-grid element per CUDA thread.
Expand Down Expand Up @@ -104,7 +104,7 @@ The behavior of the modulus operator in a Warp kernel follows that of C++11: The
Power Operator
""""""""""""""

The power operator (``**``) in Warp kernels only works on floating-point numbers (also see :func:`wp.pow <pow>`).
The power operator (``**``) in Warp kernels only works on floating-point numbers (also see :func:`wp.pow() <pow>`).
In Python, the power operator can also be used on integers.

Inverse Sine and Cosine
Expand Down
17 changes: 12 additions & 5 deletions docs/modules/runtime.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,23 +15,30 @@ Kernels are launched with the :func:`wp.launch() <launch>` function on a specifi

wp.launch(simple_kernel, dim=1024, inputs=[a, b, c], device="cuda")

Kernels may be launched with multi-dimensional grid bounds. In this case threads are not assigned a single index,
Note that all the kernel inputs must live on the target device or a runtime exception will be raised.
Kernels may be launched with multi-dimensional grid bounds. In this case, threads are not assigned a single index,
but a coordinate in an n-dimensional grid, e.g.::

wp.launch(complex_kernel, dim=(128, 128, 3), ...)

Launches a 3D grid of threads with dimension 128 x 128 x 3. To retrieve the 3D index for each thread use the following syntax::
Launches a 3D grid of threads with dimension 128 x 128 x 3. To retrieve the 3D index for each thread, use the following syntax::

i,j,k = wp.tid()

.. note::
Currently kernels launched on CPU devices will be executed in serial.
Currently, kernels launched on CPU devices will be executed in serial.
Kernels launched on CUDA devices will be launched in parallel with a fixed block-size.

.. note::
Note that all the kernel inputs must live on the target device, or a runtime exception will be raised.
In the Warp :ref:`Compilation Model`, kernels are just-in-time compiled into dynamic libraries and PTX using
C++/CUDA as an intermediate representation.
To avoid excessive runtime recompilation of kernel code, these files are stored in a cache directory
named with a module-dependent hash to allow for the reuse of previously compiled modules.
The location of the kernel cache is printed when Warp is initialized.
:func:`wp.clear_kernel_cache() <clear_kernel_cache>` can be used to clear the kernel cache of previously
generated compilation artifacts as Warp does not automatically try to keep the cache below a certain size.

.. autofunction:: launch
.. autofunction:: clear_kernel_cache

.. _Runtime Kernel Creation:

Expand Down
8 changes: 5 additions & 3 deletions warp/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
# for autocomplete on builtins
# from warp.stubs import *

from warp.types import array, array1d, array2d, array3d, array4d, constant
from warp.types import array, array1d, array2d, array3d, array4d, constant, from_ptr
from warp.types import indexedarray, indexedarray1d, indexedarray2d, indexedarray3d, indexedarray4d
from warp.fabric import fabricarray, fabricarrayarray, indexedfabricarray, indexedfabricarrayarray

Expand All @@ -31,9 +31,9 @@
from warp.types import bvh_query_t, hash_grid_query_t, mesh_query_aabb_t, mesh_query_point_t, mesh_query_ray_t

# device-wide gemms
from warp.types import matmul, adj_matmul, batched_matmul, adj_batched_matmul, from_ptr
from warp.types import matmul, adj_matmul, batched_matmul, adj_batched_matmul

# deprecated
# discouraged, users should use wp.types.vector, wp.types.matrix
from warp.types import vector as vec
from warp.types import matrix as mat

Expand Down Expand Up @@ -99,6 +99,8 @@

from warp.dlpack import from_dlpack, to_dlpack

from warp.build import clear_kernel_cache

from warp.constants import *

from . import builtins
Expand Down
8 changes: 6 additions & 2 deletions warp/build.py
Original file line number Diff line number Diff line change
Expand Up @@ -71,8 +71,12 @@ def init_kernel_cache(path=None):
os.makedirs(warp.config.kernel_cache_dir, exist_ok=True)


def clear_kernel_cache():
"""Clear the kernel cache."""
def clear_kernel_cache() -> None:
"""Clear the kernel cache directory of previously generated source code and compiler artifacts.
Only directories beginning with ``wp_`` will be deleted.
This function only clears the cache for the current Warp version.
"""

warp.context.init()

Expand Down
2 changes: 1 addition & 1 deletion warp/examples/benchmarks/benchmark_cloth_warp.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@

import warp as wp

wp.build.clear_kernel_cache()
wp.clear_kernel_cache()


@wp.kernel
Expand Down
2 changes: 1 addition & 1 deletion warp/examples/benchmarks/benchmark_launches.py
Original file line number Diff line number Diff line change
Expand Up @@ -128,7 +128,7 @@ def ksz(s: Sz):
tid = wp.tid() # noqa: F841


wp.build.clear_kernel_cache()
wp.clear_kernel_cache()

devices = wp.get_devices()
num_launches = 100000
Expand Down
6 changes: 4 additions & 2 deletions warp/stubs.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@
IndexedFabricArray = Generic[DType]


from warp.types import array, array1d, array2d, array3d, array4d, constant
from warp.types import array, array1d, array2d, array3d, array4d, constant, from_ptr
from warp.types import indexedarray, indexedarray1d, indexedarray2d, indexedarray3d, indexedarray4d
from warp.fabric import fabricarray, fabricarrayarray, indexedfabricarray, indexedfabricarrayarray

Expand All @@ -42,7 +42,7 @@
from warp.types import Bvh, Mesh, HashGrid, Volume, MarchingCubes
from warp.types import bvh_query_t, hash_grid_query_t, mesh_query_aabb_t, mesh_query_point_t, mesh_query_ray_t

from warp.types import matmul, adj_matmul, batched_matmul, adj_batched_matmul, from_ptr
from warp.types import matmul, adj_matmul, batched_matmul, adj_batched_matmul

from warp.types import vector as vec
from warp.types import matrix as mat
Expand Down Expand Up @@ -108,6 +108,8 @@

from warp.dlpack import from_dlpack, to_dlpack

from warp.build import clear_kernel_cache

from warp.constants import *

from . import builtins
Expand Down
2 changes: 1 addition & 1 deletion warp/tests/disabled_kinematics.py
Original file line number Diff line number Diff line change
Expand Up @@ -233,5 +233,5 @@ class TestKinematics(unittest.TestCase):


if __name__ == "__main__":
wp.build.clear_kernel_cache()
wp.clear_kernel_cache()
unittest.main(verbosity=2, failfast=False)
2 changes: 1 addition & 1 deletion warp/tests/test_adam.py
Original file line number Diff line number Diff line change
Expand Up @@ -151,5 +151,5 @@ class TestAdam(unittest.TestCase):


if __name__ == "__main__":
wp.build.clear_kernel_cache()
wp.clear_kernel_cache()
unittest.main(verbosity=2)
2 changes: 1 addition & 1 deletion warp/tests/test_arithmetic.py
Original file line number Diff line number Diff line change
Expand Up @@ -1084,5 +1084,5 @@ class TestArithmetic(unittest.TestCase):


if __name__ == "__main__":
wp.build.clear_kernel_cache()
wp.clear_kernel_cache()
unittest.main(verbosity=2, failfast=False)
2 changes: 1 addition & 1 deletion warp/tests/test_array.py
Original file line number Diff line number Diff line change
Expand Up @@ -2411,5 +2411,5 @@ def test_array_new_del(self):


if __name__ == "__main__":
wp.build.clear_kernel_cache()
wp.clear_kernel_cache()
unittest.main(verbosity=2)
2 changes: 1 addition & 1 deletion warp/tests/test_array_reduce.py
Original file line number Diff line number Diff line change
Expand Up @@ -144,5 +144,5 @@ class TestArrayReduce(unittest.TestCase):


if __name__ == "__main__":
wp.build.clear_kernel_cache()
wp.clear_kernel_cache()
unittest.main(verbosity=2)
2 changes: 1 addition & 1 deletion warp/tests/test_async.py
Original file line number Diff line number Diff line change
Expand Up @@ -662,5 +662,5 @@ def test_func(
# value_offset=0))

if __name__ == "__main__":
wp.build.clear_kernel_cache()
wp.clear_kernel_cache()
unittest.main(verbosity=2)
2 changes: 1 addition & 1 deletion warp/tests/test_atomic.py
Original file line number Diff line number Diff line change
Expand Up @@ -135,5 +135,5 @@ class TestAtomic(unittest.TestCase):


if __name__ == "__main__":
wp.build.clear_kernel_cache()
wp.clear_kernel_cache()
unittest.main(verbosity=2)
2 changes: 1 addition & 1 deletion warp/tests/test_bool.py
Original file line number Diff line number Diff line change
Expand Up @@ -208,5 +208,5 @@ class TestBool(unittest.TestCase):


if __name__ == "__main__":
wp.build.clear_kernel_cache()
wp.clear_kernel_cache()
unittest.main(verbosity=2)
2 changes: 1 addition & 1 deletion warp/tests/test_builtins_resolution.py
Original file line number Diff line number Diff line change
Expand Up @@ -1286,5 +1286,5 @@ def test_vec4_float_args_precision(self):


if __name__ == "__main__":
wp.build.clear_kernel_cache()
wp.clear_kernel_cache()
unittest.main(verbosity=2)
2 changes: 1 addition & 1 deletion warp/tests/test_bvh.py
Original file line number Diff line number Diff line change
Expand Up @@ -163,5 +163,5 @@ def test_bvh_new_del(self):
add_function_test(TestBvh, "test_bvh_ray", test_bvh_query_ray, devices=devices)

if __name__ == "__main__":
wp.build.clear_kernel_cache()
wp.clear_kernel_cache()
unittest.main(verbosity=2)
2 changes: 1 addition & 1 deletion warp/tests/test_closest_point_edge_edge.py
Original file line number Diff line number Diff line change
Expand Up @@ -223,5 +223,5 @@ class TestClosestPointEdgeEdgeMethods(unittest.TestCase):


if __name__ == "__main__":
wp.build.clear_kernel_cache()
wp.clear_kernel_cache()
unittest.main(verbosity=2)
2 changes: 1 addition & 1 deletion warp/tests/test_codegen.py
Original file line number Diff line number Diff line change
Expand Up @@ -558,5 +558,5 @@ class TestCodeGen(unittest.TestCase):


if __name__ == "__main__":
wp.build.clear_kernel_cache()
wp.clear_kernel_cache()
unittest.main(verbosity=2, failfast=True)
2 changes: 1 addition & 1 deletion warp/tests/test_compile_consts.py
Original file line number Diff line number Diff line change
Expand Up @@ -213,5 +213,5 @@ def test_constant_math(self):


if __name__ == "__main__":
wp.build.clear_kernel_cache()
wp.clear_kernel_cache()
unittest.main(verbosity=2)
2 changes: 1 addition & 1 deletion warp/tests/test_conditional.py
Original file line number Diff line number Diff line change
Expand Up @@ -240,5 +240,5 @@ class TestConditional(unittest.TestCase):


if __name__ == "__main__":
wp.build.clear_kernel_cache()
wp.clear_kernel_cache()
unittest.main(verbosity=2)
2 changes: 1 addition & 1 deletion warp/tests/test_copy.py
Original file line number Diff line number Diff line change
Expand Up @@ -226,5 +226,5 @@ class TestCopy(unittest.TestCase):


if __name__ == "__main__":
wp.build.clear_kernel_cache()
wp.clear_kernel_cache()
unittest.main(verbosity=2)
2 changes: 1 addition & 1 deletion warp/tests/test_ctypes.py
Original file line number Diff line number Diff line change
Expand Up @@ -626,5 +626,5 @@ class TestCTypes(unittest.TestCase):


if __name__ == "__main__":
wp.build.clear_kernel_cache()
wp.clear_kernel_cache()
unittest.main(verbosity=2)
2 changes: 1 addition & 1 deletion warp/tests/test_dense.py
Original file line number Diff line number Diff line change
Expand Up @@ -61,5 +61,5 @@ class TestDense(unittest.TestCase):


if __name__ == "__main__":
wp.build.clear_kernel_cache()
wp.clear_kernel_cache()
unittest.main(verbosity=2)
2 changes: 1 addition & 1 deletion warp/tests/test_devices.py
Original file line number Diff line number Diff line change
Expand Up @@ -85,5 +85,5 @@ def test_devices_unmap_imaginary_device(self):


if __name__ == "__main__":
wp.build.clear_kernel_cache()
wp.clear_kernel_cache()
unittest.main(verbosity=2)
2 changes: 1 addition & 1 deletion warp/tests/test_dlpack.py
Original file line number Diff line number Diff line change
Expand Up @@ -524,5 +524,5 @@ class TestDLPack(unittest.TestCase):


if __name__ == "__main__":
wp.build.clear_kernel_cache()
wp.clear_kernel_cache()
unittest.main(verbosity=2)
2 changes: 1 addition & 1 deletion warp/tests/test_examples.py
Original file line number Diff line number Diff line change
Expand Up @@ -398,6 +398,6 @@ class TestFemDiffusionExamples(unittest.TestCase):

if __name__ == "__main__":
# force rebuild of all kernels
wp.build.clear_kernel_cache()
wp.clear_kernel_cache()

unittest.main(verbosity=2, failfast=True)
2 changes: 1 addition & 1 deletion warp/tests/test_fabricarray.py
Original file line number Diff line number Diff line change
Expand Up @@ -951,5 +951,5 @@ def test_fabricarray_new_del(self):


if __name__ == "__main__":
wp.build.clear_kernel_cache()
wp.clear_kernel_cache()
unittest.main(verbosity=2)
2 changes: 1 addition & 1 deletion warp/tests/test_fast_math.py
Original file line number Diff line number Diff line change
Expand Up @@ -56,5 +56,5 @@ def test_fast_math_cpu(self):


if __name__ == "__main__":
wp.build.clear_kernel_cache()
wp.clear_kernel_cache()
unittest.main(verbosity=2)
2 changes: 1 addition & 1 deletion warp/tests/test_fem.py
Original file line number Diff line number Diff line change
Expand Up @@ -1294,5 +1294,5 @@ class TestFemShapeFunctions(unittest.TestCase):


if __name__ == "__main__":
wp.build.clear_kernel_cache()
wp.clear_kernel_cache()
unittest.main(verbosity=2)
Loading

0 comments on commit e33ccab

Please sign in to comment.