Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow specification for GPU device index #96

Merged
merged 43 commits into from
Mar 28, 2024
Merged
Show file tree
Hide file tree
Changes from 21 commits
Commits
Show all changes
43 commits
Select commit Hold shift + click to select a range
83ff00a
Have get_device use torch::Device
jwallwork23 Mar 19, 2024
a392900
Add device_number arg for get_device
jwallwork23 Mar 19, 2024
2552c91
Throw error if device_number used in CPU-only case
jwallwork23 Mar 19, 2024
9b0b7dd
Disallow negative device number
jwallwork23 Mar 19, 2024
e44e3e6
Actually use the device number
jwallwork23 Mar 19, 2024
cf39472
Use device number for torch_zeros
jwallwork23 Mar 19, 2024
01b8063
Use device number for torch_ones
jwallwork23 Mar 19, 2024
530fa19
Use device number for torch_empty
jwallwork23 Mar 19, 2024
af7a8af
Use device number for torch_from_blob
jwallwork23 Mar 19, 2024
e2fe070
Device and device number args for torch_module_load
jwallwork23 Mar 19, 2024
fd729a3
Pass device and device number to torch_jit_load by value
jwallwork23 Mar 19, 2024
3b3e62c
Make device number argument to torch_module_load optional
jwallwork23 Mar 19, 2024
5fe34b0
Make device number argument to torch_tensor_from_array optional
jwallwork23 Mar 19, 2024
3fe5258
Make device number argument to other subroutines optional
jwallwork23 Mar 19, 2024
9ed2452
Make device argument to torch_module_load optional
jwallwork23 Mar 19, 2024
fbc6a12
Add function for determining device_index
jwallwork23 Mar 20, 2024
58d28ed
Rename device number as index
jwallwork23 Mar 20, 2024
682d887
Rename device as device type
jwallwork23 Mar 20, 2024
2d9698c
Device index defaults to -1 on CPU and 0 on GPU
jwallwork23 Mar 20, 2024
ca40777
Make device type and index optional on C++ side
jwallwork23 Mar 20, 2024
e37f743
Fix typo in torch_model_load
jwallwork23 Mar 20, 2024
8b63dfe
Fix typos in example 1
jwallwork23 Mar 22, 2024
8982129
Initial draft of example 3_MultiGPU
jwallwork23 Mar 22, 2024
1eec646
Differentiate between errors and warnings in C++ code
jwallwork23 Mar 25, 2024
2739c16
Formatting
jwallwork23 Mar 25, 2024
fc18b52
Add mpi4py to requirements for example 3
jwallwork23 Mar 25, 2024
2b0086a
Use mpi4py to differ inputs in simplenet_infer_python
jwallwork23 Mar 25, 2024
fced4c1
Raise ValueError for Python inference with invalid device
jwallwork23 Mar 25, 2024
188b305
Print rank in Python case; updates to README
jwallwork23 Mar 25, 2024
dcfb153
Setup MPI for simplenet_infer_fortran, too
jwallwork23 Mar 25, 2024
392afb9
Write formatting for example 3
jwallwork23 Mar 25, 2024
9fd3040
Add note on building with Make
jwallwork23 Mar 25, 2024
24d5b6a
Print before and after; mpi_finalise; output on CPU; comments
jwallwork23 Mar 27, 2024
a44e262
Merge branch 'main' into 85_gpu_device_number
jwallwork23 Mar 27, 2024
5ebe845
Docs: device->device_type for consistency
jwallwork23 Mar 27, 2024
18fca7b
Add docs on MultiGPU
jwallwork23 Mar 27, 2024
475a859
Update warning text for defaulting to 0
jwallwork23 Mar 28, 2024
3f26457
Mention MPI in requirements
jwallwork23 Mar 28, 2024
3dba29a
Update outputs for example 3
jwallwork23 Mar 28, 2024
0e3272e
Use NP rather than 4 GPUs
jwallwork23 Mar 28, 2024
99d3b5b
Implement SimpleNet in example 3 but with a twist
jwallwork23 Mar 28, 2024
99002d5
Add code snippets for multi-GPU doc section
jwallwork23 Mar 28, 2024
e2b68bd
Add note about multiple GPU support to README.md.
jatkinson1000 Mar 28, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
49 changes: 34 additions & 15 deletions src/ctorch.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -29,30 +29,41 @@ constexpr auto get_dtype(torch_data_t dtype)
}
}

constexpr auto get_device(torch_device_t device)
const auto get_device(torch_device_t device_type, int device_index)
jatkinson1000 marked this conversation as resolved.
Show resolved Hide resolved
{
switch (device) {
switch (device_type) {
case torch_kCPU:
return torch::kCPU;
if (device_index != -1) {
std::cerr << "[ERROR]: device index unsupported for CPU-only runs"
<< std::endl;
}
return torch::Device(torch::kCPU);
case torch_kCUDA:
return torch::kCUDA;
if (device_index >= 0 && device_index < torch::cuda::device_count()) {
return torch::Device(torch::kCUDA, device_index);
} else {
std::cerr << "[ERROR]: invalid device index " << device_index
<< " for device count " << torch::cuda::device_count()
<< ", using zero instead" << std::endl;
return torch::Device(torch::kCUDA);
jatkinson1000 marked this conversation as resolved.
Show resolved Hide resolved
}
default:
std::cerr << "[ERROR]: unknown device type, setting to torch_kCPU"
<< std::endl;
return torch::kCPU;
return torch::Device(torch::kCPU);
jatkinson1000 marked this conversation as resolved.
Show resolved Hide resolved
}
}

torch_tensor_t torch_zeros(int ndim, const int64_t* shape, torch_data_t dtype,
torch_device_t device)
torch_device_t device_type, int device_index = -1)
{
torch::Tensor* tensor = nullptr;
try {
// This doesn't throw if shape and dimensions are incompatible
c10::IntArrayRef vshape(shape, ndim);
tensor = new torch::Tensor;
*tensor = torch::zeros(
vshape, torch::dtype(get_dtype(dtype))).to(get_device(device));
vshape, torch::dtype(get_dtype(dtype))).to(get_device(device_type, device_index));
} catch (const torch::Error& e) {
std::cerr << "[ERROR]: " << e.msg() << std::endl;
delete tensor;
Expand All @@ -66,15 +77,15 @@ torch_tensor_t torch_zeros(int ndim, const int64_t* shape, torch_data_t dtype,
}

torch_tensor_t torch_ones(int ndim, const int64_t* shape, torch_data_t dtype,
torch_device_t device)
torch_device_t device_type, int device_index = -1)
{
torch::Tensor* tensor = nullptr;
try {
// This doesn't throw if shape and dimensions are incompatible
c10::IntArrayRef vshape(shape, ndim);
tensor = new torch::Tensor;
*tensor = torch::ones(
vshape, torch::dtype(get_dtype(dtype))).to(get_device(device));
vshape, torch::dtype(get_dtype(dtype))).to(get_device(device_type, device_index));
} catch (const torch::Error& e) {
std::cerr << "[ERROR]: " << e.msg() << std::endl;
delete tensor;
Expand All @@ -88,15 +99,15 @@ torch_tensor_t torch_ones(int ndim, const int64_t* shape, torch_data_t dtype,
}

torch_tensor_t torch_empty(int ndim, const int64_t* shape, torch_data_t dtype,
torch_device_t device)
torch_device_t device_type, int device_index = -1)
{
torch::Tensor* tensor = nullptr;
try {
// This doesn't throw if shape and dimensions are incompatible
c10::IntArrayRef vshape(shape, ndim);
tensor = new torch::Tensor;
*tensor = torch::empty(
vshape, torch::dtype(get_dtype(dtype))).to(get_device(device));
vshape, torch::dtype(get_dtype(dtype))).to(get_device(device_type, device_index));
} catch (const torch::Error& e) {
std::cerr << "[ERROR]: " << e.msg() << std::endl;
delete tensor;
Expand All @@ -113,7 +124,7 @@ torch_tensor_t torch_empty(int ndim, const int64_t* shape, torch_data_t dtype,
// data
torch_tensor_t torch_from_blob(void* data, int ndim, const int64_t* shape,
const int64_t* strides, torch_data_t dtype,
torch_device_t device)
torch_device_t device_type, int device_index = -1)
{
torch::Tensor* tensor = nullptr;

Expand All @@ -124,7 +135,7 @@ torch_tensor_t torch_from_blob(void* data, int ndim, const int64_t* shape,
tensor = new torch::Tensor;
*tensor = torch::from_blob(
data, vshape, vstrides,
torch::dtype(get_dtype(dtype))).to(get_device(device));
torch::dtype(get_dtype(dtype))).to(get_device(device_type, device_index));

} catch (const torch::Error& e) {
std::cerr << "[ERROR]: " << e.msg() << std::endl;
Expand All @@ -144,18 +155,26 @@ void torch_tensor_print(const torch_tensor_t tensor)
std::cout << *t << std::endl;
}

int torch_tensor_get_device_index(const torch_tensor_t tensor)
{
auto t = reinterpret_cast<torch::Tensor*>(tensor);
return t->device().index();
}

void torch_tensor_delete(torch_tensor_t tensor)
{
auto t = reinterpret_cast<torch::Tensor*>(tensor);
delete t;
}

torch_jit_script_module_t torch_jit_load(const char* filename)
torch_jit_script_module_t torch_jit_load(const char* filename,
const torch_device_t device_type = torch_kCPU,
const int device_index = -1)
{
torch::jit::script::Module* module = nullptr;
try {
module = new torch::jit::script::Module;
*module = torch::jit::load(filename);
*module = torch::jit::load(filename, get_device(device_type, device_index));
} catch (const torch::Error& e) {
std::cerr << "[ERROR]: " << e.msg() << std::endl;
delete module;
Expand Down
37 changes: 28 additions & 9 deletions src/ctorch.h
Original file line number Diff line number Diff line change
Expand Up @@ -37,30 +37,36 @@ typedef enum { torch_kCPU, torch_kCUDA } torch_device_t;
* @param number of dimensions of the Tensor
* @param shape of the Tensor
* @param data type of the elements of the Tensor
* @param device used (cpu, CUDA, etc.)
* @param device type used (cpu, CUDA, etc.)
* @param device index for the CUDA case
*/
EXPORT_C torch_tensor_t torch_zeros(int ndim, const int64_t* shape,
torch_data_t dtype, torch_device_t device);
torch_data_t dtype, torch_device_t device_type,
int device_index);

/**
* Function to generate a Torch Tensor of ones
* @param number of dimensions of the Tensor
* @param shape of the Tensor
* @param data type of the elements of the Tensor
* @param device used (cpu, CUDA, etc.)
* @param device type used (cpu, CUDA, etc.)
* @param device index for the CUDA case
*/
EXPORT_C torch_tensor_t torch_ones(int ndim, const int64_t* shape,
torch_data_t dtype, torch_device_t device);
torch_data_t dtype, torch_device_t device_type,
int device_index);

/**
* Function to generate an empty Torch Tensor
* @param number of dimensions of the Tensor
* @param shape of the Tensor
* @param data type of the elements of the Tensor
* @param device used (cpu, CUDA, etc.)
* @param device type used (cpu, CUDA, etc.)
* @param device index for the CUDA case
*/
EXPORT_C torch_tensor_t torch_empty(int ndim, const int64_t* shape,
torch_data_t dtype, torch_device_t device);
torch_data_t dtype, torch_device_t device_type,
int device_index);

/**
* Function to create a Torch Tensor from memory location given extra information
Expand All @@ -69,21 +75,30 @@ EXPORT_C torch_tensor_t torch_empty(int ndim, const int64_t* shape,
* @param shape of the Tensor
* @param strides to take through data
* @param data type of the elements of the Tensor
* @param device used (cpu, CUDA, etc.)
* @param device type used (cpu, CUDA, etc.)
* @param device index for the CUDA case
* @return Torch Tensor interpretation of the data pointed at
*/
EXPORT_C torch_tensor_t torch_from_blob(void* data, int ndim,
const int64_t* shape,
const int64_t* strides,
torch_data_t dtype,
torch_device_t device);
torch_device_t device_type,
int device_index);

/**
* Function to print out a Torch Tensor
* @param Torch Tensor to print
*/
EXPORT_C void torch_tensor_print(const torch_tensor_t tensor);

/**
* Function to determine the device index of a Torch Tensor
* @param Torch Tensor to determine the device index of
* @return device index of the Torch Tensor
*/
EXPORT_C int torch_tensor_get_device_index(const torch_tensor_t tensor);

/**
* Function to delete a Torch Tensor to clean up
* @param Torch Tensor to delete
Expand All @@ -97,9 +112,13 @@ EXPORT_C void torch_tensor_delete(torch_tensor_t tensor);
/**
* Function to load in a Torch model from a TorchScript file and store in a Torch Module
* @param filename where TorchScript description of model is stored
* @param device type used (cpu, CUDA, etc.)
* @param device index for the CUDA case
* @return Torch Module loaded in from file
*/
EXPORT_C torch_jit_script_module_t torch_jit_load(const char* filename);
EXPORT_C torch_jit_script_module_t torch_jit_load(const char* filename,
const torch_device_t device_type,
const int device_index);

/**
* Function to run the `forward` method of a Torch Module
Expand Down
Loading
Loading