The project is an open-source GPGPU implementation of Lattice-Boltzmann Method (LBM), a computational fluid dynamics (CFD) method for fluid simulation, that solves a lid-driven cavity problem on a D3Q19 lattice grid.
Due to a local nature of computations performed during lattice grid update the method is highly parallelizable and can be scaled to virtually the same amount of compute units as the number of cells being used for the domain. Since modern GPUs have thousands of execution cores and number of cores is tending upwards, they are the perfect candidate for LBM parallelized code to run on. This project utilizes CUDA platform due to the fact that compared to it's competitor, OpenCL, it has a wider functionality and higher data transfer rate with virtually the same computational throughput. However CUDA comes with a cost of locking developers to use NVidia GPUs, but that is irrelevant for the purposes of this project.
During the project implementation the following goals were accounted for:
- high performance and efficiency of LBM solver
- high scalability of code to various NVidia GPU architectures
- maintainability and clarity of the code
In case that the reader is not familiar with the GPGPU programming models or the inner workings of GPU hardware it is highly recommended to skim through NVidia programming guide and NVidia GPU architectures. It is also recommended to have a general understanding of LBM solver principles.
The project was implemented in C utilizing CUDA 5.5 Toolkit and consists of two aligned implementations of the LBM solver: CPU and GPU. GPU functionality is decoupled from CPU code and is enclosed in files with _gpu.cu
or _gpu.cuh
endings. General project structure is as follows:
main.c
- main funciton that triggers simulation routineslbm_model.h
- problem/LBM specific constants and validation methods- GPU:
initialization_gpu.h
- GPU memory initialization and freeinglbm_solver_gpu.h
- LBM solver that encompasses streaming, collision and boundary treatmentcell_computation_gpu.cuh
- decoupled local cell computationslbm_model_gpu.cuh
- gpu specific problem/LBM definitions
- CPU:
initialization.h
- CLI and config file parsingstreaming.h
- streaming computationscollision.h
- collision computationsboundary.h
- boundary treatmentcell_computation.h
- local cell computations
visualization.h
- visualization of fields
The code is compatible with GPUs of compute capability 2.0
and higher and NVidia CUDA Toolkits of version 4.0
and higher.
The project was tested against NVidia GeForce GTX 660M (CC 3.0
) and GeForce GTX 460 (CC 2.1
). Development was performed solely on Linux system, however, there should be no problems with running it on windows.
These instructions are aimed at linux users who have CUDA enabled GPUs with compute capability 2.0+ and who have already installed and enabled gpu device drivers. It is also expected that the reader went through NVidia getting started guide and installed CUDA Toolkit 4.0
or newer.
Other dependencies:
-
Clone the project from gihub repository:
git clone https://github.com/nyxcalamity/lbm-gpu.git <project-dir>
-
Navigate to
<project-dir>
directory and run:make
-
Adjust grid size or physical properties of the problem in the configuration file located in
<project-dir>/data/lbm.dat
. -
Run the project using next command:
<project-dir>/build/lbm-sim -help
-
Read the help message and run the actual simulation as follows:
<project-dir>/build/lbm-sim <project-dir>/data/lbm.dat -gpu
There are several known issues with the project which do not affect it's performance or the resulting simulation:
- due to optimization of boundary treatment code we reduced 57 checking branches to just 22 at a cost of exchanging probability distribution functions between boundary cells at the edges
- an unknown rounding error happens during visualization which might change a minority of values by not more than 0.000001