Skip to content

Latest commit

 

History

History
97 lines (76 loc) · 10.6 KB

configurations.md

File metadata and controls

97 lines (76 loc) · 10.6 KB

Engine Configuration

This covers the available configurations for DJL and engines.

DJL settings

DJLServing build on top of Deep Java Library (DJL). Here is a list of settings for DJL:

Key Type Description
DJL_DEFAULT_ENGINE env var/system prop The preferred engine for DJL if there are multiple engines, default: MXNet
ai.djl.default_engine system prop The preferred engine for DJL if there are multiple engines, default: MXNet
DJL_CACHE_DIR env var/system prop The cache directory for DJL: default: $HOME/.djl.ai/
ENGINE_CACHE_DIR env var/system prop The cache directory for engine native libraries: default: $DJL_CACHE_DIR
ai.djl.dataiterator.autoclose system prop Automatically close data set iterator, default: true
ai.djl.repository.zoo.location system prop global model zoo search locations, not recommended
DJL_OFFLINE env var Don't access network for downloading engine's native library and model zoo metadata
ai.djl.offline system prop Don't access network for downloading engine's native library and model zoo metadata
collect-memory system prop Enable memory metric collection, default: false
disableProgressBar system prop Disable progress bar, default: false

PyTorch

Key Type Description
PYTORCH_LIBRARY_PATH env var/system prop User provided custom PyTorch native library
PYTORCH_VERSION env var/system prop PyTorch version to load
PYTORCH_EXTRA_LIBRARY_PATH env var/system prop Custom pytorch library to load (e.g. torchneuron/torchvision/torchtext)
PYTORCH_PRECXX11 env var/system prop Load precxx11 libtorch
PYTORCH_FLAVOR env var/system prop To force override auto detection (e.g. cpu/cpu-precxx11/cu102/cu116-precxx11)
PYTORCH_JIT_LOG_LEVEL env var Enable JIT logging
ai.djl.pytorch.native_helper system prop A user provided custom loader class to help locate pytorch native resources
ai.djl.pytorch.num_threads system prop Override OMP_NUM_THREAD environment variable
ai.djl.pytorch.num_interop_threads system prop Set PyTorch interop threads
ai.djl.pytorch.graph_optimizer system prop Enable/Disable JIT execution optimize, default: true. See: https://github.com/deepjavalibrary/djl/blob/master/docs/development/inference_performance_optimization.md#graph-optimizer
ai.djl.pytorch.cudnn_benchmark system prop To speed up ConvNN related model loading, default: false
ai.djl.pytorch.use_mkldnn system prop Enable MKLDNN, default: false, not recommended, use with your own risk

TensorFlow

Key Type Description
TENSORFLOW_LIBRARY_PATH env var/system prop User provided custom TensorFlow native library
TENSORRT_EXTRA_LIBRARY_PATH env var/system prop Extra TensorFlow custom operators library to load
TF_CPP_MIN_LOG_LEVEL env var TensorFlow log level
ai.djl.tensorflow.debug env var Enable devicePlacement logging, default: false

MXNet

Key Type Description
MXNET_LIBRARY_PATH env var/system prop User provided custom MXNet native library
MXNET_VERSION env var/system prop The version of custom MXNet build
MXNET_EXTRA_LIBRARY_PATH env var/system prop Load extra MXNet custom libraries, e.g. Elastice Inference
MXNET_EXTRA_LIBRARY_VERBOSE env var/system prop Set verbosity for MXNet custom library
ai.djl.mxnet.static_alloc system prop CachedOp options, default: true
ai.djl.mxnet.static_shape system prop CachedOp options, default: true
ai.djl.use_local_parameter_server system prop Use java parameter server instead of MXNet native implemention, default: false

Huggingface tokenizers

Key Type Description
TOKENIZERS_CACHE env var User provided custom Huggingface tokenizer native library

Python

Key Type Description
PYTHON_EXECUTABLE env var The location is python executable, default: python
DJL_ENTRY_POINT env var The entrypoint python file or module, default: model.py
MODEL_LOADING_TIMEOUT env var Python worker load model timeout: default: 240 seconds
PREDICT_TIMEOUT env var Python predict call timeout, default: 120 seconds
MAX_NETTY_BUFFER_SIZE env var/system prop Max response size in bytes, default 20 * 1024 * 1024 (20M)
DJL_VENV_DIR env var/system prop The venv directory, default: $DJL_CACHE_DIR/venv
ai.djl.python.disable_alternative system prop Disable alternative engine
TENSOR_PARALLEL_DEGREE env var Set tensor parallel degree.
For mpi mode, the default is number of accelerators.
Use "max" for non-mpi mode to use all GPUs for tensor parallel.

Engine specific settings

DJL support 12 deep learning frameworks, each framework has their own settings. Please refer to each framework’s document for detail.

A common setting for most of the engines is OMP_NUM_THREADS, for the best throughput, DJLServing set this to 1 by default. For some engines (e.g. MXNet, this value must be one). Since this is a global environment variable, setting this value will impact all other engines.

The follow table show some engine specific environment variables that is override by default by DJLServing:

Key Engine Description
TF_NUM_INTEROP_THREADS TensorFlow default 1, OMP_NUM_THREADS will override this value
TF_NUM_INTRAOP_THREADS TensorFlow default 1
TF_CPP_MIN_LOG_LEVEL TensorFlow default 1
MXNET_ENGINE_TYPE MXNet this value must be NaiveEngine