Skip to content

Latest commit

 

History

History
100 lines (85 loc) · 4.93 KB

File metadata and controls

100 lines (85 loc) · 4.93 KB

DLRM v1 Inference

DLRM v1 Inference best known configurations with Intel® Extension for PyTorch.

Model Information

Use Case Framework Model Repo Branch/Commit/Tag Optional Patch
Inference PyTorch https://github.com/facebookresearch/dlrm - -

Pre-Requisite

Prepare Dataset

The code supports interface with the Criteo Kaggle Display Advertising Challenge Dataset.

  • Please do the following to prepare the dataset for use with DLRM code:
    • First, specify the raw data file (train.txt) as downloaded with
    • This is then pre-processed (categorize, concat across days...) to allow using with dlrm code
    • The processed data is stored as *.npz file datset dir need have train.txt and kaggleAdDisplayChallenge_processed.npz

you can get the checkpoints by running the command ./bench/dlrm_s_criteo_kaggle.sh [--test-freq=1024]

Inference

  1. git clone https://github.com/IntelAI/models.git
  2. cd models/models_v2/pytorch/dlrm/inference/gpu
  3. Create virtual environment venv and activate it:
    python3 -m venv venv
    . ./venv/bin/activate
    
  4. Run setup.sh
    ./setup.sh
    
  5. Install the latest GPU versions of torch, torchvision and intel_extension_for_pytorch:
python -m pip install torch==<torch_version> torchvision==<torchvvision_version> intel-extension-for-pytorch==<ipex_version> --extra-index-url https://pytorch-extension.intel.com/release-whl-aitools/
  1. Set environment variables for Intel® oneAPI Base Toolkit: Default installation location {ONEAPI_ROOT} is /opt/intel/oneapi for root account, ${HOME}/intel/oneapi for other accounts
    source {ONEAPI_ROOT}/compiler/latest/env/vars.sh
    source {ONEAPI_ROOT}/mkl/latest/env/vars.sh
    source {ONEAPI_ROOT}/tbb/latest/env/vars.sh
    source {ONEAPI_ROOT}/mpi/latest/env/vars.sh
    source {ONEAPI_ROOT}/ccl/latest/env/vars.sh
  2. Setup required environment paramaters
Parameter export command
MULTI_TILE export MULTI_TILE=False (False)
PLATFORM export PLATFORM=Flex (Flex)
DATASET_DIR export DATASET_DIR=
CKPT_DIR export CKPT_DIR=
BATCH_SIZE (optional) export BATCH_SIZE=32768
PRECISION (optional) export PRECISION=fp16 (fp16 for Flex)
OUTPUT_DIR (optional) export OUTPUT_DIR=$PWD
NUM_ITERATIONS (optional) export NUM_ITERATIONS=20
  1. Run run_model.sh

Note

Refer to CONTAINER.md for DLRM-v1 inference instructions using docker containers.

Output

Single-tile output will typically look like:

accuracy 76.215 %, best 76.215 %
dlrm_inf latency:  0.11193203926086426  s
dlrm_inf avg time:  0.007462135950724284  s, ant the time count is : 15
dlrm_inf throughput:  4391235.996821996  samples/s

Final results of the inference run can be found in results.yaml file.

results:
 - key: throughput
   value: 4391236.0
   unit: inst/s
 - key: latency
   value: 0.007462135950724283
   unit: s
 - key: accuracy
   value: 76.215
   unit: accuracy