Skip to content
/ CIA Public

Code for training, evaluating and using a cross-lingual Auto Evaluator

License

Notifications You must be signed in to change notification settings

AI4Bharat/CIA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

36 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CIA

Code for training, evaluating and using a cross-lingual Auto Evaluator

Installation

We require separate environments for training and evaluation due to incompatible Torch versions.

Training Environment Setup

  1. Create and activate the training environment:

    conda create -n training python=3.10 && conda activate training
  2. Install numpy (ensure compatibility by avoiding numpy 2.x):

    pip install numpy==1.26.4
  3. Install PyTorch:

    conda install pytorch==2.1.2 torchvision==0.16.2 torchaudio==2.1.2 pytorch-cuda=12.1 -c pytorch -c nvidia
    • Use torch v2.1.2 only.
    • Compile with CUDA based on your system specifications.
    • For further instructions, refer to the official PyTorch installation guide.
  4. Clone and install the Alignment Handbook:

    git clone https://github.com/huggingface/alignment-handbook.git
    cd ./alignment-handbook/
    python -m pip install .
  5. Install Flash Attention 2:

    python -m pip install flash-attn --no-build-isolation
  6. Login to Hugging Face CLI:

    huggingface-cli login
  7. Install other useful libraries:

    pip install wandb huggingface-hub==0.24.7
  8. Install Git LFS to push models to the Hugging Face Hub:

    sudo apt-get install git-lfs

Inference Environment Setup

  1. Create and activate the inference environment:

    conda create -n inference python=3.10 && conda activate inference
  2. Install vLLM:

    pip install vllm
  3. Install datasets and transformers libraries:

    pip install datasets transformers

Citation

If you find the following model helpful, please consider citing our paper!

BibTeX:

@article{doddapaneni2024crosslingual,
  title   = {Cross-Lingual Auto Evaluation for Assessing Multilingual LLMs},
  author  = {Sumanth Doddapaneni and Mohammed Safi Ur Rahman Khan and Dilip Venkatesh and Raj Dabre and Anoop Kunchukuttan and Mitesh M. Khapra},
  year    = {2024},
  journal = {arXiv preprint arXiv: 2410.13394}
}

About

Code for training, evaluating and using a cross-lingual Auto Evaluator

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published