The Specialized Immigration Assistant LLM is an innovative application that leverages the power of LLAMA2, a Large Language Model, to provide specialized legal assistance in the field of immigration law. Our project aims to make legal information more accessible and comprehensible to users seeking guidance in US immigration matters.
Training with Legal Articles: We utilized a comprehensive collection of legal articles sourced from a prominent Law group to train and fine-tune the LLAMA2 Large Language Model.
Customized Immigration Assistant Model: We tailored LLAMA2's capabilities to create a specialized Immigration Assistant model that understands and generates accurate legal language for immigration-related queries.
Improved Accuracy: By fine-tuning the model on a rich dataset of legal nuances and terminology, we achieved improved accuracy and context-specific responses.
Real-time Immigration Information: We are actively working on enabling users to access reliable and up-to-date legal information through an AI interface, contributing to more informed decision-making in US immigration matters.
- Python
- Numpy (Use matrix math operations)
- PyTorch (Build Deep Learning models)
- Datasets (Access datasets from huggingface hub)
- Huggingface_hub (access huggingface data & models)
- Transformers (Access models from HuggingFace hub)
- Trl (Transformer Reinforcement Learning. And fine-tuning.)
- Bitsandbytes (makes models smaller, aka 'quantization')
- Sentencepiece (Byte Pair Encoding scheme aka 'tokenization')
- OpenAI (Create synthetic fine-tuning and reward model data)
- Peft (Parameter Efficient Fine Tuning, use low rank adaption (LoRa) to fine-tune)
- Jupyter Notebook
- The dataset provided by private Legal firm and is not included in the GitHub repository for data privacy reasons.
- Install all dependencies using pip
pip install numpy pandas torch datasets huggingface_hub transformers trl bitsandbytes sentencepiece openai peft evaluate rouge_score
To train the Immigration Assistant model, you have the option to execute the llam2_training.ipynb notebook either locally on your machine or remotely through a cloud service such as Google Colab Pro. It's important to note that the training process requires the availability of a GPU for optimal performance.
In case you don't have access to a GPU, a convenient and cost-effective alternative is to utilize Google Colab Pro, which is available at a monthly cost of $10.
For those interested in gaining deeper insights into the training process and the nuances of our specialized Immigration Assistant model, you can explore detailed information within the llam2_training.ipynb notebook. This notebook provides a comprehensive overview of the training methodology and the underlying mechanisms that make the Immigration Assistant so effective in providing accurate legal guidance for immigration-related queries.
Cloud TrainingLocal Training
git clone https://github.com/kedir/Specialized-Immigration-Assistant-LLM-Chatbot.git
cd notebooks
jupyter llam2_training.ipynb
Contributions, issues, and feature requests are welcome!
Give a ⭐️ if you like this project!
Distributed under the MIT License. See LICENSE.txt
for more information.
- Kedir Ahmed - @linkedin - [email protected]
Meta HuggingFace OpenAI