This is a repository of some python scripts that were used for the wild diacritics project which was created to analyse the random diacritical marks placed in normal arabic writing and attempting to use those in order to improve CAMeL tools automatic arabic diacritization models.
install the required pip packages using the following command
pip install -r requirements.txt
Also you also need to change the relative path of the diac_handler.py
and token_handler.py
in the lines that import these files as follows:
exec(open('<RELATIVE_PATH_TO_HANDLER>').read())
in any of the scripts when using them so that these relative paths really point towards diac_handler.py
and token_handler.py
This repository has modules and scripts, modules are python files with a lot of helper functions for resusability between different scripts.
- to get a list of python versions
pyenv install --list
- to download one version
pyenv install 3.10.13
- to change the python version of the current shell use
pyenv shell 3.10.13
- to create a new virtual environment use
python -m venv wild_diac_env
- in order to activate the python environment use
source wild_diac_env/bin/activate
- and to deactivate use
deactivate
- In order to reinstall camel_tools use
pip install -e .
pyenv rehash