Skip to content

Latest commit

 

History

History

wilddiacs_utils

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 

Wild Diacritics Utils

Description

This is a repository of some python scripts that were used for the wild diacritics project which was created to analyse the random diacritical marks placed in normal arabic writing and attempting to use those in order to improve CAMeL tools automatic arabic diacritization models.

Setup

install the required pip packages using the following command

pip install -r requirements.txt

Also you also need to change the relative path of the diac_handler.py and token_handler.py in the lines that import these files as follows:

exec(open('<RELATIVE_PATH_TO_HANDLER>').read())

in any of the scripts when using them so that these relative paths really point towards diac_handler.py and token_handler.py

Overview

This repository has modules and scripts, modules are python files with a lot of helper functions for resusability between different scripts.

Modules

Scripts

Dependencies

DEV NOTES

  • to get a list of python versions
pyenv install --list
  • to download one version
pyenv install 3.10.13
  • to change the python version of the current shell use
pyenv shell 3.10.13
  • to create a new virtual environment use
python -m venv wild_diac_env
  • in order to activate the python environment use
source wild_diac_env/bin/activate
  • and to deactivate use
deactivate
  • In order to reinstall camel_tools use
pip install -e .
pyenv rehash