ner-asr

This repository contains a named entity recognition system for Finnish language.

Requirements

pytorch
pytorch-crf
gensim
morfessor
fasttext

Download resources

The pretrained word embeddings can be downloaded from the following link: https://dl.fbaipublicfiles.com/fasttext/vectors-crawl/cc.fi.300.bin.gz .

You need to place the embeddings in the data/embeddings directory.

Usage

There are two models trained: model_lower and model_upper. The first one is trained on lower case data and without punctuation. The second one is trained on data that contains both lower and upper case letter together with punctuation.

To switch between model, change the flag lowercase_model in config/params.py file.

Use evaluate_document.py in order to annotate a new document. This script takes two input arguments:

--input - input file to be evaluated

--output - path where the output will be stored

The format of the input document is described in the script

Name		Name	Last commit message	Last commit date
Latest commit History 53 Commits
config		config
data		data
plotting		plotting
utils		utils
weights		weights
README.md		README.md
evaluate_document.py		evaluate_document.py
main.py		main.py
model.py		model.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ner-asr

Requirements

Download resources

Usage

About

Releases

Packages

Languages

aalto-speech/ner-asr

Folders and files

Latest commit

History

Repository files navigation

ner-asr

Requirements

Download resources

Usage

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages