KNLP | BI-LSTM Word Segmentation and POS Tagging tool

bilstm_tokenizer is used for tokenizing text file or input sentences. tokenize_file_bilstm is for tokenizing text file and tokenize_sentences_bilstm is for tokenizing input sentences.

Installation

git clone https://github.com/nakanyseth-vuth/git
cd segmentation
pip install -r requirements.txt

Usage 😮 🔑

Import the funtions to your code:

from bilstm_tokenizer import tokenize_sentences_bilstm, tokenize_file_bilstm

input_sents = ["ខ្ញុំទៅសាលា", "សាលារៀនខ្ញុំនៅព្រែកលាប។"]
res = tokenize_sentences_bilstm(input_sents)

print(res)

To include POS Tagging in the results: 😎

Replace the below code from:

seq,pos = decode(pred_sent, lines[i])
result = [s for s in seq ]

to this:

seq,pos = decode(pred_sent, lines[i])
result = [s+"/"+p for s,p in zip(seq,pos) ]

@Created by Nakanyseth VUTH

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.idea		.idea
__pycache__		__pycache__
intents		intents
stores		stores
utils		utils
.gitignore		.gitignore
README.md		README.md
bilstm_tokenizer.py		bilstm_tokenizer.py
models.py		models.py
requirements.txt		requirements.txt
test.py		test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

KNLP | BI-LSTM Word Segmentation and POS Tagging tool

Installation

Usage 😮 🔑

To include POS Tagging in the results: 😎

About

Releases

Packages

Languages

IcyTempest/segmentation

Folders and files

Latest commit

History

Repository files navigation

KNLP | BI-LSTM Word Segmentation and POS Tagging tool

Installation

Usage 😮 🔑

To include POS Tagging in the results: 😎

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages