PASS2ACT

Author : Daniel Nohimovich & Zhekai Jin (Scott)

Course : ECE 467 Natural Language Processing

Instructor : Professor Carl Sable

Description

A passive to active voice transformer based on an existing dependency parser. The data pipleline processes the parser result to detect whether a sentence is passive. Then, transformations are performed on the parse tree to change the sentence to active voice if there is an agent in the original sentence. The result is rendered both in parser-tree-form visualization and text format.

Dependency

Usage

pip3 install -U spacy
python3 -m spacy download en
python3 demo.py

Then follow the instruction as prompted.

Assumptions

The whole data pipeline relies on th result of the parser tree result, which is assumed to be correct.
Input is generally a statement but not in question form.

Workflow

Given:

an statement sentence in English
a dependency parser tree labbeled with POS tag is formed with an existing parser

Decision Making:

the dependency parser we use distinigishes passive subjects from normal subjects in its grammer
the existence of a passive subject or a passive auxilary verb implies that a sentence is passive

Transform:

the subject and object are inverted according to a lookup table
the root verb and its auxilaries are conjugated based on a couple naive rules
finally the sentence is built up by joining the individual phrases in an active order with an attempt to accomdate miscellaneous clauses
if a sentence has an independent clause within it that is also passive the algorithm will recursively transform that clause as well

Running Time

Besides the initial parsing the algorithm to actually transform the sentence to active runs in approximately linear time.

Robustness:

The algorithm take edge cases into consideration and resolve recursive passive voice, but the wrong & ambiguous parser result will lead to err performance.

Performance Testing

The testing was performed with a limited database and only the detection was tested since the transformed sentence has mutiple valid forms. The detection testing give 97% recall and 97% precision, and the err case was actually due to the err parser result. Without an existing baseline method to compare with, the algorithm was concluded to give a eligible passive voice detection.

Future Improvement

Question form :
- The question form sentence could be resolved in a better form.
Parser tree result correction:
- if the sentence has clear feature that could be detected to check with the parser tree to prve its validness, we could add error detection and correction on the parser result to improve the Robustness.
Feature selection:
- More features or edge cases could be tested and considered.
Muti Language support:
- More languages could be included with different head parameters.

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
assets		assets
patternlib @ 5b85d99		patternlib @ 5b85d99
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
demo.py		demo.py
detector_try.py		detector_try.py
matrix_plot.py		matrix_plot.py
minisets.txt		minisets.txt
minitest.py		minitest.py
pass2act.py		pass2act.py
pattern		pattern
wordinv.py		wordinv.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PASS2ACT

Author : Daniel Nohimovich & Zhekai Jin (Scott)

Course : ECE 467 Natural Language Processing

Instructor : Professor Carl Sable

Description

Dependency

Usage

Assumptions

Workflow

Running Time

Robustness:

Performance Testing

Future Improvement

About

Releases

Packages

Languages

License

DanManN/pass2act

Folders and files

Latest commit

History

Repository files navigation

PASS2ACT

Author : Daniel Nohimovich & Zhekai Jin (Scott)

Course : ECE 467 Natural Language Processing

Instructor : Professor Carl Sable

Description

Dependency

Usage

Assumptions

Workflow

Running Time

Robustness:

Performance Testing

Future Improvement

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages