Merge pull request #14 from letuananh/coolisf-0.2.3

Coolisf 0.2.3
letuananh · May 12, 2021 · 847af8d · 847af8d
2 parents 5a1fe79 + 8761539
commit 847af8d
Show file tree

Hide file tree

Showing 51 changed files with 215 additions and 175 deletions.
diff --git a/MANIFEST.in b/MANIFEST.in
@@ -1,3 +1,5 @@
 include README.md
 include LICENSE
 include requirements*.txt
+include coolisf/dao/scripts/*.sql
+include coolisf/data/*.gz
diff --git a/README.md b/README.md
@@ -1,77 +1,80 @@
-Integrated Semantic Framework (intsem.fx)
-=========
+# Integrated Semantic Framework (intsem.fx)
 
-# Prerequisite
+A Python 3 implementation of the [Integrated Semantic Framework](https://osf.io/9udjk/) that provides computational deep semantic analysis by combining structural semantics from construction grammars and lexical semantics from ontologies in a single representation.
 
-* Python >= 3.5
-* Required packages (see requirements.txt)
-* English Resource Grammar (rev >= 26135)
-* NLTK data
-* LeLESK data (see https://github.com/letuananh/lelesk)
+# Install
 
-# Installation
+`coolisf` only works on Linux distributions at the moment (built and tested on Fedora and Ubuntu Linux). 
+
+- Install `coolisf` package from [PyPI](https://pypi.org/project/coolisf/) using pip
 
-* Download and install ACE >= 0.9.26 from: http://sweaglesw.org/linguistics/ace/
-* Download ERG trunk from SVN `svn checkout http://svn.delph-in.net/erg/trunk`
-* Build erg.dat `ace -g ace/config.tdl -G erg.dat`
-* Download the latest release from: https://github.com/letuananh/intsem.fx/releases, unzip it to a folder and run the `isf` command
-* Download NLTK data
 ```
-import nltk
-nltk.download("book")
+pip install coolisf
 ```
 
+- Create coolisf data folder at `/home/user/local/isf/data`
+- Download ace-0.9.26 binary from https://osf.io/x52fy/ to `/home/user/bin/ace`. Make sure that you can run ace by
 
-Tips:
-`pip` is recommended for installing required packages
-```
-python -m pip install -r requirements.txt
+```bash
+[isf]$ ~/bin/ace -V
+ACE version 0.9.26
+compiled at 18:48:50 on Sep 14 2017
 ```
 
+- Install [lelesk](https://pypi.org/project/lelesk/) and yawlib with data
+- Download coolisf lexical rules database from https://osf.io/qn4wz/ and extract it to `/home/user/local/isf/data/lexrules.db`
+- Download grammar files (erg.dat, jacy.dat, virgo.dat, etc.) and copy them to `/home/user/local/isf/data/grammars/`
 
-# Using ISF
+The final data folder should look something like this
 
 ```
-cd ~/workspace/intsem.fx
-./isf parse data/sample.txt data/sample.out
+/home/user/local/isf/data
+├── grammars
+│   ├── erg.dat
+│   └── jacy.dat
+├── lexrules.db
 ```
 
-# Development
+# Using ISF
 
-WARNING: These are meant for developers who want to contribute to the codebase. If all you need is to run the ISF to process your data, please see the Installation section above instead.
+To parse a sentence, use coolisf `text` command
 
-2 - Check out the code of intsem.fx to ~/workspace with:
+```bash
+python -m coolisf text "I drink green tea." -f dmrs
 
+:`I drink green tea.` (len=5)
+------------------------------------------------------------
+dmrs {
+  10000 [pron<0:1> x ind=+ num=sg pers=1 pt=std];
+  10001 [pronoun_q<0:1> x ind=+ num=sg pers=1 pt=std];
+  10002 [_drink_v_1_rel<2:7> e mood=indicative perf=- prog=- sf=prop tense=pres];
+  10003 [udef_q<8:18> x num=sg pers=3];
+  10004 [_green+tea_n_1_rel<8:18> x num=sg pers=3];
+  0:/H -> 10002;
+  10001:RSTR/H -> 10000;
+  10002:ARG1/NEQ -> 10000;
+  10002:ARG2/NEQ -> 10004;
+  10003:RSTR/H -> 10004;
+}
+# 10002 -> 01170052-v[drink/lelesk]
+# 10004 -> 07935152-n[green tea/lelesk]
+...
 ```
-git clone --recursive https://github.com/letuananh/intsem.fx.git
-```
-
-3 - Check out ERG to your workspace folder and compile the grammar
 
-This is complicated, read more here: http://moin.delph-in.net/TuanAnhLe/GramEng4Dummies
+For batch processing, create a text file with each sentence on a separate line.
+For example here is the content of the file `sample.txt`
 
-Basically, I need the grammar file (erg.dat) to be located at ~/workspace/cldata/erg.dat
-
-4 - Configure the application
-```
-cd ~/workspace/intsem.fx
-./config.sh
-```
-To use ISF, please try
 ```
-cd ~/workspace/intsem.fx
-./isf --help
+I drink green tea.
+Sherlock Holmes has three guard dogs.
+A soul is not a living thing.
+Do you have any green tea chest?
 ```
 
-Notes:
+After that, run the following command and the output will be written to the file `demo_out.xml`
 
-Use virtualenv to install required packages
-```
-python3 -m venv ~/isf_py3
-. ~/isf_py3/bin/activate
+```bash
+python -m coolisf parse demo.txt -o demo_out.xml
 ```
 
-Install these packages if you are using Fedora Linux:
-```
-sudo dnf install -y redhat-rpm-config gcc-c++
-```
+If you encounter any issue, please submit an issue at: https://github.com/letuananh/intsem.fx/issues
diff --git a/config.sh b/config.sh
@@ -41,9 +41,7 @@ git submodule init && git submodule update
 # prerequisite packages
 pip install -r requirements.txt -qq
 
-link_folder `readlink -f ./modules/lelesk/lelesk` lelesk
 link_folder `readlink -f ./modules/demophin` demophin
-link_folder `readlink -f ./modules/yawlib/yawlib` yawlib
 
 link_file `readlink -f ${WORKSPACE_FOLDER}/cldata/erg.dat` data/erg.dat
 link_file `readlink -f ${WORKSPACE_FOLDER}/cldata/jacy.dat` data/jacy.dat

diff --git a/coolisf/__main__.py b/coolisf/__main__.py
@@ -1,2 +1,6 @@
+# This code is a part of coolisf library: https://github.com/letuananh/intsem.fx
+# :copyright: (c) 2014 Le Tuan Anh <[email protected]>
+# :license: MIT, see LICENSE for more details.
+
 from . import main
 main.main()
diff --git a/coolisf/__version__.py b/coolisf/__version__.py
@@ -9,11 +9,11 @@
 __copyright__ = "Copyright (c) 2014, Le Tuan Anh <[email protected]>"
 __credits__ = []
 __license__ = "MIT License"
-__description__ = "a Python package for providing computational deep semantic analysis by combining structural semantics from construction grammars and ontology-based lexical semantics in a single representation"
+__description__ = "A Python 3 implementation of the Integrated Semantic Framework that provides computational deep semantic analysis by combining structural semantics from construction grammars and lexical semantics from ontologies in a single representation."
 __url__ = "https://github.com/letuananh/intsem.fx/"
 __issue__ = "https://github.com/letuananh/intsem.fx/"
 __maintainer__ = "Le Tuan Anh"
 __version_major__ = "0.2.3"  # follow PEP-0440
-__version__ = "{}beta".format(__version_major__)
-__version_long__ = "{} - Beta".format(__version_major__)
+__version__ = "{}b2".format(__version_major__)
+__version_long__ = "{} - Beta 2".format(__version_major__)
 __status__ = "4 - Beta"
diff --git a/coolisf/common.py b/coolisf/common.py
@@ -39,7 +39,7 @@
 import gzip
 import logging
 
-from chirptext import FileHelper
+from texttaglib.chirptext import FileHelper
 from lelesk.util import ptpos_to_wn
 
 

diff --git a/coolisf/config.py b/coolisf/config.py
@@ -42,7 +42,7 @@
 import os
 import logging
 
-from chirptext import FileHelper, AppConfig
+from texttaglib.chirptext import FileHelper, AppConfig
 from coolisf.common import write_file
 from coolisf.data import read_config_template
 

diff --git a/coolisf/dao/cache.py b/coolisf/dao/cache.py
@@ -36,7 +36,7 @@
 import logging
 import json
 
-from puchikarui import Schema, with_ctx
+from texttaglib.puchikarui import Schema, with_ctx
 
 from coolisf.model import Sentence
 

diff --git a/coolisf/dao/corpus.py b/coolisf/dao/corpus.py
@@ -35,8 +35,8 @@
 import os.path
 import logging
 
-from puchikarui import Schema, with_ctx
-from chirptext import texttaglib as ttl
+from texttaglib.puchikarui import Schema, with_ctx
+from texttaglib.chirptext import ttl
 
 from coolisf.util import is_valid_name
 from coolisf.model import Corpus, Document, Sentence, Reading

diff --git a/coolisf/dao/ruledb.py b/coolisf/dao/ruledb.py
@@ -32,7 +32,7 @@
 import os.path
 import logging
 
-from puchikarui import with_ctx
+from texttaglib.puchikarui import with_ctx
 
 from coolisf.dao import CorpusDAOSQLite
 from coolisf.model import LexUnit, RuleInfo, PredInfo, RulePred, Reading

diff --git a/coolisf/dao/textcorpus.py b/coolisf/dao/textcorpus.py
@@ -31,8 +31,8 @@
 import os
 import json
 
-from chirptext import FileHelper
-from chirptext.chio import CSV
+from texttaglib.chirptext import FileHelper
+from texttaglib.chirptext.chio import CSV
 
 
 # ----------------------------------------------------------------------

diff --git a/coolisf/ergex.py b/coolisf/ergex.py
@@ -38,12 +38,12 @@
 from collections import defaultdict
 import csv
 
-from chirptext.leutile import Counter
-from chirptext.leutile import FileHelper
-from chirptext.leutile import TextReport
-from chirptext.leutile import FileHub
-from chirptext.leutile import Timer
-from chirptext.leutile import header
+from texttaglib.chirptext.leutile import Counter
+from texttaglib.chirptext.leutile import FileHelper
+from texttaglib.chirptext.leutile import TextReport
+from texttaglib.chirptext.leutile import FileHub
+from texttaglib.chirptext.leutile import Timer
+from texttaglib.chirptext.leutile import header
 
 from yawlib import YLConfig
 from yawlib import WordnetSQL as WNSQL

diff --git a/coolisf/ghub.py b/coolisf/ghub.py
@@ -42,7 +42,7 @@
 import logging
 from delphin.interfaces import ace
 
-from chirptext import FileHelper
+from texttaglib.chirptext import FileHelper
 
 from coolisf.config import read_config
 from coolisf.dao.cache import AceCache, ISFCache

diff --git a/coolisf/gold_extract.py b/coolisf/gold_extract.py
@@ -45,9 +45,9 @@
 from lxml import etree
 
 
-from chirptext.leutile import FileHelper
-from chirptext import texttaglib as ttl
-from chirptext import chio
+from texttaglib.chirptext.leutile import FileHelper
+from texttaglib.chirptext import texttaglib as ttl
+from texttaglib.chirptext import chio
 from lelesk import LeLeskWSD
 from lelesk import LeskCache  # WSDResources
 

diff --git a/coolisf/lexsem.py b/coolisf/lexsem.py
@@ -35,7 +35,7 @@
 
 from delphin.mrs.components import Pred
 
-from chirptext import texttaglib as ttl
+from texttaglib.chirptext import texttaglib as ttl
 from yawlib import SynsetID
 
 

diff --git a/coolisf/main.py b/coolisf/main.py
@@ -46,10 +46,10 @@
 import logging
 import collections
 
-from chirptext.cli import CLIApp, setup_logging
-from chirptext import header, confirm, TextReport, FileHelper, Counter, Timer
-from chirptext.leutile import is_number
-from chirptext import texttaglib as ttl
+from texttaglib.chirptext.cli import CLIApp, setup_logging
+from texttaglib.chirptext import header, confirm, TextReport, FileHelper, Counter, Timer
+from texttaglib.chirptext.leutile import is_number
+from texttaglib.chirptext import texttaglib as ttl
 from lelesk import LeLeskWSD
 from lelesk import LeskCache  # WSDResources
 

diff --git a/coolisf/model.py b/coolisf/model.py
@@ -51,9 +51,9 @@
 from delphin.mrs.components import Pred
 from delphin.mrs.components import normalize_pred_string
 
-from chirptext.anhxa import update_obj
-from chirptext.leutile import StringTool, header
-from chirptext import texttaglib as ttl
+from texttaglib.chirptext.anhxa import update_obj
+from texttaglib.chirptext.leutile import StringTool, header
+from texttaglib.chirptext import texttaglib as ttl
 from yawlib import Synset
 from lelesk import LeLeskWSD
 from lelesk import LeskCache  # WSDResources

diff --git a/coolisf/morph.py b/coolisf/morph.py
@@ -34,7 +34,7 @@
 import logging
 from collections import defaultdict as dd
 
-from chirptext import FileHelper
+from texttaglib.chirptext import FileHelper
 
 from coolisf.dao.ruledb import LexRuleDB
 from coolisf.config import read_config

diff --git a/coolisf/processors/jp_adv.py b/coolisf/processors/jp_adv.py
@@ -33,13 +33,13 @@
 
 import logging
 
-from chirptext import TextReport
+from texttaglib.chirptext import TextReport
 
 from coolisf.model import Sentence
 from .base import Processor
 try:
 
-    from chirptext.deko import wakati
+    from texttaglib.chirptext.deko import wakati
     from coolisf.shallow import JapaneseAnalyser
     from jamdict import Jamdict
     from jamdict.tools import dump_result

diff --git a/coolisf/processors/jp_basic.py b/coolisf/processors/jp_basic.py
@@ -37,7 +37,7 @@
 from .base import Processor
 
 try:
-    from chirptext.deko import wakati
+    from texttaglib.chirptext.deko import wakati
     from coolisf.shallow import JapaneseAnalyser
 except:
     logging.warning('chirptext.deko cannot be imported. JNLP mode is disabled')

diff --git a/djangoisf/__init__.py → coolisf/rest/__init__.py b/djangoisf/__init__.py → coolisf/rest/__init__.py
diff --git a/djangoisf/admin.py → coolisf/rest/admin.py b/djangoisf/admin.py → coolisf/rest/admin.py
diff --git a/djangoisf/apps.py → coolisf/rest/apps.py b/djangoisf/apps.py → coolisf/rest/apps.py
diff --git a/djangoisf/migrations/__init__.py → coolisf/rest/migrations/__init__.py b/djangoisf/migrations/__init__.py → coolisf/rest/migrations/__init__.py
diff --git a/djangoisf/models.py → coolisf/rest/models.py b/djangoisf/models.py → coolisf/rest/models.py
diff --git a/djangoisf/tests.py → coolisf/rest/tests.py b/djangoisf/tests.py → coolisf/rest/tests.py
diff --git a/djangoisf/urls.py → coolisf/rest/urls.py b/djangoisf/urls.py → coolisf/rest/urls.py