diff --git a/.github/ISSUE_TEMPLATE/bug_report.md b/.github/ISSUE_TEMPLATE/bug_report.md new file mode 100644 index 0000000..b963be4 --- /dev/null +++ b/.github/ISSUE_TEMPLATE/bug_report.md @@ -0,0 +1,29 @@ +--- +name: Bug report +about: Create a report to help us improve + +--- + +**Describe the bug** +A clear and concise description of what the bug is. + +**To Reproduce** +Steps to reproduce the behavior: +1. Go to '...' +2. Click on '....' +3. Scroll down to '....' +4. See error + +**Expected behavior** +A clear and concise description of what you expected to happen. + +**Screenshots** +If applicable, add screenshots to help explain your problem. + +**Please complete the following information:** + - Version: [see bottom of the browser screen] + - OS: [e.g. Windows, Linux, OS X] + - Browser: [e.g. chrome, safari, firefox] + +**Additional context** +Add any other context about the problem here. diff --git a/.github/ISSUE_TEMPLATE/documentation_request.md b/.github/ISSUE_TEMPLATE/documentation_request.md new file mode 100644 index 0000000..975cf27 --- /dev/null +++ b/.github/ISSUE_TEMPLATE/documentation_request.md @@ -0,0 +1,9 @@ +--- +name: Documentation request +about: Suggest an idea to enhance documentation + +--- + +**What kind of information do you search? Please describe.** + +**Describe the solution you'd like** \ No newline at end of file diff --git a/.github/ISSUE_TEMPLATE/feature_request.md b/.github/ISSUE_TEMPLATE/feature_request.md new file mode 100644 index 0000000..f957ba6 --- /dev/null +++ b/.github/ISSUE_TEMPLATE/feature_request.md @@ -0,0 +1,17 @@ +--- +name: Feature request +about: Suggest an idea for this project + +--- + +**Is your feature request related to a problem? Please describe.** +A clear and concise description of what the problem is. + +**Describe the solution you'd like** +A clear and concise description of what you want to happen. + +**Describe alternatives you've considered** +A clear and concise description of any alternative solutions or features you've considered. + +**Additional context** +Add any other context or screenshots about the feature request here. \ No newline at end of file diff --git a/.github/ISSUE_TEMPLATE/refactoring.md b/.github/ISSUE_TEMPLATE/refactoring.md new file mode 100644 index 0000000..86dd724 --- /dev/null +++ b/.github/ISSUE_TEMPLATE/refactoring.md @@ -0,0 +1,11 @@ +--- +name: Refactoring (for developers) +about: Refactor the application + +--- + +**Describe the refactoring action** +A clear and concise description of what the action is. + +**Expected benefit** +A clear and concise description of what you expect to improve by the refactoring. \ No newline at end of file diff --git a/CITATION.CFF b/CITATION.CFF new file mode 100644 index 0000000..0440b88 --- /dev/null +++ b/CITATION.CFF @@ -0,0 +1,10 @@ +cff-version: 1.2.0 +message: "If you use this software, please cite it as below." +authors: +- family-names: "Terriel" + given-names: "Lucas" + orcid: "https://orcid.org/0000-0002-9189-258X" +title: "Semantic@" +version: 0.0.1 +date-released: 2022-05-5 +url: "https://github.com/Lucaterre/semanticat" diff --git a/README.md b/README.md index 6aea5f6..a27383f 100644 --- a/README.md +++ b/README.md @@ -1,24 +1,31 @@ ![Python Version](https://img.shields.io/badge/python-3.8-blue) [![MIT License](https://img.shields.io/apm/l/atomic-design-ui.svg?)](https://github.com/tterb/atomic-design-ui/blob/master/LICENSEs) [![Semantic@ CI build](https://github.com/Lucaterre/semanticat/actions/workflows/CI.yml/badge.svg)](https://github.com/Lucaterre/semanticat/actions/workflows/CI.yml) - + + # Semanti🐱 +---- + **WORK-IN-PROGRESS** -Semantic@ is a platform for enriching XML documents in TEI or EAD format with named entities. +Semantic@ is a semantic annotation platform for enriching XML documents in [TEI](https://tei-c.org/) or [EAD](https://www.loc.gov/ead/) schemas with semantic annotations. -After importing the document(s), apply the NER model and correct prediction or annotate manually from-zero and finally export and/or publish your XML with annotations inside. +Follow a simple workflow: After importing the document(s), apply the NER model and correct prediction or annotate manually from-zero and finally export and/or publish your XML with annotations directly inside. +This platform is also designed to adapt generically to the diversity of publishing projects and a base for adding custom components. -## :battery: Installation +## :movie_camera: Demo +![semanticat_demo](./documentation/semanticat_demo.gif) + +## :battery: Installation 1. Clone the Github repository ```bash -git clone +git clone https://github.com/Lucaterre/semanticat.git ``` 2. Move inside the directory @@ -48,6 +55,8 @@ pip install -r requirements.txt ## :rocket: Run Locally +:fire: This application is intended to be simple and local for the moment. **Please note that the application is currently optimized for the Firefox browser.** + Use the semantic@ CLI; inside the `semanticat/` directory, launch the command : ```bash @@ -56,52 +65,85 @@ python run.py Others arguments : -| **Type** | **Details** | -|-----------------------|----------------------------------------------| -| `--dev_mode` | Launch app in development mode | +| **Type** | **Details** | +|-----------------------|------------------------------------------| +| `--dev_mode` | Launch application in development mode | | `--erase_recreate_db` | Clean and Restore all database :warning: | -## :arrow_forward: Quick Start -(TODO : créer un wiki ? avec détails sur les exports, sur la manière d'annoter, sur comment enregistrer sur modèle NER etc.) +## :arrow_forward: Getting started + + +- Start by creating a project with the button `Create a new project` and open your project; +- Go to `Menu` > `Manage documents` and import your XML, now you can see your documents in +`Project workflow` view (You can mix EAD and TEI); +- In `Project workflow` view: Apply `parse` feature on document one by one or apply `Parse All` on all documents; + +- Go to `Menu` > `configuration`, two use cases : + +1. You don't want to apply a NER model, and you want to manually annotate your data : +- First, define Annotation mapping (see the "Mapping details" section); +- Add labels with `Add new pair to mapping scope`; +- Then, go to `Project workflow` > `correct named entities` and start annotation. + +2. You want to use an NER (recommend) model to predict named entities and correct afterwards (see the "NER configuration details" section): +- First, select checkbox `NER Recommenders`; +- Then, Choose the correct language that corresponding to your model; +- Then, Select the model and save; +- Wait, the pre-mapping appears, you can then adapt it (see the "Mapping details" section); +- Go to `Project workflow` > `Launch Ner` (or `Launch Ner on all`); +- When the process is complete,s, go to `correct named entities` and correct the predictions or add annotations. + +- Whatever the chosen scenario, once the correction is finished, you can export your document (see the "Export details" section) ! + +## Detail sections + +### Mapping -- Start by "create a new project" -- Go to "Menu" > "Manage documents" and import your XML, You can see your documents in -"Project workflow" view -- Apply "parse" on document one by one or apply "parse" on all documents -- Go to "Menu" > "configuration", two use cases : +The mapping is a table that references the labels you use for annotation with: -1. You don't apply NER model, and use semantic@ to create manually annotated data : -- First, define Annotation mapping ("NER Label" is display on annotation view, "Prefered index label" is display in export, define color) and save everytime -- Then, go to "Project workflow" > "correct named entities" and start annotate +- *Ner Label*: The default label use to annotate or use by your model; +- *Prefered Index label*: The label that will appear in the output; +- *Color*: label color in annotation view. -2. You want to use NER recommenders to predict named entities (see the "NER configuration details section"): -- First, select checkbox 'NER Recommenders' -- then, Choose the language that corresponding to your model -- then, Select the model and save -- Adjust the mapping -- Go to "Project workflow" > "Launch Ner" -- After the process, go to "correct named entities" to correct the predictions +You can add new labels to your existing schema via `Add new pair to mapping scope`. -- Now you can export your document ! +Be careful if you remove a label from table, if your model has already made predictions or if you have started to correct document, all annotations will be destroyed. -## NER configuration details section +### NER configuration -The NER framework that semantic@ use is SpaCy. +Currently, Semantic@ uses the NER SpaCy framework, in the future other frameworks may be integrated. -By default, the platform provides two small pre-trained model for French and English. +When installing the Semantic@, two pre-trained models for French (fr_core_news_sm) and English (en_core_web_sm) are already available -For add new [SpaCy pre-trained model](https://spacy.io/usage/models) : +For add a new available [SpaCy pre-trained model](https://spacy.io/usage/models), before starting Semantic@, launch in terminal : ```bash python -m spacy download ``` -The SpaCy pre-trained language are sometimes slow and too generic for your data, you -can use your own trained model, place your NER model folder under `/instance_config/my_features/my_models/` +The new pre-trained model will be directly available in model list from `configuration`. + +Sometimes, SpaCy’s default in-built pre-trained NER model are too slow and too generic for your data (the model is far from perfect so it doesn't necessarily detect your labels). +If you have training a better statistical NER model with SpaCy, you can place your NER model folder under `/instance_config/my_features/my_models/` + +Your model will be directly available in model list from `configuration`. + +### Export + +There are different XML export solutions : + +- `annotations inline (based on characters offsets)` (TEI specific): This export uses standoff converter and uses the positions of annotations in the text to produce output. It is precise but sometimes it takes time. +- `annotations to controlaccess level` (EAD specific): This export tags annotations in a level of type . +- `annotations inline (based on surface form matching)` (TEI & EAD): This export uses the surface shape of annotated mentions to tag the output. It is fast but sometimes less precise. + + +- `annotations in JSON`: This export allows you to keep track of your annotations in a JSON format and import it directly into the annotation view. ## :crying_cat_face: Bug reports +Feel free to create a new issue (new features, bug reports, documentation etc.). + ## :computer: Stack ### Interface @@ -112,8 +154,26 @@ can use your own trained model, place your NER model folder under `/instance_con ### Main Components -## :bust_in_silhouette: Authors +- [![Spacy](https://img.shields.io/badge/NLP%20with-SpaCy-blue)](https://spacy.io/) + +- [![RecogitoJS](https://img.shields.io/badge/Text%20annotation%20with-RecogitoJS-9cf)](https://github.com/recogito/recogito-js) + +- [![Standoffconverter](https://img.shields.io/badge/Annotations%20in%20TEI%20with-StandoffConverter-red)](https://github.com/standoff-nlp/standoffconverter) + +## :bust_in_silhouette: Mainteners - [@Lucaterre](https://github.com/Lucaterre) + +## How to cite + +Please use the following citation: + + @misc{terriel-2022-semanticat, + title = "Semantic@ : a semantic annotation platform for enriching XML documents in TEI or EAD schemas with semantic annotations.", + author = "Terriel, Lucas", + year = "2022", + url = "https://github.com/Lucaterre/semanticat", + } + [![forthebadge made-with-python](http://ForTheBadge.com/images/badges/made-with-python.svg)](https://www.python.org/) diff --git a/app/views/projectHandler.py b/app/views/projectHandler.py index 6f7b6f2..4ad7877 100644 --- a/app/views/projectHandler.py +++ b/app/views/projectHandler.py @@ -6,7 +6,8 @@ redirect, render_template, flash, - abort) + abort, + url_for) from app.config import (app, db) @@ -36,8 +37,6 @@ def index(): db.session.commit() flash(f'New project {project_name} created now.', category='success') - return render_template('main/project.edition.html', - projects=Project.return_all_projects()) else: flash('Project always exist ! Please change the name.', category='warning') return render_template('main/project.edition.html', @@ -53,6 +52,6 @@ def remove_project(project_id): db.session.delete(project) db.session.commit() flash(f'Project : {project.project_name} completely removed.', category='warning') - return redirect("/"), 200 + return redirect(url_for('index')) else: abort(404, description="Project not found") diff --git a/documentation/semanticat_demo.gif b/documentation/semanticat_demo.gif new file mode 100644 index 0000000..17f0137 Binary files /dev/null and b/documentation/semanticat_demo.gif differ