Skip to content

Commit

Permalink
📝 Update documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
Lucaterre committed May 5, 2022
1 parent 7c331db commit 0bbb5b1
Show file tree
Hide file tree
Showing 8 changed files with 172 additions and 37 deletions.
29 changes: 29 additions & 0 deletions .github/ISSUE_TEMPLATE/bug_report.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
---
name: Bug report
about: Create a report to help us improve

---

**Describe the bug**
A clear and concise description of what the bug is.

**To Reproduce**
Steps to reproduce the behavior:
1. Go to '...'
2. Click on '....'
3. Scroll down to '....'
4. See error

**Expected behavior**
A clear and concise description of what you expected to happen.

**Screenshots**
If applicable, add screenshots to help explain your problem.

**Please complete the following information:**
- Version: [see bottom of the browser screen]
- OS: [e.g. Windows, Linux, OS X]
- Browser: [e.g. chrome, safari, firefox]

**Additional context**
Add any other context about the problem here.
9 changes: 9 additions & 0 deletions .github/ISSUE_TEMPLATE/documentation_request.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
---
name: Documentation request
about: Suggest an idea to enhance documentation

---

**What kind of information do you search? Please describe.**

**Describe the solution you'd like**
17 changes: 17 additions & 0 deletions .github/ISSUE_TEMPLATE/feature_request.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
---
name: Feature request
about: Suggest an idea for this project

---

**Is your feature request related to a problem? Please describe.**
A clear and concise description of what the problem is.

**Describe the solution you'd like**
A clear and concise description of what you want to happen.

**Describe alternatives you've considered**
A clear and concise description of any alternative solutions or features you've considered.

**Additional context**
Add any other context or screenshots about the feature request here.
11 changes: 11 additions & 0 deletions .github/ISSUE_TEMPLATE/refactoring.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
---
name: Refactoring (for developers)
about: Refactor the application

---

**Describe the refactoring action**
A clear and concise description of what the action is.

**Expected benefit**
A clear and concise description of what you expect to improve by the refactoring.
10 changes: 10 additions & 0 deletions CITATION.CFF
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "Terriel"
given-names: "Lucas"
orcid: "https://orcid.org/0000-0002-9189-258X"
title: "Semantic@"
version: 0.0.1
date-released: 2022-05-5
url: "https://github.com/Lucaterre/semanticat"
126 changes: 93 additions & 33 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,24 +1,31 @@
<!--<img src="" width=300 align=right>-->

![Python Version](https://img.shields.io/badge/python-3.8-blue) [![MIT License](https://img.shields.io/apm/l/atomic-design-ui.svg?)](https://github.com/tterb/atomic-design-ui/blob/master/LICENSEs) [![Semantic@ CI build](https://github.com/Lucaterre/semanticat/actions/workflows/CI.yml/badge.svg)](https://github.com/Lucaterre/semanticat/actions/workflows/CI.yml)
<!-- CI badge -->


# Semanti🐱

----

**WORK-IN-PROGRESS**

Semantic@ is a platform for enriching XML documents in TEI or EAD format with named entities.
Semantic@ is a semantic annotation platform for enriching XML documents in [TEI](https://tei-c.org/) or [EAD](https://www.loc.gov/ead/) schemas with semantic annotations.

After importing the document(s), apply the NER model and correct prediction or annotate manually from-zero and finally export and/or publish your XML with annotations inside.
Follow a simple workflow: After importing the document(s), apply the NER model and correct prediction or annotate manually from-zero and finally export and/or publish your XML with annotations directly inside.

This platform is also designed to adapt generically to the diversity of publishing projects and a base for adding custom components.

## :battery: Installation

## :movie_camera: Demo

![semanticat_demo](./documentation/semanticat_demo.gif)

## :battery: Installation

1. Clone the Github repository

```bash
git clone
git clone https://github.com/Lucaterre/semanticat.git
```

2. Move inside the directory
Expand Down Expand Up @@ -48,6 +55,8 @@ pip install -r requirements.txt

## :rocket: Run Locally

:fire: This application is intended to be simple and local for the moment. **Please note that the application is currently optimized for the Firefox browser.**

Use the semantic@ CLI; inside the `semanticat/` directory, launch the command :

```bash
Expand All @@ -56,52 +65,85 @@ python run.py

Others arguments :

| **Type** | **Details** |
|-----------------------|----------------------------------------------|
| `--dev_mode` | Launch app in development mode |
| **Type** | **Details** |
|-----------------------|------------------------------------------|
| `--dev_mode` | Launch application in development mode |
| `--erase_recreate_db` | Clean and Restore all database :warning: |

## :arrow_forward: Quick Start

(TODO : créer un wiki ? avec détails sur les exports, sur la manière d'annoter, sur comment enregistrer sur modèle NER etc.)
## :arrow_forward: Getting started


- Start by creating a project with the button `Create a new project` and open your project;
- Go to `Menu` > `Manage documents` and import your XML, now you can see your documents in
`Project workflow` view (You can mix EAD and TEI);
- In `Project workflow` view: Apply `parse` feature on document one by one or apply `Parse All` on all documents;

- Go to `Menu` > `configuration`, two use cases :

1. You don't want to apply a NER model, and you want to manually annotate your data :
- First, define Annotation mapping (see the "Mapping details" section);
- Add labels with `Add new pair to mapping scope`;
- Then, go to `Project workflow` > `correct named entities` and start annotation.

2. You want to use an NER (recommend) model to predict named entities and correct afterwards (see the "NER configuration details" section):
- First, select checkbox `NER Recommenders`;
- Then, Choose the correct language that corresponding to your model;
- Then, Select the model and save;
- Wait, the pre-mapping appears, you can then adapt it (see the "Mapping details" section);
- Go to `Project workflow` > `Launch Ner` (or `Launch Ner on all`);
- When the process is complete,s, go to `correct named entities` and correct the predictions or add annotations.

- Whatever the chosen scenario, once the correction is finished, you can export your document (see the "Export details" section) !

## Detail sections

### Mapping

- Start by "create a new project"
- Go to "Menu" > "Manage documents" and import your XML, You can see your documents in
"Project workflow" view
- Apply "parse" on document one by one or apply "parse" on all documents
- Go to "Menu" > "configuration", two use cases :
The mapping is a table that references the labels you use for annotation with:

1. You don't apply NER model, and use semantic@ to create manually annotated data :
- First, define Annotation mapping ("NER Label" is display on annotation view, "Prefered index label" is display in export, define color) and save everytime
- Then, go to "Project workflow" > "correct named entities" and start annotate
- *Ner Label*: The default label use to annotate or use by your model;
- *Prefered Index label*: The label that will appear in the output;
- *Color*: label color in annotation view.

2. You want to use NER recommenders to predict named entities (see the "NER configuration details section"):
- First, select checkbox 'NER Recommenders'
- then, Choose the language that corresponding to your model
- then, Select the model and save
- Adjust the mapping
- Go to "Project workflow" > "Launch Ner"
- After the process, go to "correct named entities" to correct the predictions
You can add new labels to your existing schema via `Add new pair to mapping scope`.

- Now you can export your document !
Be careful if you remove a label from table, if your model has already made predictions or if you have started to correct document, all annotations will be destroyed.

## NER configuration details section
### NER configuration

The NER framework that semantic@ use is SpaCy.
Currently, Semantic@ uses the NER SpaCy framework, in the future other frameworks may be integrated.

By default, the platform provides two small pre-trained model for French and English.
When installing the Semantic@, two pre-trained models for French (fr_core_news_sm) and English (en_core_web_sm) are already available

For add new [SpaCy pre-trained model](https://spacy.io/usage/models) :
For add a new available [SpaCy pre-trained model](https://spacy.io/usage/models), before starting Semantic@, launch in terminal :

```bash
python -m spacy download <name-pretrained-model>
```

The SpaCy pre-trained language are sometimes slow and too generic for your data, you
can use your own trained model, place your NER model folder under `/instance_config/my_features/my_models/`
The new pre-trained model will be directly available in model list from `configuration`.

Sometimes, SpaCy’s default in-built pre-trained NER model are too slow and too generic for your data (the model is far from perfect so it doesn't necessarily detect your labels).
If you have training a better statistical NER model with SpaCy, you can place your NER model folder under `/instance_config/my_features/my_models/`

Your model will be directly available in model list from `configuration`.

### Export

There are different XML export solutions :

- `annotations inline (based on characters offsets)` (TEI specific): This export uses standoff converter and uses the positions of annotations in the text to produce output. It is precise but sometimes it takes time.
- `annotations to controlaccess level` (EAD specific): This export tags annotations in a level of type <controlaccess>.
- `annotations inline (based on surface form matching)` (TEI & EAD): This export uses the surface shape of annotated mentions to tag the output. It is fast but sometimes less precise.


- `annotations in JSON`: This export allows you to keep track of your annotations in a JSON format and import it directly into the annotation view.

## :crying_cat_face: Bug reports

Feel free to create a new issue (new features, bug reports, documentation etc.).

## :computer: Stack

### Interface
Expand All @@ -112,8 +154,26 @@ can use your own trained model, place your NER model folder under `/instance_con

### Main Components

## :bust_in_silhouette: Authors
- [![Spacy](https://img.shields.io/badge/NLP%20with-SpaCy-blue)](https://spacy.io/)

- [![RecogitoJS](https://img.shields.io/badge/Text%20annotation%20with-RecogitoJS-9cf)](https://github.com/recogito/recogito-js)

- [![Standoffconverter](https://img.shields.io/badge/Annotations%20in%20TEI%20with-StandoffConverter-red)](https://github.com/standoff-nlp/standoffconverter)

## :bust_in_silhouette: Mainteners

- [@Lucaterre](https://github.com/Lucaterre)


## How to cite

Please use the following citation:

@misc{terriel-2022-semanticat,
title = "Semantic@ : a semantic annotation platform for enriching XML documents in TEI or EAD schemas with semantic annotations.",
author = "Terriel, Lucas",
year = "2022",
url = "https://github.com/Lucaterre/semanticat",
}

[![forthebadge made-with-python](http://ForTheBadge.com/images/badges/made-with-python.svg)](https://www.python.org/)
7 changes: 3 additions & 4 deletions app/views/projectHandler.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,8 @@
redirect,
render_template,
flash,
abort)
abort,
url_for)

from app.config import (app,
db)
Expand Down Expand Up @@ -36,8 +37,6 @@ def index():
db.session.commit()
flash(f'New project {project_name} created now.',
category='success')
return render_template('main/project.edition.html',
projects=Project.return_all_projects())
else:
flash('Project always exist ! Please change the name.', category='warning')
return render_template('main/project.edition.html',
Expand All @@ -53,6 +52,6 @@ def remove_project(project_id):
db.session.delete(project)
db.session.commit()
flash(f'Project : {project.project_name} completely removed.', category='warning')
return redirect("/"), 200
return redirect(url_for('index'))
else:
abort(404, description="Project not found")
Binary file added documentation/semanticat_demo.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 0bbb5b1

Please sign in to comment.