Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
LeeBY68 authored Dec 25, 2023
1 parent 0a62a85 commit 8b47472
Showing 1 changed file with 120 additions and 15 deletions.
135 changes: 120 additions & 15 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,34 +1,43 @@
# TextSLAM (coming soon)
# TextSLAM: Visual SLAM with Semantic Planar Text Features

**Authors**: [Boying Li](https://leeby68.github.io/), [Danping Zou](https://drone.sjtu.edu.cn/dpzou/), Yuan Huang, Xinghan Niu, Ling Pei and Wenxian Yu.

🏠 [[Project]](https://leeby68.github.io/TextSLAM/)
 
📝 [[Paper]](https://ieeexplore.ieee.org/abstract/document/10285400)
 
➡️ [[Dataset]](https://github.com/SJTU-ViSYS/TextSLAM-Dataset/)
 
🔧 [[Extra Evaluation Tool]](https://github.com/SJTU-ViSYS/SLAM_Evaluation/)


**Motivation:**

:star: TextSLAM is a novel visual Simultaneous Localization and Mapping system (SLAM) tightly coupled with semantic text objects.

:bulb: Humans can read texts and navigate complex environments using scene texts, such as road markings and room names. why not robots?

:star: TextSLAM explores scene texts as the basic feature both geometrically and semantically. It achieves superior performance even under challenging environments, such as image blurring, large viewpoint changes, and significant illumination variations (day and night).

In the project we provide:
- **C++ implementation of our TextSLAM system**
- **A large text-orientated dataset covering indoor and outdoor scenes with ground truth.**
This repository provides **C++ implementation of TextSLAM system**.

<img src="./pic/Image_TextSLAM_dataset.png" width ="800" align = "center" />
<div align='center'><em>Overview of TextSLAM dataset</em></div>
&emsp;
<img src="./pic/Image_TextSLAM.png" width ="800" align = "center" />
<em><div align='center'>Overview of TextSLAM</div></em>
&emsp;
&emsp;

The papers of the project:
- [TextSLAM: Visual SLAM with Semantic Planar Text Features](https://arxiv.org/abs/2305.10029)
Boying Li, Danping Zou, Yuan Huang, Xinghan Niu, Ling Pei, Wenxian Yu
- [TextSLAM: Visual SLAM with Planar Text Features](https://arxiv.org/abs/1912.05002)
Boying Li, Danping Zou, Daniele Sartori, Ling Pei, Wenxian Yu
Our **accompanying videos** are now available on YouTube (click below images to open) and Bilibili<sup>[1-outdoor](https://www.bilibili.com/video/BV1pe411B7kx/?spm_id_from=333.999.0.0&vd_source=404d99588f2e4c0ce1cca75ed492e620), [2-night](https://www.bilibili.com/video/BV1kC4y1M7tk/?spm_id_from=333.999.0.0&vd_source=404d99588f2e4c0ce1cca75ed492e620), [3-rapid](https://www.bilibili.com/video/BV1Au4y1T7DE/?spm_id_from=333.999.0.0&vd_source=404d99588f2e4c0ce1cca75ed492e620)</sup>.
<div align="center">
<a href="https://youtu.be/ug-FvJKTXJY" target="_blank"><img src="https://github.com/SJTU-ViSYS/TextSLAM/blob/main/pic/TextSLAM-frontPage.png" alt="video" width="32%" /></a>
<a href="https://youtu.be/PYrZ5kiIC0Q" target="_blank"><img src="https://github.com/SJTU-ViSYS/TextSLAM/blob/main/pic/TextSLAM-frontPage.png" alt="video" width="32%" /></a>
<a href="https://youtu.be/3Ml6070Hgd8" target="_blank"><img src="https://github.com/SJTU-ViSYS/TextSLAM/blob/main/pic/TextSLAM-frontPage.png" alt="video" width="32%" /></a>
</div>

Please consider citing the following papers in your publications if the project helps your research.
:star: Please consider citing the following papers in your publications if the project helps your work.
```
@article{li2023textslam,
title={TextSLAM: Visual SLAM with Semantic Planar Text Features},
author={Li, Boying and Zou, Danping and Huang, Yuan and Niu, Xinghan and Pei, Ling and Yu, Wenxian},
booktitle={arXiv preprint},
booktitle={IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)},
year={2023}
}
Expand All @@ -40,3 +49,99 @@ Please consider citing the following papers in your publications if the project
}
```

# Getting Start

## Dataset Download

Download the dataset from [**TextSLAM Dataset**](https://github.com/SJTU-ViSYS/TextSLAM-Dataset/).

## 1. Prerequisites

TextSLAM is run in Ubuntu 16.04. It should be easy to compile in other Linux system versions.

**1.1 Ceres & Eigen3**

Refer to [**Ceres**](http://ceres-solver.org/installation.html#linux) for installing it in Linux.

During the above process, **Eigen3** is also installed at the same time.

**1.2 OpenCV**

We use **OpenCV** 3.3.1 for image processing.

You can use the **OpenCV library** provided by ROS. Remember to set OpenCV_DIR in CMakeLists.txt using `set(OpenCV_DIR [ros_direction]/share/OpenCV-3.3.1-dev)`.

You can also refer to [**OpenCV**](http://opencv.org) to download and install the library.

**1.3 EVO (Evaluation)**

EVO is used for SLAM results evaluation. Refer to [**EVO**](https://github.com/MichaelGrupp/evo) to install this evaluation tool.

<!-- 2.2 Build-->
<!-- git clone & build -->
<!-- 2.3 Run -->
<!-- yaml, how to run, results meanings -->
## 2. Build and Run

**2.1 Clone the repository and build the project:**
```
git clone https://github.com/SJTU-ViSYS/TextSLAM.git
mkdir build
cd build
cmake ..
make -j
```
<!-- cmake .. -DCMAKE_BUILD_TYPE=Release -->
Above procedure will create an executable named TextSLAM.

**2.2 Run TextSLAM with:**
```
./TextSLAM [yaml_path]/[yaml_name].yaml
```
We provide yaml files (`GeneralMotion.yaml`,`AIndoorLoop.yaml`, `LIndoorLoop.yaml`, `Outdoor.yaml`) for our 4 kinds of experiments.
Write your sequence save path in 'Exp read path:' of the yaml file.

Refer to [TextSLAM Dataset](https://github.com/SJTU-ViSYS/TextSLAM-Dataset/) for a detail yaml file structure.

**2.3 Output:**
`keyframe_latest.txt` will output to record each keyframe pose estimation results in the current station.
`keyframe.txt` will output when finishing a sequence.
Both `keyframe_latest.txt` and `keyframe.txt` are in TUM format with ` timestamp tx ty tz qx qy qz qw `.

## 3. Evaluation

We use EVO to evaluate the SLAM performance.

**For APE evaluation:**
```
evo_ape tum gt.txt text.txt -va -s
```

**For RPE evaluation at the uint of 1.0 m:**
```
evo_rpe tum gt.txt text.txt -va -s --pose_relation trans_part -d 1.0 -u m
```

For the loop tests in a large indoor scene, add ` --n_to_align XX ` to align the first XX pose of the whole trajectory.
Because GT for this sequence is only at the beginning and the end, using the alignment for the first poses will get the more correct results.
```
evo_ape tum gt.txt text.txt -va -s --n_to_align XX
evo_rpe tum gt.txt text.txt -va -s --pose_relation trans_part -d 1.0 -u m --n_to_align XX
```

:heavy_exclamation_mark: **ATTENTION for RPE evaluation:** :heavy_exclamation_mark:

EVO does not automatically rectify the misalignment between the SLAM body frame and the ground-truth body, which influences RPE results.

To solve this problem, we provide an [extra Evaluation tool](https://github.com/SJTU-ViSYS/SLAM_Evaluation) for TextSLAM dataset, which also served as **a supplement for EVO**.

Following the instruction of the [extra Evaluation tool](https://github.com/SJTU-ViSYS/SLAM_Evaluation) to first obtain the **updated pose ground truth** file, and then use the updated GT file to evaluate the RPE results.

This step is necessary for **all data except outdoor sequences**.
We use COLMAP to generate outdoor sequences' ground truth, which generates the same ground truth frame as the SLAM estimated body frame.

# Acknowledgement

The authors thank [ORB-SLAM](https://github.com/raulmur/ORB_SLAM2), [DSO](https://github.com/JakobEngel/dso), and [AttentionOCR](https://github.com/zhang0jhon/AttentionOCR) for their excellent works.
The authors thank [EVO](https://github.com/MichaelGrupp/evo) for providing this convenient evaluation tool.
The authors thank [Ceres](http://ceres-solver.org) for providing this powerful optimization library.

0 comments on commit 8b47472

Please sign in to comment.