Skip to content

Commit

Permalink
Merge pull request #305 from UHHRobotics22-23/Flova-patch-1
Browse files Browse the repository at this point in the history
Update vision README.md
  • Loading branch information
ateRstones authored Oct 17, 2023
2 parents 8c61a67 + 80782fb commit e494af2
Show file tree
Hide file tree
Showing 2 changed files with 12 additions and 3 deletions.
7 changes: 5 additions & 2 deletions marimbabot_vision/README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
# TAMS Master Project 2022/2023 - Vision

This part of the repository stores all the code to run and train the transformer based end-to-end vision pipeline.
The latest trained model can be found on [Huggingface](https://huggingface.co/Flova/omr_transformer), which also allows a simple browser based demo.

## Scripts
For more information on the usage of the scripts, please refer [README](../marimbabot_vision/scripts/README.md).

Expand All @@ -10,8 +13,8 @@ This will create a dataset of N samples. The dataset is saved in the `data`, `da
## Src
### [vision_node.py](src/vision_node.py)

This ROS node is responsible for processing images from a camera source and recognizing notes in the images using a pre-trained model. It converts the image data into a textual representation of recognized musical notes and publishes them as ROS messages.
This ROS node is responsible for processing images from a camera source and recognizing notes in the images using a pre-trained model. It converts the image data into a textual LilyPond representation and publishes them as ROS messages.


### [visualization_node.py](src/visualization_node.py)
The ROS node receives recognized notes from the vision_node and generates visual representations of the musical notations. It uses the LilyPond library to create musical staff notation and publishes the resulting images as ROS messages for visualization.
The ROS node receives recognized notes from the vision_node and generates visual representations of the musical notations. It uses the LilyPond library to create musical staff notation and publishes the resulting images as ROS messages for visualization.
8 changes: 7 additions & 1 deletion marimbabot_vision/scripts/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,4 +51,10 @@ Trains a model on the a set of given `train_data_paths`.
Trains the tokenizer on all text files defined by a glob expression.

### `detect.py`
This script is used for live detection of notes. A trained model can be used to initialize. The current model is stored at HuggingFace and its path/name is set by the `MODEL_PATH` parameter inside `config/vision_node.yaml` The detected notes are shown in a window.
This script is used for detection of notes in given image file. The current model stored at HuggingFace will be downloaded/used by default. The detected notes are printed to the terminal.

### `attention_viz.py`
The script is similar to the `detect.py` script, but shows the cross-attention to the image encoder as a heatmap.

### `eval.py`
This script can be used to evaluate a given model on a given dataset.

0 comments on commit e494af2

Please sign in to comment.