diff --git a/README.md b/README.md index aa36d3e..22afef5 100644 --- a/README.md +++ b/README.md @@ -1,4 +1,246 @@ -# tello-gesture-control +# DJI Tello Hand Gesture control + +The main goal of this project is to control the drone using hand gestures without any gloves or additional equipment. +Just camera on the drone or your smartphone(soon), laptop and human hand.
+ +demo_gif + + +## Index +1. [Introduction](#Introduction) +2. [Setup](#Setup) + 1. [Install pip packages](#1.-Installing-pip-packages) + 2. [Connect and test Tello](#2.-Connect-Tello) +3. [Usage](#Usage) + * [Keyboard control](##Keyboard-control) + * [Gesture control](#Gesture-control) +4. [Adding new gestures](#Adding-new-gestures) + * [Technical description](#Technical-details-of-gesture-detector) + * [Creating dataset](#Creating-dataset-with-new-gestures) + * [Retrain model](#Notebook-for-retraining-model) +5. [Repository structure](#Repository-structure) + +## Introduction +This project relies on two main parts - DJI Tello drone and Mediapipe fast hand keypoints recognition. + +DJI Tello is a perfect drone for any kind of programming experiments. It has a rich Python API (also Swift is available) which helps to almost fully control a drone, create drone swarms and utilise its camera for Computer vision. + +Mediapipe is an amazing ML platform with many robust solutions like Face mesh, Hand Keypoints detection and Objectron. Moreover, their model can be used on the mobile platforms with on-device acceleration. + +Here is a starter-pack that you need: + +starter_pack + +## Setup +### 1. Installing pip packages +First, we need to install python dependencies. Make sure you that you are using `python3.7` + +List of packages +```sh +ConfigArgParse == 1.2.3 +djitellopy == 1.5 +numpy == 1.19.3 +opencv_python == 4.5.1.48 +tensorflow == 2.4.1 +mediapipe == 0.8.2 +``` + +Install +```sh +pip3 install -r requirements.txt +``` +### 2. Connect Tello +Turn on drone and connect computer to its WiFi + +wifi_connection + + +Next, run the following code to verify connectivity + +```sh +python3 tests/test_connection.py +``` + +On successful connection + +```json +1. Connection test: +Send command: command +Response: b'ok' + + +2. Video stream test: +Send command: streamon +Response: b'ok' +``` + +If you get such output, you may need to check your connection with the drone + +```json +1. Connection test: +Send command: command +Timeout exceed on command command +Command command was unsuccessful. Message: False + + +2. Video stream test: +Send command: streamon +Timeout exceed on command streamon +Command streamon was unsuccessful. Message: False +``` + +## Usage +The most interesting part is demo. There are 2 types of control: keyboard and gesture. You can change between control types during the flight. Below is a complete description of both types. + +Run the following command to start the tello control : + +```sh +python3 main.py +``` + +This script will start the python window with visualization like this: + +window + + +### Keyboard control +(To control the drone with your keyboard, first press the `Left Shift` key.) + +The following is a list of keys and action description - + +* `k` -> Toggle Keyboard controls +* `g` -> Toggle Gesture controls +* `Left Shift` -> Take off drone #TODO +* `Space` -> Land drone +* `w` -> Move forward +* `s` -> Move back +* `a` -> Move left +* `d` -> Move right +* `e` -> Rotate clockwise +* `q` -> Rotate counter-clockwise +* `r` -> Move up +* `f` -> Move down +* `Esc` -> End program and land the drone + + +### Gesture control + +By pressing `g` you activate gesture control mode. Here is a full list of gestures that are available now. + +gestures_list + +## Adding new gestures +Hand recognition detector can add and change training data to retrain the model on the own gestures. But before this, +there are technical details of the detector to understand how it works and how it can be improved +### Technical details of gesture detector +Mediapipe Hand keypoints recognition is returning 3D coordinated of 20 hand landmarks. For our +model we will use only 2D coordinates. + +gestures_list + + +Then, these points are preprocessed for training the model in the following way. + +preprocessing + + +After that, we can use data to train our model. Keypoint classifier is a simple Neural network with such +structure + +model_structure + + + +_check [here](#Grid-Search) to understand how the architecture was selected_ +### Creating dataset with new gestures +First, pull datasets from Git LFS. [Here](https://github.com/git-lfs/git-lfs/wiki/Installation) is the instruction of how +to install LFS. Then, run the command to pull default csv files +```sh +git lfs install +git lfs pull +``` + +After that, run `main.py` and press "n" to enter the mode to save key points +(displayed as **MODE:Logging Key Point**) + +writing_mode + + +If you press "0" to "9", the key points will be added to [model/keypoint_classifier/keypoint.csv](model/keypoint_classifier/keypoint.csv) as shown below.
+1st column: Pressed number (class ID), 2nd and subsequent columns: Keypoint coordinates + +keypoints_table + +In the initial state, 7 types of learning data are included as was shown [here](#Gesture-control). If necessary, add 3 or later, or delete the existing data of csv to prepare the training data. +### Notebook for retraining model +[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/kinivi/tello-gesture-control/blob/main/Keypoint_model_training.ipynb) + +Open [Keypoint_model_training.ipynb](Keypoint_model_training.ipynb) in Jupyter Notebook or Google Colab. +Change the number of training data classes,the value of **NUM_CLASSES = 3**, and path to the dataset. Then, execute all cells +and download `.tflite` model + +notebook_gif + + +Do not forget to modify or add labels in `"model/keypoint_classifier/keypoint_classifier_label.csv"` + +#### Grid Search +❗️ Important ❗️ The last part of the notebook is an experimental part of the notebook which main functionality is to test hyperparameters of the model structure. In a nutshell: grid search using TensorBoard visualization. Feel free to use it for your experiments. + + +grid_search + + +## Repository structure +
+│  main.py
+│  Keypoint_model_training.ipynb
+│  config.txt
+│  requirements.txt
+│  
+├─model
+│  └─keypoint_classifier
+│      │  keypoint.csv
+│      │  keypoint_classifier.hdf5
+│      │  keypoint_classifier.py
+│      │  keypoint_classifier.tflite
+│      └─ keypoint_classifier_label.csv
+│ 
+├─gestures
+│   │  gesture_recognition.py
+│   │  tello_gesture_controller.py
+│   └─ tello_keyboard_controller.py
+│          
+├─tests
+│   └─connection_test.py
+│ 
+└─utils
+    └─cvfpscalc.py
+
+### app.py +Main app which controls the functionality of drone control and gesture recognition
+App also includes mode to collect training data for adding new gestures.
+ +### keypoint_classification.ipynb +This is a model training script for hand sign recognition. + +### model/keypoint_classifier +This directory stores files related to gesture recognition.
+ +* Training data(keypoint.csv) +* Trained model(keypoint_classifier.tflite) +* Label data(keypoint_classifier_label.csv) +* Inference module(keypoint_classifier.py) + +### gestures/ +This directory stores files related to drone controllers and gesture modules.
+ +* Keyboard controller (tello_keyboard_controller.py) +* Gesture controller(tello_keyboard_controller.py) +* Gesture recognition module(keypoint_classifier_label.csv) + +### utils/cvfpscalc.py +Module for FPS measurement. # TODO - [ ] Motion gesture support (LSTM) diff --git a/config.txt b/config.txt index cc6f06f..267a3f1 100644 --- a/config.txt +++ b/config.txt @@ -1,6 +1,7 @@ device = 0 width = 960 height = 540 -min_detection_confidence = 0.5 +min_detection_confidence = 0.7 min_tracking_confidence = 0.5 +buffer_len = 5 is_keyboard = True \ No newline at end of file diff --git a/gestures/gesture_recognition.py b/gestures/gesture_recognition.py index 3660159..5e95abe 100644 --- a/gestures/gesture_recognition.py +++ b/gestures/gesture_recognition.py @@ -458,12 +458,12 @@ def _draw_info_text(self, image, brect, handedness, hand_sign_text, cv.putText(image, info_text, (brect[0] + 5, brect[1] - 4), cv.FONT_HERSHEY_SIMPLEX, 0.6, (255, 255, 255), 1, cv.LINE_AA) - if finger_gesture_text != "": - cv.putText(image, "Finger Gesture:" + finger_gesture_text, (10, 60), - cv.FONT_HERSHEY_SIMPLEX, 1.0, (0, 0, 0), 4, cv.LINE_AA) - cv.putText(image, "Finger Gesture:" + finger_gesture_text, (10, 60), - cv.FONT_HERSHEY_SIMPLEX, 1.0, (255, 255, 255), 2, - cv.LINE_AA) + # if finger_gesture_text != "": + # cv.putText(image, "Finger Gesture:" + finger_gesture_text, (10, 60), + # cv.FONT_HERSHEY_SIMPLEX, 1.0, (0, 0, 0), 4, cv.LINE_AA) + # cv.putText(image, "Finger Gesture:" + finger_gesture_text, (10, 60), + # cv.FONT_HERSHEY_SIMPLEX, 1.0, (255, 255, 255), 2, + # cv.LINE_AA) return image diff --git a/gestures/tello_gesture_controller.py b/gestures/tello_gesture_controller.py index c31cabb..da3824e 100644 --- a/gestures/tello_gesture_controller.py +++ b/gestures/tello_gesture_controller.py @@ -1,5 +1,6 @@ from djitellopy import Tello + class TelloGestureController: def __init__(self, tello: Tello): self.tello = tello @@ -16,23 +17,33 @@ def gesture_control(self, gesture_buffer): print("GESTURE", gesture_id) if not self._is_landing: - if gesture_id == 0: + if gesture_id == 0: # Forward self.forw_back_velocity = 30 - elif gesture_id == 1: + elif gesture_id == 1: # STOP + self.forw_back_velocity = self.up_down_velocity = \ + self.left_right_velocity = self.yaw_velocity = 0 + if gesture_id == 5: # Back self.forw_back_velocity = -30 - elif gesture_id == 2: - self.forw_back_velocity = 0 - elif gesture_id == 3: + + elif gesture_id == 2: # UP + self.up_down_velocity = 25 + elif gesture_id == 4: # DOWN + self.up_down_velocity = -25 + + elif gesture_id == 3: # LAND self._is_landing = True self.forw_back_velocity = self.up_down_velocity = \ self.left_right_velocity = self.yaw_velocity = 0 self.tello.land() + + elif gesture_id == 6: # LEFT + self.left_right_velocity = 20 + elif gesture_id == 7: # RIGHT + self.left_right_velocity = -20 + elif gesture_id == -1: self.forw_back_velocity = self.up_down_velocity = \ self.left_right_velocity = self.yaw_velocity = 0 self.tello.send_rc_control(self.left_right_velocity, self.forw_back_velocity, self.up_down_velocity, self.yaw_velocity) - - - diff --git a/main.py b/main.py index efe6024..f729a19 100644 --- a/main.py +++ b/main.py @@ -29,6 +29,9 @@ def get_args(): parser.add("--min_tracking_confidence", help='min_tracking_confidence', type=float) + parser.add("--buffer_len", + help='Length of gesture buffer', + type=int) args = parser.parse_args() @@ -64,7 +67,7 @@ def main(): # Take-off drone - # tello.takeoff() + tello.takeoff() cap = tello.get_frame_read() @@ -74,7 +77,7 @@ def main(): gesture_detector = GestureRecognition(args.use_static_image_mode, args.min_detection_confidence, args.min_tracking_confidence) - gesture_buffer = GestureBuffer(buffer_len=5) + gesture_buffer = GestureBuffer(buffer_len=args.buffer_len) def tello_control(key, keyboard_controller, gesture_controller): global gesture_buffer @@ -137,7 +140,7 @@ def tello_battery(tello): # Battery status and image rendering cv.putText(debug_image, "Battery: {}".format(battery_status), (5, 720 - 5), cv.FONT_HERSHEY_SIMPLEX, 1, (0, 0, 255), 2) - cv.imshow('Hand Gesture Recognition', debug_image) + cv.imshow('Tello Gesture Recognition', debug_image) tello.land() tello.end() diff --git a/model/keypoint_classifier/keypoint_classifier_label.csv b/model/keypoint_classifier/keypoint_classifier_label.csv index 3198d4b..0969a59 100644 --- a/model/keypoint_classifier/keypoint_classifier_label.csv +++ b/model/keypoint_classifier/keypoint_classifier_label.csv @@ -1,7 +1,7 @@ Forward Stop Up -OK +Land Down Back Left diff --git a/tests/connection_test.py b/tests/connection_test.py new file mode 100644 index 0000000..d9be227 --- /dev/null +++ b/tests/connection_test.py @@ -0,0 +1,14 @@ +from djitellopy import Tello + +if __name__ == '__main__': + + print('1. Connection test:') + tello = Tello() + tello.connect() + print('\n') + + print('2. Video stream test:') + tello.streamon() + print('\n') + + tello.end()