3D Object Recognition from 2D Projection

Overview

In this project, we aim to recognize the shape of a 3D object based on an image of its shadow. The idea was inspired by the video "The average area of a shadow" by 3Blue1Brown.

This project explores multiple synthetic 3D shapes and uses machine learning models, including logistic regression and convolutional neural networks (CNNs) among others, to classify objects based on their shadows.

Contributors

Liam Farhan (Formerly Mohamad Al Farhan)
Emil Reiter
Mahmoud Sharaf

Project Structure

Data Generation and Preprocessing

We created a dataset to train the models. The shapes are all synthetic.

3D Shapes: We generated 3D meshes for several shapes, including:
- Cube
- Pyramid
- Tetrahedron
- Diamond
- Dodecahedron
- Icosahedron
- Torus
- Cylinder
- Cow
- Human

Projection: We used perspective projection to create shadows. In this method, the light source is replaced by a viewer, and the objects are projected onto a screen.
For each object, we generated hundreds of shadow images by randomly orienting the object and projecting it onto the screen.

Preprocessing:
- Edge detection was applied by comparing pixel values to identify object boundaries and generate images containing only the edges.
- Data was compressed and cast into boolean values. To handle memory limitations, all images were resized to 200x200.

Machine Learning Models

1. Multinomial Logistic Regression

Training: The model was trained with L2 regularization.
Accuracy: Achieved a classification accuracy of approximately 60%, which improved slightly (to 62%) when using edge-detected data.

2. Deep Neural Networks (DNNs)

Performance: The model performed better on full shapes than edge-detected data. Using dropout regularization, a learning rate of 0.00001, and the Adam optimizer, the best results were achieved with a batch size of 40.

3. Convolutional Neural Networks (CNNs)

CNN architectures, such as VGG and LeNet, were used with varying configurations.
Results:
- The highest accuracy of 97.58% was achieved using a Conv-Conv-Pool architecture.
- CNNs proved to be superior to DNNs for this task.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
3D Mesh		3D Mesh
Data		Data
Images		Images
All_shapes__CNN_.ipynb		All_shapes__CNN_.ipynb
CNN__Clean_code_.ipynb		CNN__Clean_code_.ipynb
FD_and_KNN.ipynb		FD_and_KNN.ipynb
LICENSE		LICENSE
MLHO.pdf		MLHO.pdf
README		README
README.md		README.md
Random_Forest__clean_code_.ipynb		Random_Forest__clean_code_.ipynb
generating_samples.ipynb		generating_samples.ipynb
pyvista_Demo.ipynb		pyvista_Demo.ipynb
test		test
testfile		testfile

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

3D Object Recognition from 2D Projection

Overview

Contributors

Project Structure

Data Generation and Preprocessing

Machine Learning Models

1. Multinomial Logistic Regression

2. Deep Neural Networks (DNNs)

3. Convolutional Neural Networks (CNNs)

About

Releases

Packages

Languages

License

elfarhan/3D-object-recognition-from-2D-Projection

Folders and files

Latest commit

History

Repository files navigation

3D Object Recognition from 2D Projection

Overview

Contributors

Project Structure

Data Generation and Preprocessing

Machine Learning Models

1. Multinomial Logistic Regression

2. Deep Neural Networks (DNNs)

3. Convolutional Neural Networks (CNNs)

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages