Skip to content

tulane-cmps6730/project-vqa

Repository files navigation

CMPS 6730 Project: Visual Question Answering

Goal, methods, results:

The goal of this project aims to explore method for solving visual question answering problem. The model takes an image and question text as input and select an answer from answer space that contains all possible answers. Two baseline models and one complex model are used to test their performance on a vqa dataset that contains around 9974 training samples and 2494 testing samples.

Key components:

  • Dataset: load and preprocessing dataset
  • Input: tokenization and word embedding of question text
  • Model: integrating feature of language and visual model to get image-text understanding

This repository contains starter code for the final project in CMPS 4730/6730: Natural Language Processing at Tulane University.

The code in this repository will be copied into your team's project repository at the start of class to provide a starting point for your project.

You should edit this file to include a summary of the goals, methods, and conclusions of your project.

The structure of the code supports the following:

  • A simple web UI using Flask to support a demo of the project
  • A command-line interface to support running different stages of the project's pipeline
  • The ability to easily reproduce your work on another machine by using virtualenv and providing access to external data sources.

Using this repository

  • At the start of the course, students will be divided into project teams. Each team will receive a copy of this starter code in a new repository. E.g.: https://github.com/tulane-cmps6730/project-alpha
  • Each team member will then clone their team repository to their personal computer to work on their project. E.g.: git clone https://github.com/tulane-cmps6730/project-alpha
  • See GettingStarted.md for instructions on using the starter code.

Contents

  • docs: template to create slides for project presentations
  • nlp: Python project code
  • notebooks: Jupyter notebooks for project development and experimentation
  • report: LaTeX report
  • tests: unit tests for project code

Background Resources

The following will give you some technical background on the technologies used here:

  1. Refresh your Python by completing this online tutorial: https://www.learnpython.org/ (3 hours)
  2. Create a GitHub account at https://github.com/
  3. Setup git by following https://help.github.com/en/articles/set-up-git (30 minutes)
  4. Learn git by completing the Introduction to GitHub tutorial, reading the git handbook, then completing the Managing merge conflicts tutorial (1 hour).
  5. Install the Python data science stack from https://www.anaconda.com/distribution/ . We will use Python 3 (30 minutes)
  6. Complete the scikit-learn tutorial from https://www.datacamp.com/community/tutorials/machine-learning-python (2 hours)
  7. Understand how python packages work by going through the Python Packaging User Guide (you can skip the "Creating Documentation" section). (1 hour)
  8. Complete Part 1 of the Flask tutorial, which is the library we will use for making a web demo for your project.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published