The goal of this project aims to explore method for solving visual question answering problem. The model takes an image and question text as input and select an answer from answer space that contains all possible answers. Two baseline models and one complex model are used to test their performance on a vqa dataset that contains around 9974 training samples and 2494 testing samples.
- Dataset: load and preprocessing dataset
- Input: tokenization and word embedding of question text
- Model: integrating feature of language and visual model to get image-text understanding
This repository contains starter code for the final project in CMPS 4730/6730: Natural Language Processing at Tulane University.
The code in this repository will be copied into your team's project repository at the start of class to provide a starting point for your project.
You should edit this file to include a summary of the goals, methods, and conclusions of your project.
The structure of the code supports the following:
- A simple web UI using Flask to support a demo of the project
- A command-line interface to support running different stages of the project's pipeline
- The ability to easily reproduce your work on another machine by using virtualenv and providing access to external data sources.
- At the start of the course, students will be divided into project teams. Each team will receive a copy of this starter code in a new repository. E.g.: https://github.com/tulane-cmps6730/project-alpha
- Each team member will then clone their team repository to their personal computer to work on their project. E.g.:
git clone https://github.com/tulane-cmps6730/project-alpha
- See GettingStarted.md for instructions on using the starter code.
- docs: template to create slides for project presentations
- nlp: Python project code
- notebooks: Jupyter notebooks for project development and experimentation
- report: LaTeX report
- tests: unit tests for project code
The following will give you some technical background on the technologies used here:
- Refresh your Python by completing this online tutorial: https://www.learnpython.org/ (3 hours)
- Create a GitHub account at https://github.com/
- Setup git by following https://help.github.com/en/articles/set-up-git (30 minutes)
- Learn git by completing the Introduction to GitHub tutorial, reading the git handbook, then completing the Managing merge conflicts tutorial (1 hour).
- Install the Python data science stack from https://www.anaconda.com/distribution/ . We will use Python 3 (30 minutes)
- Complete the scikit-learn tutorial from https://www.datacamp.com/community/tutorials/machine-learning-python (2 hours)
- Understand how python packages work by going through the Python Packaging User Guide (you can skip the "Creating Documentation" section). (1 hour)
- Complete Part 1 of the Flask tutorial, which is the library we will use for making a web demo for your project.