Skip to content

Latest commit

 

History

History
73 lines (49 loc) · 3.58 KB

README.md

File metadata and controls

73 lines (49 loc) · 3.58 KB

nr-vqa-vmaf

The code in this repository was a part of a Bachelor thesis project at KTH. Some of the code in this respitory may be deprecated or test cases not used for attaining the results. Write to https://github.com/Kajlid/nr-vqa-vmaf/issues for questions.

Main dependencies:

  • pip3 install torch torchvision torchview
  • pip3 install matplotlib
  • pip3 install ffmpeg_quality_metrics
  • pip install scikit-image

ffmpeg_quality_metrics was used to calculate vmaf.

Structure:

  • main.ipynb [File] Main working file. Here all code for the project can be found.

  • data [Folder] The 87 videos in the dataset are movie trailers from SF Anytime: https://www.sfanytime.com/sv?gad_source=1&gclid=CjwKCAjwoPOwBhAeEiwAJuXRh6oWU7xujbrRTNWjHI0TJWMGzKvdCIxuh3fRfquPWEXj0LysFCYgdRoC60UQAvD_BwE.

    • trailers_train [Folder] the 70 trailers reserved for the dataset that the neural network was trained on.

    • trailers_test [Folder] the 17 trailers reserved for the dataset that the neural network was tested on.

  • compressed_data2: [Folder] 70 videos from the training data that has gotten a random compression level.

  • compressed_data_ALL: [Folder] All videos in the train set (70 videos), where each video has been compressed into three differently compressed versions, where the constant rate factors were set to 23, 37 and 51.

  • compressed_TEST_videos: [Folder] 17 videos from the test data that has gotten a random compression level.

  • compressed_TEST_videos_ALL: [Folder] All videos in the test set (17 videos), where each video has been compressed into the three differently compressed versions, same as compressed_data_ALL above.

  • images_train: [Folder] Reference images (train set), each 250th frame from trailers_train.

  • images_TEST: [Folder] Reference images (test set), each 250th frame from trailers_test.

  • images_train_compressed2: [Folder] Distorted images (train set), each 250th frame from compressed_data2

  • images_TEST_compressed: [Folder] Distorted images (test set), each 250th frame from compressed_TEST_videos

  • images_train_crop_png: [Folder] Cropped "patches" of images from images_train, contains information in the file name of which part of the image each patch belongs to, for analysis. Used for VMAF calculation (reference images).

  • images_train_comp_crop_png: [Folder] Cropped "patches" of images from images_train_compressed2, contains information in the file name of which part of the image each patch belongs to, for analysis. Used for VMAF calculation (distorted images) and for creating the dataset used for training the neural network.

  • images_TEST_crop_png: [Folder] Cropped "patches" from images_TEST. Only used for VMAF calculation (reference) which will be used as ground truth quality labels.

  • images_TEST_comp_crop_png:[Folder] Cropped "patches" from images_TEST_compressed. Used for VMAF calculation (distorted images) and for creating the dataset used for testing the neural network.

  • vmaf_values_train.csv [File] Calculated VMAF values, train set and corresponding compressed image patches.

  • vmaf_values_TEST.csv [File] Calculated VMAF values, test set and corresponding compressed image patches.

  • vmaf_on_full_frames_train.csv [File] Only used for analysis.

Other files and folders found in the repository were either created by running code or used for tests/not used at all. For example, inference.csv was created as a result of testing the DenseNet model and contains information on inferreing the model performance.