This project tries to detect intros of tv series by comparing two episodes of the same series and trying to find the largest common subset of frames with a bit of fuzziness.
- Install dependencies from
requirements.txt
- Modify
file_paths
variable inmain.py
and add the absolute path to at least two episodes of the same season
Only work if the intro of an episode is similar / identical from episode to episode.
And the intro indeed is the longest sequence in the first quarter of two episodes (should be in all cases)
Each frame from the first quarter of each episode is extracted and a hash (https://pypi.org/project/ImageHash/) is made on the frame. Each frame hash is added to a long video hash.
In pairs the longest identical string is searched from two video hashes.
Assumption: this is the intro
- Dont extract every frame from video - does not speed up fingerprinting. Seeking in a file is slow
- Make educated guesses on which parts to fingerpint. At the moment the first quarter of an episode is fingerprinted. Might be to much for longer episodes. etc.
- Create a fingerprint that works for the whole season instead of finding the same fingerprint for every file.