Thanks to visit codestin.com
Credit goes to github.com

Skip to content

A CNN-powered non-cat frame remover for videos

harrisonstark/only-cats

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 

Repository files navigation

only-cats

A CNN-powered non-cat frame remover for videos

Purpose

Our program "Only Cats" takes in a video as input and returns an edited version of it which only includes frames where a cat is visible.

Function

A convolutional neural network (CNN) is used to detect cats in the video. This particular CNN implements the EfficientNetB0 network, which is pre-trained on the ImageNet dataset. Extra layers are added which are trained via a separate dataset called "natural images". ImageNet and the natural images dataset both include cats, so instead of classifying images with respect to the given categories in each dataset, our CNN can simply detect whether the image is a cat or not. After validating that the network is reasonably reliable, it is then applied to the detection algorithm. The video input is represented as a NumPy array of RGB images which are the video's frames. To allow input to be of any size, each frame is split into 224x224 square subimages (the size of input that EfficientNetB0 was trained on). If the subimages do not divide evenly, the right-most and/or bottom-most square is taken. After the network classifies every subimage of a frame, the program sees if there is a subimage that was classified as having a cat with a certain threshold of confidence (tuned to 99% in our final test) If this is true, the entire frame is classified as having a cat and is included in the output video. The final step of the program seeks to nullify any strange output by the network. For example, if there is a short interval of frames in which a cat is not detected between several frames where a cat is detected, the frames are included in the final video; it does not make sense for the cat to be gone for just a few frames ( 30ths of a second for a 30 frames per second video). After these adjustments are made, the program outputs and saves the cropped video.

Challenges and Solutions

The biggest challenge for this project was unit testing. It was quite clear the goal of the program and the steps that had to be taken. On the other hand, it was difficult to test the functionality of the small functions created along the way. For example, making a buffering system for the few frames that did not contain a cat was quite difficult to test as it would take approximately 1 minute for each second of video to be processed into a NumPy array of “cat-containing” frames. Then, the function may be tested. Additionally, it was immensely hard to test the function that broke images into subimages. To solve this, single images were used in place of videos for unit testing, dropping the time-to-test by nearly 99% in some cases. Another challenge faced was using new libraries like skvideo for I/O of mp4 and NumPy array conversions. Overcoming the challenges was as simple as utilizing Google and StackOverflow to see how others implemented the functions from the libraries in similar ways. This strategy was also taken in learning how to implement Google Drive in Colab to facilitate the I/O of all of our data, including the model weights.

Applications and Further Research

Applications for this program would likely be related to environmental research, where researchers would capture hours of footage in an environment and extract only the parts of the video where a certain animal appears, allowing researchers to observe animal behavior in a more time efficient manner. Another similar application would be to find endangered/officially extinct species, although it may be difficult to train the model to detect species that are not well-documented in picture.

Final Videos

About

A CNN-powered non-cat frame remover for videos

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •