JordanSearch is my final project for CS 499 that focuses on parsing and indexing audio and visual content within videos so users can locate content within a large corpus of videos.
JordanSearch uses ffmpeg, Vosk, ElasticSearch, and ImageAI.
-
Download your desired Vosk Model and your ImageAI model and place it in the
modelsdirectory. I usedvosk-model-en-us-0.42-gigaspeechandYOLOv3in my development. If you use different models, you will need to changevosk_model_path,imageai_model_path, and the ImageAI model type inaudio_parser.pyandimage_parser.pyrespectively. -
Place all video files in the
inputfolder. They must be.mp4files. -
Run ElasticSearch with
docker-compose up -d. -
Run
main.pywith the-pflag to parse all video files. -
Now, you can search for queries and if will return the top results including the file name and timestamp.
-
As long as you are using the same ElasticSearch instance, re-running
main.pywithout the-pflag will skip parsing and begin query entry. -
If you use the
-fflag, you will search for full videos rather than finding specific timestamps within those videos.
-
Implement opening up the source files at the chosen timestamps
-
Implement a GUI
-
Use an LLM to generate keywords summarizing what the audio is about, so users don't have to search by exact match of content
-
Use an audio model that can also parse sound effects, rather than just dialogue
-
Use an image model that can identify more than just 80 items