Author: Xuetao Ma
This project implements a lightweight human action recognition pipeline using Motion History Images (MHI) and Hu moments as features, followed by classification using Random Forest. The goal is to recognize six human actions from the KTH dataset: boxing, handclapping, handwaving, jogging, running, and walking.
Unlike deep learning approaches, this project uses interpretable, handcrafted features and traditional machine learning, achieving decent performance while remaining easy to understand and fast to compute.
Dataset is from: https://web.archive.org/web/20190901190223/http://www.nada.kth.se/cvap/actions/
.
├── Visualization.ipynb
├── best_params.txt
├── config.py
├── config_optimal.py
├── final_predict.py
├── final_prediction_set
├── mhi_utils.py
├── models
├── predict_sample.py
├── readme.md
├── requirements.txt
├── run.py
├── sample_set
├── train_set
│ ├── boxing
│ ├── handclapping
│ ├── handwaving
│ ├── jogging
│ ├── running
│ └── walking
- Motion History Image (MHI) is computed from frame differencing with a binary threshold θ and decay constant τ.
- From each MHI, 7 Hu moments are computed to extract translation, scale, and rotation-invariant motion features.
- A Random Forest classifier is trained on these features for classification.
- Models are trained across a grid of θ and τ values.
- A separate prediction set (
final_prediction_set/) is used to evaluate model accuracy.
pip install -r requirements.txt- Place 6 labeled videos under
sample_set/to test prediction output. - Place ~10 or more videos under
final_prediction_set/for accuracy evaluation.
- Open
run.pyand set your desired range ofTauandThreshold. - Then run:
python run.pyThis will train and save models in /models/ and log results to best_params.txt.
- Open
Visualization.ipynbin Jupyter - Run each cell to visualize accuracy heatmaps and identify the best-performing model configurations.
If you would like to check performance of one model, change the parameters under config_optimal.py Then run:
python final_predict.py