This project uses the EfficientNetB0 CNN architecture for image classification of the American Sign Language (ASL) alphabet and 0-9 digits, with real-time detection support.
Install all required libraries from the requirements.txt file:
pip install -r requirements.txtDownload and extract the ASL dataset using the following command:
gdown https://drive.google.com/uc\?id\=1b0-MLad_AcVvbocCk7RUB2XH5Xbr7L3xsudo apt install rar
unrar asl.rarBefore training, set up the COMET_API_KEY in a .env file inside the neuralnet directory to log metrics.
To train the model, run:
python3 ASL-Alphabet-Detection/neuralnet/train.pyHyperparameter configurations are available in
train.py.
To run the real-time detection demo:
python3 detect.pyNote: If you don't have a webcam, you can use the DroidCam app to turn your mobile phone into a webcam. Logs will be saved in the
action_handler.logfile.
You can use the pre-trained model best_model.pth located in the assets/ directory to perform inference.
| Loss Curves | Accuracy Curves |
|---|---|
The best model was selected based on the highest test accuracy and was trained for 25 epochs, with the best results at epoch 22.
| Train Loss | Test Loss | Train Accuracy | Test Accuracy |
|---|---|---|---|
| 0.052 | 0.028 | 0.984 | 0.990 |
Note
Since the number of class labels was large and the test set was randomly sampled, not all labels were included in the evaluation. As a result, some labels may be missing from the confusion matrix.
| Confusion Matrix |
|---|
Feel free to report any issues you encounter.
Don't forget to ⭐ the repo!