AERONET

Advanced Emotional Recognition and Observation Network

AERONET is an AI-powered, deep learning-based system designed to detect signs of mental distress in real time using multimodal inputs such as voice, facial expressions, and behavior patterns. It aids mental health professionals by providing remote psychological assessments without relying on wearable devices.

The Problem We Saw

The increasing prevalence of mental health disorders has placed immense pressure on healthcare systems. Major challenges include:

Shortage of trained mental health professionals
Social stigma preventing people from seeking help
Lack of real-time monitoring tools
Inefficient early detection of conditions like depression, anxiety, or emotional distress

Innovation

AERONET addresses these challenges by offering:

Real-time detection of emotional distress using audio-visual input
No wearables required — ideal for scalable public health screening
Visual display of emotional trends and severity levels
Easy integration for remote access and analysis by professionals

Research Questions

How can signs of mental distress be detected in real time?
How accurately can emotional states be classified?
Can speech and facial cues be enough to assess mental health conditions?
How far can vocal and facial distress signals be effectively detected?
How fast and reliably can the system perform across environments?

Features

Real-time facial expression and hand gesture tracking via MediaPipe
Custom-trained LSTM deep learning model for classification
Visual feedback showing predicted emotional/gesture class and probability
Simple and interactive GUI using OpenCV

Findings and Results

The model achieves 95% training accuracy and 87% validation accuracy, showing promising results for real-time emotional classification.
It successfully classifies between 10 emotional/gesture-based labels with high precision.
Misclassifications mainly occur between similar expressions such as "Neutral" and "Sad".
The system is robust across varying lighting conditions and webcam resolutions.
Audio-visual fusion helps improve classification reliability compared to using a single modality.
On real-world test data, it performs with over 80% consistency in emotional prediction across multiple users.
Works well without requiring any external wearable sensors, enhancing usability and accessibility.

Folder Structure

```bash
AERONET/
├── DataPreProcessed/
│   ├── A/
│   ├── B/
│   └── C/
├── RawData/
│   ├── A/
│   ├── B/
│   ├── C/
│   ├── Q/
│   └── t/
├── TrainandValidation/
│   ├── train/
│   ├── validation/
│   └── __pycache__/
├── application.py
├── datacollection.py
├── function.py
├── model.h5
├── model.json
├── predata.py
├── tempCodeRunnerFile.py
├── trainingmodel.py
├── LICENSE
└── README.md
```

Data Collection

Script: datacollection.py

Uses OpenCV to record hand signs for each alphabet letter (A-Z).
Captures images from the webcam and saves them in the RawData/ folder under corresponding alphabet subfolders (e.g., A, B, C...).
Only the Region of Interest (ROI) from (0, 40) to (300, 400) is captured to improve accuracy and focus on the hand region.

Data Preprocessing

Script: predata.py

Loads the recorded hand sign images.
Uses MediaPipe to extract 21 hand landmarks (keypoints) per hand.
Converts each 30-frame gesture sequence into a .npy file containing the keypoints.
Each sequence represents one gesture corresponding to a specific alphabet.
Processed data is saved in the DataPreProcessed/ directory in the same subfolder structure as raw data.

Model Architecture & Training

Script: trainingmodel.py

A deep LSTM (Long Short-Term Memory) neural network is used for sequence classification.
Model architecture:
- 3 LSTM layers with ReLU activation and return_sequences=True
- 2 Dense layers with ReLU activation
- 1 Output Dense layer with Softmax activation for multi-class classification
Training is done using:
- model.fit() for 200 epochs
- TensorBoard is used for logging training metrics
After training:
- Model weights are saved to model.h5
- Model architecture is saved to model.json

Real-Time Prediction

Script: application.py

Loads the trained model from model.json and model.h5.
Opens a live webcam feed using OpenCV.
Uses MediaPipe to extract hand landmarks in real time.
Maintains a buffer of the last 30 frames to form a sequence for prediction.
Displays:
- Predicted gesture/class
- Prediction confidence score
- UI overlays using OpenCV for visual feedback

Visualization

Real-time probability bars show the model's confidence for each class.
Sentence history (recognized gestures over time) is displayed at the top of the webcam frame.
Prediction confidence is updated live for user transparency.

Libraries Used

TensorFlow / Keras – for deep learning model building and training
MediaPipe – for hand tracking and keypoint extraction
OpenCV – for webcam streaming and UI rendering
NumPy – for numerical data handling
scikit-learn – for dataset splitting and basic preprocessing

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AERONET

The Problem We Saw

Innovation

Research Questions

Features

Findings and Results

Folder Structure

Data Collection

Data Preprocessing

Model Architecture & Training

Real-Time Prediction

Visualization

Libraries Used

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
DataPreProcessed		DataPreProcessed
RawData		RawData
TrainandValidation		TrainandValidation
__pycache__		__pycache__
LICENSE		LICENSE
README.md		README.md
application.py		application.py
datacollection.py		datacollection.py
function.py		function.py
model.h5		model.h5
model.json		model.json
predata.py		predata.py
tempCodeRunnerFile.py		tempCodeRunnerFile.py
trainingmodel.py		trainingmodel.py

License

sayandeepmaity/aeronet

Folders and files

Latest commit

History

Repository files navigation

AERONET

The Problem We Saw

Innovation

Research Questions

Features

Findings and Results

Folder Structure

Data Collection

Data Preprocessing

Model Architecture & Training

Real-Time Prediction

Visualization

Libraries Used

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages