2D Drone Routing Simulation with Reinforcement Learning

A comprehensive simulation system that combines a React frontend with a Python reinforcement learning backend to solve the drone routing problem with battery constraints and recharging stations.

Features

🖼️ User Interface

Interactive 2D grid visualization
Click-to-place customer nodes and recharging stations
Real-time drone position and path tracking
Live battery level indicator
Step-by-step decision-making logs
Simulation statistics dashboard

🤖 Simulation Logic

2D grid graph environment representation
Battery consumption (1 unit per second of travel)
Constraint handling:
- Customers must be visited exactly once
- No direct travel between recharge stations
- Episode fails if battery depletes before reaching recharge station

🧠 Reinforcement Learning Agent

Q-learning algorithm implementation
State space: current position, remaining battery, visited customers
Action space: move to adjacent valid nodes
Reward function:
- +100 for visiting all customers
- +50 for visiting new customer
- -100 for battery depletion
- -1 per move (efficiency incentive)
- Distance-based penalties

Tech Stack

Frontend: React.js with HTML5 Canvas for grid visualization
Backend: Python Flask with CORS support
RL Algorithm: Q-learning with epsilon-greedy exploration
Communication: REST API between frontend and backend

Setup Instructions

Prerequisites

Node.js (v14 or higher)
Python 3.8 or higher
npm or yarn

Frontend Setup

Install dependencies:

npm install

Start the React development server:

npm start

The frontend will be available at http://localhost:3000

Backend Setup

Navigate to the backend directory:

cd backend

Create a virtual environment (recommended):

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install Python dependencies:

pip install -r requirements.txt

Start the Flask backend:

python app.py

The backend will be available at http://localhost:5000

Quick Start (Both Frontend and Backend)

npm run dev

This command will start both the React frontend and Python backend concurrently.

Usage Guide

1. Setup the Environment

Configure Parameters: Set grid size, number of customers, recharge stations, and battery capacity
Place Customer Nodes: Click "Place Customers" and click on grid cells to place customer locations
Place Recharge Stations: Click "Place Recharge Stations" and click on grid cells to place charging stations
Set Start Position: Click "Set Start Position" and click on a recharge station to set the drone's starting point

2. Run Simulation

Click "Start Simulation" to begin the RL-powered routing
Watch the drone navigate the grid in real-time
Monitor battery levels, visited customers, and decision logs
View simulation statistics including steps taken, efficiency, and completion rate

3. Understanding the Visualization

Red Circles: Unvisited customers (C1, C2, etc.)
Green Circles: Visited customers
Yellow Squares: Recharge stations (R1, R2, etc.)
Green Square Border: Start position
Blue Circle: Current drone position
Blue Line: Drone's path history

API Endpoints

POST `/api/simulate`

Start a new simulation with RL agent training and pathfinding.

Request Body:

{
  "grid_size": 10,
  "customers": [[2, 3], [7, 8], [1, 9]],
  "recharge_stations": [[0, 0], [9, 9]],
  "start_position": [0, 0],
  "battery_capacity": 20
}

Response:

{
  "success": true,
  "path": [[0, 0], [2, 3], [7, 8], [1, 9]],
  "total_reward": 245.5,
  "steps_taken": 15,
  "training_complete": true
}

POST `/api/train`

Train the RL agent for additional episodes.

GET `/api/get_q_table`

Retrieve the current Q-table for analysis.

GET `/api/health`

Health check endpoint.

Algorithm Details

Q-Learning Implementation

Learning Rate: 0.1 (how much new information overrides old)
Discount Factor: 0.95 (importance of future rewards)
Epsilon: 1.0 → 0.01 (exploration vs exploitation balance)
Epsilon Decay: 0.995 (gradual shift from exploration to exploitation)

State Representation

States are represented as tuples containing:

Current X coordinate
Current Y coordinate
Remaining battery level
Visited customers bitmask

Action Space

Actions represent moving to any valid position (customer or recharge station) that:

Is reachable with current battery
Doesn't violate movement constraints
Follows the "no recharge-to-recharge" rule

Reward Engineering

The reward function balances multiple objectives:

Task Completion: Large positive reward for visiting all customers
Progress: Medium reward for visiting new customers
Efficiency: Small penalty per move to encourage shorter paths
Safety: Large penalty for battery depletion
Distance: Penalty proportional to travel distance

Customization

Modifying Grid Size

Change the GRID_SIZE constant in src/App.js or use the UI controls.

Adjusting RL Parameters

Modify hyperparameters in backend/drone_rl_agent.py:

learning_rate: How quickly the agent learns
discount_factor: How much future rewards matter
epsilon_decay: How quickly exploration decreases

Custom Reward Functions

Edit the calculate_reward method in backend/environment.py to implement different reward strategies.

Troubleshooting

Backend Connection Issues

Ensure Python backend is running on port 5000
Check CORS configuration in backend/app.py
Verify all Python dependencies are installed

Frontend Issues

Clear browser cache and reload
Check browser console for JavaScript errors
Ensure all npm dependencies are installed

Simulation Not Working

Verify start position is set at a recharge station
Ensure at least one customer is placed
Check that battery capacity is sufficient for basic moves

Future Enhancements

Deep Q-Network (DQN) implementation
Policy Gradient methods (PPO, A3C)
Multi-drone coordination
Dynamic obstacles and weather conditions
3D visualization
Export simulation data and trained models
Performance benchmarking tools
Advanced pathfinding algorithms comparison

Contributing

Fork the repository
Create a feature branch
Make your changes
Add tests if applicable
Submit a pull request

License

This project is open source and available under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
backend		backend
node_modules		node_modules
public		public
src		src
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json

zarissam/drone-simulation

Folders and files

Latest commit

History

Repository files navigation