A machine learning project to predict police response times for the New Orleans Police Department using incident data from 2025. This project analyzes emergency response patterns to help optimize resource allocation and improve public safety services.
The data attached ds.csv is a real-world dataset from https://data.gov
This project aims to predict the overall response time for police incidents in New Orleans, calculated as the difference between TimeClosed and TimeCreate. By analyzing historical incident data from 2025, we develop predictive models that can help the New Orleans Police Department better understand response patterns and optimize their operations.
Data Source: Orleans Parish Communication District (OPCD) - the administrative office of 9-1-1 for the City of New Orleans
Dataset Details:
- Time Period: 2025 incident reports
- Target Variable: Response time (TimeClosed - TimeCreated)
- Total Records: 29,753 incidents
- Features: 21 columns including incident types, priorities, locations, and timestamps
| Column | Type | Description |
|---|---|---|
| NOPD_Item | String | Unique incident identifier |
| Type | String | Incident type code |
| TypeText | String | Human-readable incident type |
| Priority | String | Incident priority level |
| InitialType | String | Initial incident type code |
| InitialTypeText | String | Initial incident type description |
| InitialPriority | String | Initial priority assignment |
| MapX | Numeric | X-coordinate location |
| MapY | Numeric | Y-coordinate location |
| TimeCreate | DateTime | Incident creation timestamp |
| TimeDispatch | DateTime | Dispatch timestamp |
| TimeArrive | DateTime | Officer arrival timestamp |
| TimeClosed | DateTime | Incident closure timestamp |
| Disposition | String | Incident resolution code |
| DispositionText | String | Resolution description |
| SelfInitiated | String | Whether incident was self-initiated |
| Beat | String | Police beat designation |
| BLOCK_ADDRESS | String | Incident location |
| Zip | Numeric | ZIP code |
| PoliceDistrict | Numeric | Police district number |
| Location | String | Geographic coordinates |
- Python 3.8+
- Jupyter Notebook or JupyterLab
- Required libraries (see requirements section)
- Clone the repository:
git clone https://github.com/ysandansing/responseTimePrediction.git
cd responseTimePrediction- Create a virtual environment (Optional):
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activateor use conda
conda create -n myEnv python=3.8
conda activate myEnv- Install required dependencies:
pip install -r requirements.txtBaseline Models
Three regression approaches were implemented to establish performance benchmarks:
- Linear Regression: Provided foundational insights into linear relationships between features and response times
- Lasso Regression (L1): Automated feature selection through coefficient zeroing, handling high-dimensional data
- Ridge Regression (L2): Maintained all features while preventing overfitting through regularization
Neural Network Architecture
A 3-layer feedforward network with PyTorch implementation:
- Input (64) → Hidden 1 (128, ReLU, Dropout 0.2) → Hidden 2 (64, ReLU) → Output (1)
- Trained with AdamW optimizer (cyclical LR: 0.001-0.0001)
- Incorporated batch normalization and early stopping (patience=3)
| Model | Training Time | Validation MSE | Validation MAE | R² Score |
|---|---|---|---|---|
| Linear Regression | 0.09s | 0.872 | 0.621 | 0.412 |
| Lasso Regression | 1.25s | 0.763 | 0.587 | 0.486 |
| Ridge Regression | 0.03s | 0.758 | 0.584 | 0.491 |
| Neural Network | 22m7s | 0.682 | 0.512 | 0.573 |
Key Findings:
- Neural networks achieved 27% lower MAE than best linear model (0.512 vs 0.584 minutes)
- Lasso/Ridge showed comparable performance despite different regularization approaches
- Training time scaled 1000x from linear (0.09s) to NN (22m) models
- Log-transform reduced target variable skewness (right-skew σ from 4.2 → 0.8)