Cloud-Coverage-Determination

Introduction

This project focuses on developing robust time series forecasting models to predict "Total Cloud Cover [%]" for the next 15th, 25th, and 30th minute using historical weather data. The dataset spans 11 months and includes 17 feature columns, with the last 10% reserved as the testing set for validation.

Key Metric

The R² score is the primary evaluation metric, used to assess model performance and measure the effectiveness of predictions.

Methodology

The methodology is divided into two primary stages:

1. Pre-Processing

Key Steps:

Sorting the Data:
- Created a Datetime column in the format YYYY-MM-DD HH:MM to ensure chronological order in the dataset.
Handling Missing Values:
- Identified and interpolated 1,400 missing values in the "Total Cloud Cover" column using the cubic interpolation method.
Feature Transformation:
- Converted the Azimuth Angle (measured in degrees) into sine and cosine components to handle cyclical behavior.
- Transformed Wind Speed and Wind Direction into vector components:
  - Before Transformation: Wind Speed, Wind Direction
  - After Transformation: Wind_U, Wind_V (vector components)
Cyclic Features:
- Added daily and weekly cyclic features to capture periodic patterns in weather conditions.
Outlier Handling:
- Replaced outliers in the "Snow Depth" column with the maximum observed values within a reasonable range.
Lagged Features:
- Added lagged feature columns to capture temporal dependencies. However, this step resulted in performance degradation and was excluded from the final model.
Visualization:
- Performed exploratory data analysis to validate processed features and understand the dataset.

2. Modeling

Key Steps:

Residual Modeling:
- Implemented LSTM with residual modeling by incorporating additional models like Linear Regression, XGBoost, and ARIMA.
Advanced Architectures:
- Experimented with BiLSTM (Bidirectional LSTM) and ConvLSTM (Convolutional LSTM) to enhance temporal pattern detection.
Baseline Models:
- Compared performance using traditional models such as LSTM, GRU, and RNN.
Grid Search Optimization:
- Conducted extensive hyperparameter tuning through grid search to optimize the number of LSTM layers, dense layers, and other parameters.
Custom Loss Function:
- Designed and implemented a custom loss function to optimize the weighted accuracy metric.

File Structure

The repository is organized as follows:

.
├── preprocessed.csv            # Preprocessed dataset
├── train.csv                   # Raw training dataset
├── testing.csv                 # Raw testing dataset
├── Bitlstm Convlstm Model.ipynb      # Notebook for BiLSTM and ConvLSTM models
├── Grid Search Model.ipynb           # Notebook for grid search optimization
├── LSTM GRU RNN Model.ipynb        # Notebook for LSTM, GRU, and RNN models
├── Processing.ipynb      # Notebook for all preprocessing steps
├── Residual Modelling.ipynb     # Notebook for residual modeling

Key File Descriptions

preprocessed.csv: Preprocessed dataset used for training and evaluation.
train.csv: Raw training dataset containing unprocessed weather data.
testing.csv: Raw testing dataset for model validation.
Processing.ipynb: Notebook covering all preprocessing steps, including handling missing values, feature engineering, and outlier detection.
Bilstm Convlstm Model.ipynb: Notebook implementing BiLSTM and ConvLSTM models.
Grid Search Model.ipynb: Notebook for hyperparameter tuning using grid search.
LSTM GRU RNN Model.ipynb: Notebook exploring baseline models (LSTM, GRU, and RNN).
Residual Modelling.ipynb: Notebook demonstrating residual modeling approaches using hybrid models.

Results

Detailed model comparisons, evaluations, and visualizations are available in the respective notebooks. The project demonstrates the strengths and weaknesses of different modeling approaches for short-term weather forecasting.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Cloud-Coverage-Determination

Introduction

Key Metric

Methodology

1. Pre-Processing

Key Steps:

2. Modeling

Key Steps:

File Structure

Key File Descriptions

Results

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.gitattributes		.gitattributes
Bilstm Convlstm Model.ipynb		Bilstm Convlstm Model.ipynb
Grid Search Model.ipynb		Grid Search Model.ipynb
LSTM GRU RNN Model.ipynb		LSTM GRU RNN Model.ipynb
Processing.ipynb		Processing.ipynb
README.md		README.md
Residual Modeling.ipynb		Residual Modeling.ipynb
preprocessed.csv		preprocessed.csv
testing.csv		testing.csv
train.csv		train.csv

mayank-1007/Cloud_Coverage

Folders and files

Latest commit

History

Repository files navigation

Cloud-Coverage-Determination

Introduction

Key Metric

Methodology

1. Pre-Processing

Key Steps:

2. Modeling

Key Steps:

File Structure

Key File Descriptions

Results

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages