Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
31 views30 pages

Motivations Literature Review Objectives Methodology Results & Discussions Conclusions Future Scope References

This presentation outlines a study on improving precipitation prediction in Lucknow using machine learning techniques compared to traditional General Circulation Models (GCM). The Random Forest model demonstrated superior accuracy, achieving the lowest RMSE and highest R² values among various tested models, including Linear Regression and Decision Trees. Future work will focus on refining the Random Forest model, exploring additional machine learning techniques, and expanding the geographical scope of the study.

Uploaded by

bsujay424
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views30 pages

Motivations Literature Review Objectives Methodology Results & Discussions Conclusions Future Scope References

This presentation outlines a study on improving precipitation prediction in Lucknow using machine learning techniques compared to traditional General Circulation Models (GCM). The Random Forest model demonstrated superior accuracy, achieving the lowest RMSE and highest R² values among various tested models, including Linear Regression and Decision Trees. Future work will focus on refining the Random Forest model, exploring additional machine learning techniques, and expanding the geographical scope of the study.

Uploaded by

bsujay424
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

OUTLINE OF PRESENTATION

❑Introduction
❑Motivations
❑Literature Review
❑Objectives
❑Methodology
❑Results & Discussions
❑Conclusions
❑Future Scope
❑References
2
OUTLINE OF PRESENTATION
❑ Introduction
❑Motivations
❑Literature Review
❑Objectives
❑Methodology
❑Results & Discussions
❑Conclusions
❑Future Scope
❑References
3
Introduction
▪ Precipitation prediction plays a vital role in various sectors, including agriculture, water
resource management, transportation, and disaster management.
▪ Accurate rainfall predictions enable early warnings, better resource management, and disaster
preparedness.
▪ In this study, machine learning was employed to refine precipitation forecasting in Lucknow,
with the goal of building a model that delivers greater precision and efficiency than
conventional meteorological methods.
▪ The results obtained from GCM were compared with the results of Machine learning (ML)
algorithms.
▪ The rapid advancement in Machine learning (ML) and its ability to handle large datasets and
identify complex patterns, there has been growing interest in applying ML techniques to improve
precipitation prediction.

4
OUTLINE OF PRESENTATION
❑ Introduction
❑Motivation
❑Literature Review
❑Objectives
❑Methodology
❑Results & Discussions
❑Conclusions
❑Future Scope
❑References
5
Motivation
▪ Bridging the Gap in Climate Prediction Accuracy: GCMs usually work with large-scale data
over big areas and long time periods, making it hard for them to predict rainfall accurately in
specific small regions, especially in places with unique local weather patterns whereas
Machine learning algorithms are well-suited to handle large volumes of diverse data, such as
historical precipitation records, temperature, humidity, zonal wind (u-wind speed), meridional
wind (v-wind speed) and atmospheric pressure.

▪ This study compares linear regression, multiple ML algorithms (e.g., Random Forest,
Decision Tree) and GCM to identify the best-performing models for predicting rainfall.

▪ By benchmarking ML predictions against GCM outputs, the project contributes to the growing
body of research on the intersection of AI and climate science.
6
DATASET MODEL RAINFALL PREDICTION

INPUT VARIABLES:
1. Maximum Temperature
2. Minimum Temperature
3. Maximum Relative Humidity
4. Minimum Relative Humidity
5. Geopotential Height
6. Longwave Radiation
7. Sea level pressure
8. U-wind Speed
9. V-wind Speed

TARGET VARIABLE: Rainfall

Ref : www.google.com (Images) 7


OUTLINE OF PRESENTATION
❑ Introduction
❑Motivations
❑Literature Review
❑Objectives
❑Methodology
❑Results & Discussions
❑Conclusions
❑Future Scope
❑References
8
Data Source /
Study Title Model(s) Used Key Findings Research Gaps
Region

Comparative analysis Enhanced forecasting Limited geographic


of different rainfall Logistic Regression, accuracy through focus; model
Khan et al. (2024) prediction models: A Decision Tree, MLP, machine learning Aligarh City, India generalization for
case study of Aligarh Random Forest model comparison diverse climates not
City, India and feature selection. addressed.

Limited to urban
LSTM-based models
LSTM-based rainfall areas; further testing
Barrera-Animas et LSTM, Stacked- effectively forecasted Time-series data from
forecasting models needed for rural or
al. (2022) LSTM hourly rainfall in five UK cities
for UK cities diverse topographic
urban areas.
regions.

Rainfall Forecasting XGBoost Regression Limited adaptability to


ARIMA, Holt-Winters,
Ganapathy et al. Using Machine was the most non-localized or
LSTM, SVR, Linear Vellore region, India
(2022) Learning Algorithms accurate for localized larger-scale weather
Regression, XGBoost
for Localized Events rainfall forecasts. events.

9
Study Title Model(s) Used Key Findings Data Source / Region Research Gaps

Limited variety in
Machine learning XGBoost performed best machine learning models
Liyew et al. (2021) techniques to predict MLR, FR, XGBoost among models in Bahir Dar City, Ethiopia tested; need for more
daily rainfall amount predicting daily rainfall. advanced deep learning
models.

Precipitation prediction
High accuracy in short- Lack of exploration with
using geopotential
Random Forest, term rainfall prediction additional features like
Li et al. (2020) height, sea-level General study
Gradient Boosting using diverse humidity and wind
pressure, and longwave
atmospheric features. speed.
radiation

Model suitability for


The hybrid Conv1D-MLP
Hybrid DL approach for diverse geographic
model was effective for
Khan et al. (2020) multi-step daily rainfall Conv1D-MLP General study locations and longer
multi-step-ahead rainfall
prediction prediction horizons not
predictions.
examined.

10
OUTLINE OF PRESENTATION
❑ Introduction
❑Motivations
❑Literature Review
❑Objectives
❑Methodology
❑Results & Discussions
❑Conclusions
❑Future Scope
❑References
11
Objectives of the Study

To determine and compare the accuracy of rainfall predictions in


Lucknow city using Machine Learning (ML) algorithms, focusing on
Objective 1:
forecasting rainfall patterns and assessing their performance against
forecasts derived from General Circulation Models (GCMs).

To evaluate the accuracy, error metrics, and predictive capabilities


Objective 2:
of GCM rainfall forecasts and compare them with trained Machine
Learning models, including decision trees and random forest.

12
Methodology
Study Area & Data Collection
Lucknow 1950 – 2014 rainfall data (9 input variables)

Data Preprocessing
Handling Missing Values Outlier Detection and Removal Normalization/Scaling Data Splitting

Model Development
Machine Learning model like Random forest,
Statistical Model Model Training (70% Training & 30% Testing)
Decision Tree.

Model Evaluation
Evaluation Metrics (R^2, RMSE, BIAS)

Comparison & Analysis


Model Comparison Visualization

13
OUTLINE OF PRESENTATION
❑ Introduction
❑Motivations
❑Literature Review
❑Objectives
❑Methodology
❑Results & Discussions
❑Conclusions
❑Future Scope
❑References
14
Results & Discussions
▪ The focus of this study was to predict daily rainfall using statistical model like Linear Regression and Machine
learning algorithms like Random Forest, Decision Tree.

▪ The predicted rainfall from Machine Learning models were compared with the GCM prediction.

▪ Three key metrics of R^2 , RMSE and Bias were employed to judge the performance of different models.

• The Random Forest model outperformed other models, achieving the lowest Root Mean Squared Error
(RMSE) and the highest R² value, making it the most accurate and robust model for this task.

• For detailed analysis and better understanding, tabulated results, scatter plots and time series plots are
provided in the upcoming slides.

15
R^2 Values (Training & Testing) RMSE Values (Training & Testing)

4.5
0.9
4 3.689
0.8
0.6955 3.5
0.7
0.5794 3 2.6862.607
0.6
2.4628

RMSE
0.5 2.5 2.191
R^2

0.423
0.4 0.36786 0.3542 2 1.7903
0.3 1.5
0.1923
0.2 1
0.1 0.5
0
0
Linear Regression Decision Tree Random Forest
Linear Decision Tree Random
MODELS Regression
MODELS
Forest

Training Testing Training Testing

▪ The Random Forest model explains 69.55% of the variance in the rainfall data on unseen data, which is a strong
result and the training R² is also high (0.5794), indicating good fit to the training data without severe overfitting.
▪ Random Forest has the lowest RMSE values, suggesting that its predictions are the most accurate and least error-
prone among the models.

16
Predicted Rainfall (mm)
Scatter Plot_GCM (rainfall pred vs rainfall obs.)

R2 = 0.001347

Observed Rainfall (mm)

The R-squared value (R² = 0.001347) indicates a very weak correlation between the observed and
predicted values. This suggests that the model (GCM) used for prediction does not accurately capture the
patterns in the observed rainfall data.

17
Linear Regression (LR)

Scatter Plot_Training (LR) Scatter Plot_Testing (LR)

R2 = 0.36786

R^2 = 0.36786 R^2 = 0.3542

▪ The plots suggest that the linear regression model has a moderate fit on both training and testing datasets.
▪ However, the R-squared values are relatively low, indicating that the model is not capturing a significant amount
of the variation in the data. This could suggest that a linear regression model may not be the best fit for this
dataset or that additional features or transformations are needed to improve the model's performance.

18
Time Series Plot_Training (LR) Time Series Plot_Testing (LR)

▪ The image presents two time series plots, illustrating the performance of a Linear Regression (LR) model on
a training and testing dataset.
▪ The model performs well on the training data but struggles to predict larger fluctuations in the data,
especially in the testing dataset. This indicates a need for further model improvement.

19
Decision Tree (DT)

Scatter Plot_Training (DT) Scatter Plot_Testing (DT)

R2 = 0.1923

R2 = 0.423

▪ The Decision Tree model appears to have learned the training data well but shows some signs of
overfitting. It occurs when the model learns the data too well, including its noise, making it less effective
at generalizing to new data.
▪ The R^2 value is very low which indicates that the model fails to explain variance in the data.
▪ To improve the model, pruning of decision tree and other machine learning algorithms can be tested.

20
Time Series Plot_Training (DT) Time Series Plot_Testing (DT)

The model might be overfitting to the training data, the model might require further tuning or the use of a
different algorithm to improve its predictive performance on unseen data.

21
Random Forest (RF)

Scatter Plot_Training (RF) Scatter Plot_Testing (RF)

R2 = 0.5794
R2 = 0.6955

▪ The model seems to perform reasonably well on both the training and testing dataset.
▪ The higher R-squared value on the testing dataset (0.6955) compared to the training dataset (0.5794)
suggests that the model does not suffer from significant overfitting. This is a positive sign, as it indicates
that the model is likely to generalize well to new data.

22
Time Series Plot_Training (RF) Time Series Plot_Testing (RF)

▪ The model seems to have learned the training data well, but its performance on the testing data suggests it
might be slightly overfitting.
▪ To improve generalization, consider using regularization techniques, gathering more diverse training data,
or adjusting the model's hyperparameters.

23
Evaluation Metrics

MODEL R^2 RMSE BIAS


Training Testing Training Testing Training Testing
Linear Regression 0.368 0.354 2.686 2.607 -0.336 0.331
Decision Tree 0.192 0.423 3.689 2.462 -0.118 -0.002
Random Forest 0.580 0.696 2.191 1.790 -0.075 -0.021
GCM (on whole 0.001347 9.946
data)

24
OUTLINE OF PRESENTATION

❑Introduction
❑Motivations
❑Literature Review
❑Objectives
❑Methodology
❑Results & Discussions
❑Conclusions
❑Future Scope
❑References
25
Conclusions
▪ Machine learning techniques were successfully applied to predict precipitation in Lucknow using
historical meteorological data.
▪ Various models, including Linear Regression, Decision Trees, and Random Forest, were tested for
accuracy and robustness in predicting rainfall.
▪ The Random Forest model outperformed other models, achieving the lowest Root Mean Squared
Error (RMSE) and the highest R² value, making it the most accurate and robust model for this task.
▪ The Random Forest model provided better prediction accuracy than General Circulation Model (GCM)
predictions, underscoring the value of machine learning in enhancing precipitation forecasts.
▪ Random Forest effectively captured interactions among key features like temperature, humidity, wind
patterns, and pressure, essential for accurate precipitation prediction.
▪ The study suggests exploring other machine learning algorithms to further improve precipitation
prediction accuracy.

26
OUTLINE OF PRESENTATION
❑Introduction
❑Motivations
❑Literature Review
❑Objectives
❑Methodology
❑Results & Discussions
❑Conclusions
❑Future Scope
❑References
27
Future scope

• Future work can focus on refining (Hyperparameter tuning) the Random Forest model for better
accuracy and efficiency in precipitation forecasting.

• Additional machine learning techniques including recent developments like deep learning, physics
informed machine learning can be tested to improve prediction capabilities.

• More input variables affecting rainfall can be considered in further studies.

• Expanding the study to cover broader geographical areas can improve the robustness and
scalability of the forecasting models.

28
OUTLINE OF PRESENTATION
❑ Introduction
❑Motivations
❑Literature Review
❑Objectives
❑Methodology
❑Results & Discussions
❑Conclusions
❑Future Scope
❑References
29
References
• Barrera-Animas, Ari & Oyedele, Lukumon & Bilal, Muhammad & Akinosho, Taofeek & Davila Delgado, Manuel & Akanbi,
Lukman. (2021). Rainfall prediction: A comparative analysis of modern machine learning algorithms for time-series
forecasting. Machine Learning with Applications. 7.100204. 10.1016/j.mlwa.2021.100204.

• Liyew, Chalachew & Melese, Haileyesus. (2021). Machine learning techniques to predict daily rainfall amount. Journal of Big
Data. 8. 10.1186/s40537-021-00545-4.

• Ashesh Chattopadhyay, Adam Subel, Pedram Hassanzadeh (2020), “Data-Driven Super-Parameterization Using Deep
Learning: Experimentation with Multiscale Lorenz 96 Systems and Transfer Learning.”
https://doi.org/10.1029/2020MS002084.

• Ganapathy, Ganapathy & Srinivasan, Kathiravan & Datta, Debajit & Chang, Chuan-Yu & Purohit, Om & Zaalishvili, Vladislav
& Burdzieva, Olga. (2022). Rainfall Forecasting Using Machine Learning Algorithms for Localized Events. Computers,
Materials & Continua. 71. 6333-6350. 10.32604/cmc.2022.023254.

• Khan, Mohd & Maity, Rajib. (2020). Hybrid Deep Learning Approach for Multi-Step-Ahead Daily Rainfall Prediction Using
GCM Simulations. IEEE Access. PP. 1-1. 10.1109/ACCESS.2020.2980977.

• Das, Riyanka & Chatterjee, Durjoy & Maji, Deblina & Roy, Deepsubhra & Datta, Piyali. (2024). Modeling Rainfall Prediction
Framework Using Machine Learning Approach: A Comparative Study. 10.1007/978-3-031-71125-1_14.
Thank You
31

You might also like