Integration of ML with GIS for Land Use/Land Cover and Deforestation Prediction
Anushka Tagade1, Pradyumna Rokade2, Aastha Chandankar3 , Pranay Bandichode4
First to Fifth Computer Department, DYPIEMR, DY Patil Educational Complex, Akurdi 411044, Pune, Maharashtra
ABSTRACT This paper explores the integration of ML with GIS to improve
the detection, analysis, and prediction of LULC and
Understanding land use and land cover (LULC) changes and deforestation trends. By combining remote sensing data from
their impacts on the environment is crucial for sustainable land satellites like Landsat with predictive models such as the CA-
management. This paper explores the integration of machine Markov hybrid model, we can simulate spatial-temporal
learning (ML) with Geographic Information System (GIS) dynamics and project LULC patterns into the future. Such
techniques for predicting LULC changes and deforestation integrated approaches are invaluable for policymakers and
trends in arid regions. By utilizing remote sensing data from environmental managers, as they allow for data-driven
Landsat satellites, the study employs the CA-Markov hybrid decision-making aimed at sustainable land-use management.
model to simulate and forecast the spatial-temporal changes in
LULC categories. The paper reviews various methodologies, 2. STUDY AREA AND DATA
including supervised classification and machine learning
algorithms, to model and predict LULC dynamics. The results 2.1. Study area
suggest that machine learning models, in combination with GIS The study area can be any arid region characterized by
tools, provide accurate and reliable predictions for its topography, climate, and major land cover types. The
deforestation and LULC changes, thereby supporting effective region has experienced significant changes in land use due to
land-use planning and policy-making. urbanization, agricultural expansion, and desertification
processes. The area’s climate is primarily dry with very low
Index Terms— Machine Learning, Geographic annual rainfall, making it highly susceptible to land
Information Systems (GIS), Land Use/Land Cover, degradation. The key land use types in the study area include
Deforestation, Remote Sensing, CA-Markov Model, Land-use urban settlements, agricultural lands, desert areas, and water
Planning, Prediction Models bodies.
1. INTRODUCTION
The study of Land Use and Land Cover (LULC) changes
and their environmental impact has become increasingly
important, especially as human activities, such as urbanization
and agriculture, continue to alter landscapes at an
unprecedented rate. These changes have significant ecological,
economic, and social consequences, particularly in arid and
semi-arid regions where resources like water and arable land
are already scarce. LULC changes, coupled with deforestation,
contribute to biodiversity loss, altered water cycles, soil
degradation, and climate change, all of which threaten
environmental sustainability [1].
Historically, LULC monitoring relied on ground surveys
and manual classification of aerial photographs. However, Fig.1. the location of the study area
with advancements in satellite remote sensing, vast amounts of
spatial data are now available, enabling more comprehensive 2.2 Data Acquisition
and accurate tracking of these changes over time. Machine For this study, Landsat satellite images were used as the
learning (ML) methods have revolutionized this field by primary data source. The remote sensing data includes images
providing powerful tools to process and analyze complex from different Landsat sensors, including:
datasets. When combined with Geographic Information • Landsat 5 (TM) for 1984
Systems (GIS), ML techniques can analyze spatial data to • Landsat 8 (OLI) for 2013 and 2022
detect patterns, model changes, and make predictions about
future land cover scenarios.
In the model to predict the future land use and land cover
changes, the history data were included and studied as the base
data.
(a) 2002 (b)2013 (c)2022
Fig 2 Land use/cover map
These images provide a temporal resolution for tracking
land cover changes over the past several decades. In addition
to satellite data, ancillary data such as Digital Elevation
Models (DEMs), socioeconomic data (population density,
GDP), and vector data (roads, rivers, urban areas) were also
used to understand the driving factors behind LULC changes. Fig.3 The flow chart of the MARKOV_CA model
3. METHODOLOGY 3.1. Data Preprocessing
The raw satellite images were preprocessed to ensure that
To analyze and predict LULC changes and deforestation, they were geometrically and radiometrically aligned for
this study integrates Machine Learning (ML) algorithms with accurate analysis. The preprocessing steps include:
Geographic Information Systems (GIS) and remote sensing
data. The methodology involves a multi-step process, starting • Geometric Correction: Aligning the satellite images to
with the acquisition and preprocessing of satellite imagery and a common coordinate system (WGS 84), ensuring that
ancillary data. This is followed by LULC classification using all images have a consistent spatial reference.
supervised classification techniques and machine learning • Radiometric Calibration: Adjusting the images to
models. A CA-Markov model is then applied to simulate and correct for atmospheric effects and to normalize
project future land-cover scenarios based on past trends and brightness variations between different sensor types.
transition probabilities. • Mosaicking: Combining multiple images from
different scenes to cover the entire study area.
The approach combines the predictive strengths of ML
• Band Selection and Indices Calculation: Extracting
with the spatial analysis capabilities of GIS. Machine learning
relevant bands from the multispectral images and
models, such as decision tree classifiers, random forests, and
calculating spectral indices such as NDVI (Normalized
artificial neural networks, enhance the accuracy of LULC
Difference Vegetation Index), MNDWI (Modified
classification by learning from historical data and identifying
Normalized Difference Water Index), and NDBI
patterns across multiple spectral indices. The CA-Markov
(Normalized Difference Built-Up Index) to differentiate
model, a hybrid approach that combines Cellular Automata
between land cover types.
and the Markov Chain model, is used to simulate the spatial
distribution of land-cover categories over time. This model
takes into account both the probability of LULC transitions and
the spatial relationships between cells, allowing for more
realistic projections.
Each component of the methodology—from data
preprocessing to classification, modeling, and validation—
works in conjunction to deliver a comprehensive analysis of
LULC trends. The methodology is designed to provide
reliable, scalable, and accurate predictions that can support
sustainable land management and policy development.
3.2. LULC Classification
Landsat images were classified into different LULC
categories using a supervised classification technique. The
decision tree classifier (DTC) was applied, where:
• Training Data: Labeled samples from known land use
categories were used to train the classifier.
• Classification: The DTC algorithm recursively splits
the data into decision nodes based on spectral
properties, assigning each pixel to the most likely
LULC class.
The LULC categories include urban areas, cultivated
land, desert land, and water bodies.
3.3. CA-Markov Model for LULC Prediction
The CA-Markov model was used to predict future
LULC trends based on historical data. The model consists of
two main components:
• Markov Chain: Used to calculate the transition
probability matrix, which represents the likelihood of a
pixel changing from one land use category to another
over a given time period.
• Cellular Automata (CA): Used to simulate the spatial
distribution of LULC classes, incorporating the spatial
dependencies between adjacent cells in the grid.
The model was trained using historical LULC data from
1984 to 2022 and was used to project LULC scenarios for
the years 2030, 2040, and 2050. The spatial configuration
of LULC categories was predicted based on the transition
probabilities calculated from the Markov chain.
3.4 Machine Learning Models • Urban Expansion: Urban areas increased from 5.5% of
Various machine learning models were evaluated for their the total study area in 1984 to 12.3% in 2022. This
ability to classify and predict LULC changes: expansion primarily occurred at the expense of
agricultural land and desert areas.
• Random Forest (RF): A robust ensemble method used
• Cultivated Land Growth: Cultivated areas grew from
for classification and regression. RF was applied to
45.5% in 1984 to 60.7% in 2022, largely due to land
handle the large number of features from the remote
reclamation projects in desert areas.
sensing data and to improve classification accuracy.
• Desert Area Decline: Desert lands, which constituted
The performance of these ML models was compared using 46.8% of the area in 1984, decreased to 23.6% in 2022 as
accuracy metrics such as overall accuracy, producer’s more land was converted to cultivation.
accuracy, and Kappa coefficient. • Water Bodies: Water bodies showed a slight increase,
from 2.2% in 1984 to 3.4% in 2022, due to the creation of
new irrigation channels and reservoirs.
4. RESULTS AND DISCUSSIONS
4.1. LULC Change Analysis
The classification results revealed significant changes in
LULC over the study period. From 1984 to 2022:
Activity Influences on Vegetation Cover Changes in Beijing,
4.2 Model Validation China, from 2000 to 2015. Remote Sensing, 2017, 9(3):271.
The accuracy of the CA-Markov model was validated by
comparing the predicted LULC maps with the actual classified [6] Biro, K.; Pradhan, B.; Buchroithner, M.; Makeschin, F.
maps from 2022. The Kappa coefficients ranged from 0.84 to Land use/land cover change analysis and its impact on soil
0.93, indicating a high level of agreement between the properties in the northern part of Gadarif region, Sudan. Land
predicted and actual maps. This suggests that the model can Degrad. Dev. 2013, 24, 90–102
reliably predict future LULC trends.
[7] Calicioglu, O.; Flammini, A.; Bracco, S.; Bellù, L.; Sims,
4.3 Projected Future LULC Trends R. The Future Challenges of Food and Agriculture: An
Based on the CA-Markov model’s projections for 2030, Integrated Analysis of Trends and Solutions. Sustainability
2040, and 2050: 2019, 11, 222
• Urban Expansion: Urban areas are expected to
continue expanding, particularly around major cities and [8] Liu, J.; Zhang, Z.; Xu, X.; Kuang, W.; Zhou, W.; Zhang,
transportation corridors. S.; Li, R.; Yan, C.; Yu, D.; Wu, S.; et al. Spatial patterns and
• Cultivated Lands: Agricultural lands will likely driving forces of land use change in China during the early 21st
continue to grow, driven by ongoing reclamation century. J. Geogr. Sci. 2010, 20, 483–494
projects.
• Desert Area Shrinking: Desert lands are expected to [9] Wang,S.Q.; Zheng, X.; Zang, X. Accuracy assessments of
decline further, with more areas being converted to land use change simulation based on Markov-cellular
agriculture or urban development. automata model. Procedia Environ. Sci. 2012, 13, 1238–1245
[10] Ning,J.; Liu, J.; Kuang, W.; Xu, X.; Zhang, S.; Yan, C.;
Li, R.; Wu, S.; Hu, Y.; Du, G.; et al. Spatiotemporal patterns
5. CONCLUSION
and characteristics of land-use change in China during 2010–
2015. J. Geogr. Sci. 2018, 28, 547–562.
This highlights the integration of machine learning and GIS
techniques for analyzing and predicting LULC changes and
[11] Berihun, M.L.; Tsunekawa, A.; Haregeweyn, N.;
deforestation trends. The results demonstrate that the CA-
Meshesha, D.T.; Adgo, E.; Tsubo, M.; Masunaga, T.; Fenta,
Markov hybrid model, in combination with ML classifiers
A.A.; Sultan, D.; Yibeltal, M. Exploring land use/land cover
such as decision trees and random forests, provides a powerful
changes, drivers and their implications in contrasting agro-
tool for modeling and forecasting LULC changes. The ability
ecological environments of Ethiopia. Land Use Policy 2019,
to predict future LULC scenarios is crucial for land-use
87, 104052.
planning, urban development, and environmental
management. Future work could focus on improving model
[12] You,H.Agricultural landscape dynamics in response to
accuracy by incorporating additional data sources such as
economic transition: Comparisons between different spatial
climate data and improving spatial resolution.
planning zones in Ningbo region, China. Land Use Policy
2017, 61, 316–328.
6. REFERENCES
[13] Mezger, G.; De Stefano, L.; Gonz, M. Analysis of the
[1] Selmy,S.A.H.,Kucher,D.E.,Mozgeris,G.,Moursy, A.R.A., Evolution of Climatic and Hydrological Variables in the Tagus
et al. (2023). "Detecting, Analyzing, and Predicting Land River Basin, Spain. Water 2022, 14, 818.
Use/Land Cover (LULC) Changes in Arid Regions Using
Landsat Images, CA-Markov Hybrid Model, and GIS [14] Sibanda, S.; Ahmed, F. Modelling historic and future land
Techniques." Remote Sensing, 15, 5522. use/land cover changes and their impact on wetland area in
Shashe sub-catchment, Zimbabwe. Model. Earth Syst.
[2] Zhan, Q., Tian, J., Tian, S. (2019) "Prediction Environ. 2021, 7, 57–70.
Model of Land Use and Land Cover Changes
in Beijing Based on ANN and Markov_CA [15] Rahman, M.T.U.; Tabassum, F.; Rasheduzzaman, M.;
Model." IEEE IGARSS 2019. Saba, H.; Sarkar, L.; Ferdous, J.; Uddin, S.Z.; Islam, A.Z.
Temporal dynamics of land use/land cover change and its
[3] Veldkamp A, Lambin EF. Predicting land-use prediction using CA-ANN model for southwestern coastal
change. Agric Ecosyst Environ. 2001(85):1-6 Bangladesh. Environ. Monit. Assess. 2017, 189, 565.
[4] Dai Y, Jiang G. Analysis of Land Use Change [16] Hua,A.K.Landuseland cover changes in detection of water
and Driving Factors in Datong County from 1995 quality: A study based on remote sensing and multivariate
to 2014. Journal of Henan Polytechnic University statistics. J. Environ. Public Health 2017, 2017, 7515130.
(Social Science Edition), 2016, 17(4):438-444
[17] Bashir, O.; Bangroo, S.A.; Guo, W.; Meraj, G.; Ayele,
[5] Jiang M, Tian S, Zheng Z, et al. Human
G.T.; Naikoo, N.B.; Shafai, S.; Singh, P.; Muslim,
M.; Taddese, H.; et al. Simulating Spatiotemporal
Changes in Land Use and Land Cover of the
North-Western Himalayan Region Using Markov
Chain Analysis. Land 2022, 11, 2276
[18] Al-shalabi, M.; Billa, L.; Pradhan, B.; Mansor,
S.; Al-Sharif, A.A.A. Modelling urban growth
evolution and land-use changes using GIS based
cellular automata and SLEUTH models: The case
of Sana’a metropolitan city, Yemen. Environ.
Earth Sci. 2013, 70, 425–437.
[19] Elagouz, M.; Abou-Shleel, S.; Belal, A.; El-
Mohandes, M. Detection of land use/cover change
in Egyptian Nile Delta using remote sensing.
Egypt. J. Remote Sens. Space Sci. 2019, 23, 57–
62.
[20] Said, M.E.S.; Ali, A.M.; Borin, M.; Abd-
Elmabod, S.K.; Aldosari, A.A.; Khalil, M.M.N.;
Abdel-Fattah, M.K. On the Use of Multivariate
Analysis and Land Evaluation for Potential
Agricultural Development of the Northwestern
Coast of Egypt. Agronomy 2020, 10, 1318.