Crop Recommendation System
Group Name : Digital Farmers
Pro ject Members
Dhrupad Dutta
Sagnik Biswas
November 23, 2024
Abstract
The Crop Recommendation System is an intelligent application powered by machine learn-
ing algorithms designed to assist farmers in making informed decisions about crop selection
based on environmental factors. By analyzing key parameters such as soil nutrient levels,
temperature, humidity, pH, and rainfall, the system predicts the most suitable crop for
cultivation on a given piece of land. This project aims to alleviate the challenges faced by
farmers in selecting appropriate crops, ultimately increasing agricultural productivity and
pro tability. By harnessing the power of data-driven insights, the Crop Recommendation
System empowers farmers to make ecient and sustainable agricultural decisions, leading
to improved yields and economic outcomes.
1 Introduction
1.1 What is this project:
The Crop Recommendation System is a pioneering project designed to assist farmers in
making informed decisions about crop selection by harnessing the power of machine learn-
ing algorithms. It analyzes various environmental factors such as soil nutrient levels, tem-
perature, humidity, pH, and rainfall to provide tailored recommendations for optimal crop
choices.
1.2 Why this project:
With the increasing complexity of agricultural practices and the growing need for sustain-
able farming solutions, there is a pressing demand for advanced technologies to 1 streamline
crop selection processes. This project addresses the challenges faced by farmers in selecting
suitable crops by o ering a data-driven approach that optimizes productivity and resource
1
utilization while minimizing environmental impact
1.3 How this is solved:
The Crop Recommendation System employs sophisticated machine learning techniques to
analyze large datasets of environmental parameters. By training predictive models on this
data, the system can accurately predict the most suitable crops for cultivation on a given
piece of land. This approach revolutionizes traditional crop selection methods, empowering
farmers with the tools and insights needed to achieve sustainable and pro table farming
outcomes
2 Literature review
2.1 FARMING ASSISTANCE FOR SOIL FERTILITY IMPROVEMENT AND CROP
PREDICTION USING XGBOOST{ MANGESH DESHMUKH, AMITKUMAR JAISWAR,
OMKAR JOSHI, AND RAJASHREE SHEDGE{ DEPARTMENT OF COMPUTER EN-
GINEERING, RAMRAO ADIK INSTITUTE OF TECHNOLOGY, NERUL, NAVI MUM-
BAI, INDIA
This proposed work is a recommendation system in which Machine Learning techniques
are used to recommend best three crops based on soil and weather parameters. The top
three crops are recommended because farmers may not have access to a particular crop
if only one crop is recommended. Previous studies in this eld have been done by using
di erent Machine Learning algorithms such as Random Forest, KNN, Naive Bayes, etc.
In this proposed system XGBoost Machine Learning algorithm is used which gives better
results than other algorithms. In addition, the system provides information about how
to improve the soil for growing the desired crop and gives the weather forecast for next
ve days. A nancial losses while also increasing crop productivity. This system has the
accuracy 90
2.2 CROP RECOMMENDATION SYSTEM USING MACHINE LEARNING ALGO-
RITHM{ LAKSHMAN KUMAR SERU, SAI MAANAS GANDHAM, DEPARTMENT OF
COMPUTER SCIENCE AND ENGINEERING SCHOOL OF COMPUTING, SATHYABA-
MAINSTITUTEOFSCIENCEANDTECHNOLOGY
This Crop Recommendation System for agriculture is based on various input param-
eters. This proposes a hybrid model for recommending crops to south Indian states by
considering various attributes such as soil type, Rainfall, Groundwater level, Temperature,
Fertilizers, Pesticides and season. The recommender model is built as a hybrid model using
the classi er machine learning algorithm. Based on the appropriate parameters, the system
will recommend the crop. Technology based crop recommendation system for agriculture
2
helps the farmers to increase the crop yield by recommending a suitable crop for their land
with the help of geographic and the climatic parameters. This system has the accuracy 91
2.3 INTELLIGENT CROP RECOMMENDATION SYSTEM USING ML- AYUSH KU-
MAR, OMEN RAJENDRA POONIWALA, SWAPNEEL CHAKRABORTY, VISVESVARAYA
TECHNOLOGICAL UNIVERSITY
In this project, they are building an intelligent system, which intends to assist the Indian
farmers in making an informed decision about which crop to grow depending on the sowing
season, his farm's geographical location and soil characteristics. Further the system will
also provide the farmer, the yield prediction if he plants the recommended crop. This is an
intelligent system that would consider environmental parameters (temperature, rainfall,
geographical location in terms of state) and soil characteristics (pH value, soil type and
nutrients concentration) before recommending the most suitable crop to the user.
2.4 AGRO CONSULTANT- INTELLIGENT CROP RECOMMENDATION SYSTEM
USING MACHINE LEARNING ALGORITHMS- ZEEL DOSHI, SUBHASH NADKARNI,
RASHI AGRAWAL, PROF. NEEPA SHAH
This paper, proposed and implemented an intelligent crop recommendation system, which
can be easily used by farmers all over India. This system would assist the farmers in making
an informed decision about which crop to grow depending on a variety of environmental
and geographical factors. We have also implemented a secondary system, called Rainfall
Predictor, which predicts the rainfall of the next 12 months.
2.5 CROP RECOMMENDATION SYSTEM FOR PRECISION AGRICULTURE- S.PUDUMALAR,
E.RAMANUJAM, R.HARINE RAJASHREE, C.KAVYA, T.KIRUTHIKA, J.NISHA
This paper, proposes a recommendation system through an ensemble model with major-
ity voting technique using Random tree, CHAID, K-Nearest Neighbor and Naive Bayes
as learners to recommend a crop for the site speci c parameters with high accuracy and
eciency
3 Proposed methodology
3.1 Data Collection:
The rst step in implementing the Crop Recommendation System is to collect relevant
data on soil nutrient levels (Nitrogen, Phosphorus, Potassium), temperature, humidity,
pH, rainfall. This data serves as the foundation for training the machine learning models.
In our dataset, there are 8 columns in which 7 are the input parameters and the last one
is the crop type which is classi ed based on these input factors.
The screenshot of the raw CSV le of our dataset
3
statistical view of our data set
3.2 Data Preprocessing:
Once the data is collected, it undergoes preprocessing to clean and prepare it for analysis.
This includes handling missing values, encoding categorical variables, and scaling numeri-
cal features to ensure uniformity and consistency across the dataset. In our dataset there
4
was no null values and no duplicate values so we didn't need to modify these. In next
stage, we have encoded the categorical column of our dataset (In our dataset, 'label' eld)
into numerical values in which 1st type name is encoded by '1', 2nd by '2' and so on.
3.3 Exploratory Data Analysis (EDA):
EDA is performed to gain insights into the relationships between di erent environmen-
tal factors and crop yields. Visualization techniques such as histograms, scatter plots,
and correlation matrices are used to identify patterns and trends in the data. At rst, the
co-relations between di erent environmental input factors are determined and plotted as
co-relation matrix using Seaborn heatmap function.
Figure 1: Co-relation matrix of the environmental input factors
Plots of input counts can provide insights into the distribution of di erent classes
or categories within the dataset. Examining input counts can reveal potential data pre-
processing steps that may be necessary before model training. So here the Histogram
plots of the environmental input parameters are done for visualization. There are 7 input
parameters such as ratio of Nitrogen (N), Phosphorus (P), Potassium (K), Temperature,
Humidity, pH value of soil and Rainfall of that geographical location.
Histogram plots for Nitrogen (N), Phosphorus (P) and Potassium (K) counts
5
Histogram plots for Temperature, Humidity, pH and Rainfall count
3.4 Model Selection:
6
After pre-processing the data, various machine learning models are evaluated to determine
which one performs best for the task of crop recommendation. Models such as Logistic Re-
gression, Decision Tree Classi er, Random Forest Classi er, Gradient Boosting Classi er,
Support Vector Machine, and K-Nearest Neighbors are considered. We have applied these
models for training upon our dataset. We found the accuracy scores of all these models
upon our dataset which are given in next Table .
Figure 2: Accuracy Scores of Various Models
Confusion matrices provide a comprehensive summary of the performance of a classi-
cation model by tabulating the actual and predicted class labels. It allows for a detailed
analysis of the types of errors made by the model. By breaking down the predictions
into true positives, true negatives, false positives, and false negatives, one can identify
which classes are often confused with each other and understand the speci c nature of the
model's mistakes. So for model selection, we have used Confusion matrices to facilitate
the comparison of multiple models by providing a standardized framework for evaluating
their performance.
7
8
9
10
11
12
13
14
15
16
17
18
19
3.5 Model Training:
Random Forest Classi er model is selected for our training which gives us the Accuracy
99.3 trained on the pre-processed dataset. The model learns to map the input environ-
mental factors to the corresponding crop recommendations based on the historical data.
We considered Random Forest Classi er model as best model for us. After splitting our
dataset into 80-20 split for training-testing, we trained it by importing the model from
Scikit-learn python package and applying t() function on Random Forest Classi er.
3.6 Model Evaluation:
Once trained, the model is evaluated using a separate test dataset to assess its per-
formance. Metrics such as accuracy score and confusion matrix are used to measure the
model's e ectiveness in predicting crop recommendations. Our training dataset contains
1760 records and testing dataset contains 440 records.
4 Experimental result
The experimental results demonstrate the e ectiveness of the proposed method in ac-
curately predicting crop recommendations based on environmental factors. The system
achieves competitive performance compared to state-of-the-art methods while maintaining
reasonable time complexity for real-time application.
4.1 Used Datasets:
The Crop Recommendation System was evaluated using a standard dataset obtained from
the site: https://www.kaggle.com/datasets/atharvaingle/crop-recommendation-dataset? re-
source=download This dataset comprises records of soil nutrient levels (such as Nitro-
gen, Phosphorus, Potassium), temperature, humidity, pH, rainfall, and corresponding crop
names which is suitable for yielding based on these environmental factors.
4.2 Experimental Settings:
The machine learning models were trained and evaluated using a 80-20 train-test split. The
dataset was preprocessed to handle missing values, encode categorical variables, and scale
numerical features. Various machine learning algorithms, including Logistic Regression,
Decision Tree Classi er, Random Forest Classi er, Gradient Boosting Classi er, Support
Vector Machine, and K-Nearest Neighbors, were implemented and compared by the Confu-
sion Matrices,and performance metrics(precision,recall,f1 score) which were shown before.
4.3 Experimental Results:
20
After deploying the whole system, it requires a lot of testing and experiments for under-
standing how e ective is the model's performance and reliability of its performance. We
have used our test dataset values for giving into the input variables during our testing and
experimental stage.
4.3.1 In Python Notebook:
The screenshot is provided where we have tested the model for prediction using input
values such as, N = 20, P = 21, K = 15, Temperature = 10, Humidity = 48, pH = 10,
Rainfall = 80
Figure 3: Predicting crop type in numerical encoded format
4.4 Time Complexity:
The time complexity of the Crop Recommendation System project can vary depending
on several factors, including:
4.4.1 Data Pre-processing:
Time complexity of data pre-processing steps such as handling missing values, encoding
categorical variables, and splitting the dataset into training and testing sets depends on
the size of the dataset and the complexity of pre-processing operations. Generally, these
operations have a linear time complexity, but certain operations like one-hot encoding of
categorical variables may have higher time complexity.
4.4.2 Model Training:
The time complexity of training the Random Forest Classi er model depends on the
size of the training dataset (n) and the number of features (m). For each decision tree in
the random forest, the time complexity of training is approximately O(n m log(n))
21
4.5 Drawbacks of this project
4.5.1 Dependency on Input Parameters:
The accuracy and reliability of the recommendations provided by the system heav-
ily rely on the accuracy and completeness of the input parameters such as nitrogen (N),
phosphorus (P), potassium (K), temperature, humidity, pH, and rainfall. Inaccurate or
incomplete data may lead to suboptimal recommendations.
4.5.2 Limited Scope:
The system's recommendations are based solely on environmental parameters and do
not take into account other important factors such as soil type, land availability, market
demand, and farmer preferences. As a result, the recommendations may not always align
perfectly with real-world conditions and constraints.
4.5.3 Model Limitations:
While the Random Forest Classi er model chosen for this project performs well in
many scenarios, it has its limitations. For instance, it may struggle with capturing com-
plex nonlinear relationships in the data, especially if the dataset is highly imbalanced or
contains noisy features.
4.5.4 Maintenance and Updates:
Like any machine learning system, the Crop Recommendation System requires regular
maintenance and updates to ensure its continued e ectiveness. This includes updating the
model with new data, re ning the feature selection process, and addressing any issues or
biases that may arise over time
5 Summary
In conclusion, the Crop Recommendation System project represents a signi cant step
towards leveraging machine learning technology to address agricultural challenges. By
integrating the Machine Learning model with a user-friendly HTML/CSS web interface, we
have created a versatile tool capable of providing tailored crop recommendations based on
key environmental parameters. This system has the potential to revolutionize traditional
farming practices by o ering data-driven insights to farmers and agriculture professionals.
Through extensive data exploration, pre-processing, and model selection processes, we
22
have identi ed the Random Forest Classi er as the most suitable model for predicting crop
recommendations with high accuracy. Its ability to handle complex relationships within
the dataset and provide robust predictions makes it an ideal choice for our application.
The user interface design emphasizes simplicity, intuitiveness, and accessibility, allowing
users to easily input their environmental parameters and receive instant recommendations.
Input validation mechanisms ensure that users provide valid input values, enhancing the
reliability of the system's output. Furthermore, the system's output display seamlessly
integrates the predicted crop recommendation into the same webpage, providing users with
immediate feedback and eliminating the need for navigating to separate pages. Overall,
the Crop Recommendation System holds immense potential to empower farmers with
actionable insights, optimize crop selection processes, increase agricultural productivity,
and ultimately contribute to sustainable food production. As we continue to re ne and
improve the system, it will play a pivotal role in shaping the future of precision agriculture
and fostering innovation in the farming industry
6 References
[1] Swapneel Chakraborty Ayush Kumar, Omen Rajendra Pooniwala. Intelligent crop rec-
ommendation system using ml. Technical report, Visvesvaraya Technological University,
2019. [2] BDA. Big data analytics. http://cs.rkmvu.ac.in/ academics-msc-in-big-data-
analytics-data-science/, 2016. [3] Christopher M. Bishop. Pattern Recognition and Ma-
chine Learning. Springer, 2006. [4] Bernhard E Boser, Isabelle M Guyon, and Vladimir
Naumovich Vapnik. A training algorithm for optimal margin classi ers. In Proceedings of
the Fifth Annual Workshop on Computational Learning Theory, pages 144152. ACM, 1992.
[5] Leo Breiman. Random forests. Machine Learning, 45(1):5 32, 2001. 23 [6] Google.
Crops images. http://www.google.com/images. [7] Sai Manas Gandham Lakshman Kumar
Seru. Crop recommendation system using machine learning algorithm. Technical report,
Department of Computer Science and Engineering, School of Computing, Sathyabama
Institute of Science and Technology, 2022. [8] Omkar Joshi Rajashree Shedge Mangesh
Deshmukh, Amit Kumar Jaiswar. Farming assistance for soil fertility improvement and
crop prediction using xgboost. Technical report, Department of Computer Engineering,
Ramrao Adik Institute of Technology, Nerul, Navi Mumbai, 2022. [9] Mehryar Mohri,
Afshin Rostamizadeh, and Ameet Talwalkar. Foundations of Machine Learning. The MIT
Press, 2nd edition, 2018. [10] R. Harine Rajashree C. Kavya T. Kiruthika J. Nisha S. Pudu-
malar, E. Ramanujam. Crop recommendation system for precision agriculture. Technical
report, 2017. [11] Rashi Agrawal Prof. Neepa Shah Zeel Doshi, Subhash Nadkarni. Agro
consultant intelligent crop recommendation system using machine learning algorithms.
Technical report, 2018
23