Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
57 views8 pages

Water Parameter Estimation Application U

Uploaded by

olisedaniel6
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
57 views8 pages

Water Parameter Estimation Application U

Uploaded by

olisedaniel6
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

International Journal of Computer Science and Information Security (IJCSIS),

Vol. 22, No. 4, July-August 2024

Water Parameter Estimation Application Using


Statistical Learning
Reuben Moyo 1, Stanley Ndebvu2, Peter Phiri3, Emmanuel Ngalande 4, Chimango Nyasulu5, Bright Chunga6
1,4,5
ICT Department, Mzuzu University
Private Bag 201, Mzuzu, Malawi
1
[email protected]
[email protected]
5 [email protected]

2
Department of Computer Science, University of Botswana
Private Bag UB 0022, Gaborone, Botswana
2 [email protected]
3
Swift Limited Company
P.O. Box 1170, Blantyre, Malawi
3 [email protected]
6
Water & Sanitation Department, Mzuzu University
Private Bag 201, Mzuzu, Malawi
6 [email protected]

Abstract—About 61.7% of Malawi’s population relies on [2]. To ensure that people access safe drinking water that
groundwater as a drinking water source. Particularly in rural aligns with the requirements established by the Malawi
settings, groundwater provides almost all water needs. In many Bureau of Standards (MBS) and the World Health
developing countries, however, there is a need for more routine Organization (WHO), water-supplying organisations
assessments of groundwater quality to ensure safe drinking
water. For example, in Malawi, groundwater and surface waters
proactively employ water monitoring techniques and record
often exceed threshold parametric values set by the Malawi significant water parameters such as total dissolved solids
Bureau of Standards (MBS) and the World Health Organization (TDS), total hardness in the form of Calcium Carbonates,
(WHO). The current process of determining water parameters is Manganese, and Calcium ions.
tedious and expensive, requiring specialised equipment and The current process of monitoring and determining water
repeated on-site visits for data collection. This paper reports on parametric values in Malawi is tedious and expensive; it
developing a mobile application that estimates water parametric requires specialised equipment and repeated on-site visits for
values by deducing the relationships between the parameters. data collection [3], [4]. In the likely event of the unavailability
The study used a dataset of 64 samples with eight features. We of funds or some equipment, this problem results in numerous
performed correction matrices and feature importance to
identify the relationship among the variables and build models to
missing values in the water quality database. Even if the
predict the parameters. The usability evaluation results show necessary instruments for such analysis are present, data
that the application is useful, practical, easy to use, learnable, organisation is a problem. For example, some data are lost
and satisfactory. when relayed from one level to another or when cross-sectoral
sharing is done among stakeholders. Because Malawi and
I. INTRODUCTION other low- and middle-income countries (LMICs) operate
Globally, groundwater serves a significant percentage of within a minimal budget, gaps often appear in the database
the world’s population. Approximately 97 per cent of the because most data is not collected [4]. This inevitably affects
world’s freshwater exists as groundwater. As such, most the overall management of the available water resources
regions regard groundwater as one of the most reliable sources because decisions are not informed by accurate data but rather
of safe water, free from biological contaminants. Most by guesswork.
countries, especially developing ones, opt for groundwater as The convergence of advanced technology and
an economically viable water source because it does not environmental science provides opportunities for addressing
require expensive treatments compared to surface. This is these challenges [1], [5], [6]. This study presents a technique
because groundwater does not contain biological contaminants for estimating water parametric values using statistical data.
[1]. Malawi faces many water-related challenges that require By leveraging the power of statistical learning, this research
special consideration if the country is to mitigate or endeavours to enhance water quality dynamics in Malawi and
completely eradicate them. Malawi’s poor water sanitation contribute to formulating data-driven strategies for sustainable
and hygiene cost the government US$57 million annually, or water management. By meticulously examining statistical
1.1% of its GDP, due to health costs and productivity losses tools and methodologies and their application to Malawi’s

https://google.academia.edu/JournalofComputerScience 1 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 22, No. 4, July-August 2024

unique water needs and environmental context, this study selection, heterogeneous data fusion, hyperparameter tuning,
illuminates a path towards a cheaper, more effective and time- and model evaluation. [5] aims to assess groundwater quality
saving water parameter estimation, with implications for parameters in Majmaah City, Saudi Arabia utilising
broader global contexts facing similar challenges. multivariate statistical approaches like Principal Component
features. Analysis (PCA) and Factor Analysis (FA). Geostatistics
techniques and GPS data were employed to characterise water
II. RELATED WORK quality parameters in 2D and presented as contour maps and
High water quality is essential for human health, 3D representations [19].
environmental sustainability, and economic growth. To improve the explainability, presentation of results, and
Monitoring and predicting water quality parameters is crucial deployability of the model, [20] and [21] argue that most
for protecting aquatic ecosystems, ensuring safe drinking developed models are left unused at the conceptual stage due
water, and managing water resources effectively [4], [7]. to a lack of consideration for deployment. Their study
However, water quality is influenced by various natural and proposes deploying models using explainable, user-friendly,
human factors. Parameters such as temperature, pH, electrical and affordable technologies. Guided by the Technology
conductivity, dissolved oxygen, alkaline, calcium, sodium, Acceptance Model, an Android mobile application was
potassium, magnesium, chloride, nitrite, and phosphate are developed to facilitate the model’s smooth implementation
critical. Traditional water quality monitoring methods rely on and adoption.
physical sensors and lab analysis, which can be costly, time-
consuming, and limited in scope [1], [8], [9], [10]. In other III. METHODOLOGY AND RESULTS
cases, satellite remote sensing-based techniques measure A. Dataset and feature selection
active WQI, such as Chl-a, SS, coloured dissolved organic
matter (CDOM), and turbidity, over broad areas and at regular The dataset, which contains 64 samples and nine features
intervals. Nonetheless, this approach has limitations and may (Ca, EC, Total Hardness, Turbidity, Total coliform, SO4, pH,
not work well in all environments. Above all, the use of and Streptococcus), was compiled by the Central Region
satellites is an expensive undertaking. Therefore, this study Water Board (CRWB) from 64 boreholes in Dowa district,
employs predictive statistical modelling to overcome some Malawi. The CRWB is a parastatal organization founded to
highlighted challenges. provide Malawi’s central region with potable water for
Statistical learning (SL) methods, including support vector residential, commercial, and industrial use. We chose the
machine and decision tree algorithms, have been well- features that correlate with each feature required for estimate.
developed for nonlinear regression analysis in recent years The standard practice for corroboration or rejection of a
(Kenda et al., 2020). These methods have been proven theory based on correlation is to use more than one method of
effective in both small and large-scale cases, attracting the determining the correlation. We used two methods to identify
attention of the environmental and geophysical modelling and ascertain correlations among the dataset’s features. The
community. [11], [12], [13], [14] demonstrated that water correlation matrices showed strong correlations between TDS
parametric values can be calculated from the relationship of and EC and between total hardness and Ca. Therefore, the
the features in a dataset. For example, [13] observed that the study aimed to predict these four features. Figure 1 below
concentration of TDS can be measured easily from EC value. illustrates the correlation matrix of the dataset, calculated
This study used correlation and feature importance to identify using the Pearson correlation coefficient method. The Pearson
relationships and select features for prediction from a custom correlation coefficient determines the strength and direction of
dataset. In addition to these applications, some studies have the linear. The correlation values range from1to+1, where +1
utilised both SL and DA methods to achieve accurate and indicates a perfect positive linear relationship, 1 indicates a
efficient modelling. [15] used random forest regression to perfect negative linear relationship, and 0 indicates no linear
build a relationship between remote sensing data and field relationship. One limitation of the Pearson correlation
observation data to estimate dryland surface indicators. [16] coefficient is its inability to capture non-linear relationships
used an artificial neural network to imitate the local ensemble among variables. The distance correlation method was
transform Kalman filter and enhance prediction efficiency but employed to detect non-linear relationships in the data. The
did not improve the DA algorithm. process is versatile because it can effectively apply to linear
[17] used the extended Kalman filter as a substitute for the and non-linear data sets. One key advantage of this metric is
backpropagation training of deep belief networks in that it does not make any assumptions regarding the normality
infrastructure sustainability analysis. The findings of these of the input vectors, thereby eliminating any potential biases
studies imply that incorporating SL methods into sequential in the results. The results of this metric range from 0 to 2, with
DA is a promising means of achieving more accurate time- 0 indicating a perfect correlation and 2 indicating a perfect
varying parameter tracking. In their 2020 study, [18] negative correlation. Figure 2 presents a correlation matrix
comprehensively assessed various statistical modelling calculated using the distance correlation. Additionally, we
methods for both ground- and surface-level prediction employed feature importance to identify essential features in
scenarios. Additionally, they explored practical applications of predicting a feature of concern. Understanding the importance
data-driven modelling, including feature generation, feature of each feature is crucial for gaining insights into the model’s
decision-making process and identifying key factors

https://google.academia.edu/JournalofComputerScience 2 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 22, No. 4, July-August 2024

influencing its predictions. The feature importance score is a the imputation method, which involved filling in missing
metric to assess a feature’s contribution to a model’s values with estimated values of the mean of the columns. We
predictive performance. A higher score indicates a more used the interquartile range of 5-95% to remove outliers from
substantial predictive power, whereas a lower score suggests the dataset.
relatively less influence on predictions.
C. Data Normalisation
Statistical models may produce misleading results when a
dataset contains features that vary significantly in magnitude
(scale). This is because features with larger magnitudes
dominate the learning process [24]. Normalisation aims to
change the dataset’s feature values to a standard scale without
distorting differences in the ranges of values or losing
information. We employed the Min-Max Scaling method to
scale the features to a range between 0 and 1. For a given
feature 𝑥, the scaled value 𝑥′ in the range [minnew,maxnew] is
computed as follows:

Fig 1 Pearson correlation coefficient Where:


• 𝑥 is the original value of the feature.
The feature importance score indicates how much a feature • minold and maxold are the minimum and
contributes to the model’s predictive performance. Higher maximum values of the feature in the original dataset.
importance values suggest more substantial predictive power, • minnew and maxnew are the scaled range’s
while lower values suggest less influence on predictions. We desired minimum and maximum values.
used random-forest regression for each feature to identify its
essential features for their prediction.
D. Design and Model Selection
The correlation matrix reveals that significant relationships
occur between only two pairs of features. We used these
relationships to build univariate models to predict the features
of concern. Since this is a regression problem, we used linear
regression for the prediction. The prime reason for using
linear regression is that the relationship between the
independent and dependent features in this data is represented
by a linear equation, making it interpretable and suitable for
efficiently conveying results.
1) Linear Regression: The linear regression model
assumes that changes in the independent variables are directly
proportional to changes in the dependent variable. It aims to
find the best-fitting linear equation that predicts the dependent
Fig 2 Distance correlation for the dataset
variable based on the independent variables. The following (A.
Khalil et al., 2005) equation describes it: 𝑦 = 𝛽0 + 𝛽1𝑥1 +
Table 1 displays the importance of each variable’s features. 𝛽2𝑥2 + … + 𝛽𝑝𝑥𝑝 Where:
B. Handling Missing Data • 𝑦 is the dependent feature.
Handling missing values in a dataset is essential in the data • 𝑥1,𝑥2,…,𝑥𝑝 are the independent features.
preprocessing phase. Using datasets with missing values can • 𝛽0,𝛽1,𝛽2,…,𝛽𝑝 are the coefficients that
lead to inaccuracies or biases in the results [22]. Data can be represent the relationship between the independent
missing for various reasons, such as corruption, measurement features and the dependent features.
errors, or incomplete data collection. One of the easiest ways • 𝑝 stands for the number of independent
to address missing values is to delete the records containing features.
them; however, for use cases with a small dataset like the one E. Evaluation Metrics
used in this study, it is recommended to use methods that do
not further reduce the dataset [23]. In this study, we employed

https://google.academia.edu/JournalofComputerScience 3 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 22, No. 4, July-August 2024

The following metrics were used to evaluate the 1) Mean Absolute Error: The Mean Absolute Error (MAE) is
performance of regression models: a metric used to measure the average difference between the
actual and predicted values. A lower MAE suggests that the
Table I Feature Importance

Total Total
Ca TDS EC Turbidity SO4 pH Streptococcus
Hardness coliform
Ca 0.00 0.01 0.81 0.21 0.02 0.10 0.06 0.07

TDS 0.01 0.93 0.03 0.00 0.05 0.057 0.08 0.06

EC 0.02 0.96 0.03 0.03 0.02 0.04 0.09 0.04

Total Hardness 0.80 0.00 0.00 0.00 0.05 0.05 0.07 0.05

Turbidity 0.05 0.00 0.01 0.01 0.84 0.03 0.08 0.01

Total coliform 0.01 0.00 0.00 0.00 0.36 0.03 0.03 0.38

SO4 0.05 0.00 0.01 0.03 0.22 0.01 0.54 0.04

pH 0.03 0.00 0.00 0.05 0.01 0.02 0.61 0.17

Streptococcus 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.02

model performs better. The formula to calculate MAE is as


4) R-squared (R2) Score): The coefficient of determination
follows:
R2 shows the proportion of the variance in the dependent
variable that the independent variables can explain in a
regression model. It varies from 0 to 1, with larger values
indicating that the independent variables better fit the
dependent variable.
2) Mean Squared Error (MSE): The mean squared error
(MSE) is a measure that calculates the average of the squares 5) Adjusted R-squared: The Adjusted R-squared is a
of the errors or the differences between predicted and actual statistical measure considering the number of predictors in a
values. This metric always yields a non-negative value equal model. It offers a more accurate assessment when comparing
to or greater than zero. The formula to compute the MSE is as models with varying numbers of variables.
follows: 6) Mean Percentage Error (MPE): The mean percentage
error (MPE) or the mean absolute percentage error (MAPE) is
a measure that shows the average percentage difference
between actual and predicted values. It helps to understand
how accurately the model predicts the values relative to the
3) Root Mean Squared Error (RMSE): The Root Mean actual data. The following formula gives it:
Square Error (RMSE) is a metric used to quantify the
differences between values predicted by a model and the
actual observed values. It is calculated as the square root of
the Mean Squared Error (MSE) and is expressed in the same
units as the target variable. The RMSE provides a measure of
the model’s prediction accuracy by considering the average
7) Residual Sum of Squares (RSS): RSS, or Residual Sum of
magnitude of the prediction errors. The following formula
Squares, is an essential measure in regression analysis as it
gives RMSE:
helps to quantify the variability or unexplained variance in a
regression model. To calculate RSS, you sum the squared
differences between the observed and the predicted values
from the regression model using the following formula:

https://google.academia.edu/JournalofComputerScience 4 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 22, No. 4, July-August 2024

The development team worked on implementing the agreed


features. The application coding was done in two weeks,
during which daily stand-up meetings were held to discuss
progress and any impediments and plan the day’s work.
Where: Continuous integration and automated testing were employed
• 𝑦𝑖 is the actual target value for the 𝑖-th to ensure code quality and detect issues early in the
observation. development cycle. The officials in charge of waterboarding
• 𝑦̂𝑖 is the predicted value for the 𝑖-th tested and evaluated the application’s features to ensure
observation. quality before adding it to the final repository. Feedback from
the testers was used to improve the features of the application.
F. Results: Estimating EC, TDS, Hardness and Ca The process was considered complete once all requirements
As shown in Table 2, EC is the only feature that influences were met, and an iterative approach was taken until all
the predictability of TDS and vice versa, with a strong experiments met the stakeholders’ expectations. The final
correlation between the two features. Hence, we run phase was entered after the testers approved all proposed
regression models using one feature to predict the other. system features. After successfully pulling the system from
Another strong relationship is between hardness and Ca. the local repository, a signature key was essential for creating
According to the feature importance matrix, these variables the final Android binary. Upon obtaining the key, the final
strongly influence each other’s predictions. Predictions of release APK file was promptly generated. The screenshots of
these features were made based on each other using linear the application are shown in Figure 3.
regression. Table 2 below shows the results of estimating the
features using a linear regression model.

Table II Results of Estimating Features Using Linear Regression

Predicted MAE MSE RMSE MAPE R2 Adjusted


Feature R2
EC 1.08 5.1 2.27 2.74 1.0 1.0
TDS 19.09 1.129 2.69 2.69 1.0 1.0
Hardness 14.2 298.37 3.3 7.9 0.90 0.87
Ca 3.0 11.2 3.35 0.19 0.873 0.83

1) Mobile Application Development


We incorporated the developed models into a mobile
application for easy usage and distribution. We used the Agile
software development methodology to guide the development
process. The choice of the methodology was necessitated by
its emphasis on collaboration, flexibility, and customer Fig 3 Screenshots of the application
feedback throughout iterative development. The need for
stakeholder input influenced the choice of methods throughout
the development of the process. We used the Dart G. Application Evaluation and Results
programming language and Flutter framework to develop the Quantitative methodology was used to evaluate the
application. The Flutter framework was chosen because it application’s performance. We employed a purposive
allows for the development of cross-platform applications sampling technique to identify participants and assess the
using a single code base. Hence, the application works for system. Specifically, expert sampling was used to select
Android and iOS platforms. individuals for the evaluation and testing exercise. This
2) Joint Application Design and Planning sampling technique is ideal since participants should be
knowledgeable in water monitoring. We gave the application
The development process started with a Joint Application
and a questionnaire to thirty-one water monitoring officers
Design and planning meeting where the researchers,
from various water-supplying bodies. We gathered twenty-one
developers, and Waterboard officials gathered to prioritise the
completed questionnaires, indicating a response rate of 70%.
application activities based on requirements and business
Participants were tasked with evaluating the application's
value.
usability using a set of 20 items on a scale from 1 to 5. In this
3) Development Sprints scale, 1 denoted "Strongly Disagree," 5 denoted "Strongly
Agree," and the intermediary values represented varying

https://google.academia.edu/JournalofComputerScience 5 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 22, No. 4, July-August 2024

Table III Results of the Application Evaluation


Effectiveness 1 2 3 4 5 Mean STDEV
I needed much help to use the application 71%(15) 14%(3) 9%(2) 4%(1) 1.23 1.29

The application is difficult to use despite a 76%(16) 14%(3) 4%(1) 4%(1) 1.38 1.06
comprehensive guide
The application produces accurate results. 14%(3) 9%(2) 76%(16) 4.61 4.15

Usefulness 1 2 3 4 5

The application is useful. 4%(1) 14%(3) 80%(17) 4.76 4.26

The application makes things easier to 4%(1) 19%(4) 4%(1) 71%(15) 4.42 4.01
accomplish
The application does everything I would 9%(2) 9%(2) 19%(4) 61%(13) 4.3 4.80
expect it to do
The application saves me time when I use it. 9%(2) 4%(1) 9%(2) 76%(16) 4.52 4.10

Ease of use 1 2 3 4 5

The application is easy to use 4%(1) 4%(1) 4%(1) 85%(18) 4.71 4.25

The application is simple to use 4%(1) 4%(1) 4%(1) 85%(18) 4.71 4.25

The system requires a few steps to 4%(1) 4%(1) 90%(19) 4.61 3.85
accomplish what I want to do.
Learnability 1 2 3 4 5

The application is easy to remember how to 4%(1) 4%(1) 9%(2) 80%(17) 4.66 4.20
use
The application is easy to learn. 14%(3) 4%(1) 80%(17) 4.52 4.12

I learnt to use the application quickly. 9%(2) 4%(1) 4%(1) 80%(17) 4.57 4.15

Satisfaction 1 2 3 4 5

I am satisfied with the application. 4%(1) 4%(1) 4%(1) 85%(18) 4.71 4.25

The application works the way I expected 14%(3) 85%(18) 4.85 4.34

I would recommend the application to other 9%(2) 4%(1) 85%(18) 4.76 4.27
users.

degrees of agreement. The usability evaluation results in R2, and MPE. The low values of MAE, MSE, and RMSE
Table 3 suggest that most participants responded positively to indicate that the models predict values that closely align with
statements affirming the application’s usability. The results the actual measurements, signifying the accuracy of the
indicate the application is a fitting solution for water estimation. The high R2 Score and Adjusted R2 values
parameter estimation. suggest a strong correlation between the predicted and
observed values, further affirming the reliability of the models.
IV. DISCUSSION Furthermore, the mobile application’s evaluation of twenty-
This study represents a significant advancement in the field one participants yielded promising results regarding its
of water quality monitoring. We aimed to provide a usability and effectiveness in real-world scenarios.
convenient and accessible tool for professionals and lay users Participants reported ease of use and satisfaction with the
to assess water parameters with reasonable accuracy and cost. application’s interface and functionality. Moreover, the
We developed and optimised regression models to predict application provided rapid and convenient access to water
water parameters using a dataset encompassing various water parameter estimates, which could prove invaluable for
sources and conditions. The performance of the models was fieldwork, environmental monitoring, and decision-making
evaluated using the MAE, RMSE, MSE, R2 Score, Adjusted processes. However, it is essential to acknowledge certain

https://google.academia.edu/JournalofComputerScience 6 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 22, No. 4, July-August 2024

limitations and areas for future improvement. While the Environ. Res. Risk Assess., vol. 32, no. 3, pp. 1–15, Mar. 2018, doi:
10.1007/s00477-017-1394-z.
developed models demonstrate robust performance overall,
[9] M. H. Gholizadeh, A. M. Melesse, and L. N. Reddi, “A
there may be specific water conditions or parameters for Comprehensive Review on Water Quality Parameters Estimation
which they are less accurate. Further refinement and Using Remote Sensing Techniques.,” Sensors, vol. 16, no. 8, p.
validation of the models using diverse datasets could enhance 1298, Aug. 2016, doi: 10.3390/s16081298.
[10] Godson Ebenezer Adjovu, H. Stephen, and Sajjad Ahmad, “A
their generalizability and reliability across different contexts.
Machine Learning Approach for the Estimation of Total Dissolved
Additionally, the mobile application’s evaluation sample Solids Concentration in Lake Mead Using Electrical Conductivity
was relatively small, consisting of twenty-one participants. and Temperature,” Water, 2023, doi: 10.3390/w15132439.
While this sample provided valuable insights into initial user [11] E. Brown, M. Skougstad, and M. Fishman, “Methods for
collection and analysis of water samples USGS Water-Supply Pap.
perceptions and experiences, a more extensive and diverse
1454,” 1960.
participant pool would offer a more comprehensive [12] J. D. Hem, Study and interpretation of the chemical characteristics
assessment of the application’s usability and effectiveness. of natural water, vol. 2254. Department of the Interior, US
Geological Survey, 1985.
V. CONCLUSION [13] A. Rusydi, “Correlation between conductivity and total dissolved
solid in various type of water: a review, IOP conference series:
In conclusion, this study presents the development of a Earth and environmental science,” IOP Publ. Doi, vol. 10, pp.
water parameter estimation application using statistical 1755–1315, 2018.
learning methods tailored to address the specific challenges of [14] N. Walton, “Electrical conductivity and total dissolved solids—
what is their precise relationship?,” Desalination, vol. 72, no. 3, pp.
water quality monitoring in Malawi. The application, designed 275–292, 1989.
for environments where data is scarce and traditional [15] Lingcheng Li et al., “A machine learning approach targeting
equipment is costly, successfully predicts water parameters parameter estimation for plant functional type coexistence
with high accuracy, as demonstrated by the low error rates and modeling using ELM-FATES (v2.0),” Geosci. Model Dev., 2023,
doi: 10.5194/gmd-16-4017-2023.
high correlation scores achieved in model evaluations. The [16] D. Carvajal-Patiño and R. Ramos-Pollán, “Synthetic data
positive usability feedback from participants further validates generation with deep generative models to enhance predictive
the application's practicality and effectiveness in real-world tasks in trading strategies,” Res. Int. Bus. Finance, vol. 62, p.
scenarios. However, there remains potential for future 101747, 2022, doi: https://doi.org/10.1016/j.ribaf.2022.101747.
[17] J. Wei, J. Zhao, X. Lei, Z. Zhang, and H. Wang, “Statistical-
improvements, including expanding the model's adaptability Learning-Based Ensemble Data Assimilation Methods for
to diverse water conditions and broadening the evaluation Parameter Estimation in Hydrodynamic Models,” Mar. 29, 2022,
sample size for more comprehensive feedback. Overall, this Rochester, NY: 4069683. doi: 10.2139/ssrn.4069683.
research contributes significantly to the field of water quality [18] K. Kenda, J. Peternelj, N. Mellios, D. Kofinas, M. Čerin, and J.
Rožanec, “Usage of statistical modeling techniques in surface and
management by providing a cost-effective, efficient, and user- groundwater level prediction,” J. Water Supply Res. Technol.-
friendly solution with broader implications for similar Aqua, vol. 69, no. 3, pp. 248–265, Apr. 2020, doi:
contexts globally. 10.2166/aqua.2020.143.
[19] B. Khalil, T. B. M. J. Ouarda, and A. St-Hilaire, “Estimation of
REFERENCES water quality characteristics at ungauged sites using artificial
neural networks and canonical correlation analysis,” J. Hydrol.,
[1] M. Al-Mukhtar and F. Al-Yaseen, “Modeling Water Quality vol. 405, no. 3, pp. 277–287, Aug. 2011, doi:
Parameters Using Data-Driven Models, a Case Study Abu-Ziriq 10.1016/j.jhydrol.2011.05.024.
Marsh in South of Iraq,” Hydrology, vol. 6, no. 1, p. 24, Mar. 2019, [20] N. Feldkamp, “Data Farming Output Analysis Using Explainable
doi: 10.3390/hydrology6010024. AI,” in Proceedings of the Winter Simulation Conference, in
[2] UNICEF, “Malawi Annual Country Report,” Ctry. Strateg. Plan WSC ’21. Phoenix, Arizona: IEEE Press, 2022.
2019-2023, 2021. [21] C. Nyasulu and W. Dominic Chawinga, “Using the decomposed
[3] C. L. Chidammodzi and V. S. Muhandiki, “Water resources theory of planned behaviour to understand university students’
management and Integrated Water Resources Management adoption of WhatsApp in learning,” E-Learn. Digit. Media, vol. 16,
implementation in Malawi: Status and implications for lake basin no. 5, pp. 413–429, Sep. 2019, doi: 10.1177/2042753019835906.
management,” Lakes Reserv. Sci. Policy Manag. Sustain. Use, vol. [22] T. Emmanuel, T. Maupong, D. Mpoeleng, T. Semong, B. Mphago,
22, no. 2, pp. 101–114, 2017, doi: 10.1111/lre.12170. and O. Tabona, “A survey on missing data in machine learning,” J.
[4] C. Mussa and J. F. Kamoto, “Groundwater Quality Assessment in Big Data, vol. 8, no. 1, pp. 1–37, 2021.
Urban Areas of Malawi: A Case of Area 25 in Lilongwe,” J. [23] R. Ahn, S. Supakkul, L. Zhao, K. Kolluri, T. Hill, and L. Chung,
Environ. Public Health, vol. 2023, no. 1, p. 6974966, 2023, doi: “A Goal-Oriented Approach for Preparing a Machine-Learning
10.1155/2023/6974966. Dataset to Support Business Problem Validation,” in 2021 IEEE
[5] S. S. Ahmed, “Assessment of Groundwater Quality Parameters Intl Conf on Dependable, Autonomic and Secure Computing, Intl
Using Multivariate Statistics- A Case Study of Majmaah, KSA,” Conf on Pervasive Intelligence and Computing, Intl Conf on Cloud
Int. J. Environ. Monit. Anal., vol. 5, no. 2, Art. no. 2, Mar. 2017, and Big Data Computing, Intl Conf on Cyber Science and
doi: 10.11648/j.ijema.20170502.13. Technology Congress (DASC/PiCom/CBDCom/CyberSciTech),
[6] M. Azrour, J. Mabrouki, G. Fattah, A. Guezzaz, and F. Aziz, 2021, pp. 282–289. doi: 10.1109/DASC-PICom-CBDCom-
“Machine learning algorithms for efficient water quality CyberSciTech52372.2021.00057.
prediction,” Model. Earth Syst. Environ., vol. 8, no. 2, pp. 2793– [24] M. Khadr and M. Elshemy, “Data-driven modeling for water
2801, 2022. quality prediction case study: The drains system associated with
[7] UNICEF Malawi, “Water, sanitation and hygiene | UNICEF Manzala Lake, Egypt,” Ain Shams Eng. J., vol. 8, no. 4, pp. 549–
Malawi.” Accessed: Jun. 11, 2024. [Online]. Available: 557, Dec. 2017, doi: 10.1016/j.asej.2016.08.004.
https://www.unicef.org/malawi/water-sanitation-and-hygiene
[8] R. Barzegar, A. A. Moghaddam, J. Adamowski, and B. Ozga-
Zielinski, “Multi-step water quality forecasting using a boosting
ensemble multi-wavelet extreme learning machine model,” Stoch.

https://google.academia.edu/JournalofComputerScience 7 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
Call for Papers: International Journal of Computer Science and Information Security (IJCSIS)
Scope and Topics: The International Journal of Computer Science and Information Security (IJCSIS) invites
researchers, practitioners, and academicians to submit original, unpublished contributions covering a wide range of
topics in the field of computer science and information security. We welcome submissions that include but are not
limited to:

• Computer and Network Security

• Cryptography and Data Security

• Information Assurance

• Artificial Intelligence and Machine Learning in Security

• Cybersecurity Policies and Standards

• Security in Cloud Computing

• Internet of Things (IoT) Security

• Blockchain and Distributed Ledger Technologies

• Secure Software Development

• Privacy-Enhancing Technologies

Submission Guidelines:

• Manuscripts must be original and not currently under consideration for publication elsewhere. All papers
should be submitted in English.

• The manuscript should follow the IJCSIS formatting guidelines, available on our website:
https://sites.google.com/site/ijcsis/ijcsis

• Submissions must include the title of the paper, abstract, keywords, and full contact information for all
authors.

• Papers should be submitted via our submission system here: https://sites.google.com/site/ijcsis/submit-


paper

Important Dates:

• Paper Submission Deadline: Monthly, 2024

• Notification of Acceptance: Within TWO weeks, 2024

• Final Manuscript Due: Monthly, 2024

• Publication Date: Monthly Issue

Review Process: All submitted papers will undergo a rigorous peer-review process by the IJCSIS editorial board and
selected external reviewers. Authors will be notified of the review results by the notification of acceptance date.a

Contact Information: For further inquiries, please contact the editorial office at:

Email: [email protected]

We look forward to receiving your submissions!

Website: https://sites.google.com/site/ijcsis/ijcsis

You might also like