Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
58 views24 pages

Dengue Cases Forecasting in The Philippines Using ARIMA

The research project titled 'Dengue Cases Forecasting in the Philippines Using ARIMA' aims to utilize ARIMA models to predict future dengue cases in the Philippines, focusing on historical data from 2008 to 2016. The study seeks to identify contributing factors to dengue outbreaks, forecast case trends for the next five years, and provide effective solutions for prevention and control. It emphasizes the importance of early forecasting for healthcare resource allocation and highlights the need for improved data collection and predictive modeling to enhance public health responses.

Uploaded by

Maxil Urocay
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
58 views24 pages

Dengue Cases Forecasting in The Philippines Using ARIMA

The research project titled 'Dengue Cases Forecasting in the Philippines Using ARIMA' aims to utilize ARIMA models to predict future dengue cases in the Philippines, focusing on historical data from 2008 to 2016. The study seeks to identify contributing factors to dengue outbreaks, forecast case trends for the next five years, and provide effective solutions for prevention and control. It emphasizes the importance of early forecasting for healthcare resource allocation and highlights the need for improved data collection and predictive modeling to enhance public health responses.

Uploaded by

Maxil Urocay
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

UNIVERSITY OF NEGROS OCCIDENTAL - RECOLETOS, INC

COLLEGE OF INFORMATION TECHNOLOGY


#51 Lizares Avenue, Bacolod City, 6100, Negros Occidental

Dengue Cases Forecasting in


the Philippines Using ARIMA
______________________________________________________________________________

A Research Project

Operation Research / Management Science

Presented to the Faculty of the

College of Information Technology

University of Negros Occidental – Recoletos, Incorporated

______________________________________________________________________________

In Partial Fulfillment of the

Requirements for the Courses

CIT11123X

______________________________________________________________________________

By:

Casipe, Elaine

Ginete, Felicity

Piodena, Adrian

May, 2025

Bachelor of Science in Computer Science


i
UNIVERSITY OF NEGROS OCCIDENTAL - RECOLETOS INCORPORATED
COLLEGE OF INFORMATION TECHNOLOGY
#51 Lizares Avenue, Bacolod City, 6100, Negros Occidental

TABLE OF CONTENTS

​ ​ ​ ​ ​ ​ ​ ​ ​ ​ ​ ​ Page

Cover Page………………………………………………………………………………..……...i

TABLE OF CONTENTS………………………………………………………………..……...ii

APPROVAL SHEET……………………………………………………………………..…….vi

TITLE PAGE…………………………………………………………………………….…..…..1

ABSTRACT…………………………………………………………………………………...…1

I. INTRODUCTION………………………………………………………………………....…..1

A.​ OBJECTIVES……………………………………………………………………………2

​ 1. Identify the factors that contribute to the rise of dengue disease…………………………..2

2. Forecast on the number of cases of dengue in the Philippines 5 years from now…2

3. Provide solutions to alleviate the rise of dengue cases in the Philippines………….2

4. Identify the impact of forecasting dengue cases to the healthcare industry………2

5. Utilizing historical data for dengue cases forecasting………………………………2

B. STATEMENT OF THE PROBLEM……………………………………………………..3

C. SCOPE AND LIMITATIONS…………………………………………………………….3

D. SIGNIFICANCE OF THE STUDY……………………………………………………...3

1. Local Government…………………………………………………………………….3

2. Parents………………………………………………………………………………...4

​ 3. Public Health Officials…………………………………………………………….…4

​ 4. Future Researchers…………………………………………………………………..4

Bachelor of Science in Computer Science


i
UNIVERSITY OF NEGROS OCCIDENTAL - RECOLETOS INCORPORATED
COLLEGE OF INFORMATION TECHNOLOGY
#51 Lizares Avenue, Bacolod City, 6100, Negros Occidental

II. REVIEW OF RELATED LITERATURE…………………………………………….….…4

A. OVERVIEW OF THE FORECASTING TECHNIQUE…………………………….…4

B. HISTORICAL DEVELOPMENT OF FORECASTING TECHNIQUES……..….….4

C. APPLICATIONS OF FORECASTING IN VARIOUS STUDIES……………………5

1. Public Health and Healthcare………………………………………………………5

​ 2. Education…………………………………………………………………………….5

​ 3. Agriculture and Rural Development……………………………………………….6

​ 4. Tourism and Hospitality Industry…………………………………………………6

​ 5. Researcher…………………………………………………….……………………6

D. GAPS AND CHALLENGES……………………………………………………………6

III. METHODOLOGY…………………………………………………………………………2

A. DATA COLLECTION…………………………………………………………………...2

1. Sources of Data………………………………………………………………………2

2. Data Preprocessing and Cleaning…………………………………………………..2

B. FORECASTING TECHNIQUES………………………………………………………..8

1. Time Series Analysis……………………………………………………………...…8

2. Machine Learning Approaches…………………………………………………….8

3. Comparative Methods………………………………………………………………8

C. PLATFORMS AND TOOLS FOR ANALYSIS………………………………………...9

1. Anaconda……………………………………………………………………………9

2. Python……………………………………………………………………………….9

Bachelor of Science in Computer Science


i
UNIVERSITY OF NEGROS OCCIDENTAL - RECOLETOS INCORPORATED
COLLEGE OF INFORMATION TECHNOLOGY
#51 Lizares Avenue, Bacolod City, 6100, Negros Occidental

​ 3. Jupyter Notebook…………………………………………………………………9

IV. RESULTS AND ANALYSIS………………………………………………………………9

A. MODEL IMPLEMENTATION………………………………………………………..9

B. PERFORMANCE EVALUATION METRICS……………………………………….10

C. VISUALIZATION OF FORECAST AND TRENDS………………………………...11

V. RESULTS AND DISCUSSIONS……………………………………………………………12

A. FINDINGS/ANALYSIS………………………………………………………………...12

1. What are the factors that contribute to the rise of dengue cases in the Philippines?.........12

2. What will be the forecast of dengue cases for the years 2025–2030?......................12

3. How can historical data be useful in forecasting dengue cases in the Philippines?...........12

4. What is the impact of early dengue forecasting on healthcare resource allocation?.........12

5. What are the best preventive measures to mitigate the dengue cases in the

Philippines?..................................................................................................................................12

B. CONCLUSION…………………………………………………………………………12

C. RECOMMENDATION……………………………………………………………….13

APPENDICES………………………………………………………………………………...14

A. CODE…………………………………………………………………………………...14

B. SNIPPET OF THE DATA SET……………………………………………………….15

C. ADF TEST ON LOG TRANSFORM SERIES……………………………………...15

D. ADF TEST ON FIRST DIFFERENCING…………………………………………...15

E. EVALUATION TEST SET……………………………………………………………16

Bachelor of Science in Computer Science


i
UNIVERSITY OF NEGROS OCCIDENTAL - RECOLETOS INCORPORATED
COLLEGE OF INFORMATION TECHNOLOGY
#51 Lizares Avenue, Bacolod City, 6100, Negros Occidental

F. FORECASTED MONTHLY DENGUE CASES……………………………………..16

G. FORECASTED DATASET IN .CSV FILE…………………………………………..16

REFERENCES………………………………………………………………………………….17

CURRICULUM VITAE………………………………………………………………………..18

Bachelor of Science in Computer Science


i
UNIVERSITY OF NEGROS OCCIDENTAL - RECOLETOS INCORPORATED
COLLEGE OF INFORMATION TECHNOLOGY
#51 Lizares Avenue, Bacolod City, 6100, Negros Occidental

APPROVAL SHEET

The research paper here attached, entitled “Dengue Cases Forecasting in the Philippines Using

ARIMA” was prepared and submitted by Elaine Casipe, Felicity Ginete, Adrian Piodena in

partial fulfillment of the requirements of the Operation Research / Management Science is

hereby accepted.

William F. Vidal, MSCS, Ongoing


____________________________________

Panelist

​ ​ ​ Date signed:

​ ​ Noted By:

Maxil S. Urocay, MSCS, Ongoing


____________________________________

College Faculty / QAMSOR Instructor

Date signed:

Bachelor of Science in Computer Science


vi
Dengue Cases Forecasting in
the Philippines Using ARIMA
Elaine Casipe Felicity Ginete Adrian Piodena
College of Information Technology College of Information Technology College of Information Technology
University of Negros Occidental University of Negros Occidental University of Negros Occidental
Recoletos, Incorporated Recoletos, Incorporated Recoletos, Incorporated
Bacolod City, Philippines Bacolod City, Philippines Bacolod City, Philippines
[email protected] [email protected] [email protected]

___________________________________________________________________________________________
ABSTRACT mosquitoes thrive in urban tropical environments,
breeding in stagnant water found in uncovered
This study investigates the application of containers such as tires, buckets, and flower pots [3].
Autoregressive Integrated Moving Average (ARIMA) Over the past five decades, the global incidence of
dengue has increased dramatically, with a 30-fold rise
models in forecasting dengue cases in the Philippines.
in cases [1]. It is now endemic in the Philippines,
Dengue fever poses a growing public health threat due causing an estimated 50 to 390 million infections
to climate variability, urbanization, and poor annually [5], making it the most significant
sanitation. With the Philippines experiencing regular arthropod-borne viral disease affecting humans.
outbreaks, forecasting future cases is essential for Approximately 500 million people in the Philippines
effective prevention and resource planning. The study are at risk of infection, underscoring the urgent need
for effective prevention and control strategies [1].
utilizes historical dengue data from 2008 to 2016
The clinical manifestations of dengue range
obtained from Kaggle, focusing on time-based and from asymptomatic infections to severe and
regional trends. Data preprocessing ensured potentially fatal forms, such as dengue hemorrhagic
stationarity through differencing and model fever (DHF) and dengue shock syndrome (DSS).
parameters were selected via ACF and PACF analyses. Classic dengue fever (DF) is characterized by sudden
The ARIMA model was trained to predict future high fever, severe muscle and joint pain, headache,
dengue trends, with forecast results validated using nausea, and vomiting, often followed by prolonged
fatigue and depression [8]. Severe cases, particularly
metrics such as RMSE, MAE, and MAPE. Results
DHF, occur more frequently in individuals
show that ARIMA effectively captures general trends experiencing a secondary infection with a different
and provides a reasonable level of accuracy with a dengue virus serotype. DHF is marked by increased
MAPE of 18.41%. However, the model struggles with vascular permeability, leading to plasma leakage,
sudden case surges due to the absence of exogenous organ dysfunction, and shock, with mortality rates as
variables. The study suggests incorporating climate high as 20% in untreated cases [7]. Currently,
data and hybrid models to improve prediction prevention efforts rely heavily on vector control, as no
specific antiviral treatment or widely licensed vaccine
accuracy. Forecasts indicate consistent yearly peaks in
is available [1]. Consequently, there is a growing
dengue cases, highlighting the disease’s cyclical emphasis on developing predictive models to forecast
nature. These insights can guide public health officials dengue outbreaks, enabling public health agencies to
in proactive planning, resource allocation, and targeted implement timely and targeted interventions [9].
interventions. Despite extensive research on dengue, most
studies have focused on retrospective detection and
surveillance rather than proactive prediction.
I. INTRODUCTION Environmental factors, such as temperature, rainfall,
and mosquito population dynamics, have been widely
Dengue fever is a widespread viral disease
studied for their influence on dengue transmission [4].
transmitted to people through the bites of infected
However, few models have successfully predicted
Aedes mosquitoes, primarily Aedes aegypti. These
outbreaks in advance. Recent advances in

1
Autoregressive Integrated Moving Average (ARIMA) dengue cases, the research seeks to identify patterns
techniques offer promising opportunities to improve and potential outbreak risks that could emerge in the
dengue forecasting by analyzing complex, future.
high-dimensional datasets. These methods can identify The insights gained from this research will
intricate relationships between environmental, assist health officials and the government in making
climatic, and epidemiological variables, providing informed decisions based on the study’s forecast.
more accurate and timely predictions. This study aims Ultimately, the study's findings can help improve
to build on these advancements, focusing on the resource distribution, strengthen public health
development and application of predictive models to strategies, and raise community awareness about
forecast weekly dengue incidence in specific regions, dengue prevention and control.
with the goal of enhancing outbreak preparedness and
resource allocation in high-risk areas. 3. Provide solutions to alleviate the rise of dengue
​ ​ cases in the Philippines.
A. OBJECTIVES The objective of this study is to explore and
​ This section of the study focuses on the propose effective solutions to reduce the increasing
primary objectives that the project aims to accomplish. number of dengue cases in the Philippines. By
One of the key goals is to identify the factors examining the factors contributing to the spread of the
contributing to the rise of dengue cases, as it is crucial disease such as poor sanitation, stagnant water
for developing effective solutions. Additionally, the sources, and limited public awareness the research
study seeks to forecast potential dengue cases in the aims to provide a comprehensive understanding of the
coming years using Autoregressive Integrated Moving challenges involved.
Average (ARIMA) models, providing valuable Based on these findings, the study will
insights for future outbreak preparedness. recommend strategies including improved sanitation
By analyzing current trends and preventive practices, mosquito control programs, and targeted
measures, this research also aims to determine the public awareness campaigns. Implementing these
most effective solutions for controlling the spread of solutions can help minimize dengue outbreaks, protect
the disease. The findings of this study will serve as a public health, and strengthen community involvement
foundation for improving people's safety, enabling in long-term disease prevention.
health authorities to implement targeted and timely
interventions based on accurate predictions. 4. Identify the impact of forecasting dengue cases
to the healthcare industry.
1. Identify the factors that contribute to the rise of Early dengue forecasting plays a crucial role in
dengue disease. optimizing healthcare resource allocation by enabling
​ The objective of this study is to identify the hospitals and clinics to prepare in advance for
key factors contributing to the rise of dengue disease potential outbreaks. By predicting case surges,
and understand their impact on its spread. By healthcare facilities can ensure adequate staffing,
analyzing environmental, social, and climatic medical supplies, and hospital bed availability, thereby
influences, this research aims to highlight the reducing strain on the healthcare system and
conditions that favor mosquito breeding and virus preventing service disruptions.
transmission. This proactive approach minimizes
To enhance predictive capabilities, the study overcrowding and improves patient care, ultimately
will utilize Autoregressive Integrated Moving Average leading to better health outcomes. Additionally, early
(ARIMA) models to forecast dengue trends based on forecasting allows government agencies to distribute
these influencing factors. Understanding these resources more efficiently, ensuring that high-risk
elements will help develop more effective prevention areas receive the necessary support to combat
and control strategies to reduce dengue outbreaks. outbreaks effectively and contain the spread of the
Ultimately, the findings can support public health disease.
initiatives and policy-making efforts to mitigate the
disease's impact on communities. 5. Utilizing historical data for dengue cases
forecasting
2. Forecast on the number of cases of dengue Historical data provides valuable insights into
in the Philippines 10 years from now dengue trends, seasonal patterns, and outbreak cycles,
​ This study aims to project the number of making it a critical tool for forecasting future cases in
dengue cases in the Philippines over the next five the Philippines. This information helps in developing
years by assessing historical data. By analyzing key predictive models that enable public health officials to
variables such as the month, year, region, and reported implement timely interventions. Additionally,

2
historical data allows for the identification of high-risk C. SCOPE AND LIMITATIONS
regions, improving targeted prevention and control One of the primary limitations of this study is
strategies. Utilizing past data ensures a more the restricted timeframe of the dataset, which only
evidence-based approach to dengue management, includes dengue cases from 2008 to 2016. The
ultimately reducing cases and improving public health absence of data from earlier years may have limited
outcomes. the depth of analysis and prevented the identification
In conclusion, addressing the rise of dengue of long-term trends. Additionally, the dataset is
cases in the Philippines requires a comprehensive constrained in terms of available variables, as it only
approach. By implementing solutions, communities includes information on the month, year, region, and
can reduce mosquito breeding grounds and lower the number of cases. Key factors such as weather
risk of disease transmission. Continuous efforts and conditions, seasonal variations, and other
collaboration among government agencies, health environmental or socio-economic influences were not
organizations, and the public are essential to considered, which could have enhanced the accuracy
effectively controlling and preventing future dengue and robustness of the predictions.
outbreaks. Moreover, the study focuses exclusively on
the Philippines, limiting its applicability to other
B. STATEMENT OF THE PROBLEM countries with different climates, population densities,
Dengue is a growing public health threat, and public health infrastructures. Even within the
especially in tropical and subtropical regions, due to Philippines, the data is only categorized at the regional
the increasing spread of Aedes mosquitoes. Despite level, making it too broad to capture localized trends
existing prevention efforts, cases continue to rise, at the city, town, or barangay level. A more granular
leading to severe health complications and economic dataset would have allowed for more precise,
burdens. Ineffective mosquito control and inconsistent community-specific insights. The lack of detailed
policy implementation further hinder efforts to reduce geographical data further restricts the study’s ability to
outbreaks. Addressing these challenges is essential to analyze spatial patterns and pinpoint high-risk areas
developing better prevention strategies and improving with greater accuracy.
healthcare responses to dengue. Due to these limitations, the generalizability
of the findings is constrained, particularly for regions
●​ What are the factors that contribute to the rise outside the study’s scope. Future research should
of dengue cases in the Philippines? address these gaps by incorporating a more extensive
●​ What will be the forecast of dengue cases for dataset that covers a longer timeframe, includes
additional variables, and offers a finer geographical
in 10 years?
breakdown to improve the precision and applicability
●​ How can historical data be useful in of dengue forecasting models.
forecasting dengue cases in the Philippines?
●​ What is the impact of early dengue
forecasting on healthcare resource allocation? D. SIGNIFICANCE OF THE STUDY
●​ What are the best preventive measures to
mitigate the dengue cases in the Philippines?
1. Local Government
Local government officials can use this study
Understanding the root causes of the to develop more effective dengue prevention and
increasing dengue cases is crucial in addressing the control programs. By raising public awareness through
growing public health threat. Identifying the factors educational campaigns, social media, and community
that contribute to dengue outbreaks, such as outreach, the study can help reduce infection rates and
environmental conditions and human-related factors, promote proactive measures at the community level.
serves as a foundation for developing targeted These efforts are crucial for fostering a deeper
solutions. Additionally, forecasting future trends in understanding of the risks and preventive actions
dengue cases can provide valuable insights for related to dengue.
authorities, enabling them to implement proactive The findings can also support localized
measures and allocate resources more efficiently. By interventions such as fumigation, waste management,
taking a data-driven approach to prevention and and clean-up drives aimed at eliminating mosquito
control, public health officials can reduce the impact breeding sites. By implementing data-driven policies,
of dengue and protect communities from further health officials can improve public health strategies,
crises. minimize outbreaks, and allocate resources more
effectively. Active community engagement will be key

3
to strengthening dengue prevention efforts and key areas that warrant further investigation and
ensuring that the public plays an integral role in innovation in public health practices. Ultimately, it
safeguarding public health. contributes to the broader scientific effort to combat
dengue and supports the advancement of
2. Parents evidence-based strategies for disease prevention and
This study will help parents better understand control.
dengue and take preventive measures to protect their
families. By staying informed about how the disease
spreads and recognizing early symptoms, parents can
create safer home environments and significantly II. REVIEW OF RELATED LITERATURE
reduce the risk of infection. Simple actions such as
eliminating stagnant water, using mosquito repellents,
A. OVERVIEW OF THE FORECASTING
and ensuring proper sanitation can make a big
difference in preventing dengue. TECHNIQUE
Fewer severe cases will result in less strain ​ Autoregressive Integrated Moving Average
on healthcare facilities and lower hospitalization rates, (ARIMA) is a traditional statistical model that
easing the burden on the medical system. Additionally, combines autoregressive (AR), differencing (I), and
preventive actions can help families avoid the moving average (MA) components to analyze and
financial strain associated with medical treatment. forecast time series data [1]. It is particularly effective
Ultimately, informed and proactive parents play a vital
for linear relationships and stationary data, where the
role in strengthening community health and
supporting broader dengue prevention efforts. statistical properties of the series remain constant over
time. ARIMA models have been widely used in
3. Public Health Officials dengue forecasting due to their simplicity and
​ Public health officials can use the study’s interpretability. For example, studies in the Philippines
findings to improve dengue outbreak preparedness and have demonstrated ARIMA's ability to capture trends
response. By utilizing accurate forecasting, hospitals
and short-term fluctuations in dengue cases, making it
and clinics can ensure the availability of essential
medical supplies, adequate staffing, and necessary a reliable tool for near-term predictions [9]. However,
infrastructure for early and efficient treatment of ARIMA's reliance on linear assumptions limits its
patients. This level of preparedness is crucial in ability to model complex, nonlinear interactions
managing case surges and preventing healthcare between variables such as climate, mosquito
systems from becoming overwhelmed. populations, and human behavior, which are critical
Improved forecasting and resource allocation for accurate dengue forecasting.
can help reduce mortality rates and enhance the
The effectiveness of ARIMA lies in its
overall quality of patient care. In addition, the study
simplicity and capacity to handle time series data with
can serve as a valuable tool in guiding the
consistent patterns. It remains a useful tool for
development of stronger, data-driven public health
short-term predictions and scenarios with linear
policies. By strengthening preventive measures and
trends. Evidence from studies in the Philippines
community-based interventions, the findings will
supports the continued use of ARIMA for dengue
contribute to greater resilience against dengue
forecasting, particularly for identifying and
outbreaks and support long-term public health
responding to periodic variations in case numbers. By
improvements.
utilizing ARIMA, public health officials in the
Philippines can develop more reliable forecasting
4. Future Researchers
systems to anticipate outbreaks, allocate resources
This study provides a valuable reference for
efficiently, and implement targeted interventions to
researchers exploring dengue trends, risk factors, and
mitigate the impact of dengue [1].
prevention strategies. Access to well-documented data
can enhance the accuracy, reliability, and depth of
B. HISTORICAL DEVELOPMENT OF
future research efforts. By offering insights into
patterns of infection and contributing factors, the FORECASTING TECHNIQUES
study lays a strong foundation for continued ​ The historical development of forecasting
exploration in the field of dengue epidemiology. techniques for dengue cases in the Philippines has
Researchers can build upon these findings to evolved significantly over the years, with traditional
develop new models for predicting and controlling statistical methods giving way to more advanced
dengue outbreaks. Additionally, the study highlights

4
approaches. Initially, methods like Autoregressive hypotheses, and refining public health strategies.
Integrated Moving Average (ARIMA) were widely These applications highlight the critical role of reliable
used due to their simplicity and effectiveness in forecasting in reducing the broader impact of dengue
across multiple sectors.
capturing linear and seasonal patterns in time series
data[9]. ARIMA models rely on historical data to 1. Public Health and Healthcare
predict future dengue cases by identifying trends, ​ Dengue forecasting models, particularly
seasonality, and residuals. However, ARIMA has advanced methods like Random Forest, are
limitations in handling non-linear relationships and extensively applied in the public health sector to
complex dependencies in data, which are often present predict outbreaks and allocate resources efficiently.
in dengue case time series due to factors like climate These models analyze large datasets, including climate
variables, population density, and historical case
variability, population density, and public health
counts, to identify patterns and anticipate future
interventions. dengue incidence. In the Philippines, where dengue is
As of the last few years, advancements in a persistent threat, accurate forecasting empowers
ARIMA techniques have continued to improve dengue healthcare providers to take proactive measures, such
forecasting accuracy. ARIMA models have been as stockpiling medications, expanding hospital bed
enhanced with hybrid approaches, combining them availability, and deploying medical personnel to
with machine learning algorithms to address their vulnerable regions.
limitations[10]. These developments have enabled For instance, a study by [10] demonstrated
more timely and accurate forecasts, aiding public how Random Forest models could forecast dengue
health officials in the Philippines to implement cases several weeks in advance, enabling public health
targeted interventions and resource allocation. The authorities to implement timely and targeted
ongoing integration of big data and artificial interventions. Such predictive accuracy plays a crucial
intelligence into forecasting systems promises even role in minimizing the severity of outbreaks,
greater precision in predicting dengue outbreaks, ultimately reducing both morbidity and mortality. This
ultimately contributing to better disease control and application reinforces the relevance of our study by
prevention strategies. underscoring the significance of reliable forecasting
tools in enhancing public health preparedness and
C. APPLICATIONS OF FORECASTING IN response strategies against dengue.
VARIOUS STUDIES
​ Forecasting in industries plays a vital role in 2. Education
mitigating the effects of dengue, especially in the Schools and institutions of higher learning in
areas of public health, resource allocation, and the Philippines utilize dengue forecasts to safeguard
students and staff from potential outbreaks. Predictive
strategic interventions to reduce morbidity and
models help these institutions schedule timely vector
mortality rates. In the public health sector, accurate control measures, such as fumigation and clean-up
forecasting enables authorities to anticipate outbreaks, drives, ahead of peak transmission periods. This
deploy medical personnel and supplies efficiently, and proactive approach reduces the presence of mosquito
implement targeted response strategies. Schools breeding sites on campuses and highlights the
benefit from forecasting by planning vector control importance of forecasting in protecting public health
within educational environments [7].
measures, launching health awareness campaigns, and
In addition to vector control, forecasts allow
making informed decisions about temporary closures schools to organize health education campaigns that
to ensure the safety of students and staff. raise awareness among students and parents about
Beyond healthcare and education, other dengue prevention practices. These initiatives
industries also rely on dengue forecasts. In agriculture, empower communities with knowledge on eliminating
forecasting helps guide timely vector control mosquito habitats and recognizing early symptoms. In
interventions, protecting farmworkers and ensuring severe cases, forecasts may also inform temporary
the stability of food production in rural communities. school closures in high-risk areas to minimize
The tourism industry uses dengue forecasts to exposure. By integrating forecasting into their health
implement enhanced sanitation and mosquito control and safety planning, educational institutions can create
efforts, protecting tourists and minimizing economic a safer learning environment and help curb the spread
disruptions. Additionally, forecasting supports of dengue among their populations.
scientific research by offering valuable data for
understanding dengue epidemiology, testing

5
3. Agriculture and Rural Development assess intervention strategies [2]. This use is
In rural communities, dengue forecasting consistent with our study's emphasis on the
plays a crucial role in protecting farming populations advancement of scientific knowledge and the
from potential outbreaks. Forecasting models help informing of public health practice [3]. In addition,
determine the optimal timing for vector control forecasting models can be utilized to formulate new
measures in agricultural settings, such as draining methods and enhance current methods, thereby adding
stagnant water from rice paddies, irrigation canals, and to the general body of knowledge in dengue
other water-holding areas that serve as mosquito transmission and control. Researchers can better
breeding grounds [1]. This application is closely understand dengue dynamics and formulate more
aligned with our research, as it emphasizes the effective interventions by using forecasting models.
importance of forecasting in preserving agricultural By using dengue forecasting in these
productivity and minimizing health-related disruptions industries, the Philippines can better prepare and
in rural economies. respond to dengue outbreaks, finally minimizing the
By reducing the incidence of dengue, these effect of the disease on public health, the economy,
interventions help prevent illness among farmers and and society.
their families, thereby avoiding work interruptions and
income loss. In addition, forecasting can inform the D. GAPS AND CHALLENGES
design of community-based strategies that engage ​ Despite significant advancements in dengue
residents in proactive prevention efforts, such as forecasting, several gaps and challenges persist in the
coordinated clean-up drives and education campaigns.
Philippines. One major issue is the lack of
These localized responses, guided by accurate
predictions, ensure that rural communities are better high-quality, real-time data. Accurate forecasting
prepared to manage outbreaks and maintain both relies on comprehensive and timely data on dengue
public health and economic stability. cases, mosquito populations, and environmental
factors such as rainfall, temperature, and humidity.
4. Tourism and Hospitality Industry However, in many regions of the Philippines,
The tourism industry in the Philippines surveillance systems are underdeveloped, leading to
benefits significantly from dengue forecasting by
delays in case reporting and incomplete data.
enabling the implementation of timely preventive
measures in high-risk areas. Accurate predictions Additionally, there is often limited integration of data
allow hotels, resorts, and tourist destinations to from different sources, such as healthcare facilities,
enhance their sanitation protocols and mosquito meteorological agencies, and environmental
control efforts, such as fumigation and eliminating monitoring systems. This fragmentation hinders the
breeding grounds, thereby reducing the risk of dengue ability to build robust and reliable forecasting models.
transmission to visitors [6]. This proactive approach ​ Another challenge is the complexity of
helps safeguard tourist health and underscores the
dengue transmission dynamics, which are influenced
value of forecasting in preserving the operational
continuity of tourism establishments. by a wide range of interacting factors. These include
This application is particularly relevant to our climatic conditions, urbanization, population mobility,
study as it highlights how forecasting can mitigate the and socio-economic factors, all of which vary
broader economic effects of dengue outbreaks on significantly across different regions of the Philippines
tourism, a major contributor to the Philippine [1]. Traditional statistical models, such as ARIMA,
economy. Proactive communication about dengue
often struggle to capture these nonlinear and
risks and ongoing preventive efforts also fosters trust
and reassurance among travelers. By integrating multifaceted relationships. Furthermore, the lack of
forecasting models into their health and safety expertise in advanced data science among public
strategies, tourism stakeholders can maintain tourist health professionals can limit the adoption and
confidence, minimize disruptions, and uphold the implementation of sophisticated forecasting models.
country’s reputation as a premier travel destination. ​ Finally, there is a gap in translating
forecasting results into actionable public health
5. Researcher
interventions. Even when accurate predictions are
Dengue forecasting models also make
contributions to scientific research by supplying data made, the response is often hindered by logistical,
for the study of the epidemiology and transmission financial, and political barriers. For example, vector
dynamics of the disease. Researchers in the control measures, such as fogging and clean-up drives,
Philippines utilize these models to test hypotheses and require significant coordination and resources, which

6
may not be available in remote or underserved areas. ●​ Month – The specific month in which dengue
Additionally, public awareness and community cases were recorded.
engagement are critical for the success of preventive ●​ Year – The corresponding year of the
recorded cases.
measures, but these efforts are often underfunded or
●​ Region – The geographical region where the
poorly executed. Addressing these gaps and cases occurred.
challenges requires a multi-faceted approach, ●​ Dengue_Cases – The number of reported
including investments in data infrastructure, capacity dengue cases per region and month.
building, and cross-sector collaboration, to ensure that
dengue forecasting can effectively contribute to The dataset is essential for understanding
reducing the disease burden in the Philippines. seasonal dengue patterns and regional variations in
dengue incidence. By analyzing this information,
researchers can identify high-risk months and areas
III. METHODOLOGY where dengue outbreaks are more likely to occur [10].
One limitation of using Kaggle is that the
A. DATA COLLECTION data may not always be updated in real time, which
​ The study utilizes historical dengue case data can affect the forecasting model’s ability to predict
to develop an accurate forecasting model for dengue sudden dengue outbreaks. To improve forecasting
outbreaks in the Philippines. The dataset contains accuracy, the study suggests incorporating real-time
monthly dengue case records categorized by year and surveillance data from public health agencies in future
region, allowing researchers to analyze trends and research [1].This would help refine the model’s
seasonal variations in dengue incidence. The ability to predictive capabilities and allow public health officials
track dengue cases over time makes this dataset a to implement timely intervention strategies based on
valuable resource for predictive modeling and emerging dengue trends.
outbreak management [5]. While the dataset provides comprehensive
ARIMA is used to predict future dengue coverage, it also contains gaps in recorded dengue
cases. ARIMA is a well-established statistical model cases for certain months and regions. These missing
that identifies time-dependent and seasonal patterns values can introduce bias in forecasting models, which
[8]. By combining these models, the study enhances is why data cleaning and preprocessing techniques are
prediction accuracy and provides a robust framework crucial before model implementation [2].
for dengue forecasting [6].
To address this, data preprocessing and 2. Data Preprocessing and Cleaning​
cleaning techniques are applied before training the The original dataset was not viable for
models. Steps ensure that the dataset is high-quality, training in its initial state as it was not stationary. To
reliable, and ready for analysis, which is crucial for assess its stationarity, the team first applied the
generating precise and actionable dengue case ADFuller test. The results confirmed that the dataset
predictions [4]. was not stationary, prompting the need for
​ differencing, which was done twice to achieve
1. Sources of Data stationarity. Following this, ACF and PACF tests were
The source for this study is Kaggle, which conducted to determine the appropriate order for
hosts publicly available datasets on dengue cases and training the model. Next thing that was done was
climate-related factors. The dataset used contains key checking for missing values in the dataset, and the
information such as dengue case counts per region, result showed no missing values. The dataset,
recorded timeframes, year, and month levels. These however, required no cleaning, as it was already
factors enable the study to explore the relationship well-structured and contained.
between environmental variables and the seasonal
trends of dengue outbreaks in the Philippines [2]. By
using Kaggle’s structured dataset, the study ensures B. FORECASTING TECHNIQUES
accessibility to high-quality data that supports the
implementation of ARIMA models. 1. Time Series Analysis
The dengue cases dataset serves as the ​ ARIMA (AutoRegressive Integrated Moving
primary data source for this study. It contains dengue Average) was used to forecast dengue cases in the
case records from 2008 to 2016, covering multiple Philippines, considering both trend and seasonality in
regions in the Philippines. The dataset consists of four the data. The model was selected because dengue
key variables: outbreaks often follow seasonal patterns influenced by
factors such as climate and rainfall. The first step

7
involved checking the time series data for stationarity 3. Comparative Methods
using tests like ADFuller. Once non-stationarity was ARIMA (AutoRegressive Integrated Moving
confirmed, differencing was applied to make the data Average) and SARIMA (Seasonal ARIMA) are both
stationary. To account for recurring seasonal patterns, widely used for time series forecasting, including
seasonal components were also considered. ACF and predicting dengue outbreaks. However, they differ in
PACF plots were analyzed to identify the appropriate their approach to handling data patterns, particularly
parameters for the ARIMA model, including seasonality.
autoregressive, differencing, and moving average ARIMA is designed for time series data that
terms. shows overall trends, but lacks strong, repeating
After selecting the best-fit ARIMA model, it seasonal patterns. It works by combining three key
was trained using historical dengue data. The model components: autoregressive terms (AR), which use
was then used to forecast future dengue cases, past values in the series; differencing (I) to make the
producing predictions with confidence intervals. data stationary (i, removing trends and ensuring a
These forecasts assist public health officials in constant mean); and moving average terms (MA),
preparing and allocating resources effectively to which model the relationship between observations. In
manage potential outbreaks. By providing data-driven dengue prediction, ARIMA can capture general
insights, ARIMA helps anticipate dengue trends, trends, such as long-term increases or decreases in
enabling better decision-making for dengue control case numbers, but it may struggle to capture recurring
efforts in the Philippines. seasonal outbreaks due to weather conditions, as is the
case with dengue in tropical regions like the
2. Machine Learning Approaches Philippines.
​ To predict dengue cases, the dataset must first SARIMA, on the other hand, extends
be prepared by ensuring it includes relevant features ARIMA by including seasonal components. This
and a target variable. The date column should be model is specifically designed for data with strong
converted to datetime format, and additional seasonal behavior, such as dengue cases that often
time-based features such as month, year, or week can peak during specific months due to environmental
be extracted to enhance model performance. Missing factors like rainfall or temperature. SARIMA models
values should be handled appropriately, and outliers both short-term trends and long-term seasonality,
may require removal or transformation. Since this which makes it more effective for dengue forecasting,
method is not designed for sequential data like as it can account for seasonal cycles. SARIMA’s
ARIMA, the time series data can be reframed as a additional seasonal parameters help model the
supervised learning problem by shifting previous repeating patterns, providing more accurate forecasts
observations as lag features. For example, creating for diseases like dengue that are influenced by
new columns that represent dengue cases from one or seasonal variations. In contrast, ARIMA can offer
more previous months can help the model learn reasonable short-term predictions but may miss
patterns. The dataset is then split into training and cyclical spikes in dengue cases.
testing sets to evaluate model performance. While ARIMA is faster to implement and
​ The model is trained on the data, leveraging simpler when seasonality is weak or absent, SARIMA
its structure to reduce overfitting and improve is a more robust and reliable tool when seasonal
prediction accuracy. Key hyperparameters should be patterns are clearly evident, like the predictable spikes
tuned to optimize performance. After training, the in dengue cases during certain times of the year. Both
model is used to predict dengue cases for the next five models require the data to be stationary, but SARIMA
years by iteratively forecasting future values using also requires the identification of seasonal lags. In
predicted cases as new inputs. Feature importance terms of model selection, both ARIMA and SARIMA
analysis can reveal which factors, such as seasonal involve analyzing autocorrelation (ACF) and partial
trends or environmental conditions, have the greatest autocorrelation (PACF) plots, and evaluating model fit
impact on dengue outbreaks. Visualizing the predicted using criteria like the AIC (Akaike Information
values alongside actual data helps assess the model’s Criterion) or BIC (Bayesian Information Criterion). In
effectiveness. Finally, performance metrics such as conclusion, SARIMA is often the better choice for
R-squared, RMSE, MAPE, MSE, or MAE should be dengue forecasting when there is clear seasonality,
used to evaluate prediction accuracy. This approach while ARIMA can be used for simpler, short-term
provides a powerful method for predicting dengue predictions or when seasonality is not as pronounced.
outbreaks.
C. PLATFORMS AND TOOLS FOR ANALYSIS

8
1. Anaconda 3. Jupyter Notebook
To efficiently analyze dengue case data and An interactive computational environment is
implement forecasting models, this study utilizes employed in this study for data preprocessing,
Anaconda as the primary platform for analysis. visualization, and model implementation. Jupyter
Anaconda is a widely used distribution that simplifies Notebook is widely used in the field of data science
the installation and management of data science because it allows researchers to write and execute
libraries, making it ideal for handling large datasets, code in a step-by-step manner, making it easier to
performing computations, and executing complex document findings and track progress. This tool is
models. With Anaconda, researchers can access a particularly useful for exploratory data analysis, where
centralized environment that supports a variety of researchers can generate visualizations, analyze
essential libraries for data preprocessing, statistical trends, and test different forecasting models before
analysis, and machine learning applications. It comes finalizing their approach.
pre-installed with crucial packages such as NumPy, Within the Jupyter Notebook, researchers can
Pandas, Matplotlib, and Scikit-learn, all of which play combine executable code, markdown text, and visual
a key role in this study. output in a single document, making it ideal for
NumPy and Pandas are used for data documenting the entire forecasting workflow. The
manipulation and transformation, ensuring that the platform’s real-time execution capabilities enable
dengue case dataset is well-structured and ready for researchers to fine-tune their models, compare results,
analysis. Matplotlib and Seaborn assist in visualizing and make adjustments as needed to improve prediction
data trends and relationships, allowing researchers to accuracy. This structured approach not only enhances
detect seasonal patterns and variations in dengue the reliability of predictions but also provides a
incidence. Meanwhile, Scikit-learn provides advanced transparent and scalable framework that can be
functionalities for training predictive models. By using applied in future public health research and outbreak
Anaconda, the study benefits from an organized, management initiatives.
stable, and scalable environment that supports
efficient dengue case forecasting, making it a crucial
tool in the analysis and interpretation of the data.
IV. RESULTS AND ANALYSIS
2. Python
Python serves as the core programming
A. MODEL IMPLEMENTATION
language for dengue forecasting due to its flexibility,
These libraries are essential tools for
efficiency, and powerful ecosystem of libraries for
conducting time series analysis, forecasting, and
scientific computing. It is a popular choice in data
evaluating model performance. pandas is used for data
science and machine learning because of its extensive
manipulation, especially in handling time-indexed
support for numerical analysis, data visualization, and
data through DataFrames, making it easy to
automation. The language's libraries, such as pandas
preprocess and analyze time series datasets.
for data manipulation, NumPy for numerical
matplotlib.pyplot and seaborn are powerful
computation, and Matplotlib for visualization, are
visualization libraries that help create informative and
essential in processing and interpreting dengue case
aesthetically pleasing plots to explore data trends,
data. Python also offers seamless integration with
seasonality, and anomalies. statsmodels offers
machine learning frameworks, making it an ideal
advanced statistical modeling capabilities, including
choice for implementing forecasting models.
ARIMA and SARIMAX models, which are commonly
In this study, Python is used to implement
used for forecasting, along with tools for testing
AutoRegressive Integrated Moving Average (ARIMA)
stationarity (like the ADFuller test) and plotting
models for predicting future dengue cases. ARIMA is
autocorrelation through ACF and PACF. The sklearn
a time-series forecasting model that analyzes historical
(scikit-learn) library provides machine learning
dengue data to capture trends and patterns. By
models such as RandomForestRegressor and
applying ARIMA, the study leverages its ability to
performance evaluation metrics like mean squared
model linear relationships in data and generate
error (MSE) and mean absolute error (MAE), which
accurate forecasts based on past trends. Python's
help in assessing the accuracy of predictions. numpy
robust libraries, such as statsmodels, provide the tools
supports efficient numerical operations and array
needed to efficiently implement and test ARIMA
manipulations, which are fundamental in
models, offering a solid foundation for predicting and
mathematical computations and preparing data for
managing dengue outbreaks.
modeling. Additionally, datetime.timedelta assists in
managing and manipulating date ranges, which is

9
useful in generating or adjusting forecasted time
series. Root Mean Squared Error (RMSE: 41827.20)​
The program begins by grouping the dataset ​ RMSE is the square root of the MSE and
by the ‘Date’ column and summing the brings the error measure back to the same unit as the
'Dengue_Cases' to create a time series dataset. They original data, making it more interpretable. An RMSE
then split the data into training and testing sets, of about 41,827 means that on average, the forecasted
allocating 80% for training and 20% for testing. To dengue case numbers deviate from actual values by
model the time series data, the user initializes an this amount. Considering that dengue case counts can
SARIMA model with training order of (8, 2, 1) and a reach into the hundreds of thousands, this level of
seasonal order of (2, 1, 4, 12). The model is then fitted error might be acceptable or high depending on the
to the training data to estimate the parameters. Using application. RMSE is sensitive to large errors due to
the fitted model, the user generates a forecast for the the squaring process before taking the root. This
length of the test data. makes it useful for detecting models that perform
To assess the model's performance, the user poorly on extreme values. In public health forecasting,
calculates the Mean Squared Error (MSE), Root Mean where planning resources based on projected cases is
Squared Error (RMSE), Mean Absolute Error (MAE), critical, even an RMSE of this size should be carefully
and Mean Absolute Percentage Error (MAPE) evaluated. RMSE also allows for easier comparison
between the actual test data and the forecast. For across different models and datasets. It’s often
visualization, the user sets the plot size (14, 16) for considered a standard metric for evaluating time series
better readability. They plot the training data to show models like ARIMA.
historical patterns and trends. The actual test data is
plotted to allow for a visual comparison with the Mean Absolute Error (MAE: 29380.83)​
forecast. Finally, the forecasted values are plotted ​ The Mean Absolute Error (MAE) gives the
using green lines, with grid lines and axis labels added average of the absolute differences between predicted
for clarity. and actual values. An MAE of approximately 29,381
indicates that the model’s predictions are off by that
amount on average. Unlike MSE and RMSE, MAE
does not square the errors, so it treats all errors equally
B. PERFORMANCE EVALUATION METRICS
regardless of size. This makes it less sensitive to
outliers and gives a more balanced view of overall
MSE: 1749514402.68 prediction accuracy. In the context of dengue
RMSE:41827.20 forecasting, this error might be reasonable if case
numbers often vary widely from month to month.
MAE: 29380.83
MAE is easier to interpret and gives a straightforward
MAPE: 18.41% sense of how much the model is typically wrong. It’s
especially useful when you want to minimize overall
Mean Squared Error (MSE: 1749514402.68)​ error rather than focus on extreme deviations.
​ The Mean Squared Error (MSE) measures Comparing MAE and RMSE together helps determine
the average of the squares of the errors between actual whether large errors are skewing the results.
and predicted values. A high MSE indicates that the
model’s predictions deviate significantly from the Mean Absolute Percentage Error (MAPE: 18.41%)​
actual observations. In this case, the MSE value is ​ MAPE expresses prediction accuracy as a
over 1.7 billion, which seems large due to the scale of percentage, making it easier to understand across
the dengue case numbers. Since MSE squares the different contexts. A MAPE of 18.41% means that, on
errors, it gives more weight to large errors than to average, the model’s predictions are about 18.41% off
smaller ones, penalizing the model more harshly when from the actual values. This level of error suggests
predictions are far off. This can be useful in time moderate forecasting accuracy—better than random
series forecasting where large errors might have guessing, but with room for improvement. MAPE is
serious consequences. However, MSE alone doesn’t particularly useful when comparing forecast
provide an intuitive sense of the magnitude of the performance across different datasets or models with
error due to its squared units. That’s why it's usually varying scales. However, MAPE can be misleading
interpreted in conjunction with other metrics like when actual values are close to zero, as it can inflate
RMSE and MAE. Despite the large value, it reflects the percentage error. In this dataset, where dengue
the data’s natural scale and variance and should be cases are typically large, MAPE remains a reliable
compared against baseline or alternative models for indicator. It's a helpful metric for stakeholders who
context. need a percentage-based summary of forecast

10
reliability. Overall, 18.41% is a tolerable error margin
in many public health applications, though lower
values would be more desirable for critical planning This graph presents a time series forecast of
purposes. monthly dengue cases from 2011 to 2027, generated
using an ARIMA (AutoRegressive Integrated Moving
C. VISUALIZATION OF FORECAST AND Average) model. The x-axis shows the timeline in
TRENDS years, while the y-axis represents the number of
dengue cases per 100,000 population. The blue line
corresponds to the observed historical data from 2011
until around the end of 2022. During this historical
period, there is a noticeable surge in dengue cases,
particularly between 2014 and 2017, where several
sharp peaks are evident. After 2017, the dengue case
counts generally decline and become less erratic,
indicating a possible reduction in outbreak severity or
improved control measures.
The red line illustrates the ARIMA model's
forecast of dengue cases from 2023 to 2027. The
forecast displays a repeating seasonal-like pattern with
This graph shows the performance of an
consistent peaks and troughs across each year. This
ARIMA (AutoRegressive Integrated Moving Average)
pattern suggests the model has identified an
model in forecasting monthly dengue cases. The
underlying cyclic structure, even though the original
x-axis represents time, spanning from 2006 to early
dataset was noted to lack true seasonality. Each yearly
2017, while the y-axis indicates the number of dengue
cycle in the forecast maintains a similar amplitude and
cases. Three types of data are plotted: the training data
frequency, indicating that the model expects relatively
in blue, actual test data in orange, and the ARIMA
stable but recurring outbreaks in the future. The
model’s forecast in green. The training data covers the
relatively smooth and consistent nature of the forecast
period from 2006 to around 2014. During this time,
contrasts with the volatility of the historical data.
the number of dengue cases shows a generally
This visualization is useful for public health
increasing trend with noticeable seasonal spikes.
planning, as it offers a long-term view of potential
Around 2013, there was a significant peak in cases,
dengue trends. The use of ARIMA helps to extrapolate
followed by fluctuations. The orange line begins
future values based on past behavior, although care
around late 2014, representing unseen data used to test
should be taken when interpreting results from
the model's predictive power. The green line shows the
non-seasonal data. Ultimately, this chart combines
ARIMA forecast, which attempts to match the orange
actual observations and statistical projections to aid in
line of actual values. Visually, the forecast closely
understanding and preparing for future dengue
follows the actual values, suggesting that the model
activity.
performs reasonably well. Some discrepancies can be
seen, especially at certain peaks, which may indicate
limitations in the model's ability to fully capture
extreme variations. Overall, the chart demonstrates
how ARIMA can be applied to real-world health data
for time series forecasting.

V. RESULTS AND DISCUSSIONS

A. FINDINGS/ANALYSIS

1. What are the factors that contribute to the rise


of dengue cases in the Philippines?
The rise in dengue cases in the Philippines is
influenced by multiple factors including rapid
urbanization, poor sanitation, stagnant water
accumulation, and climate variability such as
increased rainfall and temperature, which

11
favor mosquito breeding. The ARIMA model predictability, which is essential for timely
reveals a general increase in cases from 2006 intervention and cost-effective resource
to 2017, with several sharp peaks, planning.
particularly around 2013 to 2017. These
spikes align with known high-transmission 5. What are the best preventive measures to
periods and may reflect lapses in vector mitigate the dengue cases in the Philippines?
control or heightened virus circulation. The -​ To reduce dengue cases, the Philippines
observed seasonality-like behavior in the should prioritize eliminating mosquito
forecasts implies that dengue outbreaks tend breeding sites (like stagnant water),
to repeat in cycles, likely tied to implementing widespread community
environmental and social conditions that cleanup campaigns, and maintaining regular
persist or worsen over time. fumigation in high-risk areas. Public
awareness campaigns and early warning
2. What will be the forecast of dengue cases for the systems based on models like ARIMA can
years 2025–2030? prompt communities to take timely action.
-​ While the ARIMA forecast in the graph only Moreover, strengthening disease surveillance,
extends to 2027, it projects a stable, recurring promoting the use of mosquito repellents,
pattern of dengue outbreaks with consistent and, where appropriate, deploying vaccines
yearly peaks. Assuming this trend continues or biological control agents can further
and no major changes in public health reduce transmission and outbreak severity.
measures or environmental conditions occur,
the forecast for 2025–2030 would likely B. CONCLUSION
follow a similar cyclic pattern. This means ​ The ARIMA model demonstrates a fair
the Philippines can expect moderate but ability to forecast monthly dengue cases by capturing
regular surges in dengue cases each year, general trends over time. With a MAPE of 19.7%, the
rather than highly erratic or extreme model’s predictions fall within an acceptable error
outbreaks, unless new disruptive factors margin for public health forecasting. It effectively
emerge. captures the rise and fall of cases based on historical
data, making it useful for predicting overall patterns in
3. How can historical data be useful in forecasting dengue transmission. However, the model struggles to
dengue cases in the Philippines? handle extreme surges or sudden drops in cases,
-​ Historical data is crucial for forecasting indicating its limitation in dealing with unexpected
dengue because it captures long-term trends, outbreaks or sharp fluctuations.
outbreak frequencies, and any cyclic ​ This limitation highlights the ARIMA
behaviors that statistical models like ARIMA model's reliance on past data to predict future trends,
can detect and extrapolate. In the graphs, and it lacks the ability to incorporate external factors
training data from as early as 2006 allowed that could influence dengue transmission, such as
the ARIMA model to recognize the general weather conditions or changes in population
upward trend and seasonal peaks, leading to movement. By incorporating exogenous variables into
forecasts that mirror real-world dynamics. By the ARIMAX model, the forecasting accuracy could
learning from past outbreaks, forecasting be significantly enhanced. External factors like
tools can estimate future case counts, guide rainfall, temperature, and human mobility are known
preventive strategies, and assess the likely to play a crucial role in dengue outbreaks, and their
timing and severity of future surges. inclusion could make the model more responsive to
4. What is the impact of early dengue forecasting sudden shifts in case numbers.
on healthcare resource allocation? ​ Overall, while the ARIMA model is effective
-​ Early forecasting allows health authorities to for trend analysis and provides a reliable baseline for
prepare in advance by strategically allocating forecasting, its predictive accuracy could be improved
medical supplies, hospital beds, and by incorporating additional data sources. For
personnel to regions projected to experience high-stakes decision-making, especially during
outbreaks. With reliable forecasts such as unpredictable outbreaks, it is essential to refine the
those generated by ARIMA, healthcare model by including factors that account for
systems can avoid being overwhelmed, environmental and socio-economic influences. This
reduce mortality, and implement preemptive would improve the model’s ability to predict extreme
vector control programs. The smoother cases and make it more useful for real-time public
forecast pattern from 2023 to 2027 offers health interventions.

12
C. RECOMMENDATION
​ To improve the ARIMA model’s forecasting
accuracy for dengue cases, incorporating exogenous
variables like rainfall, temperature, and humidity is
recommended, as these environmental factors
significantly influence transmission. Shifting to an
ARIMAX model would allow the inclusion of such
variables, enhancing the model's ability to respond to
real-world conditions. By incorporating weather data,
the model can better capture sudden fluctuations in
dengue cases caused by changes in environmental
conditions, providing more accurate and timely
predictions.
​ Hybrid models that combine ARIMA with
machine learning methods such as Random Forest or
LSTM (Long Short-Term Memory) can also improve
forecasting accuracy. These hybrid approaches can
capture both linear trends from ARIMA and nonlinear
patterns from machine learning algorithms, making
the model more capable of handling complex
dynamics in dengue transmission. This combination
can help account for a wider range of influencing
factors, such as socio-demographic variables,
population density, and human mobility, which can
vary significantly by region and time.
​ Additionally, integrating data on public
health interventions, such as vector control activities (
fumigation and clean-up drives), can help explain
sharp decreases in dengue incidence. By including this
type of information, the model can more accurately
forecast the effects of ongoing efforts to control
dengue outbreaks. Preprocessing techniques, such as
handling missing values and outliers, along with
cross-validation, can further improve the model’s
reliability. A more comprehensive, data-driven
approach that incorporates environmental,
socio-demographic, and public health intervention
data will lead to more accurate and actionable
forecasts for dengue prevention and control.

APPENDICES

A. CODE

13
B. SNIPPET OF THE DATA SET

14
C. ADF TEST ON LOG TRANSFORM SERIES

D. ADF TEST ON FIRST DIFFERENCING

15
E. EVALUATION TEST SET

F. FORECASTED MONTHLY DENGUE CASES

G. FORECASTED DATASET IN .CSV FILE

16
REFERENCES
[1] Centers for Disease Control and Prevention: Wangkay, K. A., Lao, P. E., & Enriquez, A. B. (2024).
Dengue Vaccine. 2019. Projecting temperature-related dengue burden in the
https://www.cdc.gov/dengue/prevention/dengue-vacci Philippines under various socioeconomic pathway
ne.html. Accessed 7 Dec 2019. - Bing, n.d. scenarios. Frontiers in Public Health, 12.
[2] Agrupis, K. A., Ylade, M., Aldaba, J., Lopez, A. https://doi.org/10.3389/fpubh.2024.1420457 dengue
L., & Deen, J. (2019). Trends in dengue research in infection and machine learning research. Journal of
the Philippines: A systematic review. PLoS Neglected Informatics and Virtual
Tropical Diseases, 13(4), e0007280. Education,8(4),2249.https://www.joiv.org/index.php/jo
https://doi.org/10.1371/journal.pntd.0007280 iv/article/view/2249
[3]Ligue, K. D. B., & Ligue, K. J. B. (2022). Deep
Learning Approach to Forecasting Dengue Cases in
Using Long Short-term Memory (LSTM).
https://www.semanticscholar.org/paper/Deep-Learning
-Approach-to-Forecasting-Dengue-Cases-Ligue-Ligue
/72e37a00d7d1499a50dd032ffd0dae96ef437da9#:~:te
xt=The%20feasibility%20of%20deep%20learning%2
0techniques%20to%20capture,Prediction%20are%20d
emonstrated.%20International%20journal%20of%20e
nvironmental%20research%E2%80%A6 - Bing, n.d.
[4] Zhang, Q., Sun, K., Chinazzi, M., Pastore y
Piontti, A., Dean, N. E., Rojas, D. P., Merler, S.,
Mistry, D., Poletti, P., Rossi, L., Bray, M., Halloran,
M. E., Longini, I. M., & Vespignani, A. (2022).
Spread of Zika virus in the Americas. Scientific
Reports, 12, 5682.
https://pmc.ncbi.nlm.nih.gov/articles/PMC4784057[5]
[5]Rahman, M. S., Faruk, M. O., Alam, S., & Shirin,
T. (2022). A systematic review of dengue outbreak
prediction models: Current practices and future
directions. PLOS Neglected Tropical Diseases, 16(2),
e0010631.
https://doi.org/10.1371/journal.pntd.0010631
[6]Roster, K., & Rodrigues, F. A. (2021). Neural
networks for dengue prediction: A systematic review.
arXiv preprint arXiv:2106.12905.
https://arxiv.org/abs/2106.12905
[7] Zahiruddin, H., Noor, N. M., & Ahmad, M. H.
(2024). A scoping review and bibliometric analysis
(ScoRBA) on dengue infection and machine learning
research. Journal of Informatics and Virtual
Education,8(4),2249.https://www.joiv.org/index.php/jo
iv/article/view/2249
[8]Wagner, B. (2023). Using Autoregressive
Integrated Moving Average models for time series
analysis. BMJ, 383, p2739.
https://doi.org/10.1136/bmj.p2739
[9] Kaur, et al. (2023). Title of the article.
Environmental Science and Pollution Research,
30(Issue), Page
numbers.https://link.springer.com/article/10.1007/s113
56-023-25148-9
[10] Seposo, X., Valenzuela, S., Apostol, G. L. C.,

17
UNIVERSITY OF NEGROS OCCIDENTAL - RECOLETOS INCORPORATED
College of Information Technology
Lizares Avenue, Bacolod City 6100

CURRICULUM VITAE

Name: Elaine Casipe

Program of Study: BS in Computer Science

Date of Birth: April, 06, 2004

Contact Details: [email protected]

Areas of Interest: Documentation, Programmer

Name: Felicity Ginete

Program of Study: BS in Computer Science

Date of Birth: August 20, 2003

Contact Details: [email protected]

Areas of Interest: Documentation

Name: Adrian Piodena

Program of Study: BS in Information Technology

Date of Birth: June 10, 2004

Contact Details: [email protected]

Areas of Interest: Documentation

Bachelor of Science in Computer Science


18

You might also like