Thanks to visit codestin.com
Credit goes to github.com

Skip to content

suhasAB/store-sales-forecasting

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

74 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Project Title: Store Sales Prediction based on time series data

Problem Statement:

Predicting the sale of different types of products sold in Favorita stores (Ecuador), by considoring factors like historical time series sales data, promotions, holiday seasons and oil prices.

Professor:

Carlos Rojas

Term:

Fall 2022

Team Number:

10

Team Members:

  • Hardy Leung (ksleung)
  • Loukya Tammineni (LoukyaTammineni)
  • Suhas Anand Balagar (suhasAB)
  • Xichang Yu (Codyyu36)

1.Abstract:

The ability to predict the sales of a variety of stores is highly sought out in supply chain logistics, as it finds applications in increasing customer satisfaction and reducing food waste.
We are proposing use of multiple Supervised Learning methods to predict the sales of stores based on time series dataset of Corporación Favorita, an Ecuador based grocery retailer. Ecuador is a country whose economy is strongly dependent on the oil and fluctuates with the price of oil.
We are planning to use dataset from an ongoing Kaggle competition, “Store Sales - Time Series Forecasting”. The dataset includes multiple csv sheets of time series data. We will try to evaluate the different aspects that might impact the sales in a store like Holiday seasons, Oil prices and historical sales data from a variety of stores.
The preprocessing of data will include checking for missing values and if found, imputing them so that no data is disregarded. We have also checked for frequency distribution of data elements as part of Data exploration. We have calculated and plotted correlation mapping to establish the correlations between different input and output parameters such as holiday events, oil prices, transactions per store type, dates of salary, natural calamities etc.
We plan on using different Supervised Learning models to predict the prices and then evaluate which of those models give out the best results. Supervised learning approach works best in this case, as we have huge time series data of both input and the output parameters mentioned above. Few of the supervised learning options we can consider are

  • Linear Regression
  • Gradient Boost(XGBoost)
  • Random Forest (RF)
  • Support Vector Machine(SVM) /Support Vector Regression (SVR)
  • Long Short-Term Memory (LSTM)
  • Ensemble Regression by combining XGBoost, RF, Ridge and SVR models

Once the models are trained on training data and developed, we can evaluate the performance of these models on the test dataset.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •