This project is an excuse for me to play with some tools, namely database management (setting up pipelines, SQL, Postgres), and to play with some optimization techniques. The first part of the project is the simulation, in which I simulate guests randomly walking through a fictional amusement part. During this process, transactions are recorded into a Postgres database. I then aggregate this data by day and attraction, and compute quantities like total revenue by day and attraction, average duration in the park, and standard paths taken by guests. Finally, I feed that information into an online Bayesian optimization agent, which optimizes the total revenue by tweaking the prices of the attractions in the park.
The park is represented as a collection of attractions, each having a fixed physical location
Intending to optimize the prices of the park to maximize revenue, we need one more ingredient. Note that by sending the prices
Every time a guest visits an attraction, the transaction is recorded in a Postgres database. This data includes the transaction amount, the attraction identity, the guest identity, and the transaction time. This data is automatically aggregated into summary statistics by the database, recording the total revenue of each attraction for each day. This data is then fed into a reinforcement learning agent, which attempts to optimize the total revenue of the park by modifying the prices of the park, which it is allowed to modify in between days.
For the optimization agent, I decided to go with a Bayesian optimizer with a random forest regressor. I used sklearn-optimize as my chosen framework to utilize these algorithms. I chose a Bayesian optimizer since the samples in this system are quite sparse; I have access to one sample per day, and simulating a day is relatively computationally expensive on a computer and cannot be replicated in real-life. Additionally, the output is quite noisy, due to the stochastic nature of the simulation. This means that the gradients of the fitness function (total revenue) are not accessible by finite difference. Bayesian optimization seemed like a strong way to handle these constraints, although I'm certain there are better approaches. For more specific analysis of data, see analsysis.ipynb.
Some TODOs: 1.) I would like to try to compare to other optimization agents; perhaps reinforcement learning? 2.) I would like to increase the complexity of the simulation, so that guests have preferred types of rides, the seasons have a greater impact on guest behavior, and more. These could produce interesting behaviors for a reinforcement learning agent to learn. 3.) I would like to observe slightly stronger