In this experiment, we run a GPT-OSS-20b VLLM server on a 4090 at low latency and utilize prefilled prompting and some shuffling strategies to present the AI a list of control options it can choose from with single-token output, allowing rapid low latency control of the ship in-game.
https://www.youtube.com/watch?v=Yo7GWnGtpoc