Why use the 8-bit floating numbers to compute the original cost?

Hi,
The work is amazing.
When I looked through the code, I foud that you employed the 8-bit floating numbers to compute the original cost and store 
it as a lookup table. I wondered why not use the 32-bit floating(not use the flag "--half" in the pretraining process) or use the 16-bit floating (use the flag "--half" in the pretraining process)? Could you please clarify that? 
Thanks a lot!