Thanks to visit codestin.com
Credit goes to github.com

Skip to content

MrMjauh/learn_gpt

Repository files navigation

Follows https://arxiv.org/pdf/1706.03762 somewhat, but only the decoder part as GPT is a decoder only model

  • Video to follow along all the steps and questions you have https://www.youtube.com/watch?v=kCc8FmEb1nY
  • Uses a simple character encoder
  • Only for learning purpose, and get more insight into the gpt models
  • Layer norms and residuel connections are done differently in gpt
  • Using masked attention heads for each, should not be needed?

When running on a old gaming laptop

See training for the results

About

Simple gpt model to learn more in depth about it

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages