model 14524 views
nanoGPT: The simplest, fastest repository for training/finetuning medium-sized GPTs
Impact Summary
Problem Solved
Removed boilerplate and complexity from huge LLM frameworks like Megatron-LM for experimental researchers.
Improves On
HuggingFace Accelerate / minGPT
Abstract Snapshot
A complete rewrite of minGPT. Designed specifically to be simple, hackable, and fast for academic and educational environments. Capable of reproducing GPT-2 (124M) in a few hours on a modern GPU node.
LLMPyTorchGPTEducation
1 Reproductions

Yann LeCun
Verified reproduction
1 Citations
OpenAI Research cited this contextually
in related peer review
Peer Comments Thread
Be the first to leave a verified peer review on this work.
