Compare language models I trained from scratch — from 3.2M to 2 billion parameters.
Read the blog · GitHub · About