Linear Layer
Subscribe
Sign in
Transformer architecture and training tricks…
Julian Lehrer
Feb 22, 2023
A (hopefully) simple guide, explained in torch
Read →
Comments
This site requires JavaScript to run correctly. Please
turn on JavaScript
or unblock scripts
Transformer architecture and training tricks…
A (hopefully) simple guide, explained in torch