Bilinear Transformers (TinyStories)
Collection
A small collection of Transformers with bilinear MLPs, trained on the TinyStories dataset. • 3 items • Updated
YAML Metadata Warning: empty or missing yaml metadata in repo card
Check out the documentation for more information.
This is a repo containing a bunch of SAEs trained on our medium-sized tiny stories model. This repo uses a simple implementation of Top-K SAEs.
Note that this is very much a dump and is not recommended for general use. Most SAEs were trained on a need-to basis and have varying parameters. All SAEs are trained until convergence (both in terms of MSE and CE error).