Toy experiment: frozen Pythia-70M can use a forward-derived fast memory for contextual one-shot symbolic recall [D]

Toy Experiment: Frozen Pythia-70M Using Forward-Derived Fast Memory for Contextual One-Shot Recall I have been running a small research/toy experiment around fast memory on top of a frozen open-weight transformer. The motivation is simple: normal transformer learning requires backprop and weight updates, but in-context adaptation feels like temporary forward-pass memory. I wanted to test whether a frozen model exposes enough geometry that a small external memory can do limited one-shot binding without changing the transformer weights.