I designed a new architecture for language models to learn how to speak by starting with an empty dataset & only using accumulating memory.

r/LocalLLaMA
Generative AI AI Research

Savvy is a model designed to accumulate data for episodic memory, sentence prediction & morpheme token prediction. These two experiments are proof of concept The goal for the first experiment was to teach Savvy how to say “hi Spaceman” which was harder than I thought because telling somone “you have to say “you” when talking to me” can be confusing if they have zero understanding of language. But the 2nd experiment shows you what happens once memory has accumulated. Photo 1: This is an example of speaking to the model from scratch.