I trained a 90M parameter embedding model from scratch

r/LocalLLaMA
Generative AI Open Source AI AI Research AI Tools

I trained a 90M parameter encoder only (embedding) model from scratch. I mostly trained in on google colab on a colab pro plus subscription. this was like the 5th run as previously I had issues with exploding gradients. It was a fun project but not yet near SOTA quality. I also managed to successfully infer it with Auto model. it uses e5-base-v2 tokeniser. I evaluated it on STS benchmark. Spearman Correlation: 0.5453 If anyone would like to try the model. The huggingface page of the model is - submitted by /u/ConfectionAfter2366 [link] [comments.