[R] PCA rotation makes non-Matryoshka embeddings truncatable — 27x compression at 99% recall with reranking
r/LocalLLaMA
•
Generative AI
AI Research
Most embedding models (BGE-M3, E5, ada-002, Cohere) weren't trained with Matryoshka losses, so you can't just drop trailing dimensions. We tried: truncating BGE-M3 from 1024 to 256 dims gives 0.467 cosine similarity. Unusable. The fix is embarrassingly simple. Fit PCA on a sample of your embeddings (~5K vectors is enough), then rotate all vectors into the principal component basis before truncating. The eigenvalues reorder dimensions by importance, so truncation now discards the least important ones instead of arbitrary ones. Result: PCA truncation to 256 dims gives 0.974 cosine similarity.