[Project] PentaNet: Pushing beyond BitNet with Native Pentanary {-2, -1, 0, 1, 2} Quantization (124M, zero-multiplier inference)

Hey everyone, I've been experimenting with extreme LLM quantization following the BitNet 1.58b paper. While ternary quantization {-1, 0, 1} is great for replacing costly matrix multiplications with simple additions, I wondered if we were leaving too much model capacity on the table by overly restricting the weights. So, I built and trained PentaNet from scratch - a custom architecture that expands the weight states to pentanary: {-2, -1, 0, +1, +2}. Why ±2? Because multiplying by 2 doesn't require a hardware multiplier! It’s just a left bit-shift (x << 1.