Fine-Tuning TranslateGemma-4B to improve bi-directional English & Welsh translations on an H200 GPU!

r/LocalLLaMA
Machine Learning AI Hardware AI Research

Open source repo: 5% of the fine-tuning took 40 minutes and cost a couple dollars to prove the process works. Looking forwards to Flash Attention v4 to leave beta, to test fine-tuning performance on a B200 on the cloud, probably a few months away it seems? What languages would you train TranslateGemma to be able to translate? I was originally thinking about klingon but the available datasets seemed a bit lacking. submitted by /u/ufos1111 [link] [comments]