AI RESEARCH
[D] Prior work using pixel shift to improve VAE accuracy?
r/MachineLearning
•
Currently, I'm attempting to train up a "f8ch32" VAE ( 8x compression factor, 32 channels) Its current performance could be rated as "better than sdxl f8ch4, but worse than auraflow f8ch16" My biggest challenge is improving reconstruction fidelity. Various searches, etc. suggest to me that the publically known methods for this sort of thing are mostly using LPIPS and GAN. The trouble with these is that LPIPS can smooth too much, and GANs start making up stuff. The latter being fine if all you want is "a sharp end result", but lousy if you care about actual fidelity to original image.