"Second Thoughts" Been playing with adding a small transformer that reads output near the end of generation, and feeds it back near the top as a refinement loop. A quick test of 1.7B model showed drastic improvement in focused tasks (like coding)
r/LocalLLaMA
•
NLP
A 1.7B model can actually turn out some code, so I'm running the