Is there any <3B model with usable 200k+ context window?
r/LocalLLaMA
•
AI Safety
I need a small model for processing conversation transcripts from larger models, so need usable context window out to at least 200k tokens. I know some models claim to this, but I don’t know which are actually good at this in practice. Also desirable: low hallucination rate, not super verbose. submitted by /u/madmax_br5 [link] [comments]