Is Min P sampling really the preferred modern alternative to Top K/Top P?
r/LocalLLaMA
•
Generative AI
AI Tools
According to what I've been reading (and also according to all models I've asked about this), the consensus seems to be that Min P is the better/ modern approach to sampling and that it should be preferred over Top P/Top K, which should be used only if Min P isn't available or for legacy reasons. Yet, looking and recently published LLM on huggingface and elsewhere, the recommended parameters for sampling are still largely Top K and/or Top P. Is this only for legacy reasons? Or some other reason? submitted by /u/bgravato [link] [comments.