Sort of my first venture towards finetuning on a qwen 3.5 4B heretic mode

How do you deem qwen 3.5 4B heretic variants for RP finetunes? I have been struggling to get a decent instruct based model, any tips regarding the goal would be really helpful. I have crafted synthetic datasets from frontier APIs that cover RP flavours. Some guidance would really be helpful. I just have an intuition that qwen 3.5 4B is not fit for RP, but at the same time this model is so attractive as it can run on most GPU poor systems. Is it wrong to make it an instruct variant? I feel this LLM is not fit for RP or I might be doing something wrong. submitted by /u/Nubinu [link] [comments.