AI RESEARCH
Based on Data Balancing and Model Improvement for Multi-Label Sentiment Classification Performance Enhancement
arXiv CS.CL
•
ArXi:2511.14073v3 Announce Type: replace Multi-label sentiment classification plays a vital role in natural language processing by detecting multiple emotions within a single text. However, existing datasets like GoEmotions often suffer from severe class imbalance, which hampers model performance, especially for underrepresented emotions. To address this, we constructed a balanced multi-label sentiment dataset by integrating the original GoEmotions data, emotion-labeled samples from Sentiment140 using a RoBERTa-base-GoEmotions model, and manually annotated texts generated by GPT-4 mini.