AI RESEARCH

Based on Data Balancing and Model Improvement for Multi-Label Sentiment Classification Performance Enhancement

arXiv CS.CL

ArXi:2511.14073v3 Announce Type: replace Multi-label sentiment classification plays a vital role in natural language processing by detecting multiple emotions within a single text. However, existing datasets like GoEmotions often suffer from severe class imbalance, which hampers model performance, especially for underrepresented emotions. To address this, we constructed a balanced multi-label sentiment dataset by integrating the original GoEmotions data, emotion-labeled samples from Sentiment140 using a RoBERTa-base-GoEmotions model, and manually annotated texts generated by GPT-4 mini.