AI RESEARCH

[P] XGBoost + TF-IDF for emotion prediction — good state accuracy but struggling with intensity (need advice)

r/MachineLearning

Hey everyone, I’m working on a small ML project (~1200 samples) where I’m trying to predict: Emotional state (classification - 6 classes) Intensity (1-5) of that emotion The dataset contains: journal_text (short, noisy reflections) metadata like: stress_level energy_level sleep_hours time_of_day previous_day_mood ambience_type face_emotion_hint duration_min reflection_quality 🔧 What I’ve done so far 1. Text processing Using TF-IDF: max_features = 500 → tried 1000+ as well ngram_range = (1,2) stop_words = 'english' min_df = 2 Resulting shape: ~1200 samples × 500-1500 features 2.