GraphSculptor: Sculpting Pre-training Coreset for Graph Self-supervised Learning

ArXi:2605.01310v1 Announce Type: new Graph self-supervised learning typically relies on large-scale unlabeled datasets, heavily inflating computational costs. However, empirical evidence suggests that these datasets contain substantial redundancy-our analysis reveals that uniformly subsampling 50% of graphs retains over 96% of downstream performance. To exploit this redundancy, we