AI RESEARCH
Data Leakage Detection and De-duplication in Large Scale Geospatial Image Datasets
arXiv CS.CV
•
ArXi:2304.02296v2 Announce Type: replace In our study, we conducted a comprehensive analysis of three widely used datasets in the domain of building footprint extraction using deep neural networks: the INRIA Aerial Image Labelling dataset, SpaceNet 2: Building Detection v2, and the AICrowd Mapping Challenge datasets. Our experiments revealed several issues in the AICrowd Mapping Challenge dataset, where nearly 90% (about 250k) of the