AI RESEARCH
Masks Can Talk: Extracting Structured Text Information from Single-Modal Images for Remote Sensing Change Detection
arXiv CS.CV
•
ArXi:2605.07178v1 Announce Type: new Remote sensing change detection is pivotal for urban monitoring, disaster assessment, and environmental resource management. Yet, unimodal deep learning methods frequently confuse genuine semantic changes with visually similar but irrelevant variations. Recent multimodal approaches incorporate text as auxiliary supervision, but their descriptions are either semantically coarse and unstructured or model-generated and thus noisy.