GeoFlowVLM: Geometry-Aware Joint Uncertainty for Frozen Vision-Language Embedding

ArXi:2605.13352v1 Announce Type: new Standard dual-encoder vision-language models that map images and text to deterministic points on a shared unit hypersphere through $\ell_2$ normalization typically expose neither \emph{aleatoric} uncertainty (cross-modal ambiguity) nor \emph{epistemic} uncertainty (lack of