license: apache-2.0
pipeline_tag: image-segmentation
tags:
- building-extraction
- remote-sensing
UAGLNet: Uncertainty-Aggregated Global-Local Fusion Network with Cooperative CNN-Transformer for Building Extraction
This repository contains the official implementation of UAGLNet, a model for building extraction from remote sensing images, as presented in the paper "UAGLNet: Uncertainty-Aggregated Global-Local Fusion Network with Cooperative CNN-Transformer for Building Extraction".
UAGLNet addresses the challenges of building extraction from remote sensing images due to complex structure variations. It proposes an Uncertainty-Aggregated Global-Local Fusion Network capable of exploiting high-quality global-local visual semantics under the guidance of uncertainty modeling. Specifically, it features a novel cooperative encoder with hybrid CNN and transformer layers, an intermediate cooperative interaction block (CIB) to narrow feature gaps, and a Global-Local Fusion (GLF) module. Additionally, an Uncertainty-Aggregated Decoder (UAD) is introduced to explicitly estimate pixel-wise uncertainty and mitigate segmentation ambiguity in uncertain regions.
Paper
- ArXiv: 2512.12941
- Hugging Face Papers: 2512.12941
Code
- GitHub Repository: Dstate/UAGLNet
- Hugging Face Collection: ldxxx/uaglnet
Quick Start
Installation
Clone this repository and create the environment.
git clone https://github.com/Dstate/UAGLNet.git
cd UAGLNet
conda create -n uaglnet python=3.8 -y
conda activate uaglnet
conda install pytorch==1.10.0 torchvision==0.11.0 torchaudio==0.10.0 cudatoolkit=11.3 -c pytorch -c conda-forge
pip install -r requirements.txt
Data Preprocessing
We conduct experiments on the Inria, WHU, and Massachusetts datasets. Detailed guidance for dataset preprocessing is provided here: DATA_PREPARATION.md.
Training & Testing
Training and testing examples on the Inria dataset:
# training
python UAGLNet_train.py -c config/inria/UAGLNet.py
# testing
python UAGLNet_test.py -c config/inria/UAGLNet.py
Main Results
The following table presents the performance of UAGLNet on building extraction benchmarks.
| Benchmark | IoU | F1 | P | R | Weight |
|---|---|---|---|---|---|
| Inria | 83.74 | 91.15 | 92.09 | 90.22 | UAGLNet_Inria |
| Mass | 76.97 | 86.99 | 88.28 | 85.73 | UAGLNet_Mass |
| WHU | 92.07 | 95.87 | 96.21 | 95.54 | UAGLNet_WHU |
You can quickly reproduce these results by running Reproduce.py, which will load the pretrained checkpoints from Hugging Face and perform inference.
# Inria
python Reproduce.py -d Inria
# Massachusetts
python Reproduce.py -d Mass
# WHU
python Reproduce.py -d WHU
Citation
If you find this project useful in your research, please cite it as:
@article{UAGLNet,
title = {UAGLNet: Uncertainty-Aggregated Global-Local Fusion Network with Cooperative CNN-Transformer for Building Extraction},
author = {Siyuan Yao and Dongxiu Liu and Taotao Li and Shengjie Li and Wenqi Ren and Xiaochun Cao},
journal = {arXiv preprint arXiv:2512.12941},
year = {2025}
}
Acknowledgement
This work is built upon BuildingExtraction, GeoSeg and SMT. We sincerely appreciate their contributions which provide a clear pipeline and well-organized code.