TY - GEN
T1 - Toward country scale building detection with convolutional neural network using aerial images
AU - Yang, Hsiuhan Lexie
AU - Lunga, Dalton
AU - Yuan, Jiangye
N1 - Publisher Copyright:
© 2017 IEEE.
PY - 2017/12/1
Y1 - 2017/12/1
N2 - Establishing up-to-date nationwide building maps is essential to understand urban dynamics, such as estimating population and urban planning and many other applications. However, an efficient and effective solution is yet to be developed. In this paper, for the first time we evaluate three state-of-the-art CNNs for detecting buildings across entire United States using aerial images. The three CNN architectures, fully convolutional neural network, conditional random field as recurrent neural network, and SegNet, support semantic pixel-wise labeling and focus on capturing textural information at multi-scale. We use 1-meter resolution NAIP images as the test data set, and compare the detection results across the three methods. In addition, we propose to combine signed distance function labels with SegNet, which is the preferred CNN architecture identified by our extensive evaluations. The results are further improved in terms of precision, recall rate and the number of building detected. On average, model inference on test images is less than one minute for an area of size ∼ 56 km2. With these promising results and the time required to process images, the framework offers great potential toward country scale building mapping with remote sensing imagery.
AB - Establishing up-to-date nationwide building maps is essential to understand urban dynamics, such as estimating population and urban planning and many other applications. However, an efficient and effective solution is yet to be developed. In this paper, for the first time we evaluate three state-of-the-art CNNs for detecting buildings across entire United States using aerial images. The three CNN architectures, fully convolutional neural network, conditional random field as recurrent neural network, and SegNet, support semantic pixel-wise labeling and focus on capturing textural information at multi-scale. We use 1-meter resolution NAIP images as the test data set, and compare the detection results across the three methods. In addition, we propose to combine signed distance function labels with SegNet, which is the preferred CNN architecture identified by our extensive evaluations. The results are further improved in terms of precision, recall rate and the number of building detected. On average, model inference on test images is less than one minute for an area of size ∼ 56 km2. With these promising results and the time required to process images, the framework offers great potential toward country scale building mapping with remote sensing imagery.
KW - NAIP
KW - building extractions
KW - convolutional
KW - deep learning
KW - semantic segmentation
UR - http://www.scopus.com/inward/record.url?scp=85034400093&partnerID=8YFLogxK
U2 - 10.1109/IGARSS.2017.8127091
DO - 10.1109/IGARSS.2017.8127091
M3 - Conference contribution
AN - SCOPUS:85034400093
T3 - International Geoscience and Remote Sensing Symposium (IGARSS)
SP - 870
EP - 873
BT - 2017 IEEE International Geoscience and Remote Sensing Symposium
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 37th Annual IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2017
Y2 - 23 July 2017 through 28 July 2017
ER -