TY - JOUR
T1 - Building Extraction at Scale Using Convolutional Neural Network
T2 - Mapping of the United States
AU - Yang, Hsiuhan Lexie
AU - Yuan, Jiangye
AU - Lunga, Dalton
AU - Laverdiere, Melanie
AU - Rose, Amy
AU - Bhaduri, Budhendra
N1 - Publisher Copyright:
© 2008-2012 IEEE.
PY - 2018/8
Y1 - 2018/8
N2 - Establishing up-to-date large scale building maps is essential to understand the urban dynamics, such as estimating population, urban planning, and many other applications. Although many computer vision tasks have been successfully carried out with deep convolutional neural networks, there is a growing need to understand their large scale impact on building mapping with remote sensing imagery. Taking advantage of the scalability of convolutional neural networks (CNNs) and using only few areas with the abundance of building footprints, for the first time we conduct a comparative analysis of four state-of-the-art CNNs for extracting building footprints across the entire continental United States. The four CNN architectures namely: Branch-out CNN, fully convolutional network (FCN), conditional random field as recurrent neural network (CRFasRNN), and SegNet, support semantic pixelwise labeling and focus on capturing textural information at multiscale. We use 1-meter resolution aerial images from National Agriculture Imagery Program as the test-bed, and compare the extraction results across the four methods. In addition, we propose to combine signed-distance labels with SegNet, the preferred CNN architecture identified by our extensive evaluations, to advance building extraction results to instance level. We further demonstrate the usefulness of fusing additional near IR information into the building extraction framework. Large scale experimental evaluations are conducted and reported using metrics that include: Precision, recall rate, intersection over union, and the number of buildings extracted. With the improved CNN model and no requirement of further postprocessing, we have generated building maps for the United States with an average processing time less than one minute for an area of size \sim {\text{56}} \text{km}^2. The quality of extracted buildings and processing time demonstrated that the proposed CNN based framework fits the need of building extraction at scale.
AB - Establishing up-to-date large scale building maps is essential to understand the urban dynamics, such as estimating population, urban planning, and many other applications. Although many computer vision tasks have been successfully carried out with deep convolutional neural networks, there is a growing need to understand their large scale impact on building mapping with remote sensing imagery. Taking advantage of the scalability of convolutional neural networks (CNNs) and using only few areas with the abundance of building footprints, for the first time we conduct a comparative analysis of four state-of-the-art CNNs for extracting building footprints across the entire continental United States. The four CNN architectures namely: Branch-out CNN, fully convolutional network (FCN), conditional random field as recurrent neural network (CRFasRNN), and SegNet, support semantic pixelwise labeling and focus on capturing textural information at multiscale. We use 1-meter resolution aerial images from National Agriculture Imagery Program as the test-bed, and compare the extraction results across the four methods. In addition, we propose to combine signed-distance labels with SegNet, the preferred CNN architecture identified by our extensive evaluations, to advance building extraction results to instance level. We further demonstrate the usefulness of fusing additional near IR information into the building extraction framework. Large scale experimental evaluations are conducted and reported using metrics that include: Precision, recall rate, intersection over union, and the number of buildings extracted. With the improved CNN model and no requirement of further postprocessing, we have generated building maps for the United States with an average processing time less than one minute for an area of size \sim {\text{56}} \text{km}^2. The quality of extracted buildings and processing time demonstrated that the proposed CNN based framework fits the need of building extraction at scale.
KW - Building extraction
KW - FCN
KW - convolutional neural networks (CNN)
KW - large scale
KW - segnet
KW - signed-distance
UR - http://www.scopus.com/inward/record.url?scp=85049100808&partnerID=8YFLogxK
U2 - 10.1109/JSTARS.2018.2835377
DO - 10.1109/JSTARS.2018.2835377
M3 - Article
AN - SCOPUS:85049100808
SN - 1939-1404
VL - 11
SP - 2600
EP - 2614
JO - IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
JF - IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
IS - 8
M1 - 8392725
ER -