Chang , Han* , Zhang , and Miao: Building Change Detection Using Deep Learning for Remote Sensing Images

# Building Change Detection Using Deep Learning for Remote Sensing Images

Abstract: To increase building change recognition accuracy, we present a deep learning-based building change detection using remote sensing images. In the proposed approach, by merging pixel-level and object-level information of multitemporal remote sensing images, we create the difference image (DI), and the frequency-domain significance technique is used to generate the DI saliency map. The fuzzy C-means clustering technique pre-classifies the coarse change detection map by defining the DI saliency map threshold. We then extract the neighborhood features of the unchanged pixels and the changed (buildings) from pixel-level and object-level feature images, which are then used as valid deep neural network (DNN) training samples. The trained DNNs are then utilized to identify changes in DI. The suggested strategy was evaluated and compared to current detection methods using two datasets. The results suggest that our proposed technique can detect more building change information and improve change detection accuracy.

Keywords: Deep Neural Network (DNN) , Difference Image , Frequency-Domain Significance , Fuzzy C-Means

## 1. Introduction

Since buildings are the primary locations for human activities, building change detection has been a hotspot for research in photogrammetry, remote sensing, and artificial intelligence. In recent years, scholars have proposed various building change detection technologies. Previous studies have used combinations of image spectral features and morphological building index features [1,2], combinations of image spectral, textural, and shape features [3], and combinations of the image spectral, textural, shape, and morphological building index feature differences [4,5] to detect building changes. Zhang et al. [6] integrated pixel-level and object-level features to increase the change detection accuracy of buildings. While various multi-feature fusion (MFF)-based approaches for detecting building changes have yielded positive results, in some cases, the approach may generate white spot noise if it fails to effectively highlight building change information.

Numerous scholars have used deep learning to identify buildings. For instance, convolutional neural networks (CNNs) were utilized by Nemoto et al. [7] and El Amin et al. [8] to extract structures from images and detect building changes. To accomplish high-precision extraction of local objects, Liu et al. [9] employed deep neural network (DNN) to categorize spectral and textural properties and gathered random samples of diverse ground objects for categorization. To dig deeper into image features, Zhao and Du [10] and Liu et al. [11] advocated employing multi-scale CNNs, which convert original images into pyramidal structures and then extract roads, buildings, and other characteristics using multiple trainings.

Most of these deep learning methods only detect changes at the pixel level and are greatly affected by the training samples [10,11]. Therefore, researchers have conducted deep network training with buildings as the target samples for building identification. For example, Vakalopoulou et al. [12] used the Fast R-CNN method to train a large number of labeled building samples, and then used the trained model to identify buildings in remote sensing images. Wang et al. [13] also adopted the Faster R-CNN algorithm to analyze changes in remote sensing images, so as to identify various ground objects (such as buildings and roads). The effectiveness and quantity of the training samples, however, are crucial to this method.

The goal of this research is to increase the accuracy of building change detection by getting high-quality training samples from remote sensing images. We present a pixel-level and object-level feature fusion (POFF) and DNN-based building change detection method. To highlight building change information in the difference image (DI) and reduce white spot noise introduced by the POFF, using the structural similarity method (SSIM), we identify the largest difference in textural and form characteristics (after multi-scale segmentation). The final DI is constructed by fusing shape feature (SHF) DI, morphological building index (MBI) DI, textural feature (TF) DI, and the spectral feature (SF) DI acquired from the change vector analysis (CVA). To provide reliable pixel-level training data, the DI saliency map is created using the frequency-domain significance (FDS) method. The fuzzy C-means (FCM) clustering method pre-classifies the coarse change detection map (e.g., changed pixels, unaltered pixels, indeterminate pixels) by setting a DI saliency map threshold. Next, to obtain high-precision building change detection results, we extract the neighborhood features of the unchanged pixels and the changed pixels (buildings) from the multiple feature images, utilizing them as trustworthy samples for the DNNs training. Finally, we utilize the trained DNN classifier to perform building change detection on the crude change detection map to achieve the final result of building change detection.

## 2. The Proposed Method

Three steps are included in the suggested method: the construction of difference images by multi-feature fusion, high quality training sample selection, and the deep learning network classification, as shown in Fig. 1.

##### 2.1 POFF to Construct the DI

In order to properly emphasize the changing building information while minimizing noise and data redundancy. There are three primary processes in the process of creating the DI. To begin, the image’s spectral characteristics, texture features, morphological building index feature, and form features are extracted. As the image spectral feature and the image textural feature, we figure out the average of each band’s spectral mean and the average of each band’s textural characteristics (grey level co-occurrence matrix [GLCM]) [14], in order to be able to effectively emphasize the building change information. Multi-scale segmentation of multitemporal remote sensing images was per-formed using eCognition software, and the image morphological building index feature was calculated using the mean of multi-scale top-hat transformation created by differential morphological profile [15]. By adjusting shape fac¬tors, scale parameters and compactness, we perform multi-scale segmentation on remote sensing images to extract shape features effectively. Second, in order to efficiently remove noise and data redundancy, we use SSIM to compute the differences in texture features and form features of multitemporal remote sensing images, and then pick the largest difference textural and shape features. Finally, we apply CVA to obtain the multiple feature image DIs of the multitemporal remote sensing images, and then we fuse the four DIs in a predetermined proportion to create the final DI [16], as shown in Fig. 2.

Fig. 1.

Proposed method’s flow chart.

Fig. 2.

Flow chart of construction of the DI by pixel-level features and object-level features fusion.
##### 2.2 Training Sample Acquisition and Pre-classification

This paper proposes to use FDS and FCM methods [15] to obtain high-quality training samples from coarse change detection maps, taking into account both how to extract changed (buildings) and unchanged training samples as well as how to identify building change regions in order to obtain high-quality training sample.

FDS analysis is used to discover portions of an image that stand out more than other areas due to strong local or global contrast. The FDS approach is utilized to apply amplitude spectrum convolution and generate the saliency map, making the shape and location of the significant areas more equivalent to the modified regions. In this paper, the significance map is obtained by constructing an amplitude spectrum convolution using a scale-appropriate high-pass Gaussian kernel. Specifically, the Fourier transform converts an image [TeX:] $$f(x, y)$$ into the frequency domain [TeX:] $$f(x, y) \rightarrow F(f)(u, v)$$ by Fourier transform. The image’s amplitude [TeX:] $$P(u, v)=\text { angle }|F(f)|$$ and phase spectrums [TeX:] $$P(u, v)=\text { angle }|F(f)|$$ are then calculated, and spikes in the amplitude spectra [TeX:] $$|F(f)|$$ are suppressed using a Gaussian kernel [TeX:] $$h$$, as shown below [16]:

##### (1)
[TeX:] $$A_S(u, v)=|F(f)| * h$$

The inverse transform, which is computed by combining the smoothed amplitude spectrum [TeX:] $$A_S$$ with the original phase spectrum, generates the saliency map and is given by the expression:

##### (2)
[TeX:] $$S=F^{-1}\left\{A_S(u, v) e^{p(u, v)}\right\}$$

In this paper, we construct the saliency map of the coarse change detection graph by setting the threshold value, and then pre-classify the pixels in the coarse change detection map.

The FDS and FCM pre-classification algorithms successfully highlight the most probable altered locations in the DI while also narrowing the training sample search range. Additionally, it enables a more precise categorization, increasing the accuracy of the training samples that were acquired.

##### 2.3 DNNs Establishment

The neighborhood features of the original, morphological building index, textural feature, spectral feature, and shape feature images are converted into vectors and inputted into the neural network for training after pre-classification. These neighborhood features are present in the unchanged and changed class pixels in the coarse change map. Given that the multilayer backpropagation neural network does not always achieve satisfactory results, the restricted Boltzmann machine (RBM) is chosen since it only uses one feature layer at a time as the network training model.

The procedure for this method is shown in Fig. 3, where [TeX:] $$\omega_1, \omega_2, \omega_3, \omega_4$$ are the weights of each layer; [TeX:] $$\varepsilon_1, \varepsilon_2, \varepsilon_3, \varepsilon_4$$ are the learning rates of each layer; [TeX:] $$t_1$$ and [TeX:] $$t_2$$ are images of remote sensing taken at separate periods,and [TeX:] $$V_{ij}$$ represents the feature vector. The neighborhood features of the changed class [TeX:] $$w_{c}$$ and unchanged class [TeX:] $$w_{uc}$$ (Fig. 3(a)) are first inputted. A stack of RBM is learned for pre-training (Fig. 3(b)), and after that, the RBMs are unfolded in a way that creates a deep neural network (Fig. 3(c)). We fine-tune the deep neural network using the backpropagation of the error derivative (Fig. 3(d)).

Fig. 3(e) depicts the basic construction of an RBM network. RBM has [TeX:] $$l$$ visible units [TeX:] $$\left(v_1, v_2, \cdots v_l\right)$$ corresponding to its input features and[TeX:] $$n$$ hidden units [TeX:] $$\left(h_1, h_2, \cdots h_n\right)$$ that are trained, such that a visible unit must be connected to a hidden unit. [TeX:] $$W_{n \times l}$$ is a weight matrix between the hidden and visible layers,[TeX:] $$a=\left(a_1, a_2, \cdots a_l\right)$$ are biases of the visible units, and [TeX:] $$b=\left(b_1, b_2, \cdots b_n\right)$$ are the hidden bias units. The energy for the combined arrangement of visible and hidden units is given by the expression [17]:

##### (3)
[TeX:] $$E(v, h)=-\sum_{i \in \text { pixel }} b_i v_i-\sum_{j=\in \text { features }} c_j h_j-\sum_{i, j} v_i h_j W_{i j}$$

Fig. 3.

For the change detection challenge, deep neural networks are being trained. (a) As inputs, each position’s neighborhood features. (b) RBMs are employed in the pre-training stage. (c) After being pre-trained, RBMs are unrolled into a deep neural network. (d) Using backpropagation for fine-tuning. (e) RBM’s structure.

Suppose that [TeX:] $$\forall_{i, j}, v_i \in\{0,1\}, h_j \in\{0,1\}$$, then, for a given [TeX:] $$v$$, each hidden unit’s probability of being in a binary state [TeX:] $$h_{j}$$ is set to 1.

##### (4)
[TeX:] $$P\left(h_j=1 \mid v\right)=\sigma\left(\sum_{i=1}^l W_{i j} \times v_i+b_j\right)$$

In Eq. (4), [TeX:] $$\sigma(x)=1 /\left(1+e^{-x}\right)$$ is used as a sigmoid function. After the hidden units are set as binary states, the reconstructive data are produced by setting the probability of [TeX:] $$v_{i}$$ to 1.

##### (5)
[TeX:] $$P\left(v_i=1 \mid h\right)=\sigma\left(\sum_{j=1}^n W_{i j} \times h_j+a_i\right)$$

The features of the reconstructed data are then represented by updating the states of the hidden units. The change in weight is calculated by:

##### 3.5.2 DNNs parameters

Numerous factors (e.g., batch size, number of iterations, number of layers, and number of nodes per layer) must be adjusted to construct high-performance DNNs. Since the batch size defines the data subset used in training the network, choosing the best batch size is critical. During training, the number of iterations refers to the number of times Gibbs sampling is applied to each layer. The more layers a DNN has, the better it can detect features in general. However, overfitting may occur when too many or too few nodes are found in each layer. The network may not learn features because it is too complicated for the dataset being analyzed. In this paper, the DNN parameters were determined according to the accuracy of building change detection. For the batch size, the first dataset had a value of 50, whereas the second had a value of 30, while the number of iterations was set to 100 in both datasets. In the experimental process, as the layer deepens, the training time increases and the data becomes more prone to overfitting. Therefore, a deep network with a 50-250-1 architecture is recommended for the two datasets.

Table 1.

Outcomes of several methods for building change detection
Method Com Cor Qua
Dataset #1 POFF+FLICM 0.6793 0.8707 0.7632
POFF+SVM 0.6930 0.7741 0.7313
Zhang’s method [18] 0.8385 0.8103 0.7401
POFF+FDS+DNNs 0.9784 0.8799 0.8631
Dataset #2 POFF+FLICM 0.9516 0.4403 0.6020
POFF+SVM 0.8093 0.5565 0.6595
Zhang’s method [18] 0.5924 0.7399 0.4903
POFF+FDS+DNNs 0.9918 0.7563 0.7562

Fig. 7.

Different methods change the detection accuracy: (a) the first dataset and (b) the second dataset.

## 4. Conclusion

To increase the efficacy of detecting changes in buildings using remote sensing images, we created a deep learning-based building change detection technique in this research. The POFF technique of DI creation may effectively highlight the changing regions. The proposed FDS-FCM method can obtain reliable training samples, and the final building change map is produced using DNN classification. Compared to existing detection techniques, the POFF+FDS+DNNs can detect more building change information and achieve higher detection accuracy.

## Biography

##### Chang Wang
https://orcid.org/0000-0003-3132-2996

He is currently an associate professor at the School of Civil Engineering, University of Science and Technology Liaoning, Anshan, China. His research interests include remote sensing image change detection, intelligent recognition and artificial intelli-gence.

## Biography

##### Shijing Han
https://orcid.org/0000-0002-2538-0738

She is currently a lecturer at the School of Natural Resources and Surveying, Nanning Normal University, Nanning, China. Her research interests include computer vision, digital image processing, navigation and location services.

## Biography

##### Wen Zhang
https://orcid.org/0000-0001-8690-6513

She is presently a lecturer at the University of Science and Technology Liaoning's School of Civil Engineering in Anshan, China. Remote sensing image segmentation, shoreline change analysis, and geographic information science are among her research interests.

## Biography

##### Shufeng Miao
https://orcid.org/0000-0002-4448-9229

He is currently a branch manager and senior engineer in Wuhan Kedao Geographical Information Engineering Co., Ltd., Wuhan, China. His research interests include project management, digital image processing, navigation and location services.

## References

• 1 X. Huang, L. Zhang, and T. Zhu, "Building change detection from multitemporal high-resolution remotely sensed images based on a morphological building index," IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 7, no. 1, pp. 105-115, 2014.doi:[[[10.1109/jstars.2013.2252423]]]
• 2 T. Feng and Z. F. Shao, "Building change detection based on the enhanced morphological building index," Science of Surveying and Mapping, vol. 2017, no. 5, pp. 29-34, 2017.custom:[[[-]]]
• 3 X. Ye, Q. Qi, J. Wang, X. Zheng, and J. Wang, "Detecting damaged buildings caused by earthquake from remote sensing image using local spatial statistics method," Geomatics and Information Science of Wuhan University, vol. 44, no. 1, pp. 125-131, 2019.doi:[[[10.13203/j.whugis20150490]]]
• 4 J. Li, J. Dang, and Y . Wang, "Building change detection by multi-feature fusion from high resolution remote sensing images," Bulletin of Surveying and Mapping, vol. 2019, no. 10, pp. 105-108, 2019.doi:[[[10.13474/j.cnki.11-2246.2019.0328]]]
• 5 X. Huang, T. Zhu, L. Zhang, and Y . Tang, "A novel building change index for automatic building change detection from high-resolution remote sensing imagery," Remote Sensing Letters, vol. 5, no. 8, pp. 713-722, 2014.doi:[[[10.1080/2150704x.2014.963732]]]
• 6 Z. Zhang, X. Zhang, Q. Xin, and X. Yang, "Combining the pixel-based and object-based methods for building change detection using high-resolution remote sensing images," Acta Geodaetica et Cartographica Sinica, vol. 47, no. 1, pp. 102-112, 2018.doi:[[[10.11947/j.AGCS.2018.20170483]]]
• 7 K. Nemoto, R. Hamaguchi, M. Sato, A. Fujita, T. Imaizumi, and S. Hikosaka, "Building change detection via a combination of CNNs using only RGB aerial imageries," in Proceedings of SPIE 10431: Remote Sensing Technologies and Applications in Urban Environments II. Bellingham, W A: International Society for Optics and Photonics, 2017, pp. 107-118.doi:[[[10.1117/12.2277912]]]
• 8 A. M. El Amin, Q. Liu, and Y . Wang, "Zoom out CNNs features for optical remote sensing change detection," in Proceedings of 2017 2nd International Conference on Image, Vision and Computing (ICIVC), Chengdu, China, 2017, pp. 812-817.doi:[[[10.1109/icivc.2017.7984667]]]
• 9 D. Liu, L. Han, and X. Han, "High spatial resolution remote sensing image classification based on deep learning," Acta Optica Sinica, vol. 36, no. 4, pp. 306-314, 2016.doi:[[[10.3788/aos201636.0428001]]]
• 10 W. Zhao and S. Du, "Learning multiscale and deep representations for classifying remotely sensed imagery," ISPRS Journal of Photogrammetry and Remote Sensing, vol. 113, pp. 155-165, 2016.doi:[[[10.1016/j.isprsjprs.2016.01.004]]]
• 11 Y . Liu, Z. Zhang, R. Zhong, D. Chen, Y . Ke, J. Peethambaran, C. Chen, and L. Sun, "Multilevel building detection framework in remote sensing images based on convolutional neural networks," IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 11, no. 10, pp. 3688-3700, 2018.doi:[[[10.1109/jstars.2018.2866284]]]
• 12 M. V akalopoulou, K. Karantzalos, N. Komodakis, and N. Paragios, "Building detection in very high resolution multispectral data with deep learning features," in Proceedings of 2015 IEEE international Geoscience and Remote Sensing Symposium (IGARSS), Milan, Italy, 2015, pp. 1873-1876.doi:[[[10.1109/igarss.2015.7326158]]]
• 13 Q. Wang, X. Zhang, G. Chen, F. Dai, Y . Gong, and K. Zhu, "Change detection based on Faster R-CNN for high-resolution remote sensing images," Remote Sensing Letters, vol. 9, no. 10, pp. 923-932, 2018.doi:[[[10.1080/2150704x.2018.1492172]]]
• 14 C. Wang, Y . Zhang, and S. Han, "Remote sensing image change detection based on frequency domain significance method and ELM," Journal of Huazhong University of Science and Technology (Natural Science Edition), vol. 48, no. 5, pp. 19-24, 2020.doi:[[[10.3785/j.issn.1008-973X.2020.11.009]]]
• 15 C. Wang, Y . Zhang, S. Ji, and L. Zhang, "Multi-feature fusion and random multi-graph synthetic building change method," Acta Geodaetica et Cartographica Sinica, vol. 50, no. 2, pp. 235-247, 2021.doi:[[[10.11947/j.AGCS.2021.20200097]]]
• 16 C. Wang and X. Wang, "Building change detection from multi-source remote sensing images based on multifeature fusion and extreme learning machine," International Journal of Remote Sensing, vol. 42, no. 6, pp. 2246-2257, 2021.doi:[[[10.1080/2150704X.2020.1805134]]]
• 17 C. Wang, Y . S. Zhang, X. Wang, and Y . Yu, "Remote sensing image change detection method based on deep neural networks," Journal of ZheJiang University (Engineering Science), vol. 54, no. 11, pp. 2138-2148, 2020.doi:[[[10.3785/j.issn.1008-973X.2020.11.009]]]
• 18 X. Zhang, X. Chen, F. Li, and T. Yang, "Change detection method for high resolution remote sensing images using deep Learning," Acta Geodaetica et Cartographica Sinica, vol. 46, no. 8, pp. 999-1008, 2017.doi:[[[10.1109/ACCESS.2020.3047915]]]