Evaluation of the change in synthetic aperture radar imaging using transfer learning and residual network

Change detection from synthetic aperture radar images becomes a key technique to detect change area related to some phenomenon as flood and deformation of the earth surface. This paper proposes a transfer learning and Residual Network with 18 layers (ResNet-18) architecture-based method for change detection from two synthetic aperture radar images. Before the application of the proposed technique, batch denoising using convolutional neural network is applied to the two input synthetic aperture radar image for speckle noise reduction. To validate the performance of the proposed method, three known synthetic aperture radar datasets (Ottawa; Mexican and for Taiwan Shimen datasets) are exploited in this paper. The use of these datasets is important because the ground truth is known, and this can be considered as the use of numerical simulation. The detected change image obtained by the proposed method is compared using two image metrics. The first metric is image quality index that measures the similarity ratio between the obtained image and the image of the ground truth, the second metrics is edge preservation index, it measures the performance of the method to preserve edges. Finally, the method is applied to determine the changed area using two Sentinel 1 B synthetic aperture radar images of Eddahbi dam situated in Morocco.


Introduction
In satellite imagery, change detection is a feature of interest for many applications, such as urban growth monitoring. The objective is to identify and analyze changes in a scene from images acquired at different dates. In this context, radar imagery appears to be one of the most relevant means. Indeed, thanks to its ability to observe at any time of the day or night, it is an indispensable means of observation in emergencies where weather conditions are unfavorable for acquisition in the optical field.
Synthetic aperture radar (SAR) is considered as an active and powerful remote sensing technology for ground information collection at any time whatever the conditions [1,2]. The remote sensing SAR images change detection assist in assessing disasters and predicting its development trends, updating geographic data, and monitoring land use. Generally, the principle important steps for change detection include the input SAR images preprocessing, computation of the difference between these images, determination of the change information and then, evaluate the detection results [3].
Several approaches for change detection in SAR images has been proposed and exploited. Mu et al. [4] present an accelerating genetic algorithm based on search space decomposition. The difference images step is realized by decomposing the difference into sub-blocks [5], then, the detected change in each sub-block identifies the changed, unchanged and undetermined pixels. These undetermined pixels are optimized and the final change detection results is obtained by reconstruction of all subblocks. The multi-objective Fuzzy clustering method [6], was proposed for change detection in SAR image. This method is done to optimize two conflicting objective functions constructed from the perspective of reducing speckle noise and preserving detail. Whereas, a hybrid approach based on fuzzy c-means and Gustafson-Kessel clustering for unsupervised change detection in multitemporal SAR images was constructed [7]. Other works focus on reducing speckle noise effect in SAR image for change detection accuracy improvement [8 -10].
In other work, a method based on the salient image guidance and an accelerated genetic algorithm was proposed [4]. In their work [11], the authors apply the saliency detection model to the difference image in order to extract the pixels containing the change.
In the objective to improve the change detection performance and accuracy and reduce the running time, Wenyan et al. propose a method based on the weight image fusion and adaptive threshold in NSST domain [12].
Recently, a deep learning-based model has gained a great interest of researchers in the fields of change detection and SAR images analysis. Authors of reference [13] exploits the convolutional neural network (CNN) with wavelets transform for sea ice change detection. Since SAR image are characterized by strong speckle noise, the change detection accuracy becomes low, for this reason, they introduce the wavelets thresholding approach for speckle noise reducing. Then, the CNN model classifies the image pixels into changed and unchanged pixels. Li et al. present a novel method based on CNN [14], the principle idea of their work, was the classification results generation from SAR images directly without any preprocessing step. Li et al. obtain the final change detection results by producing firstly false labels through unsupervised spatial fuzzy clustering, secondly, training the CNN network, and finally, the results are obtained by the trained CNN.
Gao et al. proposed two important works dedicated for change detection. The first work is based on neighborhood based ratio and extreme learning machine [15], the neighborhood based ratio is used to obtain the pixels that have high probability of being changed or unchanged. The second work of Gao et al. concerns another performant method based on channel weighting based deep cascade network [16], this work is proposed to solve some problems in other deep learning based methods as overfitting and exploding gradiens.

Change detection methodology
Consider two SAR images of the same around the surface and taken at t 1 and t 2 respectively. The goal is to design an efficient change detection method to determine the changes between the two images. After beginning the procedure of change detection between the two SAR images, an implementation of geometric correction and registration are essential to align the two input images in the same coordinate frame.
The general procedure used to detect change is based on three important steps.
1. Preprocessing: consists to realize a radiometric calibration, orthorectification, and speckle noise reduction from these images. 2. Computation of image difference: The ways to generate a difference image include the difference and ratio methods, which involve subtracting and dividing the corresponding pixels in the two images, respectively. 3. Analysis of images difference: The image difference represents a correlation between the two-state (after and before change), and its analysis concerns the extraction of information related to the change. For a deep learning-based method, the general procedure used to estimate-predict a change from the image is show by fig. 1a.
A Binarization step is vital to train the architecture and improve the accuracy of results by using an image dataset. To detect the change from the SAR image, two GRD images with vertical-vertical polarization are used, then, a binarized image difference is computed ( fig. 1b).
After that, a network was constructed, and the binarized difference image was sent to the input layer to train the network under supervision. Finally, after several itera-tive trainings, a change map was obtained at the output layer of the network and is shown in fig. 1c. The dataset used for training contains 1104 images obtained after data augmentation step that will be explained in the next section.
The proposed approach in this work contains four important steps: image pre-processing, image clustering, data augmentation, and Transfer Learning based on Residual Network with 18 layers (ResNet-18):

Image pre-processing
This step concerns the pre-processing on the sentinel-1 datasets, this step begins by reducing speckle noise and enhance image contrast, and generating binary change bands. For multiplicative speckle noise reduction, we exploit our recent proposed convolutional neural network architecture [17].

Image clustering
An image clustering approach is used to segment changed regions and distinguish them from areas without change. In this work, we exploited the mask R-CNN [18]. It is a simple, flexible, and general framework for object instance segmentation. This mask was proposed in 2017, and it is considered the most powerful instance segmentation technique up to now. As shown by fig. 2, representing the structure of Mask R-CNN, it consists of a backbone CNN framework, a regional proposal network, a region of interest and three outputs branches as classification, box regression and mask prediction. Firstly, the features are searched through the regional proposal network for zones that may contains foreground. This is illustrated by rectangles with different size that cover such regions and the suggested rectangles are used as bounding box. Secondly, these bounding boxes are exploited for regions of interests obtaining and then, performing classification and bounding box regression.

Data augmentation
The proposed approach started by creating our dataset by annotating 100 image patches of 224 × 224 × 3 containing two classes: changed and non-changed areas. The reason of this particular dataset image size was selected to be appropriate to the size of the input layer of Resnet-18 Deep Neural Network which was 224 × 224 × 3.
Moreover, the dataset was augmented to 1104 images by performing random translations along the x-axis and random flips and rotations along the y-axis, and resized the images and scaled the patches.
From 100 images to 1104 images, the deep learning architecture make better performance and better accuracy values with data augmentation.

Transfer learning and residual network
It trained a Transfer Learning framework on earlier segmented areas, thus, combining both methods to predict two classes of objects in the satellite image: changed areas, and non-changed areas (possibly we can add an unidentified area class).
Transfer Learning is an approach used to improve the learning of a new task by transferring and adapting knowledge from a similar task that has already been learned via a trained network. Its consisted essentially in reusing the values of weights of a pre-trained Deep Neural Network, while replacing the last layers with new ones which are retrained to provide a model that better fits to the target object and task.
The reason of the choice of transfer learning was motivated by the dataset size. our relatively small dataset size, which is usually the case in the specific satellite image and remote sensing applications, such as flood areas detection, while retaining the predictive power of a deep learning model.
The CNN network architecture implemented in this paper is based on the ResNet-18 architecture, which represents a good balance / ratio between deepness (time of computation) and performance. ResNet was introduced during the 2015 ImageNet Large Scale Visual Recognition Challenge and won it with an incredible error rate of 3.57 % [19] (Depending on their skill and expertise, humans generally hover around a 5 -10 % error rate). This network was pre-trained on ImageNet database including more than a million of images. As a result, the network has learned rich feature representations for a wide range of images. The network can classify images into 1000 object categories, such as keyboard, mouse, pencil, and many animals [20].
The structure of ResNet-18 includes 18 layers organized on 5 convolutional block-stages [20] (see table 1 for more details.) So as we can see in table 1 the ResNet-18 architecture contains the following elements / Layers: Convolution with a kernel size of 7 × 7 and 64 different kernels all with a stride of size 2 giving us 1 layer.
Next, we see max pooling with also a stride size of 2.
In the next convolution, there is a 3 * 3.64 kernel following this a 3 * 3.64 kernel, these two layers are repeated in total 2 times so giving us 4 layers in this step.
Next, we see the kernel of 3 * 3,128 after that a kernel of 3 * 3.128 this step was repeated 2 times so giving us 4 layers in this step.
Where ( * ) is the convolution product. After that, there is a kernel of 3 * 3.256 and one more kernels with 3 * 3.256 this is repeated 2 times giving us a total of 4 layers.
And then again, a 3 * 3.512 kernel was repeated 2 times giving us a total of 4 layers. After that, we do an average pool and end it with a new "modified" fully connected layer containing 1000 nodes and at the end a softmax function so this gives us 1 layer.
Finally, the last three layers of our ResNet-18 were replaced by a new Fully connected layer, a softmax layer, and a new classification output layer adapted to our dataset classes (changed, non-changed).

Experimental results and analysis
The transfer learning Experimentation was passed in MATLAB 2020a using the deep learning toolbox model for ResNet-18 network, with a 6 cores Intel i5-9600k CPU at 4.5 GHz and two Nvidia GPUs: RTX-2070 (8Gb) and GTX-1050 Ti (4Gb).
The entire network was trained using a modified version of ResNet-18. Time of training was 1 minute 13 second.
ResNet-18 deep learning architecture was used as a basis of the transfer learning approach. After annotating 1104 image patches extracted from the satellite image and labeled as changed areas, or non-labeled for nonchanged, the dataset was augmented before launching the deep learning algorithm. After 810 iterations on 10 epochs, using a minimum batch size of 10 images, and a validation frequency of 100 iterations, an accuracy of 94.84 % was obtained. The training set and testing set examples were selected randomly from dataset images. Our dataset was randomly divided into 70 % of images for learning and 30 % for testing

Use of dataset
To evaluate the effectiveness and performance of the proposed method, two real multi-temporal SAR datasets acquired by different sensors are exploited here. Geometric corrections and co-registration have been done on these two datasets before applying the proposed method.
The first dataset is the Ottawa dataset, they are offered by the Defence Research and Development Canada Ottawa. It contains two SAR images with a size of 290 × 350 acquired by a sensor called RADARSAT SAR. These images have been registered by a specific algorithm in advance. The two images and corresponding available ground truth are shown in fig. 3. The three-dataset used here are exploited to study the performance of our proposed method. In other words, the proposed method is applied to each dataset, and the obtained results are compared with the corresponding image of the ground truth by using two metrics: image quality index (Q) and Edge preservation index (EPI) [21,22]. The figure 6 shows the procedure used in this paper.
The first metric is the well-established image quality index Q calculation defined as: The second metric measure the performance edges preservation, this metric is called edge preservation index (EPI) and it is defined as:  [16] and the Fuzzy cluster method (FCM) [23]. The obtained results using our method, FCM and DCNet methods are presented in fig. 7, fig. 8 and fig. 9 for Ottawa, Mexico and Shimen dataset respectively. According to the qualitative results presented by fig. 7, fig. 8, and fig. 9 and the quantitative appraisal summarized in table 1, we can deduce that the proposed method makes to detect the change with accuracy.
According to the obtained values in table1, our method presents a high degree of correlation and similarity compared to the reference image, and this is generalized for the three datasets.

Application to a Sentinel 1 dataset
After the validation of the proposed method form change detection on three well-known datasets in literature, we apply it to determine the change caused by flooding in EL MANSOUR EDDAHBI dam located in the south of Morocco, near the city of Ouarzazate and constructed on the river of Draâ, With coordinates between 30°55'23.  After pre-processing of the SAR images using SNAP software developed by the European Space Agency (ESA), we apply the proposed method, and the change caused by the flooding is shown in fig. 12.

Conclusion
In this paper, a network is constructed based on the transfer learning approach and residual network to evaluate change from SAR images. Three examples of known datasets are used to study the performance of the network by using quantitative appraisal metrics based on two powerful metrics such as image quality index and edge preservation index.
The obtained experimental results verified the validity and robustness of the proposed method in front of the two other famous technique used form comparison (DCNet and FCM). This method still needs some improvement for application in flood monitoring, which is also the focus of our next work.