Pseudo-color infrared and visible image fusion based on attention-dense network

doi:10.21203/rs.3.rs-3432489/v1

Download PDF

Research Article

Pseudo-color infrared and visible image fusion based on attention-dense network

https://doi.org/10.21203/rs.3.rs-3432489/v1

This work is licensed under a CC BY 4.0 License

Version 1

posted

You are reading this latest preprint version

In the existing infrared and visible image fusion algorithms, the texture details of the fused image are not clear, and the display of infrared information and texture details is unbalanced. In this paper, we come up with an image fusion method of pseudo-color infrared and visible images based on attention-dense network. Firstly, the gray infrared image is processed by pseudo-color, and then then it is fused with the color visible image. Secondly, during the training process, a generator network structure composed of convolutional layers and dense connected blocks with attention modules is designed. It focuses on the key information of source images and enhances the ability of the network to obtain the information of the source image. Finally, the content loss function is constructed by using infrared pixels, visible pixels, visible gradient and infrared gradient to keep the stability of infrared target and texture details in the fused image. The comparison experiments with five fusion methods are carried out. They show that the proposed fusion method is significantly improved compared with other fusion methods.

Image fusion

Infrared and visible images

Attention module

Dense connected block

By capturing the thermal infrared information of objects, infrared sensors can effectively highlight salient infrared targets even in extreme conditions, bad weather and partial occlusion (Feng et al. 2020). However, infrared image cannot provide enough background environment information and lack texture details (Liu et al. 2019). On the contrary, the visible image obtained by the visible light sensor by reflecting visible light contains abundant texture details (Ma et al. 2020), but its imaging conditions are harsh and easily affected by natural weather conditions. Therefore, the purpose of infrared image and visible image fusion is to fuse the supplemental details in the source image. The fused image contains clear infrared targets andabundant texture details (Li et al. 2020). At present, infrared and visible image fusion technology is widely utilized across various domains, including military reconnaissance, target identification and tracking, security monitoring, agricultural production, remote sensing measurement, etc (Zhang et al. 2021).

There are two main fusion methods for infrared and visible image. One corresponds to the conventional image fusion algorithm, while the other pertains to the deep learning approach. Traditional image fusion algorithms usually perform activity level measurement in spatial domain or transform domain and design fusion rules manually to achieve image fusion. Classical conventional image fusion frameworks primarily encompass fusion frameworks founded on multi-scale transformation (Toet et al. 1989), sparse representation (Liu et al. 2015), subspace (Kong et al. 2014), saliency (Ma et al. 2017), variational model (Ma et al. 2016), etc. Image fusion algorithms founded on deep learning can be subdivided into image fusion framework based on autoencoder (AE) (Li et al. 2019), convolutional neural network (CNN) (Liu et al. 2017) and generative adversarial network (GAN) (Ma et al. 2019).

In the framework of the conventional image fusion, the fusion rules are designed manually. The rules are not suitable for the complex environment, and to enhance the effectiveness of the fusion image, the fusion rules are becoming increasingly intricate. Complex fusion rules have become a major problem that hinders traditional image fusion algorithms. Deep learning has powerful ability of feature extraction and expression, which provides some new methods for image fusion.

The image fusion framework based on AE first trains an autoencoder on a large data set to achieve feature extraction and image reconstruction. Then the fusion strategy is designed manually to fuse the features extracted by the encoder to realize image fusion. To enhance the feature extraction capacity of autoencoders, Li et al. (2020) added nest connecting blocks to the fusion framework based on AE. Liu et al. (2022) integrated an attention module into the image fusion framework founded on AE, it can pay more attention to the key details within the source images during the process of feature extraction.

The image fusion framework based on CNN realizes end-to-end feature extraction, aggregation, and image reconstruction by constructing a network framework and a loss function. It avoids the manual design of fusion rules. In this method, the extracted last layer features are used as image features to reconstruct images, many useful features extracted from the middle layer are lost. Li et al. (2019) used dense connection networks to minimize the loss of information during the feature extraction. Long et al. (2021) suggested an approach for fusing infrared and visible images utilizing aggregated residual dense networks. It can automatically evaluate the information retention of source images and extract hierarchical features to achieve effective fusion.

In 2019, Ma et al. (2019) applied GAN to infrared and visible image fusion for the first time. The fusion process can be regarded as a game between generator and discriminator. However, the single discriminator is easy to cause modal imbalance during feature extraction. The fused image will miss the infrared target information or the texture details of source images. Ma et al. (2020) come up with dual-discriminators to maintain the balance between different modes in the fused image. Li et al. (2021) integrated an attention module into the fusion framework of generative adversarial networks. It can make generator and discriminators pay more attention to important information in source images and generate fusion images with more prominent targets. Liu et al. (2022) proposed joint optimization of image fusion and object detection, which has high detection accuracy and higher visual effect of the fused image.

The above fusion methods are widely used in a variety of scenarios, and the fused images obtained also achieve good visual effects. However, there are still some problems, such as unbalanced display of infrared target and texture features in fused images, and unclear texture details in fused images.

To solve these problems, we propose a pseudo-color infrared and visible image fusion method based on attention-dense network. Firstly, the pseudo-color processed infrared image and color visible image are used for feature extraction, feature aggregation and image reconstruction to train the model. When the three-channel image is used as input, the input information is more, and the fused image will retain more texture details. Secondly, a generator network structure composed of convolutional layers and dense connected blocks with attention modules is designed. The attention module will focus on key information in the source image, such as infrared information and texture details. Dense connection blocks will reduce the information loss during feature extraction and enhance the ability of network to extract source image information. Finally, the introduction of the content loss function will maintain the stability between infrared target and texture detail information in the fused image.

2.1 Fusion Process and Network Framework

In the previous research about infrared and visible image fusion, the color visible image is grayscale processed and then fused with the gray infrared image. In this paper, the gray infrared image is processed with pseudo-color and then combined with the color visible image for feature extraction, aggregation and image reconstruction to train the model. (In this paper, the single channel of the infrared image is inserted into the three channels R, G and B to generate the pseudo-color image.) The process is shown in the fusion framework in Figure.1.

In the training process, as shown in Figure.1(a), the grayscale infrared image is first processed with pseudo-color, and then spliced with the color visible image according to the channel. They are used as the input of the generator to output the fusion image. The discriminator is composed of a detail discriminator and a target discriminator. The detail discriminator assesses the fused image or visible image as input to judge whether the fused image is from the generator or the real image. The target discriminator takes the fused image or infrared image as input and determines whether the fused image is from the generator or the real image. Through the antagonistic game between the generator and the discriminator, the visible texture details and thermal infrared information in the source image are constantly supplemented into the fused image.

In the testing process, as shown in Figure.1(b), the infrared image is processed by pseudo-color, and then the infrared and visible image are fed into the trained generator. The generated image is processed by grayscale to obtain the final grayscale fused image.

2.1.1 Generator Network Architecture

The architecture of the generator network is shown in Figure.2.

The architecture of generator network is composed of convolution layer (Conv), dense connected layer (Dense) and attention module (CBAM), and the output fused image is the same size as the source image.

2.1.2 Discriminator Network Architecture

The architecture of discriminator network is shown in Figure.3.

There are two discriminators in the generative adversarial network. Compared with the generative adversarial network with only one discriminator, the generative adversarial network with two discriminators is conducive to the fused image to retain as much texture information as possible while retaining more thermal infrared target information. The detail discriminator and the target discriminator have the same network structure, which consists of four convolutional layers and one fully connected layer.

2.1.3 Attention Module Design

The Attention Module used in this paper is the CBAM (Convolutional Block Attention Module) module. It is composed of the channel attention module and the spatial attention module, as shown in Figure.4. Compared with SENet module, which only focuses on the channel attention mechanism. CBAM can obtain better results by processing the input feature layer by the channel attention mechanism and the spatial attention mechanism in turn.

2.2 Loss Function Design

In this paper, the generator loss function consists of two parts, namely, adversarial loss and content loss, and its function is:

The content loss is represented by the proportionality coefficient λ, and the adversarial loss is defined as:

$${L_{{\text{adv}}}}=\frac{1}{N}\sum\limits_{{n=1}}^{N} {\left[ {\log \left( {1 - {D_D}(I_{f}^{n})} \right)} \right]} +\frac{1}{N}\sum\limits_{{n=1}}^{N} {\left[ {\log \left( {1 - {D_T}(I_{f}^{n})} \right)} \right]}$$

where ${D_D}(I_{f}^{{}})$ is the discrimination result of the detail discriminator on the fused image, ${D_T}(I_{f}^{{}})$ is the discrimination result of the target discriminator on the fused image, represents the count of fused images.

The expression for content loss is as follows:

In the above expression, ${I_f}$ symbolizes the fused image, ${I_r}$ symbolizes the infrared image, and ${I_v}$ symbolizes the visible image. H and W symbolize the height and width of the image respectively, and ${\alpha _1},{\alpha _2},{\alpha _3},{\alpha _4}$ symbolize the proportionate coefficients of different losses, respectively. $\nabla$ symbolizes the image matrix's gradient operation.

The gradient loss imposes a constraint on the fused image to incorporate abundant texture features, where ${\alpha _2}$ and ${\alpha _4}$ denote the weight factors for the visible gradient and infrared gradient loss terms, respectively. The pixel loss constrains the fused image to keep an image distribution comparable to the source image, where ${\alpha _1}$ and ${\alpha _3}$ signify the weight factors for the infrared pixel and visible pixel loss components, respectively.

The loss function of the discriminator in this paper consists only of adversarial losses, and its function is:

$${L_D}_{{_{D}}}=\frac{1}{N}\sum\limits_{{n=1}}^{N} {\mkern 1mu} \left[ { - \log \left( {{D_D}(I_{v}^{n})} \right)} \right]+\frac{1}{N}\sum\limits_{{n=1}}^{N} {\mkern 1mu} \left[ { - \log \left( {1 - {D_D}(I_{f}^{n})} \right)} \right],$$

$${L_D}_{{_{T}}}=\frac{1}{N}\sum\limits_{{n=1}}^{N} {\mkern 1mu} \left[ { - \log \left( {{D_T}(I_{v}^{n})} \right)} \right]+\frac{1}{N}\sum\limits_{{n=1}}^{N} {\mkern 1mu} \left[ { - \log \left( {1 - {D_T}(I_{f}^{n})} \right)} \right].$$

In the above expression, ${D_D}(I_{v}^{{}})$ is the discrimination result of the detail discriminator for the visible image, and ${D_T}(I_{r}^{{}})$ is the discrimination result of the target discriminator for the infrared image. ${D_D}(I_{f}^{{}})$ is the discrimination result of the detail discriminator on the fused image, and ${D_T}(I_{f}^{{}})$ is the discrimination result of the target discriminator on the fused image.

3.1 Experiment Design

The experiments in this paper use the M3FD dataset (Liu et al. 2022), published by the School of Software at Dalian University of Technology in 2022. It contains high-resolution infrared and visible images in different scenarios. In this paper, we chose 2000 pairs images in the M3FD dataset as the training set, and 10 pairs images as the test set.

For generator loss function, $\lambda$ is set to 100 according to several experiments. The weight coefficients of pixel loss and gradient loss terms are set to ${\alpha _1}$ = 1.1, ${\alpha _2}$ = 5, ${\alpha _3}$ = 1, ${\alpha _4}$ = 3. The experimental setup utilizes the PyTorch framework, CUDA version 11.6, and the NVIDIA RTX 4060 graphics card, the batch_size is configured at 4, the optimizer employed is Adam, and the learning rate is defined as 1e-4.

Objective assessment can yield quantitative evaluation metrics, whereas subjective assessment is more effective in reflecting people's visual perception and preferences. Hence, when assessing the quality of image fusion, it is beneficial to employ a combination of objective and subjective assessment methods. The evaluation outcomes can be collectively taken into account to achieve a more precise and all-encompassing assessment. To objectively assess the quality of the fused images, we select structural similarity (SSIM) (Wang et al. 2004), peak signal-to-noise ratio (PSNR) (Ma et al. 2019), information entropy (EN) (Roberts et al. 2008), spatial frequency (SF) (Eskicioglu et al. 1995), average gradient (AG) (Cui et al. 2015), and mutual information (MI) (Qu et al. 2002) as evaluation indexes.

3.2 Comparative Experiment

To validate the efficiency of the suggested method, we compare with five other fusion methods, including FusionGAN (Ma et al. 2019), DenseFuse (Li et al. 2019), DDcGAN (Ma et al. 2020), AttentionFGAN (Li et al. 2020), TarDal (Liu et al. 2022).

In the experiment, the above five methods are applied to 10 pairs of images. Figure 5 exhibits the fusion results of two pairs of images by different methods, and the image size is set to 256×256.

The fused images produced by different fusion methods are displayed in Figure.5. According to subjective analysis, in contrast to the methods mentioned above, the fusion algorithm introduced in this paper exhibits two advantages. First, it can retain rich texture features in visible images (the first set of blue boxes), which conforms to the human visual system; Secondly, the infrared information in the infrared image (the second group of red boxes) can be retained, and the outline is obvious, which is conducive to observation. In conclusion, the method suggested in this paper makes the fused image have clear infrared targets and rich texture details.

The mean values of the fused images produced by various algorithms under different evaluation indexes are displayed in Table.1. Table.1 presents the comparative outcomes of six methods across six indicators. The optimal value is marked in red, while the second best value is marked in blue. In the fusion method introduced in this paper, the optimal value of PSNR indicates the minimum distortion in the fusion process. The optimal values of EN indicates that the fusion images contain more information. The optimal value of MI indicates that the two source images convey more impressive information to the fusion images. The optimal value of AG and the suboptimal value of SF indicate that the fusion image has more abundant texture details. In conclusion, the fusion method proposed in this paper preserves the useful information of the source image to the greatest extent, especially the texture details.

The comparison diagram of the fusion outcomes of the above six fusion methods, including SSIM, PSNR, EN, SF, AG and MI, is displayed in Figure.6.

To validate the effectiveness of the suggested idea of pseudo-color processing of infrared images, we compare with the previous image processing methods. In the experiments, we divide into two groups of experiments. The first group of experiments is to perform grayscale processing on visible images, then uses the fusion framework introduced in this paper for image fusion. The second group of experiments is to perform image fusion according to the fusion method introduced in this paper.

In the experiments, the above two fusion experiments are applied to 10 pairs of images. Figure.7 displays the fusion outcomes of two pairs of images by two different image processing methods. The image size is configured at 256 × 256.

The fused images under two different image processing methods are shown in Figure.7. Based on subjective analysis, in contrast to the previous image processing methods. The fused image produced by the proposed image processing method has clearer visible texture features. The obtained fused image retains clearer infrared information. The image processing method suggested in this paper enhances the fused image's fusion quality and visual effect. The average values of fused images obtained by two different image processing methods under different evaluation indexes are displayed in Table.2.

Table.2 compares the average evaluation indicators of fused images. The outcomes indicate that the image processing method introduced in this paper is better than the previous image processing methods in 5 key indexes (PSNR, EN, SF, AG, MI). In particular, the improvement of SF and AG is the largest. It indicates that the image processing method suggested in this paper facilitates the network in extracting texture data from the source image. The fused image contains more abundant texture particulars.

3.3 Ablation Experiment

The network structure of this paper incorporates two essential elements, namely, the inclusion of the attention module and the integration of the dense connected block in the generator. To validate the impact of the two elements on fusion performance, we conduct ablation experiments. They are segregated into three sets and compared with the method introduced in this paper.

In Experiment A, dense connected blocks are introduced into the generator network architecture without introducing the attention module during network training. In experiment B, the attention module is introduced in the generator network architecture, but the dense connection blocks are not introduced. Experiment C uses the regular convolution layer in the generator during network training. The datasets mentioned in Chap. 3 are used for training and testing in the three experiments, and the fusion results of the three experimental groups and the proposed method are summarized and evaluated.

Table.3 compares the average evaluation indicators of fused images. The results indicate that the fusion method suggested in this paper is better than the other methods in five key indexes (PSNR, EN, SF, AG, MI). It indicates that the addition of dense connected blocks and attention modules can effectively improve the network's capability to capture feature data from the source image, and increase the information content of the fused image.

In this paper, a pseudo-infrared and visible image fusion method based on attention-dense network is proposed. The biggest difference between the suggested fusion method and the existing fusion methods is that the infrared image is converted into a pseudo-color image and then fused with the visible image, the fused image retains more texture details.

To efficiently capture supplementary information from the source image and the essential information of the source image, a dense connected network with an attention module is introduced.

In addition, a content loss function is designed to keep the stability between infrared target and texture features in fused images.

In terms of quantitative and qualitative hierarchical analysis of M3FD data sets. In contrasted to other fusion methods, the suggested approach exhibits a distinct enhancement in preserving infrared information and texture details. Through the comparison of the two groups of experiments (gray visible and infrared image fusion experiment, pseudo-color infrared and visible image fusion experiment). It can be seen that SF and AG have the largest improvement range. It verifies that the processing method of converting infrared image into pseudo-color image and then fusing it with visible image can make the fused image have richer texture details.

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Data Availability

The data that support the findings of the study is available from the corresponding author on reasonable request.

Competing Interests

The authors declare that they have no conflict of interest.

Funding

This work was supported by the National Natural Science Foundation of China, Grant number 62175114 and 61875089, and the Kunshan and Nanjing University of Information Science and Technology (NUIST) intelligent sensor research center project.

Authors’ Contributions

All authors contributed to the study conception and design. Frame design, data collection and analysis were performed by [Jianhuan Qi] and [Bo Ni]. The first draft of the manuscript was written by [Jianhuan Qi] and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Acknowledgements

I would like to thank the National Natural Science Foundation of China, and the Kunshan and Nanjing University of Information Science and Technology (NUIST) intelligent sensor research center for their financial support.

Cui, G., Feng, H., Xu, Z., et al.: Detail preserved fusion of visible and infrared images using regional saliency extraction and multiscale image decomposition. Optics Communications. 341:199-209. (2015) https://doi.org/10.1016/j.optcom.2014.12.032
Eskicioglu, A.M., Fisher, P.S.: Image quality measures and their performance. IEEE Transactions on Communications. 43(12);2959-2965. (1995) https://doi.org/10.1109/26.477498
Feng, Y.F., Lu, H.Q., Bai, J.B., Cao, L., Yin, H.: Fully convolutional network-based infrared and visible image fusion. Multimedia Tools and Applications. 79, 15001-15014 (2020). https://doi.org/10.1007/s11042-019-08579-w
Kong, W., Lei, Y., Zhao, H.: Adaptive fusion method of visible light and infrared images based on non-subsampled shearlet transform and fast non-negative matrix factorization. Infrared Physics & Technology. 67: 161-172 (2014). https://doi.org/10.1016/j.infrared.2014.07.019
Li, H., Qi, X.B., Xie, W.Y.: Fast infrared and visible image fusion with structural decomposition. Knowledge-Based Systems. 106182 (2020). https://doi.org/10.1016/j.knosys.2020.106182
Li, H., Wu, X.J., & Durrani, T.: NestFuse: an infrared and visible image fusion architecture based on nest connection and spatial/channel attention models. IEEE Transactions on Instrumentation and Measurement. 69(12): 9645-9656. (2020) https://doi.org/10.1109/tim.2020.3005230
Li, H., Wu, X.J.: DenseFuse: a fusion approach to infrared and visible images. IEEE Transactions on Image Processing. 28(5): 2614-2623. (2019) https://doi.org/10.1109/tip.2018.2887342
Li, J., Huo, H.T., Li, C., Wang, R.H., & Feng, Q.: AttentionFGAN: infrared and visible image fusion using attention-based generative adversarial networks. IEEE Transactions on Multimedia23:1383-1396. (2020) https://doi.org/10.1109/TMM.2020.2997127
Liu, J.Y., Fan, X., Huang, Z.B., Wu, G.Y., Liu, R.S., Zhong, W., & Luo, Z.X.: Target-aware dual adversarial and a multi-scenario multi-modality benchmark to fuse infrared and visible for object detection. Proceedings of 2022 IEEE/CVF Conference on Computer Vision. and Pattern Recognition. New Orleans, USA, 2022: 5802-5811. https://doi.org/10.48550/arXiv.2203.16220
Liu, J.Y., Fan, X., Jiang, J., Liu, R.S., & Luo, Z.X.: Learning a deep multi-scale feature ensemble and an edge-attention guidance for image fusion. IEEE Transactions on Circuits and Systems for Video Technology. 32(1):105-119. (2022) https://doi.org/10.1109/tcsvt.2021.3056725
Liu, Y., Chen, X., Peng, H., et al.: Multi-focus image fusion with a deep convolutional neural network. Information Fusion. 36:191-207 (2017). https://doi.org/10.1016/j.inffus.2016.12.001
Liu, Y., Chen, X., Peng, H., et al.: Multi-focus image fusion with a deep convolutional neural network. Information Fusion. 36:191-207. (2017) https://doi.org/10.1016/j.inffus.2016.12.001
Liu, Y., Dong L., Ji, Y., Xu, W.: Infrared and visible image fusion through details preservation. Sensors. 19(20): 45-56 (2019). https://doi.org/10.3390/s19204556
Liu, Y., Liu, S., Wang, Z.: A general framework for image fusion based multi-scale transform and sparse representation. Information Fusion. 24: 147-164 (2015). https://doi.org/10.1016/j.inffus.2014.09.004
Long, Y.Z., Jia, H.T., Zhong, Y.D., Jiang, Y.D., Jia, Y.M.: RXDNFuse: A aggregated residual dense network for infrared and visible image fusion. Information Fusion. 69:128–141. (2021) https://doi.org/10.1016/j.inffus.2020.11.009
Ma, J., Zhang, H., Shao, Z., Liang, P., Xu H.: GANMcC: A generative adversarial network with multiclassification constraints for infrared and visible image fusion. IEEE Transactions on Instrumentation and Measurement. 70: 5005014 (2020). https://doi.org/10.1109/TIM.2020.3038013
Ma, J.L., Zhou, Z.Q., Wang, B., & Zong, H.: Infrared and visible image fusion based on visual saliency map and weighted least square optimization. Infrared Physics and Technology. 82:8-17 (2017). https://doi.org/10.1016/j.infrared.2017.02.005
Ma, J.Y., Chen, C., Li, C., & Huang, J.: Infrared and visible image fusion via gradient transfer and total variation minimization. Information Fusion.31:100-109 (2016). https://doi.org/10.1016/j.inffus.2016.02.001
Ma, J.Y., Ma, Y., & Li, C.: Infrared and visible image fusion methods and applications: a survey. Information Fusion. 45:153-178. (2019) https://doi.org/10.1016/j.inffus.2018.02.004
Ma, J.Y., Xu, H., Jiang, J.J., Mei, X.G., & Zhang, X.P.: DDcGAN: a dual-discriminator conditional generative adversarial network for multi-resolution image fusion. IEEE Transactions on Image Processing. 29:4980-4995. (2020) https://doi.org/10.1109/tip.2020.2977573
Ma, J.Y., Yu, W., Liang, P.W., et al.: FusionGAN: A generative adversarial network for infrared and visible image fusion. Information Fusion. 48:11-26. (2019) https://doi.org/10.1016/j.inffus.2018.09.004
Qu, G., Zhang, D., Yan, P.: Information measure for performance of image fusion. Electronics Letters. 38(7):313-315. (2002) https://doi.org/10.1049/el:20020212
Roberts, J.W., Van Aardt, J.A., Ahmed, F.B.：Assessment of image fusion procedures using entropy, image quality, and multispectral classification. Journal of Applied Remote Sensing. 2(1):023522. (2008) https://doi.org/10.1117/1.2945910
Toet, A., Van Ruyven, L.J., Valeton, J.M.: Merging thermal and visual images by a contrast pyramid. Optical Engineering. 28(7): 789-792 (1989). https://doi.org/10.1117/12.7977034
Wang, Z., Bovik, A.C., Sheikh, H.R., et al.: Image quality assessment: from error visibility to structural similarity. IEEE Transactions on Image Processing. 13(4):600-612. (2004) https://doi.org/10.1109/TIP.2003.819861
Zhang, H., Ma, J.: SDNet: A versatile squeeze-and-decomposition network for real-time image fusion. International Journal of Computer Vision. 2761-2785 (2021). https://doi.org/10.1007/s11263-021-01501-8

Table.1 Average value of image measurement of different fusion algorithms

Method	SSIM	PSNR	EN	SF	AG	MI
FusionGAN	0.8175	29.8024	6.0539	15.7478	5.6841	2.5304
DenseFuse	0.8524	30.7176	6.3888	11.8805	5.2290	2.7929
DDCGAN	0.8412	30.7399	6.3954	11.4365	4.9620	2.8125
AttentionFGAN	0.8565	30.6484	6.3614	12.9964	5.3185	2.8049
TarDal	0.8559	30.7617	6.4424	11.0928	4.9680	2.6635
Ours	0.8305	31.6841	6.5581	13.8948	6.0096	3.0960

Table.2 Average value of image measurement under two image processing methods

Method	SSIM	PSNR	EN	SF	AG	MI
Original	0.8556	30.1939	6.3518	11.6009	5.2038	2.8665
Proposed	0.8305	31.6841	6.5581	13.8948	6.0096	3.0960

Table.3 Average value of image measurement of different fusion algorithms

Exp	SSIM	PSNR	EN	SF	AG	MI
A	0.8451	30.6173	6.3878	12.8113	5.7742	2.5801
B	0.8375	30.5533	6.4153	12.5889	5.5140	2.6300
C	0.7826	30.6721	6.2539	9.8468	4.3531	2.8055
Ours	0.8305	31.6841	6.5581	13.8948	6.0096	3.0960

No competing interests reported.

Download PDF

Version 1

posted

You are reading this latest preprint version

Pseudo-color infrared and visible image fusion based on attention-dense network

Status:

Version 1

Abstract

Figures

1 Introduction

2 The Proposed Method

2.1 Fusion Process and Network Framework

2.1.1 Generator Network Architecture

2.1.2 Discriminator Network Architecture

2.1.3 Attention Module Design

2.2 Loss Function Design

3 Experiment and Result

3.1 Experiment Design

3.2 Comparative Experiment

3.3 Ablation Experiment

4 Conclusion

Declarations

References

Tables

Additional Declarations

Status:

Version 1