Reconstructing Computational Spectra Using Deep Learning's Self-Attention Method

doi:10.21203/rs.3.rs-4074358/v1

Download PDF

Research Article

Reconstructing Computational Spectra Using Deep Learning's Self-Attention Method

https://doi.org/10.21203/rs.3.rs-4074358/v1

This work is licensed under a CC BY 4.0 License

Version 1

posted

You are reading this latest preprint version

Miniaturized computational spectrometers have become a new research hotspot due to their portability and miniaturization. However, there are several issues, like low precision and poor stability. Because the problem of spectrum reconstruction accuracy is very evident, we suggested a novel approach to raise the reconstruction accuracy. Using the time-domain finite-difference (FDTD) method, a library of transmittance functions was acquired. A cross-correlation algorithm was then used to choose 100 sparse transmittance functions, which were then built as an encoding matrix. Then, based on the encoding matrix, a self-attention mechanism algorithm was used to improve the accuracy of the reconstructed spectrum by improving the autocorrelation characteristics between input signals. The mean square error (MSE) of the reconstructed spectrum is 0.0019, and its similarity coefficient (R²) is 0.9780. This self-attention mechanism spectral reconstruction technique will open up new possibilities for high-accuracy reconstruction for a variety of computational spectrometer types.

Spectral Reconstruction

Self-Attention

Encoding Matrix

Cross-Correlation

The spectrometer is widely acknowledged as a vital tool in industry and scientific research. Miniaturized spectrometer is emerging as a key platform for many developing applications, including consumer electronics, hyperspectral imaging, and in-situ sensing[1]. Depending on the intricacy of the necessary algorithms, there are two primary types of spectrometer miniaturization techniques: traditional and computational[2]. A new paradigm in spectrometer miniaturization, the computational microspectrometer (CS) is based on computational spectroscopy and is anticipated to solve the limitations of conventional microspectroscopy techniques, including their inability to achieve extreme miniaturization, lack of stability, and limited resolution[3].

The CS relies on computational techniques to approximate or “reconstruct” an incident light spectrum from precalibrated information encoded within a set of detectors[4]. The process of reconstruction is crucial. Spectral reconstruction techniques have advanced significantly in recent years due to the growth of CS[5]. In current computational reconstruction algorithms, Chang et al. use the Gaussian kernel template denoising method and parameter minimization method using the l1 paradigm for reconstruction [7]. The results were compared with the l1-paradigm approach using the Tikhonov Regularized Nonnegative Least Squares (TNNLS) technique in both simulations and real data. However, the measurements largely depend on early training and Gaussian basis selection, which limits the application prospects. Conversely, Oliver et al. showed that the sparsity of the signal spectrum can be used to further improve the resolution[8][9]. Zhang et al. presented a reconstruction approach based on dictionary learning and sparse optimization. The experimental results imply that dictionary learning can greatly increase the sparsity of general spectra, as l1-paradigm minimization performs well for both direct sparse spectra and general spectra that need to be translated into dictionaries[10]. Yang et al. constructed the smallest microcomputing spectrometer in the world at the time by combining recovered spectral data fitted with a Gaussian basis function[11]. Huang et al. used a sparse encoding method and combined it with a compressed sensing algorithm to ensure the accuracy of spectral reconstruction, but the spectral resolution was not high enough [12]. Huang et al. were able to increase the spectral resolution by four times by employing this property of the codec model of the spectral features fed into the network to successfully reconstruct the spectrum information of 100 bands from 25 feature points[13][14].

The CS is unable to obtain accurate readings in real-time, and the reconstruction accuracy of these methods is subpar. Recent research has demonstrated that employing deep learning algorithms for spectrum reconstruction of computational spectroscopy systems produces good results. However, because just the linear transformation of the completely linked layer itself is used and the correlation between the spectra and the spectra itself is ignored, the accuracy of the spectral reconstruction is not very great[15–17]. On the other hand, it demonstrates that deep learning may be highly useful in spectrum reconstruction and that deep neural networks can learn the mapping between input and target output distributions by training a large number of samples. CNNs are more suited for processing data with local features and spatial information, like photos and videos, while the self-attention mechanism works well for processing sequential data, particularly for jobs with global dependencies. Currently, the self-attention mechanism has been widely used in language recognition [18], text categorization [19][20], speech translation [21], natural language generation [22], image recognition[23][24], image generation[25][26], and so on.

The self-attention mechanism accomplishes adaptive weight allocation—which determines the relative importance of different bands—by computing attention weights. Certain spectral data bands may contain information that is redundant or noisy, while other bands may have more relevant information. The self-attention method improves reconstruction accuracy by allowing the network to autonomously assess the importance of various bands and assign them greater weights during the process. To capture global context dependence, the self-attention mechanism also evaluates the correlation between each element in the input sequence and other elements [27] [28]. Consequently, it is anticipated that this technique will raise the spectral reconstruction's accuracy even further.

In this work, we address the issue of inadequate accuracy in reconstructing spectra through miniaturized computing by designing the encoding matrix using cross-correlation algorithms and combining a novel deep-learning self-attention mechanism to optimize the final reconstruction results. The spectral reconstruction accuracy of computational spectrometers will be much enhanced by this method, which will also be a significant step toward the practical use of computational spectrometers.

Principle of the data acquisition system

This section describes and demonstrates how to measure the reflection spectrum of a target sample under sunlight, and the simplified schematic is shown in Fig. 1. When light is shone on the target picture, a transformation occurs as follows:

$$\int L\left(\lambda \right)S\left(\lambda ,X,Y\right)T\left(\lambda \right)D\left(\lambda \right)d\lambda =I\left(X,Y\right)$$

Makes the conversion into coded input data that can be fed into the algorithm model. In the formula, $L\left(\lambda \right)$ represents daylight light information, $S\left(\lambda ,X,Y\right)$is the detection target, where $\left(X,Y\right)$ is the two-dimensional position coordinates of the single-point spectrum, $D\left(\lambda \right)$ is the detection camera's response function, $T\left(\lambda \right)$ is the designed encoding matrix, and $I\left(X,Y\right)$ is the encoding intensity of the corresponding position coordinates. This conversion process requires first pasting the designed encoding matrix in front of the camera, which is used to test the spectral information of the reflected light of the sample $S\left(\lambda ,X,Y\right)$, so the spectral intensity is $L\left(\lambda \right)S\left(\lambda ,X,Y\right)$ before entering the camera conversion. Then the encoded intensity $I\left(X,Y\right)$ is finally obtained by the camera's encoding matrix $T\left(\lambda \right)$ and response function $D\left(\lambda \right)$. The specified encoding matrix can be turned into a chip that fits exactly with the camera to make spectral measurement more convenient.

Filter function design and construction of data set

The data construction in this study is divided into two parts. To construct an accurate spectral reconstruction model, suitable input and output data needs to be collected. The first part of the input data is a complete library of filtering functions simulated by FDTD and set by the expectation to 0.85 as a threshold condition as an initial sieve, and then utilized the cross-correlation function:

$$\rho \left(X,Y\right)=\frac{Cov(X,Y)}{\sqrt{D\left(X\right)D\left(Y\right)}}=\frac{E\left(XY\right)-E\left(X\right)E\left(Y\right)}{\sqrt{E\left({X}^{2}\right)-{E}^{2}\left(X\right)}\sqrt{E\left({Y}^{2}\right)-{E}^{2}\left(Y\right)}}$$

$X,Y$ denote different curves respectively, the value of $Cov$ is the covariance of $X,Y$ curves, and its calculation result ρ is a scalar and the range $\rho$∈[-1,1]; when $\rho$ >0, it means positive correlation, and the closer to 1 means the stronger correlation; $\rho$ <0, it means negative correlation and the closer to -1 means the stronger correlation; when the correlation coefficient is closer to 0, the correlation is weaker, and equals to 0 when two variables are not Correlation[33]. Using this formula, each of the filter function curves is used to calculate the correlation between the filter functions screened out at the beginning. Finally, the 100 filter functions whose absolute value is closest to 1 are sorted out and arranged into a 10x10 coded array.

In the second part of the experiment, spectral simulations were first performed using advanced basis functions to generate a series of spectral curves. These simulated spectral curves introduced varying degrees of Gaussian noise on top of the actual spectral data, including noise levels of 5%, 10%, 15%, and 20%. In total, 20,000 such simulated spectral data were generated to ensure broad coverage of noise levels and spectral features.

Next, these simulated spectral data were coded and transformed into uncoded input data using a pre-constructed filter function coding matrix. Then use the following formula to convert the simulated spectrum into decoded data:

$${I}_{i}={\int }_{{\lambda }_{1}}^{{\lambda }_{2}}{T}_{i}\left(\lambda \right)(f\left(\lambda \right)+{e}_{i})d\lambda +\approx \sum _{j=1}^{M}{T}_{i}\left({\lambda }_{i}\right)(f\left({\lambda }_{i}\right)+{e}_{i}),i=1,\cdots ,N$$

The appropriate input-encoded data were obtained in this manner, where ${T}_{i}\left(\lambda \right)$ is the encoding matrix, $f\left(\lambda \right)$ is the spectral data, and ${e}_{i}$ is the noise level. Its integration can be decomposed into the sum of multiple points, where M is the number of measurements within the measurement band and N is the number of encodings. This step was designed to simulate the process of encoding and reconstruction of spectral data in real applications. Through this process, a total of 100,000 simulated data was obtained, which included both training and validation sets.

To ensure the accuracy and robustness of the model, these 100,000 simulated data were divided into training and validation sets in a ratio of 8:2. It is worth noting that the selection of the validation set was done randomly to ensure extensive testing of different spectral data.

Finally, an additional 1000 simulated spectral data that were not in the training and validation sets were transformed to serve as an independent test set to evaluate the accuracy and performance of the model on unseen data. This step is important to test the performance of our model in real-world applications.

The framework of the neural network model

The neural network architecture for spectrum reconstruction can be expressed easily as

$FC\left(100\right)\to LR\to Self-Attention\left(100\right)\to FC\left(120\right)\to LR\to FC\left(150\right)\to LR\to FC\left(200\right)\to LR$ . Each digit represents the number of cells in the relevant layer. The ReLU function is denoted by the symbol LR. fc means a fully connected layer, 100 random spectral filters correspond to the number of input cells, and 200 denotes the reconstructed spectral channels (8um-12um, 0.02um steps). In this paradigm, the self-attention mechanism and residuals are used for training. To get more relevant features, the input incoming data is first Processed to a self-attention operation, and then residuals are incorporated in each neural layer to minimize parameter explosion and overfitting during training. Deionizing in normal compressed sensing algorithms, on the other hand, heavily relies on prior knowledge, and parameters are typically updated manually throughout the iterative process to offset the bias generated by noise, which is effective but does not provide convincing results when the noise level varies. As a result, regularization parameters are introduced to the training model to improve its robustness [31][32].

In which Self-Attention is computed in the following steps:

1. Generate Query, Key and Value:

First, the query vector (Q), key vector (K), and value vector (V) are generated by a linear transformation of the input data (usually using a weight matrix).

Generating these vectors usually involves the multiplication of weight matrices to map the input data into a low-dimensional representation space for subsequent computation.

2. Compute the attention scores:

Next, the attention scores between each query (Q) and each key (K) are computed, usually using the dot product (Dot Product) method. For a single query vector, the dot product is computed with all key vectors, and then a scaling factor is applied to control the range of scores. The formula is as follows:

$$Attention\left(Q,K\right)=Softmax\left(\right(Q*{K}^{\wedge }T)/sqrt({d}_{k})$$

Where Q is the query vector, K is the key vector and ${d}_{k}$ is the dimension of the key vector. The scores can be converted to probability distributions by applying the Softmax function.

3. Weighted Summation:

The computed attention scores perform a weighted summation on the value vector (V) to generate the final self-attention representation:

$$Self-Attention\left(Q,K,V\right)=Attention\left(Q,K\right)*V$$

This step involves a weighted summation of the value vectors based on the attention scores to produce a self-attention representation. Finally, the reconstructed spectral data is derived from the self-attention values in the subsequent full connectivity. The residual mechanism is added at each fully connected layer to prevent the gradient from disappearing during training. The result is shown in Fig. 2B.

Training and validation

In our model training process, we chose the mean square error (MSE) as the loss function, which is a loss function commonly used in regression problems. The MSE measures the mean squared error between the model predictions and the actual observations. By minimizing the MSE, our model can better fit the spectral data and ensure that the resulting reconstruction is as close as possible to the real spectrum.

To improve the stability and speed of training, the batch normalization technique is applied. Batch normalization helps to normalize the inputs of each layer in the neural network, ensuring that the distribution of the data is stable and reducing the problem of gradient explosion or gradient vanishing. This is very useful when training deep neural networks and helps the model to converge faster. The Adam optimization algorithm was chosen to tune the model parameters, which combines the characteristics of the momentum method and the adaptive learning rate and is usually able to find the global minimum faster.

In addition, the appropriate learning rate for model training was set to ensure that the weights were updated appropriately during the training process to avoid the model falling into local minima or divergence. Meanwhile, to avoid the overfitting problem, two regularization methods are used, namely L1 regularization and drops out. L1 regularization drives the model to be sparser and reduces the possibility of overfitting by applying penalties to the weight parameters of the model. Dropout, however, randomly discards some neurons during training to prevent the model from relying too much on the training data and improve the generalization performance.

To verify the performance and practical application value of the model, we used actual spectral data measured by an optical spectrometer. These data have actual physical significance and reflect the nature of the actual spectral signals, thus ensuring that the model can be accurately reconstructed in the face of spectral data in real environments, thus demonstrating the practical usability and effectiveness of the model. This experimental design helps to generalize the research results to practical applications and provides solutions to practical problems in related fields.

Spectral reconstruction performance indicators

Two related indicators were used as reconstruction indicators. The R² similarity function is used to evaluate the following metrics:

$${R}^{2}=1-\frac{{{{\Sigma }}_{i=1}^{n}\left({y}_{i}-{Y}_{i}\right)}^{2}}{{{{\Sigma }}_{i=1}^{n}\left({Y}_{i}-\stackrel{-}{Y}\right)}^{2}}$$

where ${y}_{i}$ is the reconstructed spectrum's intensity value, ${Y}_{i}$ is the simulated spectrum's intensity value, and $\stackrel{-}{Y}$ is the simulated spectrum's average intensity value; as well as MSE (mean square error function):

$$MSE=\frac{1}{n}\sum _{i=1}^{n}{\left({y}_{i}-{Y}_{i}\right)}^{2}$$

Where ${y}_{i}$ is the intensity value of the reconstructed spectrum and ${Y}_{i}$ is the intensity value of the simulated spectrum.

Design of data collection system

This section presents a spectral acquisition system we have designed that can collect spectral information about objects illuminated by light. It mainly involves reflecting the information of the illuminated object onto the designed spectral chip through a Formula 2 conversion and then collecting undeciphered data through a detector.

Spectral reconstruction of the self-attention mechanism framework

Compared with traditional compressed sensing algorithms, deep learning neural networks have significant advantages in dealing with noise. In traditional compressed sensing algorithms, denoising often relies on a priori information, and some parameters usually need to be set manually to counteract the noise-induced bias. This method can be effective to some extent but has some limitations. When the noise level changes or the nature of the noise changes, the performance of traditional compressed sensing algorithms may suffer and fail to provide consistent and credible results.

Deep learning neural networks excel at noise processing because they are adaptive. This means that neural networks can learn parameters and features from the data without much manual tuning. In the presence of noise, neural networks can automatically adjust their weights and structures to better fit and model noisy data. This adaptivity allows the neural network to maintain a better fit when dealing with data with different noise levels and properties.

Deep learning neural networks are usually trained with large-scale data to give them strong generalization capabilities. This means that neural networks can process and reduce high-quality data even in the presence of noise not covered during training. This is important for spectral reconstruction in real-world applications, where the type and noise level can change at any time in a real environment.

Therefore, to achieve better reconstruction results, the self-attention mechanism and residual in deep learning were used for reconstruction. The algorithm model is depicted in Fig. 2b. To extract more valuable features from this randomly encoded input for better reconstruction, the model uses a self-attention strategy to improve the relationship between attributes. The self-attention mechanism computes the similarity between each site and other places. It uses it as a weight to weigh the data together, focusing attention on the relevant features to increase result accuracy. The residual mechanism adds cross-layer connections to the neural network layers, avoiding gradient disappearance and gradient explosion while also boosting model training [29][30].

Coding matrix design and dataset construction

The FDTD simulation-generated filter function library is shown in Fig. 3a, and the initial screening-generated filter function matrix is presented in Fig. 3b. The initial screening calculates the expectation and variance of each filter function to verify that each filter function has an encoding effect because there would be no light intensity change when detecting the target, preventing the spectrum from being decoded. Finally, the filter matrix is filtered again using the intercorrelation function to yield the results shown in Fig. 3c. The architecture of the generated spectral data is also depicted in Fig. 3, where varying amounts of Gaussian noise are included to imitate genuine environmental conditions to improve the training model's robustness.

Reconstruction results of self-attention mechanism algorithm

Fig. 4. a-f demonstrates the outcomes of the attention-based algorithm model reconstruction, with the corresponding metric values annotated. The algorithm's reconstruction achieved good results with R²=0.9780 and MSE=0.0019, this is the overall average result metric for the entire test set. The results show that using the mutual correlation algorithm to filter the filter function and combining it with the self-attention mechanism network for spectral reconstruction can produce better reconstruction results, which is an important reference value for practical spectral reconstruction applications. We also tested the reconstruction results of different algorithms using our dataset, and the test results are shown in Table. 1.

Table 1

Comparison of Results of Different Algorithms
Different algorithms R² MSE
Least Squares	0.87	0.12
compressed Sensing	0.89	0.10
DNN	0.91	0.07
seq2seq	0.94	0.04
Self-Attention	0.9780	0.0019

In this work, we propose a new method to construct a filtering function encoding matrix and integrate the self-attention mechanism in deep learning for spectral reconstruction, to improve the accuracy and stability of the final computed spectral reconstruction. In previous coding matrix designs, there was not much emphasis on the correlation between different data. However, a filtering method based on cross-correlation functions can ensure that the constructed coding matrix is sparse enough to better express features and facilitate data processing in subsequent spectral reconstruction. In addition, in previous algorithms, spectral reconstruction was mostly carried out by directly fitting the linear data to obtain the spectral information of the object. However, in the real environment, there is some nonlinear information that can make reconstruction difficult when mixed. Our spectral reconstruction method using a self-attention mechanism not only strengthens the self-correlation between input encoded data but also fits the nonlinear data. Reduce the influence of environmental factors, thereby further improving the accuracy and stability of the spectral reconstruction.

Our research first uses the filtering method of cross-correlation function to filter the filtering function library simulated by FDTD, to screen out the filtering functions with cross-correlation characteristics, to efficiently construct the sparse encoding matrix of the filtering function. The sparse nature of data can make the encoding matrix more expressive; help extract effective information from spectral data and provide a better data foundation for subsequent spectral reconstruction. This filtering method also helps to reduce computational complexity and improve computational efficiency.

Secondly, we adopted the popular self-attention mechanism in the field of deep learning to enhance the feature correlation between input-encoded data, enabling the model to better capture complex nonlinear data relationships. This plays an important role in spectral reconstruction, as spectral data typically contains complex correlations between multiple bands. The self-attention mechanism effectively improves the accuracy of spectral reconstruction by adaptively learning the correlation between bands.

In addition, a residual network was introduced during training to prevent overfitting issues during the training process. This mechanism can enable the model to better record and preserve the good results of the previous step, especially in deep networks, which helps to improve the model's generalization ability; that is, it can still obtain good reconstruction results in complex environments.

The experimental results show that the algorithm combining the self-attention mechanism has higher accuracy and better performance compared to other algorithms and can improve the accuracy and stability of the spectral reconstruction. This method is not only suitable for simulating data but also provides strong support for its application in real-world environments. In the future, we will consider expanding this research with more diverse application scenarios and datasets to further improve stability and accuracy.

In this work, we propose a new method to construct a filtering function encoding matrix and integrate the self-attention mechanism in deep learning for spectral reconstruction, ultimately achieving good accuracy and stability in the reconstruction. Our work first utilizes the method of cross-correlation function to ensure that the constructed encoding matrix can better express features and improve the accuracy and performance of subsequent reconstruction results. Then, a combination of self-attention mechanism and residual is used to enhance the self-correlation between input encoded data, and non-linear data is fitted to reduce the influence of environmental factors, thereby further improving the accuracy and stability of spectral reconstruction. The results show that the reconstruction algorithm is significantly superior to other algorithms, achieving good results with R² = 0.9780 and MSE = 0.0019. To further validate and enhance the approach for the development of spectrum reconstruction approaches in the future, more experiments and practical application testing are needed. This work offers novel insights into how deep learning and self-attention mechanisms might be used to enhance the accuracy of spectral reconstruction. These findings could have a broad variety of implications for future investigations and real-world applications in spectroscopy and related domains.

Conflicts of Interest:

All authors disclosed no relevant relationships.

Funding Declaration: This is an extended study conducted independently with no funding support.

Author Contribution

H.W. wrote and created the associated algorithms and gathered data. H. W. supplied the FDTD simulation dataset. X.S. processed and annotated certain pertinent data. S. L. and J. W. provide suggestions and directions for manuscript revisions.

Data Availability Statement:

The dataset is confidential and therefore not easy to disclose.

Yang Z et al (2019) Single-nanowire spectrometers. Science 365, 1017–1020 10.1126/science.aax8814; pmid: 31488686
Schuler LP, Milne JS, Dell JM, Faraone L (2009) MEMS-based microspectrometer technologies for NIR and MIR wavelengths. J Phys D 42:133001. 10.1088/ 0022-3727/42/13/133001
Malinen J et al (2014) Advances in miniature spectrometer and sensor development. Proc. SPIE 9101, 91010C 10.1117/12.2053567
M.Ebermann et al., Tunable MEMS Fabry-Pérot filters for infrared microspectrometers: A review. Proc. SPIE 9760, 97600H (2016) 10.1117/12.2209288
Crocombe RA (2018) Portable Spectrosc Appl Spectrosc 72:1701–1751 doi:.1177/0003702818809719; pmid: 30335465
Wolffenbuttel RF (2005) MEMS-based optical mini- and microspectrometers for the visible and infrared spectral range. J Micromech Microeng 15:S145–S152. 10.1088/0960-1317/15/7/021
Kurokawa U, Choi BI, Chang CC (2011) Filter-based miniature spectrometers: spectrum reconstruction using adaptive regularization. IEEE Sens J 11(7):1556–1563. https://doi.org/10.1109/jsen.2010.2103054
Oliver J, Lee W, Park S et al (2012) Improving resolution of miniature spectrometers by exploiting sparse nature of signals. Opt Expr 20(3):2613–2625
Oliver J, Lee WB, Lee HN (2013) Filters with random transmittance for improving resolution in filter-array-based spectrometers. Opt Expr 21(4):3969–3989. https://doi.org/10.1364/oe.21.003969
A Spectral Reconstruction Algorithm of Miniature (2018) Spectrometer Based on Sparse Optimization and Dictionary Learning. Shang Zhang, Yuhan Dong
Miniaturization of optical spectrometers ZongYin Yang SCIENCE 29 Jan 2021 Vol 371, Issue 6528.10.1126/science.abe0722
One-shot ultraspectral imaging with reconfigurable metasurfaces (2019) Xusheng Cai, Jian Xiong, Kaiyu Cui
Yang J, Cui K, Cai X et al (2022) Ultraspectral Imaging Based on Metasurfaces with Freeform Shaped Meta-Atoms[J]. Laser & Photonics Reviews, p 16. 7
Baraniuk RG (2007) Compressive sensing. IEEE Signal Process Mag 24:118–121
Goodfellow I, Bengio Y, Courville A (2016) Deep Learning 1. MIT Press
Song HY et al (2021) Deep-learned broadband encoding stochastic fifilters for computational spectroscopic instruments. Adv Theory Simul 4:2000299
Nie SJ et al (2018) IEEE, Salt Lake City, UT, USA,. Deeply learned fifilter response functions for hyperspectral reconstruction. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 18–23 June 2018
Huang J, Zhou W, Zhang Q, Li H, Li W (2018) Video-based sign language recognition without temporal segmentation. In Thirty-Second AAAI Conference on Artifificial Intelligence
Gang Liu and Jiabao Guo (2019) Bidirectional lstm with attention mechanism and convolutional layer for text classifification. Neurocomputing 337:325–338
Karim, Ahmed (2017) Nitish Shirish Keskar, and Richard Socher. Weighted transformer network for machine translation. arXiv preprint arXiv:1711.02132
Sperber M, Neubig G, Niehues J, Waibel A (2019) Attention-passing models for robust and data-effificient end-to-end speech translation. Trans Association Comput Linguistics 7:313–325
Xu K, Wu L, Wang Z, Feng Y, Witbrock M, Sheinin V (2018) Graph2seq: Graph to sequence learning with attention-based neural networks. arXiv preprint arXiv:1804.00823
Fu J, Zheng H, Mei T Look closer to see better: Recurrent attention convolutional neural network for fifine-grained image recognition. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 4476–4484, Honolulu, HI, July 2017. IEEE
Han K, Guo J, Zhang C, Zhu M (2019) January. Attribute-aware attention model for fifine-grained representation learning. arXiv:1901.00392 [cs], arXiv:1901.00392
Dimitris Kastaniotis I, Ntinou D, Tsourounis G, Economou, Fotopoulos S (2018) Attention aware generative adversarial networks (ata-gans). In 2018 IEEE 13th Image, Video, and Multidimensional Signal Processing Workshop (IVMSP), pages 1–5. IEEE
Jiahui Yu Z, Lin J, Yang X, Shen X, Lu, Thomas S Huang. Generative image inpainting with contextual attention. arXiv:1801.07892 [cs], January 2018. arXiv: 1801.07892
Simone Frintrop (2006) VOCUS: A visual attention system for object detection and goal-directed search, vol 3899. Springer
Nabil Ouerhani (2003) Visual attention: from bio-inspired modeling to real-time implementation. PhD thesis, Université de Neuchâtel
Leibe B, Matas J, Sebe N et al (2016) [Lecture Notes in Computer Science] Computer Vision – ECCV 2016 9908 || Identity Mappings in Deep Residual Networks[J]. Chapter 38630–645. 10.1007/978-3-319-46493-0
He K, Zhang X, Ren S et al (2016) Deep Residual Learning for Image Recognition[C]// IEEE Conference on Computer Vision and Pattern Recognition. IEEE
Kukaka J, Golkov V, Cremers D (2017). Regularization for Deep Learning: A Taxonomy[J]
Wen F, Chu L, Liu P et al (2018) A Survey on Nonconvex Regularization Based Sparse and Low-Rank Recovery in Signal Processing, Statistics, and Machine Learning[J]
Engle R (2002) Dynamic Conditional Correlation[J]. J Bus Economic Stat 20(3):339–350

No competing interests reported.

Download PDF

Version 1

posted

You are reading this latest preprint version

Reconstructing Computational Spectra Using Deep Learning's Self-Attention Method

Status:

Version 1

Abstract

Figures

1. Introduction

2. Method

1. Generate Query, Key and Value:

2. Compute the attention scores:

3. Weighted Summation:

3. Result

4. Discussion

5. Conclusion

Declarations

Conflicts of Interest:

Author Contribution

Data Availability Statement:

References

Additional Declarations

Status:

Version 1