Quantum Long Short-Term Memory for Drug Discovery

doi:10.21203/rs.3.rs-4967201/v1

Download PDF

Research Article

Quantum Long Short-Term Memory for Drug Discovery

https://doi.org/10.21203/rs.3.rs-4967201/v1

This work is licensed under a CC BY 4.0 License

You are reading this latest preprint version

Quantum computing combined with machine learning (ML) is an extremely promising research area, with numerous studies demonstrating that quantum machine learning (QML) is expected to solve scientific problems more effectively than classical ML. In this work, we successfully apply QML to drug discovery, showing that QML can significantly improve model performance and achieve faster convergence compared to classical ML. Moreover, we demonstrate that the model accuracy of the QML improves as the number of qubits increases. We alsointroduce noise to our model and find that it has littleeffect on experimental conclusions, illustrating the high robustness of the QML model. This work highlights the potential application of quantum computing to yield significant benefits for scientific advancement as the qubit quantity increase and quality improvement in the future.

Quantum computing

Machine learning

Drug discovery

Molecular screening and Noisy intermediate-scale quantum

ML has demonstrated significant success in various scientific fields, including material science^{1, 2, 3}, computational chemistry⁴ and drug discovery^{5, 6}. Especially in the past decades, deep learning (DL) has become a hot topic in academic research leading to numerous outstanding achievements and been widely adopted in industry as well, which has significantly impacted the social and technology development. For instance, AlphaFold by Google DeepMind has achieved highly accurate protein structure predictions, making a major technological breakthrough in computational biology⁷.

Drug discovery is an expensive, intricate and long-term process with an essential role in human health and well-being^{8, 9}. The safest and most reliable method for finding novel compounds with key property is still experimental exploration. Although many new ML and DL methods have been proposed to improve the success and efficiency of drug discovery, there are still many improvements needed to meet practical application requirements¹⁰. Moreover, most DL models utilize high-dimensional data and complex representations as input¹¹, which typically consume substantial computational resources and requires lengthy training time¹², especially when training DL models with a large number of parameters and considerable depth. However, computational resources are often limited, which further exacerbates these challenges. This, to some extent, limits the speed at which scientists discover new molecules or materials.

Quantum computing has demonstrated exponential speedups in certain computational tasks and shown the potential for processing ultra-large-scale data more efficiently¹³. This capability of handling high-dimensional and complex data representations provides significant advantages, positioning quantum computing as the next crucial stage in the development of ML and DL. Therefore, applying quantum computing to scientific fields will bring substantial and transformative changes.

Long short-term memory (LSTM)¹⁴ is a type of recurrent neural network (RNN) that has made significant contributions to drug discovery^{5, 6}. Recently, Chen et al. proposed a quantum long short-term memory (QLSTM) model, and their work showed that the QLSTM model can reach better performance and converge faster than its classical counterpart¹⁵. Sipio et al. also successfully trained a QLSTM model to perform the parts-of-speech tagging task via numerical simulations, using only half the parameters of the purely classical one to achieve the same overall performance¹⁶.

In this work, we implemented QLSTM model to handle molecular screening tasks, to our best knowledge which is the first time of applying QLSTM model in the field of drug discovery. We demonstrated that QLSTM model is capable of handling molecular screening tasks, and exhibits higher prediction accuracy and faster convergence compared to classical LSTM model in multiple datasets. We also revealed that the performance of QLSTM model improves as the number of qubits increases. Additionally, we have added noise to QLSTM model to test the effectiveness of the model in current available Noisy intermediate-scale quantum (NISQ) devices, and found that the noise of typical current available NISQ devices has no discernible impact on the overall model performance. Our work demonstrates that the QLSTM model can be applied in the field of drug discovery using currently available quantum computers, and has great potential in the future as the performance of quantum computers continues to improve.

We conducted experiments using five benchmark datasets related to drug discovery sourced from MoleculeNet¹⁷ and breast cancer cell lines¹⁸, namely BACE, Blood-Brain Barrier Penetration (BBBP), Side Effect Resource (SIDER), BCAP37 and T-47D. BACE is a database providing binding results for a set of inhibitors of human $\:\beta\:$-secretase 1 with 1522 compounds. BBBP includes 2053 molecules with prediction of the barrier permeability. SIDER contains marketed drugs and adverse drug reactions, categorized into system organ classes for 1427 approved drugs. BCAP37 and T-47D breast-associated cell lines contain 275 triple-negative breast cancer (TNBC) subtype molecules and 3135 Luminal A subtype molecules, respectively. For each molecule, we converted SMILES representations to ECFP molecular fingerprints¹⁹ using the RDKit chemoinformatics toolkit²⁰, with a radius set to 6 and bits to 1024.

LSTM is a classic ML model which has been widely applied across various domains and industries due to its ability to effectively handle sequential data. The QLSTM model, the quantum counterpart of the LSTM model, replaces the classical neural networks in the LSTM cells with a Variable Quantum Circuit (VQC) (Fig. 1). The VQC consists of three main components: data encoding, variational layer and quantum measurement. The data encoding circuit transforms classical vectors into quantum states. The variational layer with circuit parameters is the actual learnable components, updated by gradient descent algorithms. Finally, quantum measurements are utilized to retrieve values for subsequent processing. The mathematical equation of the QLSTM model is defined as:

$$\:{f}_{t}=Sigmoid\left({VQC}_{1}\left({v}_{t}\right)\right)$$

$$\:{i}_{t}=Sigmoid\left({VQC}_{2}\left({v}_{t}\right)\right)$$

$$\:{\stackrel{\sim}{C}}_{t}=Tanh\left({VQC}_{3}\left({v}_{t}\right)\right)$$

$$\:{c}_{t}={f}_{t}*{c}_{t-1}+{i}_{t}*{\stackrel{\sim}{C}}_{t}$$

$$\:{o}_{t}=Sigmoid\left({VQC}_{4}\left({v}_{t}\right)\right)$$

$$\:{h}_{t}={VQC}_{4}({o}_{t}*tanh({c}_{t}\left)\right)$$

where $\:Sigmoid$ and $\:Tanh$ are the activation functions, $\:{f}_{t}$ is the forget gate, $\:{i}_{t}$ is the input gate, $\:{o}_{t}$ is the output gate, $\:{v}_{t}$ is a concatenation of the input at step t and the hidden state at step t-1 and $\:{h}_{t}$ is the hidden state of QLSTM model.

To demonstrate the robustness of QLSTM model, all models including classical ones were trained using three different split seeds. The average validation accuracy across these splits was used to evaluate model performance. We selected Adam algorithm²¹ as the optimizer, and the learning rate ranged from 0.1 to 0.001. The batch size was set as 256, and the training epochs was 100.

We also conducted performance comparison on the QLSTM model with different level of added noise. In order to evaluate whether the QLSTM model with added noise was adapted to real NISQ devices, we used the following score function $\:s$ to estimate the overall error rate of the QLSTM model^{22, 23} on real quantum computers. The score function $\:s$ is defined as:

$$\:s=1-\prod\:_{j=1}^{d}{\left(1-{\left(\frac{\sum\:_{i}{E}_{{r}_{i}}{N}_{i}}{\sum\:_{i}{N}_{i}}\right)}_{j}\right)}^{{m}_{j}}$$

where $\:{N}_{i}$ is the number of a quantum logic gate, $\:{E}_{{r}_{i}}$ is the corresponding error rate of this type of gate, $\:d$ is the depth of the quantum circuit, the last term, $\:{\left(\frac{\sum\:_{i}{E}_{{r}_{i}}{N}_{i}}{\sum\:_{i}{N}_{i}}\right)}_{j}$ is the average error rate in the $\:j$th layer, and $\:{m}_{j}$ is the number of gate at circuit layer $\:j$.

All experiments were performed on an NVIDIA A100 GPU on a 64-bit CentOS v8.5 server with 512 GB of RAM. The source code was written by Pytorch, using Torch Quantum as a quantum simulator. Our models have not yet been implemented on quantum hardware, but our proposed models and circuits are designed to be easily adaptable to NISQ devices. Due to equipment limitations, the number of qubits used for the QLSTM model comparisons are 2, 4, 8, and 12.

Table 1

The performance of the QLSTM and LSTM models was evaluated on the BACE, BBBP, SIDER, BCAP37, and T-47D datasets with 2, 4, 8, and 12 qubits.
	BACE		BBBP		SIDER		BCAP37		T-47D
	quantum	classical	quantum	classical	quantum	classical	quantum	classical	quantum	classical
2	0.819	0.828	0.790	0.829	0.618	0.654	0.760	0.751	0.735	0.714
4	0.831	0.827	0.828	0.832	0.645	0.656	0.774	0.723	0.775	0.780
8	0.827	0.817	0.829	0.838	0.684	0.659	0.774	0.712	0.787	0.783
12	0.842	0.838	0.843	0.848	0.693	0.680	0.806	0.727	0.789	0.786

To evaluate the effectiveness of the different numbers of qubits on the QLSTM model performance, we conducted comparison study using five benchmark datasets: BACE, BBBP, SIDER, BCAP37, and T-47D. The number of qubits in QLSTM model varies from 2 to 12. Recognizing that the dimensionality of input data significantly affects model performance, we incorporated a linear layer into the input side of the LSTM model. This layer downscaled the 1024-bit ECFP molecular fingerprint to match the number of qubits, ensuring that the QLSTM and LSTM models were compared based on the same information and dimensionality. Compared to the LSTM model, QLSTM model achieves the highest prediction accuracy with 12 qubits number on BACE, SIDER BCAP37 and T-47D dataset (Table 1). The experiment results demonstrate the quantum version of LSTM model, QLSTM, can obtain better model performance than LSTM model. We averaged the model performance across the five benchmark datasets for each qubit number, and the result also reveal that QLSTM model with more qubits achieved higher accuracy overall (Fig. 2.a and Table S1). Moreover, the QLSTM model converges faster than the LSTM model (Fig. 3 and Figure S1-S3). With few epochs, the QLSTM model can achieve better convergence than the LSTM model.

Table 2

The model performance of QLSTM model using BACE, BBBP, SIDER, BCAP37 and T-47D datasets with 2, 4, 8, 12 qubits at noise levels of 0.01, 0.02, and 0.05.
	BACE			BBBP			SIDER			BCAP37			T-47D
	0.01	0.02	0.05	0.01	0.02	0.05	0.01	0.02	0.05	0.01	0.02	0.05	0.01	0.02	0.05
2	0.812	0.805	0.784	0.764	0.754	0.753	0.615	0.629	0.611	0.707	0.743	0.730	0.739	0.712	0.715
4	0.809	0.802	0.789	0.818	0.815	0.798	0.639	0.620	0.614	0.756	0.745	0.739	0.755	0.761	0.710
8	0.813	0.808	0.800	0.817	0.758	0.799	0.667	0.654	0.635	0.766	0.750	0.759	0.737	0.760	0.734
12	0.815	0.814	0.806	0.831	0.828	0.827	0.680	0.658	0.650	0.780	0.769	0.768	0.785	0.764	0.741

With quantum noise being a constant presence on current NISQ devices²⁴, it’s essential to evaluate the performance of the QLSTM model under different noise levels and to explore the potential impact of this noise on the model. In this work, we investigated the impact of three different levels of bit-flip noise, which are 0.01, 0.02 and 0.05. Remarkably, the QLSTM model still achieves the highest model performance with 12 qubits number (Table 2). The overall accuracy of the QLSTM model drops as the noise increase on each qubit (Fig. 2.b and Table S2). However, it also appears that the QLSTM model with large noise can outperform the model with small noise in certain scenarios. For instance, the QLSTM model with a noise level of 0.05 performs better than the model with a noise level of 0.02 on 2 qubits number when applied to SIDER dataset (Table 2). The reason is that sometimes good direction of gradient descent in the QLSTM model can overcome the effects from the noise²⁵.

The median error rate for single-qubit quantum gate in IBM quantum processor (Heron) is approximately 0.03%, while the median error rate for double-qubit controlled-Z gates is approximately 0.32%, underscoring the current operational challenges of quantum computing²⁶. We assess the noise error rate of QLSTM model on real quantum computers with above parameters and Eq. 7, which are 0.5%, 1.1%, 2.1% and 3.2% on 2, 4, 8 and 12 qubits number, respectively. Although noise affects the performance of quantum computers in many situations limiting their practical application, our results demonstrate that the QLSTM model maintains a high level of robustness and model performance with different levels of noise. This robustness and adaptability to noise not only highlights the current practicality of the QLSTM model, but also its potential in the future development of quantum computing. Moreover, recent advancements in noise reduction techniques in quantum computing have shown promising trends^{27, 28}. As the number of qubits continues to increase, efforts to mitigate noise levels have also intensified. We believe that noise levels could potentially be halved at least within the next 3–5 years. This expectation is bolstered by ongoing research and development in quantum error correction²⁹, improved qubit coherence times³⁰, and more efficient error mitigation strategies³¹. So we are confident that the QLSTM model with high robustness and model performance can effectively leverage quantum computers to achieve practical applications and realize significant value in drug discovery.

In summary, we have successfully implemented the QLSTM model in drug discovery, achieving higher performance and faster convergence compared to the classical LSTM model. Notably, as the number of qubits increase, the accuracy of the QLSTM model improves. Additionally, we have investigated the performance of the QLSTM model under different noise levels and found that the impact is quite limited. Given the noise error rate estimation of currently available quantum computing devices, we believe the QLSTM model can be effectively executed on NISQ devices. Our work represents a significant advancement in QML for scientific research, fostering further exploration and the development of advanced quantum-classical hybrid models.

Acknowledgements

This work was partly supported by Tianyan Quantum Computing Cloud Platform.

Author contributions

L. Z. completed the training of the QML model, the analysis of the data and the first draft of the manuscript. Y. X. and M.W. assisted with some of the experiments. H. X. and L.W. revised the manuscript. All authors participated in discussions and revisions and provided comments on the paper.

Funding

This research is supported by the Central government guide local science and technology development funds No: 2023JH6/100100065.

Data and materials Availability Statement

The data and materials that supports the findings of this study are available from the corresponding author upon reasonable request.

Competing interests

The authors declare no competing interests.

Author details

¹College of Artificial Intelligence, Tianjin University of Science and Technology, Tianjin 300457, China. ² Yiwei Quantum Technology Co., Ltd, Hefei 230088, China.

Raccuglia, P. et al. Machine-learning-assisted materials discovery using failed experiments. Nature 533, 73–76 (2016).
Zhong, M. et al. Accelerated discovery of CO2 electrocatalysts using active machine learning. Nature 581, 178–183 (2020).
Merchant, A. et al. Scaling deep learning for materials discovery. Nature 624, 80–85 (2023).
Keith, J. A. et al. Combining Machine Learning and Computational Chemistry for Predictive Insights Into Chemical Systems. Chem. Rev. 121, 9816–9872 (2021).
Popova, M. et al. Deep reinforcement learning for de novo drug design. Sci. Adv. 4, eaap7885 (2018).
Olivecrona, M. et al. Molecular de-novo design through deep reinforcement learning. J. Cheminformatics 9, 48 (2017).
Abramson, J. et al. Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature 630, 493–500 (2024).
Schneider, G. Automating drug discovery. Nat. Rev. Drug Discov. 17, 97–113 (2018).
Schneider, P. et al. Rethinking drug design in the artificial intelligence era. Nat. Rev. Drug Discov. 19, 353–364 (2020).
Sliwoski, G. et al. Computational Methods in Drug Discovery. Pharmacol. Rev. 66, 334–395 (2014).
Georgiou, T. et al. A survey of traditional and deep learning-based feature descriptors for high dimensional data in computer vision. Int. J. Multimed. Inf. Retr. 9, 135–170 (2020).
Singh, B. et al. A Trade-off between ML and DL Techniques in Natural Language Processing. J. Phys. Conf. Ser. 1831, 012025 (2021).
Lidar, D. A. et al. Calculating the thermal rate constant with exponential speedup on a quantum computer. Phys. Rev. E 59, 2429–2438 (1999).
LeCun, Y. et al. Deep learning. Nature 521, 436–444 (2015).
Chen, S. Y.-C. et al. Quantum Long Short-Term Memory. in ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 8622–8626 doi:10.1109/ICASSP43922.2022.9747369 (2022).
Di Sipio, R. et al. The Dawn of Quantum Natural Language Processing. in ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 8612–8616 doi:10.1109/ICASSP43922.2022.9747675 (2022).
Wu, Z. et al. MoleculeNet: a benchmark for molecular machine learning. Chem. Sci. 9, 513–530 (2018).
He, S. et al. Machine Learning Enables Accurate and Rapid Prediction of Active Molecules Against Breast Cancer Cells. Front. Pharmacol. 12, 796534 (2021).
Rogers, D. et al. Extended-Connectivity Fingerprints. J. Chem. Inf. Model. 50, 742–754 (2010).
Bento, A. P. et al. An open source chemical structure curation pipeline using RDKit. J. Cheminformatics 12, 51 (2020).
Kingma, D. P. et al. Adam: A Method for Stochastic Optimization. http://arxiv.org/abs/1412.6980 (2017).
Hu, W. et al. Performance of superconducting quantum computing chips under different architecture designs. Quantum Inf. Process. 21, 237 (2022).
Jurcevic, P. et al. Demonstration of quantum volume 64 on a superconducting quantum computing system. Quantum Sci. Technol. 6, 025020 (2021).
Preskill, J. Quantum Computing in the NISQ era and beyond. Quantum 2, 79 (2018).
Jing, Y. et al. RGB image classification with quantum convolutional ansatz. Quantum Inf. Process. 21, 101 (2022).
McKay, D. C. et al. Benchmarking Quantum Processor Performance at Scale. Preprint at http://arxiv.org/abs/2311.05933 (2023).
Kim, Y. et al. Evidence for the utility of quantum computing before fault tolerance. Nature 618, 500–505 (2023).
Singh, K. et al. Mid-circuit correction of correlated phase errors using an array of spectator qubits. Science 380, 1265–1269 (2023).
Brady, A. J. et al.Advances in bosonic quantum error correction with Gottesman–Kitaev–Preskill Codes: Theory, engineering and applications. Prog. Quantum Electron. 93, 100496 (2024).
Bal, M. et al. Systematic improvements in transmon qubit coherence enabled by niobium surface encapsulation. Npj Quantum Inf. 10, 43 (2024).
Jnane, H. et al. Quantum Error Mitigated Classical Shadows. PRX Quantum 5, 010324 (2024).

No competing interests reported.

SupportMaterialQuantumLongShortTermMemoryforDrugDiscovery.docx

Download PDF

Reviewers invited by journal
28 Aug, 2024
Editor assigned by journal
27 Aug, 2024
Submission checks completed at journal
27 Aug, 2024
First submitted to journal
24 Aug, 2024

You are reading this latest preprint version

Quantum Long Short-Term Memory for Drug Discovery

Status:

Version 1

Abstract

Figures

Introduction

Method

Results

Conclusion

Declarations

References

Additional Declarations

Supplementary Files

Status:

Version 1