4.1 Quantum Accelerator: Simulation Methods
The model circuit of the QMap routine is composed of two basic component logic subcircuits. The first sub-circuit A of the dot-plot generator is executed once per QMap call. Since the setting up of the dot-matrix binary structure is accomplished, sub-circuit B will estimate the Hamming distance. Both constructions are based on well-tested sub-circuit implementations which have been simulated individually to validate the results of their combination. A universal gate-set is adopted to support all the circuitry. All the simulations are conducted with the aid of the Qiskit framework.
4.1.1 Encoding and Dot-Plot Generator
Embedding classical information into quantum states of the Hilbert space is a critical issue for any algorithmic procedure, as the adopted quantum feature mapping scheme will affect the performance of the overall algorithm in terms of time consumption. This issue is discussed in J. Clapis’s work [16] for the exponentially difficult base encoding scheme, which is admittedly a well-known problem. The most prominent theoretical encoding schemes are the basis [17], the angle [18], and the amplitude [19] encoding schemes. All of them are exponentially difficult, but there is a great utility in estimating the circuit depth and run time. Our analysis is based on the basis encoding scheme which is oriented for numerical calculations. Similar performance seems to be obtained by the application of the angle encoding scheme. The more compact way of amplitude encoding may be more advantageous in terms of qubits, but algorithmic operations are not compatible with this type of encoding. Furthermore, the instantaneous quantum polynomial protocol [20] could be applied but its restrictive non-universal model is prohibitive in our case. A comparison is given among these schemes in Table 1. Other promising ways to encode and manipulate quantum information may be employed in the near term inspired by QAOA ansatz [21].
Table 1. Comparison in the characteristics of known embedding schemes. The asterisk (*) denotes non-squared dot-plots. Lq is the length of the query sequence and Lr is the length of the reference. Symbol d is the color depth.
Embedding Schemes
|
Basis
|
Angle
|
Amplitude*
|
Inst. Q. Polynomial*
|
Qubit resource
|
d+2Lr
|
1+2Lr
|
2Lr
|
2Lr
|
Pixel-value qubit
|
d
|
1
|
0
|
0
|
Pixel-value encoding
|
Basis of qubits
|
Angle of qubits
|
Probability amplitude
|
Feature Mapping
|
Complexity
|
O(2Lq2Lr)
|
≥O(log2(LqLr))
|
≥O(log2(LqLr))
|
O(2Lq2Lr)
|
J. Clapis aptly pointed out that in the NEQR encoding scheme (a.k.a. basis encoding), the Espresso algorithm can reduce multi-controlled-X gates by an average of 38.26% allowing a 72% to 74% circuit depth reduction.
The dot-matrix generator sub-circuit is comprised of the Encoder and the Entangler as it is depicted in Figure 6. The Encoder gets a classical dataset of two strings to embed it into qubits. The Entangler makes the preparation of the values of the dot-matrix as a superposition and the measurement applies the operation XOR in a quantumly parallel way. This procedure is extensively explained by Clapis [16] providing a feasible solution to the previous problem of the Black-Box QDP considered in [12] – despite its limitations due to the NEQR encoding scheme.
Once the dot-matrix is created, the n+m qubits output the whole information of the generated matrix feeding multiple times Subcircuit-B, for as many times as it is needed to scan all the acceptable diagonals of the currently scanning window. Then, the whole process is repeated for the next dot-plot window.
4.1.2 Hamming Distance Estimation
Subcircuit-B gets the generated dot-matrix as input and tries to detect diagonal formations of consecutive dots upon it. Only one or more than one diagonal may be tested in a single run. It depends on the adopted scanning policy – further information about these policies can be found in the next section. In this model, we consider the single diagonal scanning policy. Thus, the output is the number of dots detected in the tested diagonal. Subcircuit-B is composed of three basic subcircuits: the trainable pattern subcircuit, the AND gate subcircuit, and the counting subcircuit.
Before the Hamming distance estimation for each diagonal, a new matrix of equal dimension to the already generated dot-plot is prepared with a perfect diagonal d[i] at index i to get ANDed with the generated dot-matrix from Subcircuit-A. This operation will return as a target output a matrix of equal dimension where each cell-value will be the AND result between the corresponding matrix cells of the two input control matrices. This way will isolate common 1’s in the selected diagonal throughout the whole matrix. The number of 1’s (or dots) on the perfect diagonal left will be counted by the well-known quantum counting algorithm. The result will reveal the degree of similarity at the indexed diagonal. It is obvious that the second subcircuit is connected to the first by a large Multi-(Anti)Controlled-Multi-X gate (MCMX). Such a gate is a complex gate.
Obviously, Subcircuit-B imposes a significant computing overhead which must be suppressed as much as possible. In the next section, the adopted policy scanning may provide significant suppression at a cost of the detection sensitivity. The alignment type may allow faster alternative policies. Another point to improve is the AND gate which makes use of programmable logic like the basis encoding scheme. Thus, similarly to the NEQR technique, Espresso algorithm can reduce the AND subcircuit implementation up to a significant degree. Last but not least, some interesting quantum algorithms have also been proposed for the estimation of the Hamming distance [22-24] but are not compatible with the proposed protocol.
4.2 Scanning Policy Modes
Typically, it is possible to scan matrix diagonals in a dot-plot in more than one way, saving a lot of runtimes. Among the multiple ways, there are two prevalent scanning protocols to be considered. The first scenario is the single diagonal scanning policy A, where each “acceptable” diagonal requires one execution of Subcircuit B. Thus, a dot-plot window will run the repetition of Subcircuit B as many times as the threshold-allowed diagonals in the dot-plot window. The second scenario is the parallel diagonal scanning policy B, where each pair of threshold-allowed diagonals requires again one execution of Subcircuit-B. In this case, a dot-plot window will run half the repetitions of Subcircuit B in comparison to Policy A. Alternative policies could be proposed by using more augmented diagonal schemes of more than two diagonals at a time but at the cost of reduced detection sensitivity. An illustrative example follows to help realize the two suggested policies.
In the 16x16 dot-plot example in Figure 8, it is considered that query and reference sequences have the same length generating a squared dot-plot and the threshold similarity is configured at 50%. Policy A requires the execution of Subcircuit-B 16 times in total, while policy B requires 8 applications making a reduction of 50%.
Although that the second policy provides a sensitivity reduction on the detection of the right number of 1’s, it is assumed that statistically it is rare to miss a double diagonal occurrence in the same dot-plot in accordance with the pattern type we want to recognize. Even if double diagonals exist or more dense patterns occur, the counting results will reveal this abnormality and extra software may interpret properly the results for different types of patterns.
Moreover, the circuit architecture could be implemented in a parallel way for the application of the multiple scanning Subcircuits-B, but this way demands a more expensive implementation in qubits and gates.
4.3 Comparative Method for Complexity Analysis
A straightforward way to evaluate the performance of the proposed QC system in terms of computational complexity is to conduct comparisons with other popular equivalents. Despite that it is unqualified to compare two completely different systems, even if a global correspondence is feasible between computing steps – it is still impossible to compare incompatible algorithmic objects. Quantum complexity theory has its own rights and at this moment there are no quantum analogues to be compared. Running a comparative study directly with its exact classical analogue would easily conclude to the result of the superiority of the quantum protocol as it is discussed in Table 3 in [9]. But running the same type of study including the currently best aligner is intrinsically hard to determine the “winner” in a one-way manner, due to the multiparametric settings of the SA protocols and the partial ignorance about the exact hardware of the quantum system.
The proposed grouping in Figure 9 enlists operations that belong at the same stage of the seed-and-extend protocol under consideration. Basic operations are separately enlisted in each stage either for the classical or the quantum protocol in a not strict but approximate way for comparative reasons. The classical version is supposed to utilize the best possible algorithmic component tools while the quantum one tries to exploit as much as possible the power of entanglement. This comparative examination makes sense only when the same strategy, type of data, sequences format, mapping schemes and policies are adopted.
Data preparation: Up-to-dated read-aligners adopt FM-indexing methods to build indexing structures for the reference genome which are usually based on a pre-built BWT tables. Despite their fast-indexing speed, they impose a standard-expensive computing overhead of preparation. In dot-matrix techniques there is no need for such an overhead, but it is replaced by two non-overlooked QC steps: the encoding of the classical data and the generation of the dot-matrix. The second one is accomplished very fast due to quantum parallelism since the encoding is ready.
Seeding: Using an index value, the array elements of the reference in the FM-index context are accessed in constant time, meaning that time complexity is O(1). On the other side, in quantum dot-matrix approach the diagonals are scanned by XORing them with a specific pattern utilizing again the power of quantum parallelism and the degree of similarity is estimated by the quantum counting algorithm.
Filtering: Classical filtering may make use of a cut-off threshold or scoring matrices. The same scoring process has already been fulfilled in the quantum context as the degree of similarity has already been calculated by counting the dots per diagonal. This stage is addressed by the training patterns which have considered the minimum acceptable degree of similarity per diagonal.
Extending: Alike to the majority of read-aligners, when a High-scoring Segment Pairs (HSR) passes filtering, then it is usually refined optimally by the dynamic SW or NW algorithms. The most important manipulations in this phase are (a) the accurate detection of the beginning or the finishing of an HSR and (b) the estimation of indel mutations. Both manipulations, theoretically, can be converted into their quantum analogues, but it is a great challenge to be investigated if the advantage over the classical counterpart is maintained. A possible implementation for operations (a) is considered in the Supplementary material making an extension upon the original circuit.