This article has presented a Jackknife Product algorithm, which applies to any commutative semi-group . The biological application to a circRNA-miRNA system exemplifies a general statistical method in combinatorial probability. In turn, the application in combinatorial probability exemplifies an even more general statistical test for whether a term in a sum of independent counting variates (not necessarily identically distributed) is unusually large. The Benjamini-Hochberg procedure for controlling the false discovery rate in multiple tests applies to independent variates [8] or to dependent variates with a positive regression dependency property [9]. Inconveniently, however, independent summands are negatively correlated when conditioned on a fixed sum. In the circRNA-miRNA application, therefore, the Benjamini-Hochberg procedure was unavailable to correct for multiple testing. Fortunately, a Bonferroni multiple test correction [10] sufficed because empirically there, any p-value not almost 1 was extremely small.
The Results state that for n = 3086, the Jackknifed Product algorithm computed the relevant p-values in about 45 minutes, with n products requiring about 15 minutes of computation. In contrast, the naïve algorithm avoiding inverses and requiring n (n-2) products would have taken about 3086*15 minutes, i.e., about 1 month. As explained under the “Computational Time and Storage” heading in the Theory section, without exploiting the special form of the jackknife products , a segment tree requires about n products for its construction and at least nlog2n products for the computation of the products . Alone, segment tree algorithms would therefore have taken a minimum of about (1+log<>sub23086)*15 minutes, i.e., about 3 hours.
Eq (1) might suggest that jackknife products are susceptible to computation with Fourier or Laplace transforms. The Results section notes that in the biological application, however, {gi[k]} in Eq. (1) spanned several orders of magnitude, from
to
. On one hand, the widely varying magnitudes necessitated an internal logarithmic representation of {
gi[
k]} in the computer, a minor inconvenience for direct computation with the Jackknife Product algorithm. On the other hand, they might have presented a substantial obstacle for transforms. The famous Feynman anecdote about Paul Olum’s (10
100) problem indicates the reason [
11]:
So Paul is walking past the lunch place and these guys are all excited. "Hey, Paul!" they call out. "Feynman's terrific! We give him a problem that can be stated in ten seconds, and in a minute he gets the answer to 10 percent. Why don't you give him one?" Without hardly stopping, he says, "The tangent of 10 to the 100th." I was sunk: you have to divide by pi to 100 decimal places! It was hopeless.
The Jackknife Product algorithm also abstracts to any commutative semigroup , broadening its applicability enormously. As usual, abstraction eases debugging. Consider, e.g., the commutative semigroup consisting of all bit strings of length n under the bitwise “or” operation. If the bit string gj has 1 in the j-th position and 0 s elsewhere, then the segment product g[i,j) equals the bit string with 1 s in positions and 0 s elsewhere. Similarly, the complementary segment product equals the bit string with 0 s in positions and 1 s elsewhere. The Jackknife Product algorithm is easily debugged with output consisting of the segment and complementary segment trees for the bit strings.