CRISPR cluster is a family of special DNA repeats widely existing in the genomes of bacteria and archaea. It is a cluster of short palindromic repeats at regular intervals, distributed among 40% of sequenced bacteria and 90% of sequenced archaea. The CRISPR sequence consists of a number of short and conservative repeats and spacers. Repeat sequence region contains palindrome sequence, which can form hairpin structure. The spacers are special, they are the foreign DNA sequences captured by bacteria. This is the equivalent of a "blacklist" of the bacterial immune system. CRISPR/Cas system strikes with precision when the foreign genetic material invades again. The leader in the upstream is considered to be the promoter of the CRISPR sequence. In addition, there is a polymorphic family gene that can interact with the CRISPR sequence region. Therefore, the gene is named CRISPR associated gene(Cas). The Cas gene coevolve with the CRISPR sequence to form a highly conserved CRISPR/Cas system in bacteria.
Actually, CRISPR/Cas9 technology takes three steps to destroy the foreign DNA.
Firstly, the CRISPR/Cas system implements a "blacklist registration" function in this step. The CRISPR/Cas system identify the intruder's "name" (PAM) and find its "id card" (original interval sequence), and then record the intruder's identity information as a "file" (interval sequence) to the "blacklist" (CRISPR sequence). When a phage virus invades a host bacterium, the virus's double-stranded DNA is injected into the cell. The CRISPR/Cas system intercepts a sequence from this foreign DNA as an "identity card" of the foreign DNA and integrates it into the genomic CRISPR sequence as a new interval sequence. Therefore, the identity card corresponding to the interval sequence is called protospacer. However, the selection of "id card" is not random. Several bases extending to both ends of the proto-spacer sequence are conservative and are called proto-spacer adjacent motif (PAM). When the virus invests, the proteins encoded by Cas will scan the foreign DNA and identify the PAM region, and then take the DNA sequence adjacent to PAM as the candidate original interval sequence. Subsequently, the Cas protein cuts the original spacer sequence from the foreign DNA and inserts the original spacer sequence into the downstream of the adjacent leading region of the CRISPR sequence with the assistance of other enzymes. In this way, a new interval sequence is added to the CRISPR sequence of the genome.
Secondly, some special structures are needed to destroy the foreign DNA. When foreign DNA invades, the CRISPR sequence transcribes two kinds of RNA under the regulation of the promoter in the leading region, named pre-crRNA and tracrRNA. TracrRNA is an RNA with hairpin structure transcribed from repetitive sequence region, while pre-crRNA is a large RNA transcribed from the whole CRISPR sequence. Then pre-crRNA, tracrRNA and proteins encoded by Cas9 are assembled into a complex. It will select the corresponding "identity card", and cut the sequence with the help of RNase Ⅲ. Eventually a complex consisted of crRNA, Cas9, and tracrRNA is formed.
Finally, it's the process of targeted interference. This complex will scan the entire foreign DNA sequence and identify the original spacer sequence that is complementary to crRNA. Then, the complex locates the region of the PAM and the DNA double strand is unwound, forming the R-loop. The crRNA will hybridize to the complementary strand, while the other strand remains free. Cas9 protein can cut the target DNA by two nuclease domains within the molecule, each of which is responsible for cutting a DNA strand in the R-loop of the target DNA. In the end, the Cas9 protein breaks the double strand and the expression of the foreign DNA is silenced.
CRISPR/Cas is a powerful tool for gene editing, allowing precise and targeted editing of genes. With the participation of guide RNA and Cas9, the cell genome DNA to be edited will be treated as a virus or foreign DNA. But there are limits to the use of CRISPR/Cas9. A relatively conservative PAM sequence is necessary to exist near the region to be edited. And guide RNA is complementary to the sequence bases upstream of PAM. The most basic technique is gene knock-out. If two guide RNAs are designed in the upstream and downstream of the gene, and they are transferred into the cell together with the plasmid containing Cas9, the guide RNA can target the target sequence near PAM through base complementary pairing, and Cas9 will break the DNA double strand in the upstream and downstream of the gene. The organism itself has a response mechanism of DNA damage repair, which connects the sequences at the upper and lower ends of the break, thus achieving the knockout of the target gene in the cell. If a repair template plasmid (donor DNA) is introduced to the cell on this basis, then the cell will introduce fragment insertion or fixed point mutation in the repair process according to the template provided. This allows for gene replacement or mutation. Gene editing of fertilized egg cells and introducing them into the surrogate mother can construct the gene editing animal model.
However, the commonly used CRISPR/Cas9 system still has several defects. It does not have strict matching requirements for guide RNA, and Cas9 may not follow the instructions of guide RNA for random cutting, resulting in missed editing and even large segment loss at the remote editing site. But A.V. Anzalone et al.[2] improved these problems, on the one hand, they modified Cas9 protein so that it could only cut off the double-stranded DNA in a chain. On the other hand, a template RNA of DNA was added to connected with guide RNA, which became the editing extension guide RNA(pegRNA). A reverse transcriptase was also added to the Cas accordingly, and the DNA from the reverse transcription of the template RNA was used to repair the target site. Thus, the problems of editing miss and insertion deletion were greatly improved.
With the deepening and comprehensive research on CRISPR/Cas9 technology, it will be more accurate and easy to construct transgenic animals to study human diseases, for making greater contributions to human health in the future.