DuckCorp

DuckCorp Dico

(RFC 2229 compliant dictionary server)

Found one definition

  1.                 From en.wikipedia.org:
                    

    [Family of DNA sequences found in prokaryotic organisms] [name-list-style=vanc] [the prokaryotic antiviral system] {{Infobox nonhuman protein | Name = Cascade (CRISPR-associated complex for antiviral defense) | image = 4QYZ.png | caption = CRISPR Cascade protein (cyan) bound to CRISPR RNA (green) and phage DNA (red)[1] | Organism = _Escherichia coli_ | TaxID = 511145 | Symbol = CRISPR | EntrezGene = 947229 | HomoloGene = | PDB = 4QYZ | UniProt = P38036 | RefSeqmRNA = | RefSeqProtein = NP_417241.1 }} [[File:Crispr.png|thumb|262px|Diagram of the CRISPR prokaryotic antiviral defense mechanism[2]]] CRISPR ([ˈ]; acronym of CLUSTERED REGULARLY INTERSPACED SHORT PALINDROMIC REPEATS) is a family of DNA sequences found in the genomes of prokaryotic organisms such as bacteria and archaea.<ref name="pmid25574773"/> Each sequence within an individual prokaryotic CRISPR is derived from a DNA fragment of a bacteriophage that had previously infected the prokaryote or one of its ancestors.<ref name="Hille2018"/>[3] These sequences are used to detect and destroy DNA from similar bacteriophages during subsequent infections. Hence these sequences play a key role in the antiviral (i.e. anti-phage) defense system of prokaryotes and provide a form of heritable,<ref name="Hille2018"/> acquired immunity.[4][5][6][7] CRISPR is found in approximately 50% of sequenced bacterial genomes and nearly 90% of sequenced archaea.[8]

    Cas9 (or "CRISPR-associated protein 9") is an enzyme that uses CRISPR sequences as a guide to recognize and open up specific strands of DNA that are complementary to the CRISPR sequence. Cas9 enzymes together with CRISPR sequences form the basis of a technology known as CRISPR-Cas9 that can be used to edit genes within living organisms.[9][10] This editing process has a wide variety of applications including basic biological research, development of biotechnological products, and treatment of diseases.[11]<ref name="Hsu2014"/> The development of the CRISPR-Cas9 genome editing technique was recognized by the Nobel Prize in Chemistry in 2020 awarded to Emmanuelle Charpentier and Jennifer Doudna.[12][13]

    ** History

    *** Repeated sequences

    The discovery of clustered DNA repeats took place independently in three parts of the world. The first description of what would later be called CRISPR is from Osaka University researcher Yoshizumi Ishino and his colleagues in 1987. They accidentally cloned part of a CRISPR sequence together with the "_iap" gene_ _(isozyme conversion of alkaline phosphatase)_ from their target genome, that of _Escherichia coli_.[14][15] The organization of the repeats was unusual. Repeated sequences are typically arranged consecutively, without interspersing different sequences.<ref name="Hsu2014"/><ref name="Ishino-1987"/> They did not know the function of the interrupted clustered repeats.

    In 1993, researchers of _Mycobacterium tuberculosis_ in the Netherlands published two articles about a cluster of interrupted direct repeats (DR) in that bacterium. They recognized the diversity of the sequences that intervened in the direct repeats among different strains of _M. tuberculosis_[16] and used this property to design a typing method called _spoligotyping_, still in use today.[17][18]

    Francisco Mojica at the University of Alicante in Spain studied the function of repeats in the archaeal species _Haloferax_ and _Haloarcula_. Mojica's supervisor surmised that the clustered repeats had a role in correctly segregating replicated DNA into daughter cells during cell division, because plasmids and chromosomes with identical repeat arrays could not coexist in _Haloferax volcanii_. Transcription of the interrupted repeats was also noted for the first time; this was the first full characterization of CRISPR.<ref name="Mojica2016"/>[19] By 2000, Mojica and his students, after an automated search of published genomes, identified interrupted repeats in 20 species of microbes as belonging to the same family.[20] Because those sequences were interspaced, Mojica initially called these sequences "short regularly spaced repeats" (SRSR).[21] In 2001, Mojica and Ruud Jansen, who were searching for additional interrupted repeats, proposed the acronym CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) to unify the numerous acronyms used to describe these sequences.<ref name="Mojica2016b"/>[22] In 2002, Tang, et al. showed evidence that CRISPR repeat regions from the genome of _Archaeoglobus fulgidus_ were transcribed into long RNA molecules subsequently processed into unit-length small RNAs, plus some longer forms of 2, 3, or more spacer-repeat units.[23][24]

    In 2005, yogurt researcher Rodolphe Barrangou discovered that _Streptococcus thermophilus_, after iterative phage infection challenges, develops increased phage resistance due to the incorporation of additional CRISPR spacer sequences.[25] Barrangou's employer, the Danish food company Danisco, then developed phage-resistant _S. thermophilus_ strains for yogurt production. Danisco was later bought by DuPont, which owns about 50 percent of the global dairy culture market, and the technology spread widely.[26]

    *** CRISPR-associated systems

    A major advance in understanding CRISPR came with Jansen's observation that the prokaryote repeat cluster was accompanied by four homologous genes that make up CRISPR-associated systems, _cas_ 1–4. The Cas proteins showed helicase and nuclease motifs, suggesting a role in the dynamic structure of the CRISPR loci.[27] In this publication, the acronym CRISPR was used as the universal name of this pattern, but its function remained enigmatic.

    [[File:SimpleCRISPR.jpg|thumb|Simplified diagram of a CRISPR locus. The three major components of a CRISPR locus are shown: _cas_ genes, a leader sequence, and a repeat-spacer array. Repeats are shown as gray boxes and spacers are colored bars. The arrangement of the three components is not always as shown.[28][29] In addition, several CRISPRs with similar sequences can be present in a single genome, only one of which is associated with _cas_ genes.[30]]] In 2005, three independent research groups showed that some CRISPR spacers are derived from phage DNA and extrachromosomal DNA such as plasmids.[31][32][33] In effect, the spacers are fragments of DNA gathered from viruses that previously attacked the cell. The source of the spacers was a sign that the CRISPR-_cas_ system could have a role in adaptive immunity in bacteria.<ref name="pmid20056882"/>[34] All three studies proposing this idea were initially rejected by high-profile journals, but eventually appeared in other journals.[35]

    The first publication<ref name="pmid15791728"/> proposing a role of CRISPR-Cas in microbial immunity, by Mojica and collaborators at the University of Alicante, predicted a role for the RNA transcript of spacers on target recognition in a mechanism that could be analogous to the RNA interference system used by eukaryotic cells. Koonin and colleagues extended this RNA interference hypothesis by proposing mechanisms of action for the different CRISPR-Cas subtypes according to the predicted function of their proteins.[36]

    Experimental work by several groups revealed the basic mechanisms of CRISPR-Cas immunity. In 2007, the first experimental evidence that CRISPR was an adaptive immune system was published.<ref name="pmid17379808"/><ref name="Hsu2014"/> A CRISPR region in _Streptococcus thermophilus_ acquired spacers from the DNA of an infecting bacteriophage. The researchers manipulated the resistance of _S. thermophilus_ to different types of phages by adding and deleting spacers whose sequence matched those found in the tested phages.[37]<ref name="Marraffini2015"/> In 2008, Brouns and Van der Oost identified a complex of Cas proteins called Cascade, that in _E. coli_ cut the CRISPR RNA precursor within the repeats into mature spacer-containing RNA molecules called CRISPR RNA (crRNA), which remained bound to the protein complex.[38] Moreover, it was found that Cascade, crRNA and a helicase/nuclease (Cas3) were required to provide a bacterial host with immunity against infection by a DNA virus. By designing an anti-virus CRISPR, they demonstrated that two orientations of the crRNA (sense/antisense) provided immunity, indicating that the crRNA guides were targeting dsDNA. That year Marraffini and Sontheimer confirmed that a CRISPR sequence of _S. epidermidis_ targeted DNA and not RNA to prevent conjugation. This finding was at odds with the proposed RNA-interference-like mechanism of CRISPR-Cas immunity, although a CRISPR-Cas system that targets foreign RNA was later found in _Pyrococcus furiosus_.<ref name="Hsu2014"/><ref name="Marraffini2015"/> A 2010 study showed that CRISPR-Cas cuts strands of both phage and plasmid DNA in _S. thermophilus_.[39]

    *** Cas9

    [Cas9] A simpler CRISPR system from _Streptococcus pyogenes_ uses the protein Cas9, an endonuclease functioning with two small RNAs—crRNA and tracrRNA—to form a four-component complex.[40][41] In 2012, Jennifer Doudna and Emmanuelle Charpentier simplified this into a two-component system by fusing the RNAs into a "single-guide RNA", enabling Cas9 to target and cut specific DNA sequences—a breakthrough that earned them the Nobel Prize in Chemistry in 2020.[42] Parallel work showed the _S. thermophilus_ Cas9 could be similarly reprogrammed by altering the crRNA sequence.<ref name="Mojica2016"/> These developments spurred genome editing efforts, including demonstrations by groups led by Feng Zhang and George Church showing genome editing in human cells using CRISPR-Cas9.[43][44][45]

    *** Cas12a

    [Cas12a] Cas12a, a Class II Type V CRISPR-associated nuclease, was characterized in 2015 and was formerly known as Cpf1.<ref name="Rath_2015" /> This nuclease is found in the CRISPR-Cpf1 system of bacteria such as _Francisella novicida_.<ref name="Redman_2016"/>[46] The initial designation, derived from a TIGRFAMs protein family definition established in 2012, reflected the prevalence of this CRISPR-Cas subtype in the _Prevotella_ and _Francisella_ lineages. Cas12a exhibits several key distinctions from Cas9: it generates staggered cuts in double-stranded DNA, in contrast to the blunt ends produced by Cas9;<ref name="Proteintech"/> it relies on a 'T-rich' protospacer adjacent motif (PAM) (typically 5'-TTTV-3', where V is A, C, or G), offering alternative targeting sites compared to the 'G-rich' PAMs (typically 5'-NGG-3') favored by Cas9;<ref name="Nobel_Foundation"/> and it requires only a CRISPR RNA (crRNA) for effective targeting, whereas Cas9 necessitates both a crRNA and a _trans_-activating crRNA (tracrRNA).<ref name="Rath_2015"/>

    *** Cas13a

    [ Cas13a] In 2016, the nuclease (formerly known as C2c2) from the bacterium _Leptotrichia shahii_ was characterized by researchers in Feng Zhang's group at MIT and the Broad Institute. Cas13 is an RNA-guided RNA endonuclease, which means that it does not cleave DNA, but only single-stranded RNA. Cas13 is guided by its crRNA to a ssRNA target and binds and cleaves the target. Similar to Cas12a, the Cas13 remains bound to the target and then cleaves other ssRNA molecules non-discriminately.[47] This collateral cleavage property has been exploited for the development of various diagnostic technologies.[48][49][50]

    ** Locus structure

    *** Repeats and spacers

    The CRISPR array is made up of an AT-rich leader sequence followed by short repeats that are separated by unique spacers.[51] CRISPR repeats typically range in size from 28 to 37 base pairs (bps), though there can be as few as 23 bp and as many as 55 bp.[52] Some show dyad symmetry, implying the formation of a secondary structure such as a stem-loop ('hairpin') in the RNA, while others are designed to be unstructured. The size of spacers in different CRISPR arrays is typically 32 to 38 bp (range 21 to 72 bp).<ref name="Barrangou2014"/> New spacers can appear rapidly as part of the immune response to phage infection.<ref name="pmid17894817"/> There are usually fewer than 50 units of the repeat-spacer sequence in a CRISPR array.<ref name="Barrangou2014"/>

    *** CRISPR RNA structures

    <gallery title="Gallery of secondary structure images" perrow="5"> Image:RF01315.png| CRISPR-DR2: Secondary structure taken from the Rfam (see http://rfam.xfam.org) database. Family RF01315 (see http://rfam.xfam.org/family/RF01315) . Image:RF01318.png| CRISPR-DR5: Secondary structure taken from the Rfam (see http://rfam.xfam.org) database. Family RF011318 (see http://rfam.xfam.org/family/RF01318) . Image:RF01319.png| CRISPR-DR6: Secondary structure taken from the Rfam (see http://rfam.xfam.org) database. Family RF01319 (see http://rfam.xfam.org/family/RF01319) . Image:RF01321.png| CRISPR-DR8: Secondary structure taken from the Rfam (see http://rfam.xfam.org) database. Family RF01321 (see http://rfam.xfam.org/family/RF01321) . Image:RF01322.png| CRISPR-DR9: Secondary structure taken from the Rfam (see http://rfam.xfam.org) database. Family RF01322 (see http://rfam.xfam.org/family/RF01322) . Image:RF01332.png| CRISPR-DR19: Secondary structure taken from the Rfam (see http://rfam.xfam.org) database. Family RF01332 (see http://rfam.xfam.org/family/RF01332) . Image:RF01350.png| CRISPR-DR41: Secondary structure taken from the Rfam (see http://rfam.xfam.org) database. Family RF01350 (see http://rfam.xfam.org/family/RF01350) . Image:RF01365.png| CRISPR-DR52: Secondary structure taken from the Rfam (see http://rfam.xfam.org) database. Family RF01365 (see http://rfam.xfam.org/family/RF01365) . Image:RF01370.png| CRISPR-DR57: Secondary structure taken from the Rfam (see http://rfam.xfam.org) database. Family RF01370 (see http://rfam.xfam.org/family/RF01370) . Image:RF01378.png| CRISPR-DR65: Secondary structure taken from the Rfam (see http://rfam.xfam.org) database. Family RF01378 (see http://rfam.xfam.org/family/RF01378) . </gallery>

    *** Cas genes and CRISPR subtypes

    Small clusters of _cas_ genes are often located next to CRISPR repeat-spacer arrays. Collectively the 93 _cas_ genes are grouped into 35 families based on sequence similarity of the encoded proteins. 11 of the 35 families form the _cas_ core, which includes the protein families Cas1 through Cas9. A complete CRISPR-Cas locus has at least one gene belonging to the _cas_ core.<!-- The numbers are from Makarova2015, and the Makarova2018 source has expanded them a bit. Its fig1 mentions 13 core gene families (instead of 11), but there's no mention of total gene families nor total gene count. -->[53]

    CRISPR-Cas systems fall into two classes. Class 1 systems use a complex of multiple Cas proteins to degrade foreign nucleic acids. Class 2 systems use a single large Cas protein for the same purpose. Class 1 is divided into types I, III, and IV; class 2 is divided into types II, V, and VI.[54] The 6 system types are divided into 33 subtypes.[55] Each type and most subtypes are characterized by a "signature gene" found almost exclusively in the category. Classification is also based on the complement of _cas_ genes that are present. Most CRISPR-Cas systems have a Cas1 protein. The phylogeny of Cas1 proteins generally agrees with the classification system,[56] but exceptions exist due to module shuffling.<ref name="Makarova2018"/> Many organisms contain multiple CRISPR-Cas systems suggesting that they are compatible and may share components.[57][58] The sporadic distribution of the CRISPR-Cas subtypes suggests that the CRISPR-Cas system is subject to horizontal gene transfer during microbial evolution.

    [table] {| class="wikitable center" style="width:100%" |+ Signature genes and their putative functions for the major and minor CRISPR-cas types |- ! Class !! Cas type !Cas subtype!! Signature protein !! Function !! [Refh] |- | rowspan="19" | 1 || rowspan="8" | I | [sdash] ||Cas3 || Single-stranded DNA nuclease (HD domain) and ATP-dependent helicase ||[59][60] |- |[I-A]|| Cas8a, Cas5 || rowspan="3" | Cas8 is a Subunit of the interference module that is important in targeting of invading DNA by recognizing the PAM sequence. Cas5 is required for processing and stability of crRNAs. || rowspan="3" |<ref name="Makarova2015"/>[61] |- |[I-B]|| Cas8b |- |[I-C]|| Cas8c |- |[I-D]|| Cas10d || rowspan="2" | contains a domain homologous to the palm domain of nucleic acid polymerases and nucleotide cyclases || rowspan="2" |[62][63] |- |[I-E]|| Cse1, Cse2 |- |[I-F]|| Csy1, Csy2, Csy3 || Type IF-3 have been implicated in CRISPR-associated transposons||<ref name="Makarova2015"/> |- |[I-G][ Subtype [I-G] was previously known as subtype [I-U] . <ref name="Makarova2015"/> ]|| GSU0054 || ||[64] |- | rowspan="7" | III | [sdash] || Cas10 || Homolog of Cas10d and Cse1. Binds CRISPR target RNA and promotes stability of the interference complex ||<ref name="pmid21756346"/>[65] |- |[III-A]|| Csm2 || Not determined ||<ref name="Makarova2015"/> |- |[III-B]|| Cmr5 || Not determined ||<ref name="Makarova2015"/> |- |[III-C]|| Cas10 or Csx11 || ||<ref name="Makarova2015"/><ref name="pmid30840895"/> |- |[III-D]|| Csx10 || ||<ref name="Makarova2015"/> |- |[III-E]|| || ||<ref name="Makarova2019"/> |- |[III-F]|| || ||<ref name="Makarova2019"/> |- | rowspan="4" | IV | [sdash] || Csf1 || ||<ref name="Makarova2019"/> |- |[IV-A]|| || ||<ref name="Makarova2019"/> |- |[IV-B]|| || ||<ref name="Makarova2019"/> |- |[IV-C]|| || ||<ref name="Makarova2019"/> |- | rowspan="23" | 2 || rowspan="4" | II | [sdash] ||Cas9 || Nucleases RuvC and HNH together produce DSBs, and separately can produce single-strand breaks. Ensures the acquisition of functional spacers during adaptation. ||[66][67] |- |[II-A]|| Csn2 || Ring-shaped DNA-binding protein. Involved in primed adaptation in Type II CRISPR system. ||[68] |- |[II-B]||Cas4 || Endonuclease that works with cas1 and cas2 to generate spacer sequences || [69] |- |[II-C]|| || Characterized by the absence of either Csn2 or Cas4 ||[70] |- | rowspan="12" | V | [sdash] || Cas12 || Nuclease RuvC. Lacks HNH. ||<ref name="Wright2016"/>[71] |- |[V-A]|| Cas12a (Cpf1) || Auto-processing pre-crRNA activity for multiplex gene regulation||<ref name="Makarova2019"/>[72] |- |[V-B]|| Cas12b (C2c1) || ||<ref name="Makarova2019"/> |- |[V-C]|| Cas12c (C2c3) || ||<ref name="Makarova2019"/> |- |[V-D]|| Cas12d (CasY) || ||<ref name="Makarova2019"/> |- |[V-E]|| Cas12e (CasX) || ||<ref name="Makarova2019"/> |- |[V-F]|| Cas12f (Cas14, C2c10) || ||<ref name="Makarova2019"/> |- |[V-G]|| Cas12g || ||<ref name="Makarova2019"/> |- |[V-H]|| Cas12h || ||<ref name="Makarova2019"/> |- |[V-I]|| Cas12i || ||<ref name="Makarova2019"/> |- |[V-K][ Subtype [V-K] was previously known as subtype [V-U] 5. <ref name="Makarova2019"/> ]|| Cas12k (C2c5) || Type V-K have been implicated in CRISPR-associated transposons.||<ref name="Makarova2019"/> |- |[V-U]|| C2c4, C2c8, C2c9 || ||<ref name="Makarova2019"/> |- | rowspan="7" | VI | [sdash] || Cas13 || RNA-guided RNase || <ref name="Wright2016"/>[73] |- |[VI-A]|| Cas13a (C2c2) || ||<ref name="Makarova2019"/> |- |[VI-B]|| Cas13b || ||<ref name="Makarova2019"/> |- |[VI-C]|| Cas13c || ||<ref name="Makarova2019"/> |- |[VI-D]|| Cas13d || ||<ref name="Makarova2019"/> |- |VI-X |Cas13x.1 |RNA dependent RNA polymerase, Prophylactic RNA-virus inhibition |[74] |- |VI-Y | | |<ref name="Xu-2021"/> |} [Clear]

    ** Mechanism

    CRISPR-Cas immunity is a natural process of bacteria and archaea.[75] CRISPR-Cas prevents bacteriophage infection, conjugation and natural transformation by degrading foreign nucleic acids that enter the cell.[76]

    *** Spacer acquisition

    When a microbe is invaded by a bacteriophage, the first stage of the immune response is to capture phage DNA and insert it into a CRISPR locus in the form of a spacer. Cas1 and Cas2 are found in both types of CRISPR-Cas immune systems, which indicates that they are involved in spacer acquisition. Mutation studies confirmed this hypothesis, showing that removal of Cas1 or Cas2 stopped spacer acquisition, without affecting CRISPR immune response.[77][78][79][80][81]

    Multiple Cas1 proteins have been characterised and their structures resolved.[82][83][84] Cas1 proteins have diverse amino acid sequences. However, their crystal structures are similar and all purified Cas1 proteins are metal-dependent nucleases/integrases that bind to DNA in a sequence-independent manner.<ref name="pmid22337052"/> Representative Cas2 proteins have been characterised and possess either (single strand) ssRNA-[85] or (double strand) dsDNA-[86][87] specific endoribonuclease activity.

    In the I-E system of _E. coli_, Cas1 and Cas2 form a complex where a Cas2 dimer bridges two Cas1 dimers.[88] In this complex, Cas2 performs a non-enzymatic scaffolding role,<ref name="pmid24793649"/> binding double-stranded fragments of invading DNA, while Cas1 binds the single-stranded flanks of the DNA and catalyses their integration into CRISPR arrays.[89][90][91] New spacers are usually added at the beginning of the CRISPR next to the leader sequence creating a chronological record of viral infections.[92] In _E. coli_ a histone like protein called integration host factor (IHF), which binds to the leader sequence, is responsible for the accuracy of this integration.[93] IHF also enhances integration efficiency in the type I-F system of _Pectobacterium atrosepticum ,_[94] but in other systems, different host factors may be required[95]

    **** Protospacer adjacent motifs (PAM)

    [Protospacer adjacent motif] Bioinformatic analysis of regions of phage genomes that were excised as spacers (termed protospacers) revealed that they were not randomly selected but instead were found adjacent to short (3–5 bp) DNA sequences termed protospacer adjacent motifs (PAM). Analysis of CRISPR-Cas systems showed PAMs to be important for type I and type II, but not type III systems during acquisition.<ref name="pmid16079334"/>[96][97][98][99][100] In type I and type II systems, protospacers are excised at positions adjacent to a PAM sequence, with the other end of the spacer cut using a ruler mechanism, thus maintaining the regularity of the spacer size in the CRISPR array.[101][102] The conservation of the PAM sequence differs between CRISPR-Cas systems and appears to be evolutionarily linked to Cas1 and the leader sequence.<ref name="pmid19143596"/>[103]

    New spacers are added to a CRISPR array in a directional manner,<ref name="pmid15758212"/> occurring preferentially,[104]<ref name="pmid18065539"/><ref name="pmid18065545"/>[105][106] but not exclusively, adjacent<ref name="pmid19239620"/><ref name="pmid22834906"/> to the leader sequence. Analysis of the type I-E system from _E. coli_ demonstrated that the first direct repeat adjacent to the leader sequence is copied, with the newly acquired spacer inserted between the first and second direct repeats.<ref name="pmid22402487"/><ref name="pmid23445770"/>

    The PAM sequence appears to be important during spacer insertion in type I-E systems. That sequence contains a strongly conserved final nucleotide (nt) adjacent to the first nt of the protospacer. This nt becomes the final base in the first direct repeat.<ref name="pmid22558257"/>[107][108] This suggests that the spacer acquisition machinery generates single-stranded overhangs in the second-to-last position of the direct repeat and in the PAM during spacer insertion. However, not all CRISPR-Cas systems appear to share this mechanism as PAMs in other organisms do not show the same level of conservation in the final position.<ref name="pmid23403393"/> It is likely that in those systems, a blunt end is generated at the very end of the direct repeat and the protospacer during acquisition.

    **** Insertion variants

    Analysis of _Sulfolobus solfataricus_ CRISPRs revealed further complexities to the canonical model of spacer insertion, as one of its six CRISPR loci inserted new spacers randomly throughout its CRISPR array, as opposed to inserting closest to the leader sequence.<ref name="pmid22834906"/>

    Multiple CRISPRs contain many spacers to the same phage. The mechanism that causes this phenomenon was discovered in the type I-E system of _E. coli_. A significant enhancement in spacer acquisition was detected where spacers already target the phage, even mismatches to the protospacer. This 'priming' requires the Cas proteins involved in both acquisition and interference to interact with each other. Newly acquired spacers that result from the priming mechanism are always found on the same strand as the priming spacer.<ref name="pmid22558257"/><ref name="pmid22771574"/><ref name="pmid22781758"/> This observation led to the hypothesis that the acquisition machinery slides along the foreign DNA after priming to find a new protospacer.<ref name="pmid22781758"/>

    *** Biogenesis

    CRISPR-RNA (crRNA), which later guides the Cas nuclease to the target during the interference step, must be generated from the CRISPR sequence. The crRNA is initially transcribed as part of a single long transcript encompassing much of the CRISPR array.<ref name="pmid20125085"/> This transcript is then cleaved by Cas proteins to form crRNAs. The mechanism to produce crRNAs differs among CRISPR-Cas systems. In type I-E and type I-F systems, the proteins Cas6e and Cas6f respectively, recognise stem-loops[109][110][111] created by the pairing of identical repeats that flank the crRNA.[112] These Cas proteins cleave the longer transcript at the edge of the paired region, leaving a single crRNA along with a small remnant of the paired repeat region.

    Type III systems also use Cas6, however, their repeats do not produce stem-loops. Cleavage instead occurs by the longer transcript wrapping around the Cas6 to allow cleavage just upstream of the repeat sequence.[113][114][115]

    Type II systems lack the Cas6 gene and instead utilize RNaseIII for cleavage. Functional type II systems encode an extra small RNA that is complementary to the repeat sequence, known as a trans-activating crRNA (tracrRNA).<ref name="Deltcheva2011"/> Transcription of the tracrRNA and the primary CRISPR transcript results in base pairing and the formation of dsRNA at the repeat sequence, which is subsequently targeted by RNaseIII to produce crRNAs. Unlike the other two systems, the crRNA does not contain the full spacer, which is instead truncated at one end.<ref name="pmid22949671"/>

    CrRNAs associate with Cas proteins to form ribonucleotide complexes that recognize foreign nucleic acids. CrRNAs show no preference between the coding and non-coding strands, which is indicative of an RNA-guided DNA-targeting system.<ref name="pmid19095942"/><ref name="Garneau2010"/><ref name="pmid19120484"/><ref name="pmid22558257"/>[116][117][118] The type I-E complex (commonly referred to as Cascade) requires five Cas proteins bound to a single crRNA.[119][120]

    *** Interference

    During the interference stage in type I systems, the PAM sequence is recognized on the crRNA-complementary strand and is required along with crRNA annealing. In type I systems correct base pairing between the crRNA and the protospacer signals a conformational change in Cascade that recruits Cas3 for DNA degradation.

    Type II systems rely on a single multifunctional protein, Cas9, for the interference step.<ref name="pmid22949671"/> Cas9 requires both the crRNA and the tracrRNA to function and cleave DNA using its dual HNH and RuvC/RNaseH-like endonuclease domains. Basepairing between the PAM and the phage genome is required in type II systems. However, the PAM is recognized on the same strand as the crRNA (the opposite strand to type I systems).

    Type III systems, like type I require six or seven Cas proteins binding to crRNAs.[121][122] The type III systems analysed from _S. solfataricus_ and _P. furiosus_ both target the mRNA of phages rather than phage DNA genome,<ref name="pmid23320564"/><ref name="pmid19945378"/> which may make these systems uniquely capable of targeting RNA-based phage genomes.<ref name="pmid22337052"/> Type III systems were also found to target DNA in addition to RNA using a different Cas protein in the complex, Cas10.[123] The DNA cleavage was shown to be transcription dependent.[124]

    The mechanism for distinguishing self from foreign DNA during interference is built into the crRNAs and is therefore likely common to all three systems. Throughout the distinctive maturation process of each major type, all crRNAs contain a spacer sequence and some portion of the repeat at one or both ends. It is the partial repeat sequence that prevents the CRISPR-Cas system from targeting the chromosome as base pairing beyond the spacer sequence signals self and prevents DNA cleavage.[125] RNA-guided CRISPR enzymes are classified as type V restriction enzymes.

    ** Evolution

    {{Infobox protein family | Symbol = CRISPR_Cas2 | Name = CRISPR associated protein Cas2 (adaptation RNase) | image = PDB 1zpw EBI.jpg | width = | caption = Crystal structure of a hypothetical protein tt1823 from Thermus thermophilus | Pfam = PF09827 | Pfam_clan = | InterPro = IPR019199 | SMART = | PROSITE = | MEROPS = | SCOP = | TCDB = | OPM family = | OPM protein = | CAZy = | CDD = cd09638 }}

    {{Infobox protein family | Symbol = CRISPR_Cse1 | Name = CRISPR-associated protein CasA/Cse1 (Type I effector DNase) | image = | width = | caption = | Pfam = PF09481 | Pfam_clan = | InterPro = IPR013381 | SMART = | PROSITE = | MEROPS = | SCOP = | TCDB = | OPM family = | OPM protein = | CAZy = | CDD = cd09729 }} {{Infobox protein family | Symbol = CRISPR_assoc | Name = CRISPR associated protein CasC/Cse3/Cas6 (Type I effector RNase) | image = PDB 1wj9 EBI.jpg | width = | caption = Crystal structure of a crispr-associated protein from Thermus thermophilus | Pfam = PF08798 | Pfam_clan = CL0362 | InterPro = IPR010179 | SMART = | PROSITE = | MEROPS = | SCOP = | TCDB = | OPM family = | OPM protein = | CAZy = | CDD = cd09727 }}

    The cas genes in the adaptor and effector modules of the CRISPR-Cas system are believed to have evolved from two different ancestral modules. A transposon-like element called casposon encoding the Cas1-like integrase and potentially other components of the adaptation module was inserted next to the ancestral effector module, which likely functioned as an independent innate immune system.[126] The highly conserved cas1 and cas2 genes of the adaptor module evolved from the ancestral module while a variety of class 1 effector cas genes evolved from the ancestral effector module.[127] The evolution of these various class 1 effector module cas genes was guided by various mechanisms, such as duplication events.[128] On the other hand, each type of class 2 effector module arose from subsequent independent insertions of mobile genetic elements.[129] These mobile genetic elements took the place of the multiple gene effector modules to create single gene effector modules that produce large proteins which perform all the necessary tasks of the effector module.<ref name="Shmakov-2017"/> The spacer regions of CRISPR-Cas systems are taken directly from foreign mobile genetic elements and thus their long-term evolution is hard to trace.[130] The non-random evolution of these spacer regions has been found to be highly dependent on the environment and the particular foreign mobile genetic elements it contains.[131]

    CRISPR-Cas can immunize bacteria against certain phages and thus halt transmission. For this reason, Koonin described CRISPR-Cas as a Lamarckian inheritance mechanism.[132] However, this was disputed by a critic who noted, "We should remember [Lamarck] for the good he contributed to science, not for things that resemble his theory only superficially. Indeed, thinking of CRISPR and other phenomena as Lamarckian only obscures the simple and elegant way evolution really works".[133] But as more recent studies have been conducted, it has become apparent that the acquired spacer regions of CRISPR-Cas systems are indeed a form of Lamarckian evolution because they are genetic mutations that are acquired and then passed on.[134] On the other hand, the evolution of the Cas gene machinery that facilitates the system evolves through classic Darwinian evolution.<ref name="Koonin-2016"/>

    *** Coevolution

    Analysis of CRISPR sequences revealed coevolution of host and viral genomes.[135]

    The basic model of CRISPR evolution is newly incorporated spacers driving phages to mutate their genomes to avoid the bacterial immune response, creating diversity in both the phage and host populations. To resist a phage infection, the sequence of the CRISPR spacer must correspond perfectly to the sequence of the target phage gene. Phages can continue to infect their hosts' given point mutations in the spacer.<ref name="pmid20072129"/> Similar stringency is required in PAM or the bacterial strain remains phage sensitive.<ref name="pmid18065545"/><ref name="pmid20072129"/>

    *** Rates

    A study of 124 _S. thermophilus_ strains showed that 26% of all spacers were unique and that different CRISPR loci showed different rates of spacer acquisition.<ref name="pmid18065539"/> Some CRISPR loci evolve more rapidly than others, which allowed the strains' phylogenetic relationships to be determined. A comparative genomic analysis showed that _E. coli_ and _S. enterica_ evolve much more slowly than _S. thermophilus_. The latter's strains that diverged 250,000 years ago still contained the same spacer complement.[136]

    Metagenomic analysis of two acid-mine-drainage biofilms showed that one of the analyzed CRISPRs contained extensive deletions and spacer additions versus the other biofilm, suggesting a higher phage activity/prevalence in one community than the other.<ref name="pmid17894817"/> In the oral cavity, a temporal study determined that 7–22% of spacers were shared over 17 months within an individual while less than 2% were shared across individuals.<ref name="pmid21149389"/>

    From the same environment, a single strain was tracked using PCR primers specific to its CRISPR system. Broad-level results of spacer presence/absence showed significant diversity. However, this CRISPR added three spacers over 17 months,<ref name="pmid21149389"/> suggesting that even in an environment with significant CRISPR diversity some loci evolve slowly.

    CRISPRs were analysed from the metagenomes produced for the Human Microbiome Project.[137] Although most were body-site specific, some within a body site are widely shared among individuals. One of these loci originated from streptococcal species and contained ≈15,000 spacers, 50% of which were unique. Similar to the targeted studies of the oral cavity, some showed little evolution over time.<ref name="pmid22719260"/>

    CRISPR evolution was studied in chemostats using _S. thermophilus_ to directly examine spacer acquisition rates. In one week, _S. thermophilus_ strains acquired up to three spacers when challenged with a single phage.[138] During the same interval, the phage developed single-nucleotide polymorphisms that became fixed in the population, suggesting that targeting had prevented phage replication absent these mutations.<ref name="pmid23057534"/>

    Another _S. thermophilus_ experiment showed that phages can infect and replicate in hosts that have only one targeting spacer. Yet another showed that sensitive hosts can exist in environments with high-phage titres.[139] The chemostat and observational studies suggest many nuances to CRISPR and phage (co)evolution.

    ** Identification

    CRISPRs are widely distributed among bacteria and archaea<ref name="pmid24728998"/> and show some sequence similarities.<ref name="pmid17442114"/> Their most notable characteristic is their repeating spacers and direct repeats. This characteristic makes CRISPRs easily identifiable in long sequences of DNA, since the number of repeats decreases the likelihood of a false positive match.[140]

    Analysis of CRISPRs in metagenomic data is more challenging, as CRISPR loci do not typically assemble, due to their repetitive nature or through strain variation, which confuses assembly algorithms. Where many reference genomes are available, polymerase chain reaction (PCR) can be used to amplify CRISPR arrays and analyse spacer content.<ref name="pmid18065539"/><ref name="pmid21149389"/>[141][142][143][144] However, this approach yields information only for specifically targeted CRISPRs and for organisms with sufficient representation in public databases to design reliable polymerase PCR primers. Degenerate repeat-specific primers can be used to amplify CRISPR spacers directly from environmental samples; amplicons containing two or three spacers can be then computationally assembled to reconstruct long CRISPR arrays.<ref name="pmid31729390"/>

    The alternative is to extract and reconstruct CRISPR arrays from shotgun metagenomic data. This is computationally more difficult, particularly with second generation sequencing technologies (e.g. 454, Illumina), as the short read lengths prevent more than two or three repeat units appearing in a single read. CRISPR identification in raw reads has been achieved using purely _de novo_ identification[145] or by using direct repeat sequences in partially assembled CRISPR arrays from contigs (overlapping DNA segments that together represent a consensus region of DNA)<ref name="pmid22719260"/> and direct repeat sequences from published genomes[146] as a hook for identifying direct repeats in individual reads.

    ** Use by phages

    Another way for bacteria to defend against phage infection is by having chromosomal islands. A subtype of chromosomal islands called phage-inducible chromosomal island (PICI) is excised from a bacterial chromosome upon phage infection and can inhibit phage replication.[147] PICIs are induced, excised, replicated, and finally packaged into small capsids by certain staphylococcal temperate phages. PICIs use several mechanisms to block phage reproduction. In the first mechanism, PICI-encoded Ppi differentially blocks phage maturation by binding or interacting specifically with phage TerS, hence blocking phage TerS/TerL complex formation responsible for phage DNA packaging. In the second mechanism PICI CpmAB redirects the phage capsid morphogenetic protein to make 95% of SaPI-sized capsid and phage DNA can package only 1/3rd of their genome in these small capsids and hence become nonviable phage.[148] The third mechanism involves two proteins, PtiA and PtiB, that target the LtrC, which is responsible for the production of virion and lysis proteins. This interference mechanism is modulated by a modulatory protein, PtiM, binds to one of the interference-mediating proteins, PtiA, and hence achieves the required level of interference.[149]

    One study showed that lytic ICP1 phage, which specifically targets _Vibrio cholerae_ serogroup O1, has acquired a CRISPR-Cas system that targets a _V. cholera_ PICI-like element. The system has 2 CRISPR loci and 9 Cas genes. It seems to be homologous to the I-F system found in _Yersinia pestis_. Moreover, like the bacterial CRISPR-Cas system, ICP1 CRISPR-Cas can acquire new sequences, which allows phage and host to co-evolve.[150][151]

    Certain archaeal viruses were shown to carry mini-CRISPR arrays containing one or two spacers. It has been shown that spacers within the virus-borne CRISPR arrays target other viruses and plasmids, suggesting that mini-CRISPR arrays represent a mechanism of heterotypic superinfection exclusion and participate in interviral conflicts.<ref name="pmid31729390"/>

    ** Applications

    [CRISPR gene editing]

    [name-list-style=vanc] CRISPR gene editing is a revolutionary technology that allows for precise, targeted modifications to the DNA of living organisms. Developed from a natural defense mechanism found in bacteria, CRISPR-Cas9 is the most commonly used system. Gene editing with CRISPR-Cas9 involves a Cas9 nuclease and an engineered guide RNA, which come together to allow for the precise "cutting" of one or both strands of DNA at specific locations within the genome.[152] It makes use of the cell's natural DNA repair systems, including non-homologous end joining, homology-directed repair, or mismatch repair, to modify, insert, or delete genetic material at these specific cut sites. <ref name="Koblan_2020" />[153] This technology has transformed fields such as genetics, medicine,[154][155][156] and agriculture,[157] offering potential treatments for genetic disorders, advancements in crop engineering, and research into the fundamental workings of life. However, its ethical implications and potential unintended consequences have sparked significant debate.[158][159]

    ** See also

    [colwidth=18em]

    - CRISPR activation - Anti-CRISPR - CRISPR/Cas Tools - CRISPR gene editing - The CRISPR Journal - "Designer baby " - DRACO - Gene knockout - Genome-wide CRISPR-Cas9 knockout screens - Glossary of genetics - Human germline engineering - _Human Nature_ (2019 documentary film) - MAGESTIC - _New_ eugenics - Prime editing - RNAi - SiRNA - Surveyor nuclease assay - Synthetic biology - Zinc finger [div col end]

    ** Notes

    [group=Note]

    [clear]

    ** References

    [reflist]

    ** Further reading

    [30em]

    - [ vauthors = Doudna J, Mali P ] - [ vauthors = Mohanraju P, Makarova KS, Zetsche B, Zhang F, Koonin EV, van der Oost J ] - [ vauthors = Sander JD, Joung JK ] - [ vauthors = Slaymaker IM, Gao L, Zetsche B, Scott DA, Yan WX, Zhang F ] - [ vauthors = Terns RM, Terns MP ] - [ vauthors = Westra ER, Buckling A, Fineran PC ] - [ vauthors = Andersson AF, Banfield JF ] - [ vauthors = Hale C, Kleppe K, Terns RM, Terns MP ] - [ vauthors = van der Ploeg JR ] - [ vauthors = van der Oost J, Brouns SJ ] - [ vauthors = Karginov FV, Hannon GJ ] - [ vauthors = Pul U, Wurm R, Arslan Z, Geissen R, Hofmann N, Wagner R ] - [ vauthors = Díez-Villaseñor C, Almendros C, García-Martínez J, Mojica FJ ] - [ vauthors = Deveau H, Garneau JE, Moineau S ] - [ vauthors = Koonin EV, Makarova KS ] - [title = The age of the red pen] - [ vauthors = Ran AF, Hsu PD, Wright J, Agarwala V, Scott DA, Zhang F ] [refend]

    ** External links

    [Commons category] [topic]

    - [url=https://fas.org/sgp/crs/misc/R44824.pdf ] - [url=https://www.ibiology.org/ibiomagazine/jennifer-doudna-genome-engineering-with-crispr-cas9-birth-of-a-breakthrough-technology.html ] - [title=Human Nature]

    *** Protein Data Bank

    - [Q46901] - [P76632] - [Q46899] - [Q46898] - [Q46897] [Glossaries of science and engineering] [Repeated sequence] [Biology]

    [authority control]

    [DEFAULTSORT:Crispr] Category:1987 in biotechnology Category:2015 in biotechnology Category:Biological engineering Category:Biotechnology Category:Genetic engineering Category:Genome editing Category:Jennifer Doudna Category:Molecular biology Category:Non-coding RNA Category:Repetitive DNA sequences Category:Immune system Category:Prokaryote genes