Surface representation of the Cas1-Cas2 complex, consisting of four Cas1 proteins (light and dark green) and two Cas2 proteins (yellow). Donor DNA (brown) is being integrated into the target DNA (blue), at a precise location in the CRISPR array, following a short leader sequence (red). [From Wright, A. V., et al. “Structures of the CRISPR Genome Integration Complex,” Science 357(6356), 1113–1118 (2017). [DOI:10.1126/science.aao0679. Reprinted with permission from AAAS*.]
Bacterial DNA is characterized by regions of clustered regularly interspaced short palindromic repeats (CRISPRs) and associated Cas proteins (CRISPR-associated endonucleases). The CRISPR-Cas system has revolutionized gene editing by vastly simplifying the insertion of short snippets of new (“donor”) DNA into very specific locations of target DNA. Researchers in this study have discovered how Cas proteins recognize their target locations with such great specificity. They used x-ray crystallography to solve the structures of Cas1 and Cas2—responsible for DNA-snippet capture and integration—as the proteins were bound to synthesized DNA strands designed to mimic different stages of the process. The research also demonstrated how the system works in its native context as part of a bacterial immune system and how Cas proteins act as general-purpose molecular recording devices—tools for encoding information in genomes.
Cas1 appears to have evolved from a more “promiscuous” (less selective) type of enzyme that catalyzes the movement of DNA sequences from one position to another (a transposase). At some point, Cas1 acquired an unusual degree of specificity for a particular location in the bacterial genome, the CRISPR array. This specificity is critical to the bacteria, both for acquiring immunity and for avoiding genome damage caused by the insertion of viral fragments at the wrong location. The researchers wanted to learn how Cas1-Cas2 proteins recognize the target sequence to enable comparison with previously studied transposases and integrases (i.e., enzymes that catalyze the integration of donor DNA into target DNA) and to determine whether the proteins can be altered to recognize new sequences for custom applications.
The researchers crystallized Cas1-Cas2 in complex with preformed DNA strands that mimicked reaction intermediates and products. X-ray crystallography revealed that the structures showed substantial distortions in the target DNA, but there were surprisingly few sequence-specific contacts with the Cas1-Cas2 complex, and the DNA’s resulting flexibility produced disorder in the crystals. Attempts to model the DNA across the disordered sections showed that the DNA had to be even more distorted. Cryoelectron microscopy experiments, coupled with the crystallography data, confirmed that an accessor protein called the integration host factor (IHF) introduces an additional sharp bend in the DNA, bringing an upstream recognition sequence into contact with Cas1 to increase both the specificity and efficiency of integration. The architecture of the CRISPR integration complex suggests that subtle adjustment of the distance between Cas1 active sites could reprogram the system to recognize different target sites. Changes in its architecture could be exploited, thereby, for genome tagging applications and also may explain the natural divergence of CRISPR arrays in bacteria.
Wright, A. V., et al. “Structures of the CRISPR Genome Integration Complex,” Science 357(6356), 1113–1118 (2017). [DOI:10.1126/science.aao0679].
Instruments and Facilities Used: X-ray macromolecular crystallography; beamline 8.3.1; protein crystallography (PX); and scattering/diffraction at the Advanced Light Source at Lawrence Berkeley National Laboratory; Stanford Synchrotron Radiation Light Source 9-2 beamline.