Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
53 views18 pages

Wandera Etal Deep Learning

This study reports on a deep learning algorithm called DeepAcr that was developed to predict anti-CRISPR proteins (Acrs) across different CRISPR-Cas systems. DeepAcr analyzed genomic data and predicted numerous putative Acrs spanning almost all CRISPR-Cas types and subtypes, including over 7,000 putative type IV and VI Acrs not predicted by other algorithms. The researchers then performed a cell-free screen and identified a potent inhibitor of Cas13b nucleases from type VI-B CRISPR-Cas systems, which they named AcrVIB1. Testing showed that AcrVIB1 blocks Cas13b-mediated defense against a targeted plasmid and lytic phage by

Uploaded by

August Thomasen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
53 views18 pages

Wandera Etal Deep Learning

This study reports on a deep learning algorithm called DeepAcr that was developed to predict anti-CRISPR proteins (Acrs) across different CRISPR-Cas systems. DeepAcr analyzed genomic data and predicted numerous putative Acrs spanning almost all CRISPR-Cas types and subtypes, including over 7,000 putative type IV and VI Acrs not predicted by other algorithms. The researchers then performed a cell-free screen and identified a potent inhibitor of Cas13b nucleases from type VI-B CRISPR-Cas systems, which they named AcrVIB1. Testing showed that AcrVIB1 blocks Cas13b-mediated defense against a targeted plasmid and lytic phage by

Uploaded by

August Thomasen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Technology

Anti-CRISPR prediction using deep learning reveals


an inhibitor of Cas13b nucleases
Graphical abstract Authors
Katharina G. Wandera,
Omer S. Alkhnbashi,
Harris v.I. Bassett, ..., Anzhela Migur,
Rolf Backofen, Chase L. Beisel

Correspondence
[email protected]
(R.B.),
[email protected] (C.L.B.)

In brief
Wandera et al. applied deep learning to
predict anti-CRISPR protein candidates
associated with diverse subtypes of
CRISPR-Cas immune systems. The
algorithm identified one protein then
shown to inhibit the Cas13b nuclease
from type VI-B CRISPR-Cas systems,
expanding the known range of phage-
encoded inhibitors as part of the bacteria-
phage arms race.

Highlights
d Deep learning predicts Acr candidates

d The approach, DeepAcr, identified candidates across


CRISPR-Cas types and subtypes

d A cell-free screen revealed AcrVIB1, an inhibitor of Cas13b


nucleases

d AcrVIB1 principally inhibits Cas13b upstream of


ribonucleoprotein complex formation

Wandera et al., 2022, Molecular Cell 82, 2714–2726


July 21, 2022 ª 2022 Elsevier Inc.
https://doi.org/10.1016/j.molcel.2022.05.003 ll
ll

Technology
Anti-CRISPR prediction using deep learning
reveals an inhibitor of Cas13b nucleases
Katharina G. Wandera,1,6 Omer S. Alkhnbashi,2,6 Harris v.I. Bassett,1 Alexander Mitrofanov,3 Sven Hauns,3
Anzhela Migur,1 Rolf Backofen,3,4,* and Chase L. Beisel1,5,7,*
1Helmholtz €rzburg, Germany
Institute for RNA-Based Infection Research (HIRI), Helmholtz Centre for Infection Research (HZI), 97080 Wu
2Information and Computer Science Department, King Fahd University of Petroleum and Minerals, Dhahran 31261, Saudi Arabia
€t Freiburg, 79098 Freiburg, Germany
3Universita
4Signalling Research Centres BIOSS and CIBSS, University of Freiburg, 79098 Freiburg, Germany
5Medical Faculty, University of Wu€ rzburg, 97080 Wu
€ rzburg, Germany
6These authors contributed equally
7Lead contact

*Correspondence: [email protected] (R.B.), [email protected] (C.L.B.)


https://doi.org/10.1016/j.molcel.2022.05.003

SUMMARY

As part of the ongoing bacterial-phage arms race, CRISPR-Cas systems in bacteria clear invading phages
whereas anti-CRISPR proteins (Acrs) in phages inhibit CRISPR defenses. Known Acrs have proven extremely
diverse, complicating their identification. Here, we report a deep learning algorithm for Acr identification that
revealed an Acr against type VI-B CRISPR-Cas systems. The algorithm predicted numerous putative Acrs
spanning almost all CRISPR-Cas types and subtypes, including over 7,000 putative type IV and VI Acrs
not predicted by other algorithms. By performing a cell-free screen for Acr hits against type VI-B systems,
we identified a potent inhibitor of Cas13b nucleases we named AcrVIB1. AcrVIB1 blocks Cas13b-mediated
defense against a targeted plasmid and lytic phage, and its inhibitory function principally occurs upstream of
ribonucleoprotein complex formation. Overall, our work helps expand the known Acr universe, aiding our un-
derstanding of the bacteria-phage arms race and the use of Acrs to control CRISPR technologies.

INTRODUCTION but also to control CRISPR-Cas nucleases in their diverse appli-


cations (Marino et al., 2020). To date, numerous Acrs have been
Bacteria and bacterial viruses called phages have been locked in discovered that not only bear little similarity to each other but
an evolutionary arms race that has led each side to develop an also act through different mechanisms of inhibition (Bondy-Den-
arsenal of offensive and defensive strategies. On the bacterial omy et al., 2015; Pawluk et al., 2018; Trasanidou et al., 2019). Es-
side, one of many defenses is conferred by CRISPR-Cas sys- tablished Acrs also have been associated with only ten subtypes
tems, the only known adaptive immune systems in bacteria (Bar- of CRISPR-Cas systems (i.e., I-C, I-D, I-E, I-F, II-A, II-C, III-A,
rangou et al., 2007; Nussenzweig and Marraffini, 2020). These III-B, V-A, and VI-A) (Bondy-Denomy et al., 2018). As Acrs likely
remarkably diverse systems comprising two classes, six types, exist for the dozens of remaining CRISPR-Cas subtypes, identi-
and over thirty subtypes and variants utilize RNA-guided nucle- fying and characterizing these Acrs remains a major focus.
ases to clear infections or shut down infected cells upon recog- Following the discovery of Acrs in Pseudomonas aeruginosa
nition of complementary genetic material (Brouns et al., 2008; (Bondy-Denomy et al., 2013), numerous identification strategies
Garneau et al., 2010; Jackson et al., 2017; Meeske et al., 2019; have been developed that have greatly expanded the known uni-
Makarova et al., 2020). On the phage side, anti-CRISPR proteins verse of Acrs. Originally, Acrs were identified by screening for
(Acrs) inhibit different steps in CRISPR-based immunity (Bondy- phages that infect cells despite being targeted by an endoge-
Denomy et al., 2013; Bondy-Denomy, 2018). Phages encoding nous CRISPR-Cas system and then by identifying the respon-
Acrs can immediately circumvent CRISPR-Cas defenses (Lin sible coding regions (Pawluk et al., 2014; Hynes et al., 2017).
et al., 2020; Meeske et al., 2020), or they can inhibit these de- The identified Acrs were generally short and hydrophobic but
fenses while being cleared, mediating the next wave of phage otherwise bore virtually no resemblance to each other (Hynes
infection (Borges et al., 2018; Landsberger et al., 2018; Cheval- et al., 2017, 2018; Rauch et al., 2017; Lee et al., 2018), preventing
lereau et al., 2020). the identification of new Acrs based on sequence information
There has been an ongoing effort to identify new Acrs to not alone. Instead, researchers noticed that the Acrs were often en-
only reveal the complexities of the bacteria-phage arms race coded next to proteins with a helix-turn-helix (HTH) motif later

2714 Molecular Cell 82, 2714–2726, July 21, 2022 ª 2022 Elsevier Inc.
ll
Technology

shown to regulate Acr expression (Birkholz et al., 2019; Stanley closely related genomes using existing tools (Figure 1B) (Padilha
et al., 2019). Beyond these Acr-associated (Aca) proteins, Acrs et al., 2020, 2021; Alkhnbashi et al., 2021). Acrs with no CRISPR-
were often present in genomes with CRISPR-Cas systems en- Cas subtype or multiple associated subtypes are labeled
coding self-targeting spacers, where the Acr was responsible ‘‘unassigned.’’
for preventing lethal self-targeting (Rauch et al., 2017; Watters The resulting model was trained by employing 420 known Acrs
et al., 2018). These insights led to a guilt-by-association derived from multiple CRISPR-Cas subtypes available during
approach used to identify Acr candidates in prophage regions. initial model construction (i.e., I-D, I-E, I-F, II-A, and II-C)
Eventually, machine learning was applied that leveraged guilt- (Bondy-Denomy et al., 2013; Pawluk et al., 2014, 2016; Rauch
by-association as well as a few features of known Acrs, such et al., 2017; He et al., 2018; Marino et al., 2018) as well as 420
as length, hydrophobicity, amino acid composition, and any non-Acrs represented by diverse small accessory proteins asso-
present motifs (Gussow et al., 2020; Wang et al., 2020; Huang ciated with type III CRISPR-Cas systems (Figure 1A; Table S2)
et al., 2021). Although these approaches have expanded the (Shah et al., 2019). Of these Acrs and non-Acrs, 15% (126 pro-
known set of Acrs, they have only led to the validation of Acrs teins) were used to validate the performance of the model
associated with CRISPR-Cas subtypes in which Acrs were (Table S2). DeepAcr achieved a performance accuracy of 96%
already known. using the LSTM network, 94% using the linear network, and
Here, we report a deep-learning algorithm called DeepAcr that 95% using the GRU network when testing a withheld dataset
predicts Acr candidates based purely on protein sequence infor- (Figure S2).
mation. The predictions unique to DeepAcr were mainly associ-
ated with subtypes lacking any established Acrs (e.g., IV-A and RESULTS
V-B), whereas the systematic screening of the highest-scoring
candidates against type VI-B CRISPR-Cas systems led to the DeepAcr uniquely predicts candidates outside of
discovery of an Acr that principally inhibited the Cas13b subtypes with known Acrs
nuclease prior to complexing with its crRNA. Our algorithm and Feeding DeepAcr protein sequences from 80,009 draft and com-
the use of deep learning are expected to further expand the plete bacterial and phage genomes, the algorithm identified
known universe of Acrs. 1,089,152 Acr candidates with medium (R 0.65) or high (R
0.8) confidence scores (Figure 2A; Tables S3 and S4). These
DESIGN candidates were principally associated with type I-E CRISPR-
Cas systems, likely reflecting the large fraction of known Acrs
A deep-learning approach combines compositional and against this subtype (Figure 2B). However, candidates were
sequence-related features without relying on genomic associated with many more subtypes than I-E, including sub-
associations types such as VI-B or IV-A with no established Acrs. The number
Given the growing list of validated Acrs and the remaining num- of Acr candidates within these subtypes became even more pro-
ber of CRISPR-Cas subtypes unassociated with Acrs (Bondy- nounced when normalizing to the number of CRISPR-Cas sys-
Denomy et al., 2018), we sought to develop a distinct approach tems within each subtype identified in the genomes used for
for Acr prediction. Unlike most of the prior approaches that relied Acr prediction (Figure S3).
on genomic associations (i.e., guilt-by-association), we focused To compare our candidates with those from other available
on a large set of features derived only from the assessed protein. prediction algorithms, we applied the same set of protein inputs
We also utilized deep learning instead of traditional machine to AcrDB (Huang et al., 2021) and a previously reported machine
learning to apply multiple learning structures for Acr prediction learning algorithm we will call the Gussow method (Gussow
from sequence-related features (Eitzinger et al., 2020; Gussow et al., 2020). The candidates extensively overlapped with the
et al., 2020). The resulting deep-learning algorithm, DeepAcr, predictions from DeepAcr, a remarkable result given that
functions through a series of defined steps inspired by a prior al- DeepAcr relies only on the protein sequence and no genomic
gorithm (Guo et al., 2019) (Figures 1A and S1). Initially, the as- features. Within this overlap, all 37,289 candidates from AcrDB
sessed protein sequence is converted into a feature matrix using were predicted by DeepAcr and the Gussow method, whereas
one-hot encoding, which converts amino acid sequences into 1,044,013 candidates were shared between the Gussow method
numerical values. Additionally, a set of twelve features capturing and DeepAcr. Therefore, our deep-learning algorithm could
properties of the entire protein (e.g., protein length and instability closely recapitulate predictions by existing algorithms despite
index) is extracted (Table S1). The one-hot encoding is then fed excluding commonly used genomic features such as self-target-
into a bidirectional recurrent cell incorporating three neural net- ing spacers or a flanking and properly oriented Aca.
works as the learning structures: long short-term memory Despite the extensive overlap, there were numerous candi-
(LSTM), linear, and gated recurrent unit (GRU). The output of dates uniquely predicted by DeepAcr (7,850) and the Gussow
these networks is concatenated and combined with the protein method (1,019) (Figure 2A). The Gussow method’s candidates
features. A multilayer perceptron (MLP) then converts these in- were associated with the I-E and I-F subtypes already possess-
puts into a confidence score (0 and 1 for lowest and highest con- ing a large cohort of Acrs (Figure 2C). However, the candidates
fidence, respectively) reflecting the certainty that the input pro- unique to DeepAcr were heavily enriched in subtypes with no es-
tein is an Acr. After predicting putative Acrs using DeepAcr, tablished Acrs. DeepAcr further predicted putative Acrs within
each candidate is paired with a CRISPR-Cas subtype by identi- the VI-A subtype in which seven Acrs have been reported (Lin
fying a CRISPR-Cas system in the Acr-encoding genome or et al., 2020; Meeske et al., 2020), although none of these were

Molecular Cell 82, 2714–2726, July 21, 2022 2715


ll
Technology

A Learning data Model Architecture


Protein sequences Prepare input for Deep Learning Predicted
of proteins with Deep Learning Networks Score
known function approach

Training & Testing concatenation

Multilayer perceptron
Protein Properties High confidence
data selection score ≥ 0.8
Protein Length
Molecular Weight
Instability Index Medium confidence
Ensemble Models
Isoelectric Point score ≥ 0.65
Feature Learning
Charged Residues
Acrs from Accessory Extinction Coefficients
anti-CRISPRdb proteins Average of hydropathy Low confidence
Fractions of AC score < 0.65

...
...
...
420 Positive 420 Negative
...

...
...
Proteins Proteins

...
...
One-Hot Encoding

Balance A
Data Sampling C
D LSTM GRU Linear
E
Best performing
...
...
...
...
...
...
...
...

...
70% 15% 15% model
X
Train Test Y

Validate

Val Pro
Accuracy His
Val
Glu
Asp Cys
Ala Gln

B
Present subtype
assigned to Acr
Yes
candidate Present subtype
CRISPR-Cas system
Acr candidate assigned to Acr
present in genome? Yes
Closely related genome candidate
No
containing single
CRISPR-Cas system?
No Acr candidate
remains unassigned

Figure 1. DeepAcr applies deep learning to predict Acrs from input protein sequences
(A) Using DeepAcr to predict an Acr confidence score from an input protein sequence. Model training using a set of known Acrs and non-Acrs. The Acrs were
derived from anti-CRISPRdb (Dong et al., 2018) and are listed in Table S2. The non-Acrs were derived from accessory proteins from type III CRISPR-Cas systems
that are similar in size to known Acrs. LSTM, long short-term memory; GRU, gated recurrent unit.
(B) Assignment of a CRISPR-Cas subtype to a predicted Acr.
See also Figure S1; Table S1; Methods S1.

2716 Molecular Cell 82, 2714–2726, July 21, 2022


ll
Technology

A Figure 2. DeepAcr predicts Acrs strongly overlapping with prior


methods while also identifying candidates in CRISPR-Cas subtypes
without an associated Acr
(A) Overlap in Acr predictions between DeepAcr, the Gussow method, and
7,850 AcrDB. AcrDB combines three different prediction methods: AcrFinder,
1,044,013 1,019
AcRanker, and PaCRISPR. The displayed numbers include redundant se-
quences appearing in different genomes. The complete list of redundant and
DeepAcr Gussow
37,289 non-redundant sequences can be found in Tables S3 and S4.
(B) CRISPR-Cas subtypes associated with the Acrs predicted by DeepAcr with
AcrDB high and medium confidence. Candidates without a single subtype are classi-
fied as unassigned. Subtypes lacking an established Acr are shown in red.
High confidence score candidates associated with the VI-B subtype (in bold)
B were screened experimentally (see Figure 4).
DeepAcr - all candidates (C) CRISPR-Cas subtypes associated with Acrs uniquely predicted by
300,000
DeepAcr and the Gussow method.
(D) Amino acid length distribution of known Acrs and Acrs predicted by
Number of predicted Acrs

DeepAcr.
See also Figures S2 and S3; Tables S2, S3, and S4; Methods S1.
200,000

part of our training set due to the timing of their publication. The
candidates predicted by DeepAcr also exhibited a distinct length
100,000 distribution (Figure 2D), with 74% matching the distribution of
known Acrs (50–150 aa) and 24% exhibiting longer lengths
2,913
(151–300 aa). This large set of candidates offers an opportunity
0
to identify novel Acrs, particularly those associated with sub-
types in which none are known.
d
I-B
I-C
I-D
I-A

I-E
I-F

B
III C

-B

III C
-D
-A
A

-A

A
VI B
-A
na V B
ig C
ne
II-
II-

V-
V-
II-

-
ss I-
III

IV
III

VI

DeepAcr scoring aligns with TXTL characterization of


U

CRISPR-Cas subtype
known Acrs against Cas13a nucleases
C 750 Given the large number of predictions, we needed a rapid means
Gussow - unique candidates
to identify candidates that can inhibit defense by a CRISPR-Cas
500
system’s effector nuclease. We turned to cell-free transcription-
Number of predicted Acrs

250
translation (TXTL) systems that recapitulate transcription and
translation in a specially prepared E. coli cell lysate (Shin and
0 Noireaux, 2012; Garamella et al., 2016). As part of a reaction,
2,500 we added DNA constructs encoding a Cas nuclease, a targeting
DeepAcr - unique candidates
2,000 or non-targeting guide RNA (gRNA) and a targeted reporter
1,500 plasmid encoding a GFP variant (deGFP) that efficiently ex-
1,000 presses in TXTL (Garamella et al., 2016) to the lysate (Figure 3A).
500 The lysate then expresses the nuclease and gRNA, forming a
0 ribonucleoprotein (RNP) complex that cleaves the target and si-
lences deGFP expression. Measuring changes in deGFP fluores-
-C

-B
-C
-D
-A
-B

-A

-A
B
C

B
A
I-C
I-D

A
I-A
I-B

I-E
I-F

II-

V-
V-
II-

II-

VI
III
III

IV

VI
III

VI
III

cence over time then provides a dynamic readout of nuclease


CRISPR-Cas subtype
expression and activity (Maxwell et al., 2018; Marshall et al.,
D 2020) (Figure S4). Including a DNA construct encoding a putative
150 published Acrs
Acr allows the measurement of the inhibition of nuclease activity
100 based on recovered deGFP fluorescence. This approach has
Number of proteins

50
been successfully used to screen for novel Acrs and assess
inhibitory activity against different Cas nucleases from type II
0
100,000
and V CRISPR-Cas systems (Marshall et al., 2018; Watters
predicted Acrs
et al., 2018; Wandera et al., 2020).
75,000
We specifically focused on type VI CRISPR-Cas systems and
50,000
their Cas13 nucleases, as many of the predicted Acrs fell within
25,000 multiple type VI subtypes without any reported Acrs (Figure 2C).
0 We initially devised a TXTL-based assay to measure the inhibi-
100 200 300 400 tion of Cas13 activity following our prior work (Marshall et al.,
Protein Length (aa) 2018; Wandera et al., 2020). In this case, we can measure on-
target and collateral RNA cleavage by targeting the deGFP tran-
script. Building on recent reports of eight distinct Acrs that inhibit
Cas13a (Lin et al., 2020; Meeske et al., 2020), we employed our

Molecular Cell 82, 2714–2726, July 21, 2022 2717


ll
Technology

A
Non-targeting
target RNA
Acr
Cas13a

Fluorescence
Targeting
Cas ~16 hours
nuclease
Non-targeting
+ Acr
gRNA gRNA
Acr ~16 hours

deGFP Targeting
+ Acr
PFS
Time

B C
Protein AcRanker PaCRISPR AcrHub DeepAcr
Lwa
AcrVIA1′ Yes Yes Yes 0.74 (medium)

AcrVIA2′ Yes Yes Yes 0.48 (low)


Lsh
AcrVIA3′ Yes Yes Yes 0.49 (low)
1′

2′

3′

4′

5′

6′

7′

AcrVIA4′ Yes Yes Yes 0.41 (low)


1
IA

IA

IA

IA

IA

IA

IA

IA
rV

rV

rV

rV

rV

rV

rV

rV
Ac

Ac

Ac

Ac

Ac

Ac

Ac

Ac

AcrVIA5′ Yes Yes Yes 0.49 (low)


Anti−CRISPR protein
AcrVIA6′ Yes Yes Yes 0.28 (low)

AcrVIA7′ Yes Yes Yes 0.47 (low)


% inhibition of nuclease activity non-specific
GFP inhibition AcrVIA1 No Yes Yes 0.99 (high)
0 25 50 75 100

Figure 3. Reported Cas13a Acrs characterized using TXTL align with scoring by DeepAcr
(A) Overview of the TXTL assay. As part of the assay, an Acr pre-expressed in one TXTL reaction is combined in a fresh reaction with constructs encoding a
Cas13a nuclease, a targeting or non-targeting gRNA, and a targeted deGFP reporter. deGFP fluorescence is then measured over time. Nuclease activation and
collateral RNA cleavage would lead to lower fluorescence, whereas the inhibition of nuclease expression or activity would restore fluorescence. PFS, protospacer
flanking sequence.
(B) Heatmap of inhibitory strength by reported Cas13a Acrs against LwaCas13a and LshCas13a in the TXTL assay. AcrVIA10 through AcrVIA70 comes from (Lin
et al., 2020). AcrVIA1 comes from (Meeske et al., 2020). The boxes with a white X represent non-specific inhibition of deGFP expression by the Acr. Values repre-
sent the average of four independent experiments. See Figure S4 for representative time courses.
(C) Acr predictions for three existing machine learning methods as well as for DeepAcr. Values represent the confidence scores output by DeepAcr.
See also Figure S4; Table S7.

TXTL assay to test each Acr against the Cas13a nuclease from LshCas13a (85%). Interestingly, RNA cleavage was also in-
Leptotrichia wadei (LwaCas13a) (Figure 3B). AcrVIA30 from one hibited by AcrVIA10 (84%), AcrVIA40 (77%), and AcrVIA50
study (Lin et al., 2020) yielded inconclusive results due to non- (60%). These results suggest that the seven reported AcrVIA
specific inhibition of deGFP expression, as we have observed proteins (AcrVIA10 –AcrVIA70 ) may exhibit inhibitory activity,
with other Acrs (Marshall et al., 2018). The six other Acrs from although the major conclusion is that the specific inhibitory activ-
the same study failed to inhibit RNA cleavage by LwaCas13a, ities originally reported for these Acrs could not be replicated (Lin
even though the same nuclease was reported to be robustly in- et al., 2020). We also conclude that TXTL can be used to assess
hibited by these Acrs in the original study (Lin et al., 2020). Our the inhibitory activity of putative Acrs against Cas13 nucleases.
results parallel recent work that also failed to observe inhibition The confirmed validity of AcrVIA1 and the uncertain validity of
by the seven Acrs in cell-based assays (Meeske et al., 2021). the other seven Acrs raise the question: what do DeepAcr and
No inhibition of LwaCas13a was observed in our TXTL-based the other Acr prediction algorithms predict for these Acrs?
assay with the Acr from the separate study (AcrVIA1), although Feeding each sequence into DeepAcr, the algorithm assigned
this Cas13a:Acr combination had not been tested (Meeske a high confidence score to AcrVIA1 (0.99) and a moderate con-
et al., 2020). Therefore, we tested a separate nuclease from Lep- fidence score to AcrVIA10 (0.74), the two Acrs exhibiting the
totrichia shahii (LshCas13a) as part of the same TXTL-based strongest inhibition of LshCas13a. DeepAcr assigned low confi-
assay (Figure 3B). AcrVIA1 strongly inhibited RNA cleavage by dence scores (0.28–0.49) to the remaining six Acrs (Figure 3C).

2718 Molecular Cell 82, 2714–2726, July 21, 2022


ll
Technology

Interestingly, of the available Acr prediction tools, AcRanker pre- ence and phage defense. In the first assay, E. coli expressing
dicted all but AcrVIA1 as Acrs, whereas PaCRISPR and AcrHub PbuCas13b, a gRNA, and AcrVIB_5 were transformed with a
considered all as Acrs. Therefore, our TXTL results and those as- plasmid constitutively expressing the gRNA target (Figure 5A).
sessing the validity of AcrVIA10 –AcrVIA70 suggest that DeepAcr Under targeting conditions and in the absence of nuclease inhi-
can predict Acrs with enhanced accuracy over existing predic- bition, widespread collateral RNA cleavage by activated Cas13b
tion tools. induces cellular dormancy (Abudayyeh et al., 2016; Meeske
et al., 2019), leading to a drop in the number of colonies. As ex-
TXTL-based screening of Acr candidates against pected, targeting in the absence of AcrVIB_5 led to more than a
Cas13b reveals a potent inhibitor 100-fold reduction in colonies compared with a non-targeting
With the TXTL-based assay established, we were positioned to control, whereas expressing AcrVIB_5 resulted in similar colony
begin screening Acr candidates. We focused on candidates counts under targeting and non-targeting conditions (Figure 5B).
within the subtypes of type VI CRISPR-Cas systems lacking In the second assay, E. coli cells expressing the same compo-
any reported Acrs. The VI-B subtype and its Cas13b nuclease nents were infected with the lytic RNA phage MS2 followed by
were particularly attractive, in part because a number of these measuring plaque formation by the infecting phage (Figure 5C).
nucleases have been experimentally characterized and used Paralleling the plasmid interference assay, targeting in the
as technologies for gene silencing and RNA editing (Cox et al., absence of AcrVIB_5 eliminated all discernible plaques, whereas
2017; Kellner et al., 2019). From the highest-scoring candidates expressing AcrVIB_5 resulted in restored plaque formation.
associated with VI-B systems from DeepAcr, we chose 77 to When expressing AcrVIB_5, the plaques were more opaque un-
assess in our TXTL-based assay (Figure 4A; Table S5). These der targeting conditions, indicative of residual immune activity
candidates were variably predicted as Acrs by AcrRanker, (Figure 5D). Therefore, AcrVIB_5 can inhibit the activity of
PaCRISPR, and AcrHub, and only a few were flanked by a Cas13b in vivo, including under conditions in which the Acr pro-
predicted aca gene (Table S5). Given that many known Acrs motes phage infection. We now adopt the name AcrVIB1
can exhibit a narrow inhibitory spectrum (Shin et al., 2017; following the naming convention established for Acrs (Bondy-
Watters et al., 2018; Pinilla-Redondo et al., 2020), we incorpo- Denomy et al., 2018).
rated three phylogenetically distinct Cas13b nucleases from
Porphyromonas gingivalis (PgiCas13b), Prevotella buccae AcrVIB1 is a 115-residue protein consistently encoded
(PbuCas13b), and Bergeyella zoohelcum (BzoCas13b). Of the downstream of a conserved HTH-containing gene
tested Acr:Cas13b combinations, one candidate (AcrVIB_5) ex- With AcrVIB1 validated as an Acr against type VI-B CRISPR-Cas
hibited virtually complete inhibition (96%) against PbuCas13b systems, we explored the properties of this protein and its homo-
(Figures 4A and S5). logs as well as the genomic contexts in which they are found.
To initially validate the screening hit, we repeated the TXTL AcrVIB1 is 115 amino acids in length and contains no known mo-
assay using different dilutions of pre-expressed AcrVIB_5 (Fig- tifs (Figure 6A). PSI-BLAST revealed only four non-identical ho-
ure 4B). We also introduced a second reporter plasmid encoding mologs sharing between 90% and 95% amino acid sequence
mCherry lacking the gRNA target, which would be silenced identity with AcrVIB1, where the next closest search hit
through collateral RNA cleavage triggered by targeting the (A5Z863) shared only 12% identity (Altschul et al., 1990). Across
deGFP transcript. Adding pre-expressed Acr inhibited the the related homologs associated with a genomic sequence, all
silencing of both deGFP and mCherry in a dose-dependent were found in Riemerella anatipestifer genomes and fell within
manner (Figure 4C), paralleling the results from the large-scale predicted prophage regions. The homologs were also flanked
screen. We also subjected two other Cas13b homologs upstream by a conserved gene encoding an HTH domain (Fig-
(PguCas13b from Porphyromonas gulae and RanCas13b ure 6B). Although this domain is a standard feature of aca genes,
from Riemerella anatipestifer) as well as two Cas13a nucl- aca genes normally sit downstream of putative acr genes, an
eases (LshCas13a and LwaCas13a) to the TXTL-based orientation heavily weighted by the Gussow method for Acr pre-
assay with AcrVIB_5 (Figure S6). We found that only one addi- diction (Gussow et al., 2020). We conclude that AcrVIB1 exhibits
tional nuclease, the Cas13b from Porphyromonas gulae compositional and genomic hallmarks of other Acrs.
(PguCas13b), was partially inhibited by AcrVIB_5 in a dose-
dependent manner. As PguCas13b shares 52% identity with AcrVIB1 principally inhibits upstream of Cas13b binding
PbuCas13b within the set, we conclude that AcrVIB_5 exhibits the crRNA
a narrow inhibitory spectrum. Interestingly, the strain encoding Our TXTL and cell-based assays reported a reduction in RNA
AcrVIB_5 encodes a type VI-B CRISPR-Cas system, although cleavage by AcrVIB1, although any of the biomolecular steps
the associated nuclease (RanCas13b) was not inhibited by leading to RNA cleavage—from nuclease and gRNA expression
AcrVIB_5 in TXTL (Figure S6). Given that AcrVIB_5 and and RNP complex formation to target recognition and HEPN
RanCas13b are present in the same genome and RanCas13b activation—could be mechanistic targets. Therefore, we took
and PbuCas13b bear little similarity (43.4%), we speculate that steps to explore which of these steps was inhibited by
the co-occurrence of the Acr and Cas13b are coincidental rather AcrVIB1. Paralleling our prior work evaluating the timing of
than representative of direct inhibition of the endogenous VI-B Cas9:sgRNA complex formation (Marshall et al., 2018), we as-
CRISPR-Cas system. sessed different mechanistic steps by changing the set of con-
To validate the inhibitory activity of AcrVIB_5 outside of TXTL, structs added to a given TXTL reaction as well as the timing of
we performed two cell-based assays based on plasmid interfer- when each construct is added. We began by pre-expressing

Molecular Cell 82, 2714–2726, July 21, 2022 2719


ll
Technology

A
Pgi
Pbu
Bzo
1 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75
VI-B Acr candidate

Inhibition of
non-specific GFP inhibition
nuclease activity (%)
0 25 50 75 100

B C
deGFP mCherry
Cas13b gRNA Acr

100 100
nuclease activity (%)
Inhibition of

mCherry deGFP
50 50

0 0

5 6
~16 hours
4
Fluorescence

4
3
(RFU)

dilutions
2
2
1

NT T NT T
0 0
no Acr 1:1 1:2 1:5 no Acr 1:1 1:2 1:5
~16 hours Dilutions of pre-expressed AcrVIB_5

Figure 4. The screening of Acr candidates associated with VI-B CRISPR-Cas systems reveals a potent inhibitor of PbuCas13b
(A) Heatmap of 77 screened Acr candidates with medium or high confidence scores. See Table S5 for the list of the candidates along with their associated scores.
Boxes with a white X represent the non-specific inhibition of deGFP expression by the Acr. Values represent the average of four independent experiments. See
Table S6 for the individual values. Pgi: PgiCas13a. Pbu: PbuCas13a. Bzo: BzoCas13a. AcrVIB_5 was renamed AcrVIB1 following the current nomenclature
(Bondy-Denomy et al., 2018).
(B) Overview of TXTL assay to assess inhibition of on-target and collateral RNA cleavage by AcrVIB_5. The gRNA targets the deGFP transcript but not the
mCherry transcript.
(C) Results from the TXTL-based assay to assess inhibition of RNA cleavage activity by PbuCas13a with AcrVIB_5. Top: measured inhibitory activity based on
deGFP fluorescence (left) and mCherry fluorescence (right). Bottom: individual fluorescence end-point measurements. T, targeting gRNA; NT, non-targeting
gRNA. Error bars on the top reflect the mean and standard deviation of the inhibition values calculated from the fluorescence end-point measurements on the
bottom. Measurements are based on triplicate independent experiments.
See also Figures S4–S6; Tables S5, S6, and S7.

AcrVIB1 and adding it while PbuCas13b, the gRNA, and the tar- the gRNA together to form an RNP complex and then combined
geted deGFP transcript were being produced (Figure 7A). Inhibi- the complex with the deGFP reporter. Pre-expressed AcrVIB1
tion of deGFP silencing was complete and then greatly was added immediately before the deGFP reporter or at later
decreased with each hour delaying AcrVIB1 addition. deGFP time points (Figure 7B). Under this setup, AcrVIB1 only inhibited
production also did not rebound after adding AcrVIB1 (Figure S7). deGFP silencing by 23% followed by a similar decrease in inhi-
Therefore, AcrVIB1 was not inhibiting Cas13b-mediated RNA bition with each hour delaying AcrVIB1 addition. Incubating the
cleavage and instead was inhibiting some upstream step. pre-formed RNP complex with AcrVIB1 for different lengths of
To evaluate whether inhibition was occurring before or after time before adding the deGFP reporter did not restore inhibition
RNP complex formation, we pre-expressed PbuCas13b and (Figure 7C). The partial inhibition could reflect AcrVIB1 interfering

2720 Molecular Cell 82, 2714–2726, July 21, 2022


ll
Technology

A C
acr
cas13b acr cas13b
deGFP
kanR
kanR
cmR gRNA cmR gRNA
ampR

B D
- deGFP - deGFP guide - MS2 - MS2 guide
added added
bacteria - - + + AcrVIB1 MS2 phage - - + + AcrVIB1
(dilution) (PFU/ml)

1:1 1.3x109

1:5 1.3x108

1:25 1.3x107

1.3x106
1:125

1.3x105
1:625

1:3,125

Figure 5. AcrVIB1 inhibits plasmid interference and phage defense by PbuCas13b in E. coli
(A) Experimental setup to evaluate the inhibitory activity of AcrVIB_5 in an in vivo killing assay. Bacteria harboring PbuCas13b, a gRNA, and AcrVI_5 are trans-
formed to obtain a plasmid encoding the deGFP target.
(B) Colony formation following the transformation of the targeted plasmid. Results are representative of triplicate independent experiments.
(C) Overview of the experimental setup of the MS2 phage infection assay. Bacteria harboring PbuCas13b, a gRNA, and AcrVIB_5 are challenged with MS2
phages.
(D) Plaque formation following infection with the lytic MS2 phage. Results are representative of triplicate independent experiments.
See also Table S7.

with target recognition, although the inhibitory effect mostly dates predicted by the Gussow method (Gussow et al., 2020), a
occurred upstream of RNP complex formation. These insights recently reported machine learning model that relies on genomic
suggest that AcrVIB1 principally exerts its inhibitory effect early features. Genomic features could also be incorporated into
in the process of CRISPR-based immunity rather than merely DeepAcr, although there is the potential that these additional
blocking the nuclease activity of the RNP complex. features lead to overfitting to existing Acrs and thus result in
otherwise strong candidates being discarded. The overlap be-
DISCUSSION tween the predictions for DeepAcr and the Gussow method
was remarkable given the different features used between the
Through this work, we developed and applied a deep-learning two algorithms. At the same time, DeepAcr uniquely predicted
model called DeepAcr to predict novel Acrs. Unlike all but one Acr candidates associated with CRISPR-Cas subtypes that
of the prior machine learning algorithms, DeepAcr operates were not part of the training set, with some currently lacking
purely based on information from an input protein sequence. any reported Acrs. The Acrs from these subtypes (e.g., IV-A in
The one exception, AcRanker, does not use a neural network ar- which little is known about the biology of these systems) provide
chitecture and is currently used in combination with a guilt-by- a large set of candidates that can be screened using experi-
association approach in AcrDB (Eitzinger et al., 2020). Focusing mental approaches, such as TXTL. In turn, new Acrs could be
on the protein sequence does ignore genomic features such as a revealed that expand the known Acr universe to almost every
flanking HTH-containing gene, an encompassing prophage re- subtype of the CRISPR-Cas system and reveal new inhibitory
gion, and the presence of a CRISPR-Cas system with self-tar- mechanisms and the outcomes of the bacteria-phage arms race.
geting spacers that are also indicators of Acrs. Despite not using By screening Acr candidates associated with type VI-B
this additional information and only relying on protein-related CRISPR-Cas systems, we identified one Acr exhibiting strong in-
features, our algorithm predicted the vast majority of Acr candi- hibition of Cas13b. The Acr, which we call AcrVIB1, was

Molecular Cell 82, 2714–2726, July 21, 2022 2721


ll
Technology

protein name identity (%)


WP_004917816.1 - 1
WP_064968248.1 90.4
WP_153937162.1 94.0
WP_079206532 93.0
uncharacterized protein 94.8
A5Z863 12.7

protein name
WP_004917816.1 59
WP_064968248.1
WP_153937162.1
WP_079206532
uncharacterized protein
A5Z863

B
-5 -4 -3 -2 -1 AcrVIB1 1 2 3 4 5 Type VI-B CRISPR- in prophage
Cas system? region?
NC_014738
Yes Yes
(WP_004917816)
LUDK01000005
No Yes
(WP_064968248)
NZ_QXHV01000004 No Yes
(WP_153937162)
CP011859 No Yes
(WP_079206532)

HTH

Figure 6. AcrVIB1 and its homologs are 115-amino acid proteins that appear downstream of an HTH domain-encoding gene in prophage
regions of Riemerella anatipestifer genomes
(A) Protein sequence alignment of AcrVIB1 and its homologs. Homologs were identified by PSI-BLAST. A5Z863 is shown as the next PSI-BLAST hit after the
homologs. Identities to AcrVIB1 (NCBI: WP_004917816) are shown.
(B) Genetic synteny between AcrVIB1 and three homologs. The genes in gray are unrelated to any other displayed genes. The genes in non-gray colors share
>85% with the same-color genes.
See also Table S7.

identified in the genomes of Riemerella anatipestifer. AcrVIB1 Cas13b in the different applications in which these nucleases
and its homologs were consistently encoded immediately down- are employed (Cox et al., 2017; Abudayyeh et al., 2019).
stream of an HTH-encoding gene within prophage regions,
similar to many other characterized Acrs. Based on the charac- Limitations of the study
terization of HTH-encoding proteins (Birkholz et al., 2019; Stan- Although the TXTL-based screen led to the identification of
ley et al., 2019), these genes would be expected to regulate the AcrVIB1, none of the other 76 screened candidates exhibited
expression of AcrVIB1. Using TXTL, we were able to interrogate robust inhibitory activity despite all possessing high confidence
which step of immune defense AcrVIB1 is inhibiting. Our data scores. Even though this low hit rate could reflect a need for further
indicated that AcrVIB1 was principally inhibiting a step upstream improvements in the model, there are other explanations indepen-
of RNP complex formation. These data already indicate that dent of deep learning or CRISPR-Cas assignment. For instance,
AcrVIB1 functions differently than most known Acrs, which act many Acrs exhibit narrow inhibitory spectra (Shin et al., 2017;
by directly binding the RNP complex to block target recognition Marshall et al., 2018; Watters et al., 2018; Uribe et al., 2019; Pi-
or nuclease activity (Dong et al., 2017; Shin et al., 2017; Watters nilla-Redondo et al., 2020). As a result, our set of screened candi-
et al., 2018; Zhang et al., 2019). Instead, AcrVIB1 may inhibit dates could contain Acrs with inhibitory spectra that do not
RNP complex formation, or it may affect the expression or stabil- encompass the three Cas13b nucleases. Separately, the Acrs
ity of the unbound gRNA or the Cas13b holoenzyme or the ability could affect other aspects of adaptive immunity by VI-B
of the two components to interact. In-depth in vitro approaches CRISPR-Cas systems. These systems can harbor Csx27 or
can be pursued next to elucidate the exact mechanism of action. Csx28, accessory proteins that, respectively, regulate nuclease
Once determined, the Acr could be adapted for controlling activity or augment immune defense and could be targets of

2722 Molecular Cell 82, 2714–2726, July 21, 2022


ll
Technology

A Acr added
B Acr added C Acr and RNP
before RNP formed after RNP formed pre-incubated

deGFP PbuCas13b PbuCas13b


AcrVIB1 AcrVIB1
AcrVIB1
PbuCas13b gRNA gRNA

gRNA

delayed
deGFP deGFP

100 100 100


nuclease activity (%)

nuclease activity (%)

nuclease activity (%)


Inhibition of

Inhibition of

Inhibition of

50 50 50

0 0 0

no Acr 0 1 2 3 4 no Acr 0 1 2 3 4 no Acr 0 1 2 3 4


Time of delay (h) Time of delay (h) Time of delay (h)

Figure 7. AcrVIB1 primarily inhibits Cas13b upstream of RNP complex formation


(A) TXTL-based assay evaluating the addition of pre-expressed AcrVIB1 to a fresh reaction with the PbuCas13b, gRNA, and targeted reporter constructs. The
time indicates the delay between the addition of the three constructs and the addition of pre-expressed AcrVIB1. Under this setup, AcrVIB1 has the opportunity to
inhibit Cas13b at any step leading to on-target RNA cleavage.
(B) TXTL-based assay evaluating the addition of pre-expressed AcrVIB1 to a fresh reaction with a pre-formed RNP complex and the targeted reporter construct.
The time indicates the delay between the addition of the pre-expressed Cas13b:gRNA and the reporter construct and the addition of pre-expressed AcrVIB1.
Under this setup, AcrVIB1 has the opportunity to inhibit Cas13b before target recognition and on-target RNA cleavage.
(C) TXTL-based assay evaluating the addition of pre-expressed AcrVIB1 to a fresh reaction with a pre-formed RNP complex and the targeted reporter construct.
The time indicates the delay between the addition of the pre-expressed Cas13b:gRNA and AcrVIB1 and the addition of the reporter construct. Under this setup,
AcrVIB1 has the opportunity to inhibit the RNP complex before the target is expressed.
Each plot depicts the measured inhibitory activity based on deGFP fluorescence from end-point measurements. Error bars represent the mean and standard
deviation from triplicate independent experiments.
See also Figure S7; Table S7.

Acrs (VanderWal et al., 2016; Smargon et al., 2017). The Acrs could tegrated into Acr identification, potentially revealing new mecha-
also affect the natural expression of the systems as well as spacer nisms of action in which phages counter CRISPR-Cas defenses.
acquisition. Although these modes of inhibition would involve
proteins beyond Cas nucleases, the sequence and mechanistic STAR+METHODS
diversity of established Acrs and Cas nucleases lend to the iden-
tification of other Acrs using deep learning. By taking these other Detailed methods are provided in the online version of this paper
mechanisms into account, new screens could be devised and in- and include the following:

Molecular Cell 82, 2714–2726, July 21, 2022 2723


ll
Technology
d KEY RESOURCES TABLE REFERENCES
d RESOURCE AVAILABILITY
B Lead contact Abudayyeh, O.O., Gootenberg, J.S., Franklin, B., Koob, J., Kellner, M.J.,
Ladha, A., Joung, J., Kirchgatterer, P., Cox, D.B.T., and Zhang, F. (2019). A
B Materials availability
cytosine deaminase for programmable single-base RNA editing. Science
B Data and code availability
365, 382–386.
d EXPERIMENTAL MODEL AND SUBJECT DETAILS
Abudayyeh, O.O., Gootenberg, J.S., Konermann, S., Joung, J., Slaymaker,
d METHOD DETAILS I.M., Cox, D.B., Shmakov, S., Makarova, K.S., Semenova, E., Minakhin, L.,
B Data collecting and preprocessing et al. (2016). C2c2 is a single-component programmable RNA-guided RNA-
B Model architecture targeting CRISPR effector. Science 353, aaf5573.
B One-hot encoding Alkhnbashi, O.S., Mitrofanov, A., Bonidia, R., Raden, M., Tran, V.D.,
B Model training €rk, E., Padilha, V.A., Sanches, D.S., et al.
Eggenhofer, F., Shah, S.A., Öztu
B Model testing (2021). CRISPRloci: comprehensive and accurate annotation of CRISPR-
Cas systems. Nucleic Acids Res. 49, W125–W130.
B Hyperparameter optimization
B Prophage detection Altschul, S.F., Gish, W., Miller, W., Myers, E.W., and Lipman, D.J. (1990). Basic
local alignment search tool. J. Mol. Biol. 215, 403–410.
B Prediction of aca genes
Arndt, D., Grant, J.R., Marcu, A., Sajed, T., Pon, A., Liang, Y., and Wishart, D.S.
B Prediction of CRISPR-Cas subtype associated with an
(2016). PHASTER: a better, faster version of the PHAST phage search tool.
Acr candidate Nucleic Acids Res 44, W16–W21.
B Strains and growth conditions
Barrangou, R., Fremaux, C., Deveau, H., Richards, M., Boyaval, P., Moineau,
B Cell-free transcription-translation assays S., Romero, D.A., and Horvath, P. (2007). CRISPR provides acquired resis-
B Plasmid interference assay in E. coli tance against viruses in prokaryotes. Science 315, 1709–1712.
B Plaque formation assay in E. coli Birkholz, N., Fagerlund, R.D., Smith, L.M., Jackson, S.A., and Fineran, P.C.
d QUANTIFICATION AND STATISTICAL ANALYSIS (2019). The autoregulator Aca2 mediates anti-CRISPR repression. Nucleic
Acids Res 47, 9658–9665.
SUPPLEMENTAL INFORMATION Bondy-Denomy, J. (2018). Protein inhibitors of CRISPR-Cas9. ACS Chem.
Biol. 13, 417–423. https://doi.org/10.1021/acschembio.7b00831.
Supplemental information can be found online at https://doi.org/10.1016/j. Bondy-Denomy, J., Davidson, A.R., Doudna, J.A., Fineran, P.C., Maxwell,
molcel.2022.05.003. K.L., Moineau, S., Peng, X., Sontheimer, E.J., and Wiedenheft, B. (2018). A uni-
fied resource for tracking anti-CRISPR names. CRISPR J. 1, 304–305.
ACKNOWLEDGMENTS
Bondy-Denomy, J., Garcia, B., Strum, S., Du, M., Rollins, M.F., Hidalgo-
Reyes, Y., Wiedenheft, B., Maxwell, K.L., and Davidson, A.R. (2015).
We thank Raimonds Vanags and Fayyaz Hussain for their support in devel-
Multiple mechanisms for CRISPR-Cas inhibition by anti-CRISPR proteins.
oping the deep-learning method; Elena Vialetto for assistance with the plasmid
Nature 526, 136–139.
interference assays; Oliver Dietrich for assistance with using R for data anal-
ysis and figure preparation; pBZCas13b and pPbcas13b were a gift from Bondy-Denomy, J., Pawluk, A., Maxwell, K.L., and Davidson, A.R. (2013).
Feng Zhang (Addgene plasmid # 89898 and # 89906, respectively). This Bacteriophage genes that inactivate the CRISPR/Cas bacterial immune sys-
work was supported by the Deutsche Forschungsgemeinschaft (BA 2168/ tem. Nature 493, 429–432.
11-2, BA 2168/23-1, BA 2168/14-1 to R.B., and BE 6703/1-2 to C.L.B.) and Borges, A.L., Zhang, J.Y., Rollins, M.F., Osuna, B.A., Wiedenheft, B., and
the Defense Advanced Research Projects Agency Safe Genes program Bondy-Denomy, J. (2018). Bacteriophage cooperation suppresses CRISPR-
(HR0011-17-2-0042 to C.L.B.); further support was provided by the Deutsche Cas3 and Cas9 immunity. Cell 174, 917–925. e10.
Forschungsgemeinschaft under Germany’s Excellence Strategy (CIBSS – Brouns, S.J.J., Jore, M.M., Lundgren, M., Westra, E.R., Slijkhuis, R.J.,
EXC-2189 – project ID 390939984). The views, opinions, and/or findings ex- Snijders, A.P., Dickman, M.J., Makarova, K.S., Koonin, E.V., and van der
pressed should not be interpreted as representing the official views or policies Oost, J. (2008). Small CRISPR RNAs guide antiviral defense in prokaryotes.
of the Department of Defense or the U.S. Government. Science 321, 960–964.
Chevallereau, A., Meaden, S., Fradet, O., Landsberger, M., Maestri, A.,
AUTHOR CONTRIBUTIONS Biswas, A., Gandon, S., van Houte, S., and Westra, E.R. (2020). Exploitation
Conceptualization, K.G.W., O.S.A., R.B., and C.L.B.; software, O.S.A., A. Mi- of the cooperative behaviors of anti-CRISPR phages. Cell Host Microbe 27,
trofanov., S.H., and R.B.; experiments, K.G.W., H.V.I.B., and A. Migur.; model 189–198. e6.
development, O.S.A.; investigation, K.G.W.; writing—original draft, K.G.W. Cox, D.B.T., Gootenberg, J.S., Abudayyeh, O.O., Franklin, B., Kellner, M.J.,
and C.L.B., with input from O.S.A. and R.B.; writing—review & editing, Joung, J., and Zhang, F. (2017). RNA editing with CRISPR-Cas13. Science
K.G.W., O.S.A., R.B., and C.L.B., with input from all the authors; visualization, 358, 1019–1027.
K.G.W. and C.L.B., with input from O.S.A. and R.B.; supervision, R.B. and Dong, C., Hao, G.F., Hua, H.L., Liu, S., Labena, A.A., Chai, G., Huang, J., Rao,
C.L.B.; funding acquisition: R.B. and C.L.B. N., and Guo, F.B. (2018). Anti-CRISPRdb: a comprehensive online resource for
anti-CRISPR proteins. Nucleic Acids Res. 46, D393–D398.
DECLARATION OF INTERESTS Dong, D., Guo, M., Wang, S., Zhu, Y., Wang, S., Xiong, Z., Yang, J., Xu, Z., and
Huang, Z. (2017). Structural basis of CRISPR–SpyCas9 inhibition by an anti-
C.L.B. is a co-founder and member of the Scientific Advisory Board for Locus
CRISPR protein. Nature 546, 436–439.
Biosciences as well as a member of the Scientific Advisory Board for Ben-
son Hill. Eitzinger, S., Asif, A., Watters, K.E., Iavarone, A.T., Knott, G.J., Doudna, J.A.,
and Minhas, F.U.A.A. (2020). Machine learning predicts new anti-CRISPR pro-
Received: December 22, 2021 teins. Nucleic Acids Res. 48, 4698–4708.
Revised: March 25, 2022 El-Gebali, S., Mistry, J., Bateman, A., Eddy, S.R., Luciani, A., Potter, S.C.,
Accepted: May 3, 2022 Qureshi, M., Richardson, L.J., Salazar, G.A., Smart, A., et al. (2019). The
Published: May 31, 2022 Pfam protein families database in 2019. Nucleic Acids Res 47, D427–D432.

2724 Molecular Cell 82, 2714–2726, July 21, 2022


ll
Technology
Finn, R.D., Clements, J., and Eddy, S.R. (2011). HMMER web server: interac- (2018). Discovery of widespread type I and type V CRISPR-Cas inhibitors.
tive sequence similarity searching. Nucleic Acids Res 39 (web server issue), Science 362, 240–242.
W29–W37. Marshall, R., Beisel, C.L., and Noireaux, V. (2020). Rapid testing of CRISPR nu-
Garamella, J., Marshall, R., Rustad, M., and Noireaux, V. (2016). The all E. coli cleases and guide RNAs in a E. Coli cell-free transcription-translation system.
TX-TL toolbox 2.0: a platform for cell-free synthetic biology. ACS Synth. Biol. 5, Star Protoc. 1, 100003.
344–355.
Marshall, R., Maxwell, C.S., Collins, S.P., Jacobsen, T., Luo, M.L., Begemann,
Garneau, J.E., Dupuis, M.È., Villion, M., Romero, D.A., Barrangou, R., Boyaval, M.B., Gray, B.N., January, E., Singer, A., He, Y., et al. (2018). Rapid and scal-
P., Fremaux, C., Horvath, P., Magadán, A.H., and Moineau, S. (2010). The able characterization of CRISPR technologies using an E. coli cell-free tran-
CRISPR/Cas bacterial immune system cleaves bacteriophage and plasmid scription-translation system. Mol. Cell 69, 146–157. e3.
DNA. Nature 468, 67–71.
Maxwell, C.S., Jacobsen, T., Marshall, R., Noireaux, V., and Beisel, C.L. (2018).
Guo, L., Wang, S., Li, M., and Cao, Z. (2019). Accurate classification of mem- A detailed cell-free transcription-translation-based assay to decipher CRISPR
brane protein types based on sequence and evolutionary information using protospacer-adjacent motifs. Methods 143, 48–57.
deep learning. BMC Bioinformatics 20 (suppl 25), 700.
Meeske, A.J., Jia, N., Cassel, A.K., Kozlova, A., Liao, J., Wiedmann, M., Patel,
Gussow, A.B., Park, A.E., Borges, A.L., Shmakov, S.A., Makarova, K.S., Wolf, D.J., and Marraffini, L.A. (2020). A phage-encoded anti-CRISPR enables com-
Y.I., Bondy-Denomy, J., and Koonin, E.V. (2020). Machine-learning approach plete evasion of type VI-A CRISPR-Cas immunity. Science 369, 54–59.
expands the repertoire of anti-CRISPR protein families. Nat. Commun.
Meeske, A.J., Johnson, M.C., Hille, L.T., Kleinstiver, B.P., and Bondy-Denomy,
11, 3784.
J. (2021). Lack of Cas13a inhibition by anti-CRISPR proteins from Leptotrichia
He, F., Bhoobalan-Chitty, Y., Van, L.B., Kjeldsen, A.L., Dedola, M., Makarova, prophages. bioRxiv. https://doi.org/10.1101/2021.05.27.445852.
K.S., Koonin, E.V., Brodersen, D.E., and Peng, X. (2018). Anti-CRISPR proteins
Meeske, A.J., Nakandakari-Higa, S., and Marraffini, L.A. (2019). Cas13-
encoded by archaeal lytic viruses inhibit subtype I-D immunity. Nat. Microbiol.
induced cellular dormancy prevents the rise of CRISPR-resistant bacterio-
3, 461–469.
phage. Nature 570, 241–245.
Huang, L., Yang, B., Yi, H., Asif, A., Wang, J., Lithgow, T., Zhang, H., Minhas,
F.U.A.A., and Yin, Y. (2021). AcrDB: a database of anti-CRISPR operons in pro- Nussenzweig, P.M., and Marraffini, L.A. (2020). Molecular mechanisms of
karyotes and viruses. Nucleic Acids Res. 49, D622–D629. CRISPR-Cas immunity in bacteria. Annu. Rev. Genet. 54, 93–120.

Hynes, A.P., Rousseau, G.M., Agudelo, D., Goulet, A., Amigues, B., Loehr, J., Padilha, V.A., Alkhnbashi, O.S., Shah, S.A., de Carvalho, A.C.P.L.F., and
Romero, D.A., Fremaux, C., Horvath, P., Doyon, Y., et al. (2018). Widespread Backofen, R. (2020). CRISPRcasIdentifier: machine learning for accurate
anti-CRISPR proteins in virulent bacteriophages inhibit a range of Cas9 pro- identification and classification of CRISPR-Cas systems. GigaScience 9.
teins. Nat. Commun. 9, 2919. giaa062.

Hynes, A.P., Rousseau, G.M., Lemay, M.L., Horvath, P., Romero, D.A., Padilha, V.A., Alkhnbashi, O.S., Tran, V.D., Shah, S.A., Carvalho, A.C.P.L.F.,
Fremaux, C., and Moineau, S. (2017). An anti-CRISPR from a virulent strepto- and Backofen, R. (2021). Casboundary: automated definition of integral Cas
coccal phage inhibits Streptococcus pyogenes Cas9. Nat. Microbiol. 2, cassettes. Bioinformatics 37, 1352–1359.
1374–1380. Pawluk, A., Bondy-Denomy, J., Cheung, V.H., Maxwell, K.L., and Davidson,
Jackson, S.A., McKenzie, R.E., Fagerlund, R.D., Kieper, S.N., Fineran, P.C., A.R. (2014). A new group of phage anti-CRISPR genes inhibits the type I-E
and Brouns, S.J. (2017). CRISPR-Cas: adapting to change. Science 356. CRISPR-Cas system of Pseudomonas aeruginosa. mBio 5, e00896.
eaal5056. Pawluk, A., Davidson, A.R., and Maxwell, K.L. (2018). Anti-CRISPR: discovery,
Kellner, M.J., Koob, J.G., Gootenberg, J.S., Abudayyeh, O.O., and Zhang, F. mechanism and function. Nat. Rev. Microbiol. 16, 12–17.
(2019). Sherlock: nucleic acid detection with CRISPR nucleases. Nat. Pawluk, A., Staals, R.H., Taylor, C., Watson, B.N., Saha, S., Fineran, P.C.,
Protoc. 14, 2986–3012. Maxwell, K.L., and Davidson, A.R. (2016). Inactivation of CRISPR-Cas sys-
Kingma, D.P., and Ba, J. (2014). Adam: A method for stochastic optimization. tems by anti-CRISPR proteins in diverse bacterial species. Nat. Microbiol.
ArXiv. http://arxiv.org/abs/1412.6980. 1, 16085.
Landsberger, M., Gandon, S., Meaden, S., Rollie, C., Chevallereau, A., Pinilla-Redondo, R., Shehreen, S., Marino, N.D., Fagerlund, R.D., Brown,
Chabas, H., Buckling, A., Westra, E.R., and van Houte, S. (2018). Anti- C.M., Sørensen, S.J., Fineran, P.C., and Bondy-Denomy, J. (2020).
CRISPR phages cooperate to overcome CRISPR-Cas immunity. Cell 174, Discovery of multiple anti-CRISPRs highlights anti-defense gene clustering
908–916. e12. in mobile genetic elements. Nat. Commun. 11, 5652.
Lee, J., Mir, A., Edraki, A., Garcia, B., Amrani, N., Lou, H.E., Gainetdinov, I., Rauch, B.J., Silvis, M.R., Hultquist, J.F., Waters, C.S., McGregor, M.J.,
Pawluk, A., Ibraheim, R., Gao, X.D., et al. (2018). Potent Cas9 inhibition in bac- Krogan, N.J., and Bondy-Denomy, J. (2017). Inhibition of CRISPR-Cas9 with
terial and human cells by AcrIIC4 and AcrIIC5 anti-CRISPR proteins. mBio 9. bacteriophage proteins. Cell 168, 150–158. e10.
e02321–18.
Shah, S.A., Alkhnbashi, O.S., Behler, J., Han, W., She, Q., Hess, W.R., Garrett,
Lin, P., Qin, S., Pu, Q., Wang, Z., Wu, Q., Gao, P., Schettler, J., Guo, K., Li, R., R.A., and Backofen, R. (2019). Comprehensive search for accessory proteins
Li, G., et al. (2020). CRISPR-Cas13 inhibitors block RNA editing in bacteria and encoded with archaeal and bacterial type III CRISPR-cas gene cassettes re-
mammalian cells. Mol. Cell 78, 850–861. e5. veals 39 new cas gene families. RNA Biol 16, 530–542.
Loshchilov, I., and Hutter, F. (2016). SGDR: stochastic gradient descent with Shin, J., Jiang, F., Liu, J.J., Bray, N.L., Rauch, B.J., Baik, S.H., Nogales, E.,
warm restarts. ArXiv. http://arxiv.org/abs/1608.03983. Bondy-Denomy, J., Corn, J.E., and Doudna, J.A. (2017). Disabling Cas9 by
Makarova, K.S., Wolf, Y.I., Iranzo, J., Shmakov, S.A., Alkhnbashi, O.S., an anti-CRISPR DNA mimic. Sci. Adv. 3, e1701620.
Brouns, S.J.J., Charpentier, E., Cheng, D., Haft, D.H., Horvath, P., et al. Shin, J., and Noireaux, V. (2012). An E. coli cell-free expression toolbox:
(2020). Evolutionary classification of CRISPR-Cas systems: a burst of class application to synthetic gene circuits and artificial cells. ACS Synth. Biol.
2 and derived variants. Nat. Rev. Microbiol. 18, 67–83. 1, 29–41.
} , B., and Bondy-Denomy, J. (2020).
Marino, N.D., Pinilla-Redondo, R., Csörgo Smargon, A.A., Cox, D.B.T., Pyzocha, N.K., Zheng, K., Slaymaker, I.M.,
Anti-CRISPR protein applications: natural brakes for CRISPR-Cas technolo- Gootenberg, J.S., Abudayyeh, O.A., Essletzbichler, P., Shmakov, S.,
gies. Nat. Methods 17, 471–479. Makarova, K.S., et al. (2017). Cas13b Is a type VI-B CRISPR-associated
Marino, N.D., Zhang, J.Y., Borges, A.L., Sousa, A.A., Leon, L.M., Rauch, B.J., RNA-guided RNase differentially regulated by accessory proteins Csx27 and
Walton, R.T., Berry, J.D., Joung, J.K., Kleinstiver, B.P., and Denomy, J.B. Csx28. Mol. Cell 65, 618–630. e7.

Molecular Cell 82, 2714–2726, July 21, 2022 2725


ll
Technology
Stanley, S.Y., Borges, A.L., Chen, K.H., Swaney, D.L., Krogan, N.J., Bondy- robust CRISPR-Cas adaptive immunity. bioRxiv. https://doi.org/10.1101/
Denomy, J., and Davidson, A.R. (2019). Anti-CRISPR-associated proteins 2021.11.02.466367.
are crucial repressors of anti-CRISPR transcription. Cell 178, 1452–1464. e13.
Wandera, K.G., Collins, S.P., Wimmer, F., Marshall, R., Noireaux, V., and
Tan, K.C., Lee, T.H., and Khor, E.F. (2002). Evolutionary algorithms for multi- Beisel, C.L. (2020). An enhanced assay to characterize anti-CRISPR proteins
objective optimization: performance assessments and comparisons. In using a cell-free transcription-translation system. Methods 172, 42–50.
Proceedings of the 2001 Congress on Evolutionary Computation (IEEE
Publications), pp. 979–986. IEEE. Cat. No.01TH8546. Wang, J., Dai, W., Li, J., Xie, R., Dunstan, R.A., Stubenrauch, C., Zhang, Y., and
Lithgow, T. (2020). PaCRISPR: a server for predicting and visualizing anti-
Trasanidou, D., Gerós, A.S., Mohanraju, P., Nieuwenweg, A.C., Nobrega, F.L.,
CRISPR proteins. Nucleic Acids Res 48, W348–W357.
and Staals, R.H.J. (2019). Keeping CRISPR in check: diverse mechanisms of
phage-encoded anti-CRISPRs. FEMS Microbiol. Lett. 366, fnz098. Watters, K.E., Fellmann, C., Bai, H.B., Ren, S.M., and Doudna, J.A. (2018).
Uribe, R.V., Helm, E.vd., Misiakou, M.-A., Lee, S.-W., Kol, S., and Sommer, Systematic discovery of natural CRISPR-Cas12a inhibitors. Science 362,
M.O.A. (2019). Discovery and characterization of Cas9 inhibitors disseminated 236–239.
across seven bacterial phyla. Cell Host Microbe 25, 233–241. e5. Zhang, H., Li, Z., Daczkowski, C.M., Gabel, C., Mesecar, A.D., and Chang, L.
VanderWal, A.R., Park, J.U., Polevoda, B., Kellogg, E.H., and O’Connell, M.R. (2019). Structural basis for the inhibition of CRISPR-Cas12a by anti-CRISPR
(2016). CRISPR-Csx28 forms a Cas13b-activated membrane pore required for proteins. Cell Host Microbe 25, 815–826. e4.

2726 Molecular Cell 82, 2714–2726, July 21, 2022


ll
Technology

STAR+METHODS

KEY RESOURCES TABLE

REAGENT or RESOURCE SOURCE IDENTIFIER


Bacterial and virus strains
For bacterial strains, see Table S7 N/A N/A
Chemicals, peptides, and recombinant proteins
Isopropyl-ß-D-thiogalactopyranosid (IPTG) Carl Roth Cat#2316.4; CAS: 367-93-1
Critical commercial assays
myTXTL Sigma 70 Master Mix Kit Arbor Bioscience Cat#507025
Deposited data
Images for Figure 5, spreadsheets This paper Mendeley Data: https://doi.org/
for Tables S3 and S4 10.17632/n7v3bfngm5.1
Recombinant DNA
For plasmids, see Table S7 N/A N/A
Software and algorithms
DeepAcr This paper https://github.com/BackofenLab/DeepAcr
blast+ (Altschul et al., 1990) http://blast.ncbi.nlm.nih.gov//blast.ncbi.nlm.
nih.gov/Blast.cgi; RRID:SCR_004870
Casboundary (Padilha et al., 2020) https://github.com/BackofenLab/Casboundary
CRISPRcasIdentifier (Padilha et al., 2020) https://github.com/BackofenLab/CRISPRcasIdentifier
Hmmsearch (Finn et al., 2011) http://hmmer.org/; RRID:SCR_005305
PHASTER web-server (Arndt et al., 2016) https://phaster.ca/
Pfam (El-Gebali et al., 2019) http://pfam.xfam.org/; RRID:SCR_004726
Other
For a detailed protocol for installing and using DeepAcr, see Methods S1

RESOURCE AVAILABILITY

Lead contact
Further information and requests for resources and reagents should be directed to and will be fulfilled by the lead contact Prof. Chase
L. Beisel ([email protected]).

Materials availability
The following plasmids generated in this study have been deposited to Addgene: pPbuCas13b-gRNA-NT (#184568), pPgiCas13b
(#184833), pPgiCas13b-gRNA-1 (#184834), pPgiCas13b-gRNA-NT (#184835), pAcrVIB1-cloDF-kan (#184836), pcloDF-kan
(#184837), pPlacPbuCas13b-gRNA-NT (#184838), pPlacPbuCas13b-gRNA-T-MS2 (#184839), pPlacPbuCas13b-gRNA-T-
deGFP (#184840), P70a-deGFP-sc101-kan (#184841), pPJ23116-MS2-rep (#184842), pAcrVIB1-cloDF-amp (#184843), pcloDF-
amp (#184844). All other plasmids can be obtained upon reasonable request to Prof. Chase L. Beisel (chase.beisel@helmholtz-
hiri.de).

Data and code availability


d Original agar plate images have been deposited at Mendeley and are publicly available as of the date of publication. The DOI
number is listed in the key resource table.
d All original code for DeepAcr has been deposited at GitHub and is publicly available as of the date of publication under https://
github.com/BackofenLab/DeepAcr. Information regarding the data and code availability should be directed to the other cor-
responding author, Prof. Rolf Backofen ([email protected]).
d Any additional information required to reanalyze the data reported in this paper is available from the lead contact Prof. Chase L.
Beisel ([email protected]) and the other corresponding author, Prof. Rolf Backofen ([email protected]
freiburg.de) upon request.

Molecular Cell 82, 2714–2726.e1–e4, July 21, 2022 e1


ll
Technology

EXPERIMENTAL MODEL AND SUBJECT DETAILS

All bacterial strains used in this study are listed below. With the exception of Escherichia coli KL740 cI857+, all strains are cultured in
LB medium with incubation at 37 C, 220 rpm. Escherichia coli KL740 cI857+ was cultured at 29 C. All strains are stored as glycerol
stocks at -80 C.

d BL21(DE3) Competent E. coli


d Escherichia coli KL740 cI857+
d NEB Turbo Competent E. coli (High Efficiency)
d One Shot TOP10 Chemically Competent E. coli

METHOD DETAILS

Data collecting and preprocessing


All 420 Acr proteins used in this study were selected from Anti-CRISPRdb (subtype I-D, I-E, I-F, -II-A and II-C) (Dong et al., 2018). In
order to keep the dataset balanced, we added the same number of non-Acr proteins derived from accessory proteins associated with
Type III CRISPR-Cas systems (Shah et al., 2019).

Model architecture
The proposed method for binary classification of input Acr protein sequences. The first layer of the architecture is based on a single
convolutional layer that downsamples the protein input and creates an embedding to be processed by some recurrent layer. We,
therefore, use a relatively large stride (usually around 10) and kernel size (usually around 15). The use of convolution here is motivated
by the idea that amino acids in the protein that are closely clustered are more likely to create a feature. Additionally, creating an
embedding using a downsampling process makes the computation using the recurrent layer easier. We use the size of the one-
hot encoding of the amino acids as the number of channels and reduce the number of channels down to 9 in the output.
The recurrent layer consists of a bidirectional (therefore includes forward and backward pass) recurrent cell (based on either LSTM,
linear or GRU) with up to two layers (standard architecture). LSTM cells implement short-term memory by using input, forget, and
output gates. GRU cells are a smaller version of an LSTM cell that do not use an output gate. Within these cells, we apply a dropout
of 0.5. The additional information is preprocessed using (usually just a single) linear layer. The outcome of this preprocessing is then
concatenated with the output of the recurrent layer. This concatenated output is processed linearly and then yields the model output.
We do apply dropout with a dropout rate of up to 0.5 for these last linear layers. To enable non-linear processing ReLu functions f(x) =
max(0,x) are used for non-linear transformation. The used architecture was inspired by L. Guo et al. and optimized for our problem
(Guo et al., 2019). The optimization process is described under hyperparameter optimization.

One-hot encoding
We created a one-hot encoding of our input proteins and amino acids to process our data using neural networks. For each amino acid
belonging to a protein, the one-hot encoding consists of a 20 dimensional zero vector v=(0,...,x_i,...0) with a single 1 in position x_i
denoting the class of the amino acid. The sum of the one-hot encoding for each amino acid is, therefore, sum(v) = 1. In cases where
graph-convolution is applied, the primary form is represented with a one-dimensional graph with connections between neighboring
amino acids and self-referential relations. The idea behind the use of one-hot encoding is that they do not suggest ranking or order
between the used categories.

Model training
We evaluate using a 50-fold cross-validation procedure. 15% of the leftover training set is used as an additional validation set to
employ early stopping during training to not overfit the training data. We use cross-entropy-loss to evaluate our fit to the data and
weight the data according to its distribution. A weight decay of 0.001 and learning rate of 0.01 and an Adam optimizer (Kingma
and Ba, 2014) is applied, computing an adaptive learning rate for every parameter. We use mini batches containing 30 samples
each for training. Cosine Annealing (Loshchilov and Hutter, 2016) is used as a scheduler to adapt the learning rate, providing a
warm restart at 25 epochs. Test and validation sets are sampled in a stratified fashion to make sure that the distribution is equal.
We repeat this training process three times to create an ensemble of neural networks, each network trained for 50 epochs. Each
network in the ensemble is initialized randomly using a xavier normal function to capture different modes of the underlying space
of solution. The training uses weight decay, dropout, ensembles and early stopping to prevent overfitting to the train data.

Model testing
We evaluate the validation set from the cross-validation procedure in an ensemble setting. During prediction, we calculate the mean
prediction of all three networks (output1, output2, output3) and then apply a sigmoid function. The final prediction is the class with the

e2 Molecular Cell 82, 2714–2726.e1–e4, July 21, 2022


ll
Technology

highest output: argmax(out). The idea here is that by using multiple networks we can better model uncertainty and increase our ac-
curacy. We give the accuracy of the ensemble prediction. The final result is the mean of all cross-validation runs.

Hyperparameter optimization
To optimize our architecture, we ran a regularized Evolutionary Algorithm (EA) (Tan et al., 2002) for 120 iterations. At first, 25 initial
architectures were created randomly. After evaluating all architectures, we create the next architectures to be evaluated. Therefore,
we draw the best three architectures from a tournament setting. In the tournament setting, three architectures are chosen at random,
and only the best one is returned. The quality of an architecture is determined by its performance (single-objective). The chosen ar-
chitectures are then either mutated (with a probability of 90%) or recombined with another architecture (probability of 10%). During
mutation, a third of the hyperparameter is chosen at random and mutated using a gaussian mutation procedure by sampling around
the previous hyperparameter with an std of 1/3 of the range of the hyperparameter. The values are then clipped if they exceed the
range of the hyperparameter. During recombination, a third of the hyperparameter of an architecture is chosen at random and re-
placed with the value of a second architecture. The oldest three of all architectures are then discarded. Finally, we simply return
the best performing architectures. Regularized EA has previously been used for similar tasks of neural architecture search (Tan,
Lee, Khor, and IEEE), which concludes that evolution provides a simple optimization method that can yield good results, especially
with little computational resources.

Prophage detection
Prophage locations were identified using the PHASTER web-server (Arndt et al., 2016).

Prediction of aca genes


For each of the 77 candidate acr genes, we searched for aca proteins in the neighboring genes (upstream and downstream five
genes) using the hmmsearch tool (Finn et al., 2011) against the downloaded HTH models from the Pfam database (El-Gebali
et al., 2019).

Prediction of CRISPR-Cas subtype associated with an Acr candidate


The association of subtypes with Acrs is determined as follows: 1) If the genome containing the Acr and also contains a CRISPR-Cas
system that can be detected using CRISPRcasIdentifier, Casboundary and CRISPRloci (Padilha et al., 2020, 2021; Alkhnbashi et al.,
2021), then we use the CRISPR-Cas subtype of the CRISPR-Cas system closest to the Acr on the genomic level. 2) If the genome
containing the Acr does not have a detectable CRISPR-Cas system, then we investigate all genomes that are close in the taxonomy to
the genome under investigation. We screen all these genomes using CRISPRcasIdentifier and Casboundary, and determine all
possible CRISPR-Cas systems. If the taxonomically close organisms have only one subtype, i.e. VI-B, and enough organisms contain
this subtype, then we classify the Acr as belonging to VI-B. If there are several subtypes detected or none, the Acr is classified as
‘‘Unknown’’.

Strains and growth conditions


Supplementary Table S7 provides a list of the key resources used in this work, together with all strains and plasmids.

Cell-free transcription-translation assays


Plasmids encoding either the nuclease, a crRNA or deGFP were used to assess the targeting activity in a cell-free transcription-trans-
lation assay. The nuclease was under the control of a T7 promoter, therefore a plasmid encoding the T7 RNA polymerase had to be
added. The crRNA encoded either a targeting (T) spacer or non-targeting (NT) control. Plasmids encoding crRNA were cloned with
Golden Gate Assembly, eliminating the restriction sites in the process of cloning. The plasmids encoding BzoCas13b and
PbuCas13b were cloned by Gibson Assembly using Addgene plasmids 89898 and 89906, respectively, as a source of the encoded
nuclease. The backbone that was used was kindly provided by Chunyu Liao. The plasmids encoding the candidate Acrs were pur-
chased from Twist Biosciences. All DNA used in TXTL was prepared by first using a midiprep kit followed by a second purification
using a DNA clean-up kit. Each Acr candidate was pre-expressed in MyTXTL master mix together with a plasmid encoding the T7
RNA polymerase and IPTG with a final concentration of 4nM, 0.2nM and 1.2mM, respectively. To assess possible inhibitory effects
of the putative Acrs, the nuclease plasmid was added separately with either the targeting or the non-targeting crRNA, different
amounts of the pre-expressed Acr candidates and the targeted deGFP plasmid to MyTXTL master mix (Arbor Biosciences), with
a final concentration of 1 nM each plasmid. The samples were then incubated at 29 C for 16-h in a plate reader (BioTek Synergy
Neo2) and fluorescence was measured every three min (excitation, emission: 485 nm, 528 nm). All shown data was produced using
the Echo525 Liquid Handling system. The assays were therefore scaled down to 3-ml reactions per replicate, with four repli-
cates each.

Plasmid interference assay in E. coli


The plasmids used here were constructed via either Golden Gate or Gibson assembly. The Addgene plasmid #89906 was used as
backbone to include different spacers, either targeting deGFP or a non-targeting control. The BsaI restriction sites were removed in

Molecular Cell 82, 2714–2726.e1–e4, July 21, 2022 e3


ll
Technology

the process of cloning. The Acr and the no-Acr-control were cloned via Gibson Assembly using a backbone encoding cloDF origin
and a kanamycin resistance marker. The constructed plasmids were prepared using the ZymoPURE II Plasmid Midiprep Kit (Zymo
Research) and the sequence was confirmed via Sanger sequencing. E.coli BL21(DE3) expressing both the plasmids encoding the
nuclease and a single-spacer CRISPR array and either AcrVIB1 or the no-Acr-control were used to transform 50ng of the plasmid
encoding a target deGFP gene. After transformation, the cells were recovered in SOC for 1h at 37 C while shaking at 220rpm.
The cells were plated on LB agar plates with triple antibiotic (Cm, Kana, Amp) in 5-fold serial dilutions. After 16h of growth, the colony
numbers were recorded for further analysis. In addition to counting the colonies, photos of the plates were taken using the
ImageQuant 800 (Amersham).

Plaque formation assay in E. coli


The plasmids encoding the nuclease and gRNAs used here were constructed via Golden Gate Assembly, the BsaI restriction sites
were eliminated in the process of adding the respective spacer in between the two repeats. The constructed plasmids were prepped
using the ZR Plasmid Miniprep-Classic Kit (Zymo Research) and the sequence was confirmed via Sanger sequencing. Two
different plasmids were transformed into NEB Turbo competent E. coli cells: the plasmid expressing both PbuCas13b and direct
repeat-spacer-direct repeat and another plasmid expressing AcrVIB1 or the empty control plasmid with no Acr encoded. After the
transformation, the cells were plated onto LB-agar plates containing the appropriate antibiotics. Following incubation at 37 C over-
night, three single colonies (three biological replicates) were inoculated overnight at 37 C in LB-antibiotic medium. The overnight cul-
tures were diluted to OD600 = 0.05 and grown to OD600 z 0.5 at 37 C in 10 ml LB-antibiotic medium. The cells were harvested once by
centrifugation at 4000 rpm for 10 min at room temperature, the supernatant was removed and the cell pellet was resuspended in 1 ml
of LB-antibiotic medium. 750 ml of the cells were mixed with 4 ml of the ‘‘soft’’ agar (10 g/L tryptone, 5 g/L yeat extract, 5 g/L NaCl, 7.5
g/L agar) containing the appropriate antibiotics preheated to 60 C in advance. The cell-agar mix was poured onto LB-antibiotic agar
plates (24 ml LB-antibiotic agar in Petri dishes with the following dimensions: d 90 mm, h 16.2 mm), the plates were left until the top
layer solidified. 10-fold serial dilutions of the MS2 phage were prepared in LB medium. 3 ml of each dilution were spotted onto the top
layer using a multichannel pipette. The lowest dilution of the MS2 phage was 1.3x109 pfu, the highest dilution was 1.3x105 pfu. After
the droplets with the phage dried, the plates were incubated at 37 C overnight. The following morning the photos of the plates were
taken with ImageQuant 800 (Amersham).

QUANTIFICATION AND STATISTICAL ANALYSIS

As part of the analysis of TXTL fluorescence data, the background fluorescence was subtracted from all samples. Background fluo-
rescence was measured using samples that only contained myTXTL mix and water. Grubb’s test was performed using the values
after 16-h to identify outliers between replicates (a = 0.1) when the data is presented in form of a bar chart. If no outliers were iden-
tified, one of the four replicates was discarded randomly. The time course graph shows the average deGFP fluorescence over time
together with the standard deviation. The percent inhibition of nuclease activity by each Acr candidate was calculated using the fluo-
rescence values after 16-h in the following formula:
0 1
GFPt; Acr GFPt 
BGFPnt; Acr 
GFPnt C
% Inhibition of nuclease activity = 100%  B @
C;
A
GFPt 
1
GFPnt 
where GFPt,Acr is the GFP fluorescence in presence of a targeting gRNA and an Acr candidate, GFPnt,Acr is the GFP fluorescence in
presence of a non-targeting gRNA and an Acr candidate, GFPt- is the GFP fluorescence in presence of a targeting gRNA and no Acr,
and GFPnt- is the GFP fluorescence in presence of a non-targeting gRNA and no Acr.

e4 Molecular Cell 82, 2714–2726.e1–e4, July 21, 2022

You might also like