- Research
- Open access
- Published:
A survey of ADP-ribosyltransferase families in the pathogenic Legionella
BMC Genomics volume 26, Article number: 915 (2025)
Abstract
ADP-ribosyltransferases (ARTs) are a superfamily of enzymes implicated in various cellular processes, including pathogenic mechanisms. The Legionella genus, known for causing Legionnaires’ disease, possesses diverse ART-like effectors. This study explores the proteomes of 41 Legionella species to bioinformatically identify and characterise novel ART-like families, providing insights into their potential roles in pathogenesis and host interactions.
AbstractSection ResultsOur analysis identified 63 proteins with convincing similarity to ARTs, organised into 39 ART-like families, including 26 novel families. Key findings include:
● DUF2971 family: exhibits sequence similarity to DarT toxins and other DNA-acting ARTs.
● DUF4291 family: the largest newly identified family shows structural and sequence similarity to the diphtheria toxin, suggesting the ability to modify proteins.
Most members of the novel ART families are predicted effectors. Although experimental validation of the predicted ART effector functions is necessary, the novel ART-like families identified present promising targets for understanding Legionella pathogenicity and developing therapeutic strategies. We published a complete catalogue of our results in the astARTe database: http://bioinfo.sggw.edu.pl/astarte/.
AbstractSection ConclusionDetailed bioinformatic analyses of the proteomes of pathogenic bacteria reveal novel enzyme families likely involved in interactions with the host.
Background
ADP-ribosyltransferases (ARTs) are enzymes that chemically modify proteins, nucleic acids and small molecules by covalently attaching an ADP-ribose derived from oxidised nicotinamide adenine dinucleotide (NAD⁺) via N-, O- or S-glycosidic bonds [1,2,3]. ADP-ribosyltransferases are typically classified as monoARTs or poly(ADP-ribose) polymerases (PARPs), catalysing the addition of a single ADP-ribose moiety or polymers of ADP-ribose to its targets. Further, the ADP-ribose chain catalysed by PARPs can be either linear or branched [2, 4,5,6].
ADP-ribosyltransferases are common in nature and are found in all domains of life and in viruses. The characteristic feature of the ART superfamily is a conserved structural core formed by a split beta-sheet formed by six or seven antiparallel beta strands and surrounded by alpha helices. As a result, the NAD⁺ molecule is sandwiched between the two halves of the split beta sheet [2, 3, 7]. Almost all ADP-ribosyltransferases have additional protein domains accompanying the ART catalytic domain that target enzymes to their substrates and specific cell locations [3, 8, 9]. In contrast to the structural similarity, sequence conservation between ART families is poor, which is also manifested by the diverse amino acid composition of the active sites. Classifying by active site types, the ART superfamily contains four main clades: HHh, RSE, HYE, and the so-called atypical clade [2].
Mono-ADP-ribosylation is believed to have originally appeared in bacteria as a defence mechanism against viruses or bacteria [2, 10]. Poly-ADP-ribosylation emerged in Eukaryotes as a process involved in DNA-repair, modulation of chromatin structure and programmed cell death [11].
During infection, ADP-ribosylation of host proteins by bacterial effectors can change the apoptotic potential of the cell, disrupt the organisation of the actin cytoskeleton, alter the organisation of cell membranes, and interfere with the cellular immune response [2, 3, 12,13,14,15]. Effector proteins are directly translocated into host cells via specialized secretion systems (e.g., type III, IV, or VI secretion systems), where they manipulate host cellular processes to facilitate bacterial survival or replication [16]. Typically, in known effector-releasing bacteria such as Legionella, effectors are not located in the same operon as the secretion system that releases them.
Other toxins ADP-ribosylate nucleotides in single-stranded DNA (ssDNA). This leads to ADP-ribosylation of the origin of chromosomal replication of DNA and hence inhibition of cell growth [17]. A similar mechanism is used by bacteria for self-defence during phage infection. For example, DarT from M. tuberculosis selectively ADP-ribosylates thymidine nucleotides within phage ssDNA which blocks phage DNA replication [17].
In this study, we aimed to comprehensively identify and classify ART-like proteins, regardless of their functional annotation as toxins, effectors, or uncharacterised proteins. This inclusive approach reflects the biological reality that many ADP-ribosyltransferases used by pathogens do not fall neatly into one category, and some exhibit features of both toxins and effectors depending on their context, delivery mechanism, and host target.
Legionella is a genus comprising free-living, biofilm-associated, or host-associated bacteria [18]. Legionella infection in humans can cause severe atypical pneumonia called legionellosis or a milder version called Pontiac fever [19,20,21,22,23,24]. Currently, the genus Legionella comprises more than 70 different species of bacteria, about half of which have been found to be pathogenic to humans. Additionally, the majority are considered potential human pathogens [24,25,26]. After entering the host cell, Legionella creates a special vacuole in which it can survive and reproduce called the Legionella Containing Vacuole (LCV) [27,28,29]. Effector proteins translocated by Legionella into the host cell can profoundly alter the host cell’s behaviour, which greatly facilitates Legionella replication and pathogenesis. These alterations include safeguarding the LCV from degradation, manipulating the endomembrane system, dampening the host immune response, and exploiting the hosts resources [30,31,32,33,34,35,36]. The most thoroughly-characterized Legionella species, L. pneumophila, translocates over 330 effectors [16, 37]. All Legionella species contain and use type IV secretion system, while some strains and species have additional systems (see Suppl. Table S7). Three recently described L. pneumophila effector families with ART functions and ART structures are lpg0181/Lart1 [38], lpg0080/ceg3 [39] and the SidE family of all-in-one ubiquitin ligases. Lart1 was identified as an ADP-ribosylating factor for a specific class of NAD+-dependent glutamate dehydrogenase (GDH) enzymes found in fungi and protists, which includes many natural hosts of Legionella [38]. This modification inactivates GDH, thus reprogramming host metabolism. Ceg3 was identified as a mono-ADP-ribosyltransferase that localises to mitochondria in host cells and ADP-ribosylates the ADP/ATP translocase, which impairs its activity, hence modulating mitochondrial function [39, 40]. Notably, ceg3/lpg0080 is regulated by its metaeffector lpg0081, an ADPr hydrolase that can reverse the ADP-ribosylation [40, 41].
The SidE family consists of four multidomain proteins, lpg0234/sidE, lpg2157/sdeA, lpg2156/sdeB, lpg2153/sdeC. These proteins catalyze the ligation of ubiquitin (Ub) to Ser and Tyr residues on host proteins independently of the host Ub- conjugating enzymes. This atypical ubiquitination is achieved in two steps, whereas the first step, performed by the ART domain of SidE, involves ADP-ribosylation of an arginine residue on ubiquitin. In the second step, the phosphodiesterase (PDE) domain of SidE hydrolyzes a bond in the ADP-ribose moiety and attaches Ub via a phosphoribose (pR) linkage to the target host protein. This atypical ubiquitination cannot be reversed by host deubiquitinases. Among the proteins modified by SidE effectors are small Rab GTPases, which ultimately results in protecting the LCV from degradation by the host thus allowing Legionella to establish a replication niche [42,43,44,45,46]. This novel biochemistry provided by the sidE family has been recently used in a novel molecular biology tool for proximity ligation analysis of protein-protein and protein-small molecule interactions [47].
Another Legionella effector protein related to the sidE family is lpg2523/lem26, similarly containing ART and PDE domains but in reversed order compared to sidE. While the function of either domain of lem26 has not been elucidated, its genomic neighbor and metaeffector, lpg2526/mavL, a macrodomain, reverses Arg ADP-ribosylation performed by SidE ART domain [48].
In this article, basing on our expertise in identifying novel enzyme families [38, 49], and motivated by many known bacterial effectors and toxins being ARTs [38, 39], we undertook a bioinformatics search and survey of novel ART-like domains in 41 Legionella proteomes. We identified 26 novel ART-like families, six of which we describe in detail. Presenting a catalogue of ART-like catalytic domains in Legionella, we compare the known and the novel families, predict biological functions for the novel ones, establish their evolutionary history and occurrence across the bacterial world.
Methods
Sequence data
Proteomes for 41 species of Legionella were provided by the study by Burstein et al. [16]. To reduce the computational burden, the sequences were clustered by sequence similarity in two steps using the CD-HIT algorithm with sequence identity cut-offs 70% and 50% [50,51,52,53]. Next, 21,616 resulting protein sequences from 41 Legionella proteomes were cut into fragments of 300 amino acids with an overlap of 100 amino acids. This method facilitates the detection of domains in multi-domain proteins, while using an overlap can help detect domains at fragment boundaries.
Remote homology detection
For distant similarity recognition to ART families, four methods were used: (1) the profile-profile alignment and fold recognition algorithm – FFAS [54] (searching the COG, Hsapiens, PDB, SCOP, ECOD and Pfam databases), (2) homology detection and structure prediction method HHpred [55] and HHsearch pipeline (databases searched: Pfam, PDB, SCOP) that uses hidden Markov model HMM-to-HMM comparison; (3) a similar method, Phyre2 [56], which additionally models 3D structure of query and compares it with 3D models library, and (4) homology detection by comparing a profile-HMM to either a single sequence or a database of sequences – HMMER algorithm [57]. Standard parameters and significance thresholds were selected except for the HHpred algorithm, where several parameters have been modified: MSA (Multiple Sequence Alignment) generation method (all options were used); Alignment Mode (local: norealign; local: realign; global: realign); Min probability in hitlist (%) used 10% or 20%; Min coverage of MSA hits (%) used 10% or 20%; MSA generation iteration number was 0, 3, 5 or 8 and E-value cutoff was 1e-3, 0.05 or 0.1 (see Fig. 1 and Suppl. Tables S1-S5).
Multiple sequence alignments and sequence logos
Members of novel families were manually collected using BLAST [58, 59] and non-redundant sequence database (NR) [60] (E-value = 1e-4 or with default options). In cases, when BLAST found less than 5 homologs, PSI-BLAST was used (3 or 5 iterations with default options) [59]. Multiple sequence alignments were made using the MAFFT algorithm with default settings [61, 62]. Next, the sequence logos were prepared using WebLogo3 [63]. Colouring according to amino acid chemistry was used. For the logos, the alignments were processed with an in-house script that removes alignment columns that contain gaps in the reference sequence.
Structure modelling and comparison
The structures of representative proteins from new ART-like families were modelled with the use of two different tools: AlphaFold [64] and RoseTTAFold [65]. Comparisons of structures were performed with FATCAT [66], Dali server [67] and TM-align [68] (Suppl. Table S6 and data made available in the astARTe online database).
Visual clustering of families (sequence analysis of families relations)
The CLANS program [69] was used to visualise the relationships between clusters of ADP-ribosyltransferases families. The collection of the known ADP-ribosyltransferase domains was obtained from the Pfam database, as the sets of sequences from families of ADP-ribosyl clan CL0084 – Pfam 35.0 [70] (see Suppl. Table S9B). Then the sequence sets from ART families not yet included in the ADP-ribosyl Pfam clan were added. Next, collected data were supplemented with sequence sets representing the following ART-like domain families not included in the Pfam database (see Suppl. Table S9B) and, in the end, we added all 26 newly predicted ART families with their homologs collected by one of the three methods with cut-off threshold of 99%: (1) BLAST, NR, E-value = 1e-4; (2) PSI-BLAST, two iterations, NR, default; (3) PSI-BLAST, three iterations, NR, default (see Suppl. Table S9C). Additionally, two families DUF2971 (PF11185, rp15 sequence set) and DUF4291, rp15 set were included (see all details in Suppl. Table S9A).
Next, the whole collection of putative and known domain sequences were prepared for the CLANS procedure by clustering each family separately with CD-HIT and selecting representatives at a 70% (known domains) or 99% (putative domains) sequence identity threshold [53]. CLANS algorithm was run with following parameters: BLOSUM45 (scoring matrix) and E-value = 1 or E-value = 10e-2.
HMMs of known families for searching sequence databases and profile databases
Sequence sets for HMM construction are collected in a similar manner as the data for CLANS analysis. The most important difference: all sequence sets for families present in the Pfam database were downloaded as rp75 sequence sets from Pfam 34.0 [70] (see all details in Suppl. Table S9B).
In the next step, we rejected short sequences (less than 50 amino acids) and applied a sequence similarity cut-off threshold of 80% (CD-HIT). The ClustalO program [71] was used to build multiple sequence alignments. HMMs were prepared by the hmmbuild program from the HMMER software package [57].
Species dendrogram and estimating numbers of homologs
Dendrogram was made from alignment of 16 S rRNA sequences for the species in the family Legionellaceae. The sequences were downloaded from NCBI [72] and SILVA database [73]. The alignment was performed via the ngphylogeny.fr server [74, 75] using the Muscle 3.8.31 algorithm with default parameters [76]. A dendrogram was constructed using the PhyML method with the Approximate Likelihood-Ratio Test (aLRT): SH-like [77]. Homologs for all families were collected using the hmmsearch algorithm (E-value = 1e-4) [57] by using the HMMs of known and new ART and ART-like families as queries against the RefSeq database [60] (see all details in Suppl. Table S9C).
Species dendrogram was visualised using the iTol server [78].
Structure similarity network
Sequences for each of the new families were collected (see Suppl. Table S9D) and protein domains were modelled using ColabFold [79] (an implementation of AlphaFold [64] using the fast homology search of MMseqs2 [80, 81]) or ESMFold [82]. The resulting 432 models and 59 ART structures from the ECOD40 (version 2022/09/12, develop286) database were compared (all to all) using the TM-align algorithm [68]. Based on the obtained TM-score values (nTM-score, score normalised against a smaller structure; with threshold better than 0.3, and processed: ln(nTM-score)×(−1)) for each pair of structures (so that the new score is in the range 0–1), a structure similarity network was constructed in CLANS algorithm (see Suppl. Table S9D). We checked the quality of the models by analysing the pLDDT and pTM parameters for each model (see Suppl. Table S9E). The same set of sequences was used for the sequence similarity network.
NAD docking to models
We made models (ColabFold) for reference sequences from the 6 new families discussed in this article. NAD docking was performed using the HADDOCK 2.4 (High Ambiguity Driven protein-protein DOCKing) web service [83, 84] package with automatic (CPORT [85]) and manual indication of the binding site residues. Visualisation of the results was done in UCSFChimera [86] using ConSurf server [87] for mapping MSA conservation into reference sequences models for 6 families. Structures superposition and the closest structural homologues for models were obtained using the DALI server. Additionally, proteins representative of the families described were modelled with AlphaFold3 [88] including NAD as a ligand. The high-quality models obtained (pLDDT > 90, pTM > 90) confirmed the docking results – location and position of the NAD (data not shown).
Additional analyses
The subcellular localization was predicted by SubCons [89]. Analysis of genomic neighbourhoods was done using the ProFaNa tool [90] and we used the BioCyc portal for detailed analysis of the operons [91]. The occurrence of additional domains was analysed using Batch CD-Search (RPS-BLAST) [92].
Predictions of effector function
Prediction of the presence of signal peptides was made in SignalP 6.0 [93]. Predictions of secreted effector function were done using EffectiveDB [94, 95] and BastionX [96] with standard parameters.
Alternative approach to the detection of distantly related homologues
Alternatively, by using RoseTTAFold structure models for Pfam families, we found an ART-like family in Legionella that was not detectable by sequence similarity. Here, the pipeline included pairwise TM-align structure alignments between RoseTTAFold models provided in the Pfam 35.0 database and a set of reference ART domain structures from the ECOD database (ECOD40).
Definition of new families
To define what is meant by “novel family” we applied several criteria. A novel family is a set of proteins not described in the literature as ADP-ribosyltransferases and not annotated such, with similarity to known ARTs not detectable using standard sequence analysis methods – RPS-BLAST or Pfam HMM. Additionally, a family designated as “novel”, based on CLANS clustering analysis, must be sufficiently different from known and other novel ART families.
Scripts and workflow
The scripts and workflow used in this study are available at https://github.com/Mar-Gra-creator. This version reflects the exact pipeline used in the current analyses and is provided to ensure transparency and reproducibility. Further development and formal publication of the software are planned for future work.
Results
Search for ART superfamily members in Legionella
We started our survey with the pan-proteome from 41 Legionella species (see Methods). After data pre-processing (see Methods), 21,616 sequences were analysed by FFAS03, HMMER, and HHsearch for sequence similarity to known protein domains. The raw results were filtered by a Python text-mining script that retrieved all ART-like hits using Pfam, PDB, and SCOP identifiers as well as keywords in protein names. In this way, 63 potential ART-like domains were identified exhibiting any similarity to known ART-like domains, including statistically non-significant similarity (Suppl. Tables S2-S4). In the next step, we applied RPS-BLAST validation and literature searches, and known ADP-ribosyltransferases were filtered out. The known hits included 21 proteins from 10 families: RES (Pfam identifier PF08808) [97, 98], PARP (PF00644) [99], ART (PF01129) [1, 100], Dot_icm_IcmQ (PF09475) [101], DarT (PF14487) [17], DUF952 (PF06108) [2], lpg0080 [39], Lart1 [38] and the poorly studied AbiGi family (PF10899) [102], presumably part of the type IV toxin-antitoxin system, as well as the FRG family identified recently by Aravind [2, 103] as an ART-like family present in Legionella (Suppl. Tables S1-S2). Forty-two potential ART-like FFAS-HHsearch-HHMER hits, not automatically classified as ART-like, were manually verified using distant sequence similarity search methods (Phyre2, HHpred and CLANS analysis) and structure prediction tools (RoseTTAFold and AlphaFold), followed by structural comparisons (FATCAT and Dali servers) (Suppl. Tables S5-S6). After discarding 3 false positive ARTs, the remaining 39 atypical ART proteins were grouped into 26 novel families (see Suppl. Tables S6, S7 and S11).
We identified unequivocal active site signatures in 15 of these families. In another 7 families, we found the presence of conservative substitutions in the predicted active sites (see Fig. 2; Table 1). In the remaining 4 families, the active site is partly non-conserved which suggests they are pseudoenzymes, i.e. proteins that are homologous to active enzymes but presumably lack catalytic activity because of mutations to critical active site amino acids. As a result, we classified six of the new families into the HYE clade, one into the HHh clade, one into the atypical clade, and the rest (18 families) into the RSE clade (Fig. 3).
The tripartite active site signature [R/H]x[D/T] - [S/Y][T/x][S/x] - [Q/E]x[Q/E/D] is well conserved in most new ART families. Active site motif sequence logos for selected families. An archetypal ADP-ribosyltransferase, Enterotoxin a, ctxA, is shown at the top. Numbering of the residues (top row) according to ctxA sequence [104]. The DB column indicates the source database of ART sequences used for logos. N is the number of homologous sequences from the BLAST search (E-value = 1e-4). In parentheses –- numbers of homologous sequences after CD-HIT clustering at 99% sequence identity
The DUF2971 family shows similarity to DarT proteins and scabin, a DNA-acting ADP-ribosyltransferases [118] suggests that DUF2971 effectors could modify host or invader’s DNA.
Taxonomic distribution of members of novel ART families in bacterial phyla. The family Legionellaceae was considered separately. The number of bacterial strains in which homologues of new families were found is shown on a logarithmic scale (see legend of the figure). Colours reflect ART clade membership
CLANS sequence and structure similarity graphs capture distant relationships between the novel Legionella ART-like families and fifty-nine representative known ART domains from ECOD40 database [105] (see Figs. 4 and 5). The CLANS graph is created by comparing sequences using pairwise alignments from BLAST or pairwise structure comparisons using TM-align. The CLANS sequence and structure similarity networks confirm the characteristics of ART-like superfamily of proteins, i.e., high sequence divergence (Fig. 4A) – even within a single clade – and a highly conserved tertiary structure core (Fig. 4B), also between clades. Even well-characterised ART proteins from the ECOD40 database do not always show sequence similarity high enough to be included in the graph, even with a very relaxed threshold: E-value = 1 (Fig. 4A). It is also apparent that there is more similarity within clades than between them, both when comparing sequences and structures. The small HHh clade, the minimal version of the ART-like fold (only six strands, without any extended inserts), locates centrally in the structural CLANS graph. Our analysis showed that the novel ART-like families: (1) form separate clusters of proteins and (2) generally do not group together with established, well-studied ARTs, thus confirming their distinct family status within the ART-like superfamily.
Sequence and structure similarity relationships between 6 novel and 26 known ART families (defined in the ECOD40 dataset). The CLANS graph shows: (A) the sequence similarities obtained using pairwise BLAST comparisons, taking into account significant and borderline significance similarities up to the E-value of 1 (BLOSUM45); (B) the structure similarities obtained using TM-align all-to-all comparisons (with TMscore threshold better than 0.3). Novel families discussed in detail shown: DUF2971 – turquoise, Lsan_0116 – yellow, DUF4291 – blue, Lsan_2474 – purple, Lani_1641 – pink and Lmac_3114 – green. Known ART families (as defined in the ECOD40 dataset) are marked as red (crosses – clade HHh, dots – clade RSE, triangles – clade HYE, asterisks – atypical clade)
Sequence similarity relationships between 26 new and all 51 known ART families. A and C: all ART families; B and D: subset of families that are closely similar to E. coli heat-labile enterotoxin (LT). The CLANS graph shows the sequence similarities obtained using pairwise BLAST comparisons, taking into account significant and borderline significance similarities: up to the E-value of 1 (A and B) and up to the E-value of 0.01 (C and D). Red symbols: novel ART families. Blue names: names of selected novel ART families. Black names: names of selected known ART families. Not all family names are shown to avoid clutter
In-depth sequence similarity analyses (Fig. 5) compare all novel ART families (including those not described in detail in this article) with known ART-like families. The separation of the novel families (indicated in red) is clearly visible. Some of the ARTs show similarity to E. coli heat-labile enterotoxin (LT) (Fig. 5B and D).
Out of the twenty-six novel Legionella ART-like families (Fig. 6; Table 1), we characterised in more detail the six most interesting ones, considering breadth of their taxonomic distribution, numbers of representatives, features of operons and the conservation of key motifs. The results from this study are compiled into an ART database available at http://bioinfo.sggw.edu.pl/astarte/. This resource combines the new ART families with already known ones. In total, in the database we present data on 77 ART families. We provide sequences, HMMs, sequence logos, 3D structure models, and brief descriptions.
Distribution of ART families in Legionella strains. Numbers of homologues of novel and known ADP-ribosyltransferase families in selected members of the family Legionellaceae. Histograms on the right: numbers of members of known (grey) and novel (yellow) ART families identified in each species. New families are marked with asterisks above the heatmap. The DUF4291 family was not included in the diagram because Legionella homologs were not found in the RefSeq database (see Methods)
The DUF2971 family
The putative effector protein lpg1268 from Legionella pneumophila subsp. pneumophila and homologs from 13 other Legionella species are predicted as a novel ADP-ribosyltransferase family. The DUF2971 domain (DUF, domain of unknown function) shows statistically significant sequence similarity to known ARTs, especially AbiGi, DarT, and Tox-ART-HYD1 families (see Figs. 2, 4 and 5). DarT is the toxin element of a toxin-antitoxin (TA) system. It is an enzyme that specifically ADP-ribosylates thymidine nucleotides on ssDNA in a sequence-specific manner [17, 106].
The ADP-ribosyltransferase catalytic core appears to be well conserved in this family, although the histidine residue in the catalytic motif I is poorly conserved. There is a strictly conserved tyrosine residue in motif II and two conserved glutamic acid residues in the ExE motif III.
The DUF2971 family is present in 5981 species (UniProt database), mainly in Bacteria (5913 species), Archaea (56 species), Eukaryota (6 hits, mostly in in fungi) and in Caudoviricetes - tailed bacteriophages (12 hits). Among the DUF2971-possessing bacteria, one numerous group consists of soil-living saprophytic Clostidriales that ferment plant polysaccharides (82 species). Another large group are ubiquitous bacteria of the order Enterobacteriales, abundant in the human large intestine, also on the skin and in the oropharynx, or free-living in water. Most are opportunistic pathogens that infect the organisms with the weakened immune system, like Salmonella, Escherichia coli, Klebsiella or Shigella. Notably, DUF2971 homologues can be found in strains pathogenic to humans in Escherichia coli O157: H7 (EHEC), Salmonella enterica, Vibrio cholerae.
The appearance of the active site and the structure of the protein (the core is built of 6 beta sheets with inserts between) leaves no doubt this family should be included in the HYE clade (see Fig. 2, Supplementary material 4: Fig. S1 and Supplementary material 5: Fig. S2). The structural divergence from the HYE clade seen in the CLANS analysis (see Fig. 4) is due to unusually large helical inserts between β-strands 1 and 2, as well as between β-strands 4 and 5, that do not disrupt the core of the structure (see Supplementary material 4: Fig. S1).
About 18% of proteins having a DUF2971 domain may function as bacterial effectors (Fig. 7 and Suppl. Table S7). Predicted DUF2971 effectors are found in well-known pathogenic strains of diverse bacteria: S. enterica, V. parahaemolyticus, V. cholerae, L. pneumophila (Suppl. Table S7), suggesting that DUF2971 may play a role in their pathogenic mechanisms.
The DUF2971 domain proteins are, in a few instances, found in operons together with peptidase C26 and low affinity iron permeases. We propose that these proteins may interact in response to oxidative stress, regulation of iron metabolism and protein degradation. Under stress, such as iron excess, peptidase C26 can degrade poly-gamma-glutamate, which binds iron ions, reducing their toxicity. The degradation of poly-gamma-glutamates provides glutamic acid, a precursor to important metabolites, including glutathione, which protects against oxidative stress. Peptidase C26 can influence the availability of metabolites and iron ions by modulating the activity of ADP-ribosyltransferases in response to stress or cell damage. Iron permease regulates the transport of iron ions into the cell, and degradation of poly-gamma-glutamate by peptidase C26 affects intracellular iron levels, which is crucial for homeostasis and protection against oxidative stress. The operon-coordinated response to oxidative stress and regulation of iron metabolism promotes bacterial adaptation to changing environmental conditions.
In another interesting operon variant, we found DUF2971 together with Abi_C, KfrA_N, and a dermonecrotic toxin of the papain-like fold (see Fig. 8). The grouping of these genes in a single operon suggests a coordinated response to various forms of stress, such as bacteriophage infections, competitive pressure, and the regulation of plasmid replication. ADP-ribosyltransferase (ART) may be involved in stress response and DNA repair. The Abi_C domain induces abortive infection which leads to the death of the infected cell and prevention of bacteriophage spread. KfrA_N suggests involvement in the control of plasmid replication, which is crucial for the propagation of resistance genes. Papain-like fold toxins can destroy competing bacteria or protect against pathogens, which is vital in competitive environments. The integration of DNA repair mechanisms, bacteriophage defence, plasmid control, and toxin production indicate a complex system of defence and adaptation, enhancing the bacteria’s chances of survival in changing conditions.
Lsan_0116 – an effector family common to bacteria and archaea
The Lsan_0116 novel ART-like family, named after the cognate Legionella santicrucis protein, shows a noticeable similarity (13% sequences identity by HHpred) to the above-described DUF2971 (see Fig. 4, CLANS diagram). However, the two families present different types of active sites: Lsan_0116 presents an RSE motif, while DUF2971 presents a modified HYE motif (see Fig. 2). Therefore, we consider them to be separate families, even though distant homologs of the two families overlap. We discovered it mainly in bacteria, present in most major bacterial taxa (see Fig. 3), and also sparsely in archaeons.
The active site logo (Fig. 2) shows an arginine or lysine residue in motif I and two glutamic acid residues from motif III. Motif II, which is responsible for the groove for binding the nicotinamide ring and ribose and catalysis [107], is the most unusual. In this family, motif II has the form of [SC]WH instead of the canonical STS. The protein structure presented by the AlphaFold model is also unusual (see Supplementary material 4: Fig. S1 and Supplementary material 5: Fig. S2) – the active site is formed by the first, third and seventh beta sheets (1-3-7 instead of the typical 1-2-5 layout). The structural divergence from the RSE clade is evident in the CLANS analysis (see Fig. 4) and is due to unusual sequence insertions within the catalytic core sequence.
Majority of the Lsan_0116 family (80%) are predicted effectors, in contrast to the related DUF2971 family which has only 18% likely effectors (Fig. 7 and Suppl. Table S7).
The DUF4291 family – the largest atypical novel ART family, present also outside Bacteria
In addition to sequence searches for novel ARTs, we performed a structural search among RoseTTAFold models of all families in the Pfam database (see Methods). A single domain of unknown function (DUF4291), present in Legionella, stood out as significantly similar by structure to known ARTs. Based on structural similarity, conserved sequence motifs in the beta sheets 1, 2 and 5, an ART-like active site could be proposed (Fig. 2).
According to our survey, the DUF4291 is a novel effector family with similarity to ARTs. According to the Pfam database annotations, there are 2 conserved motifs in this uncharacterized family. Structural alignment suggests that these motifs are atypical counterparts of the motifs I and II of the ART catalytic site (QAY and WVK correspond to HxT and STS, respectively).
Although the predicted active site in this family has a very unusual appearance, the catalytic motif III includes a “classical” glutamine. The new family is strikingly well conserved. The DUF4291 domain shows statistically significant structure similarity to HYE clade of ARTs, especially Pfam families Exotox-A_cataly and RES. We could not identify significant sequence similarity to any known family of ARTs (Supplementary Material 4: Fig. S1 and Supplementary Material 5: Fig. S2). Also, the correct docking of NAD to the obtained model suggests that the DUF4291 domain may have ART activity (Supplementary Material 5: Fig. S2).
The DUF4291 family is present in 4540 species, mainly in Bacteria (3887 organisms) and in Eukaryota (646 species). This family is observed in Legionellaceae (it has only been identified in the Tatlockia genus which is closely related to the genus Legionella [108, 109]) and in other, well-studied bacteria e.g., Escherichia coli O103:H25, Salmonella enterica, Xanthomonas oryzae, Myxococcus xanthus, Klebsiella aerogenes. In bacteria, most hits were identified in bacterial phytopathogens and animal pathogens, including human ones or in opportunistic pathogens, and in harmless commensals of mammals and bacteria that occur in water or soil. We also found it common in bacteria widely used to produce antibiotics and other therapeutics (e.g. Streptomycetaceae), in photosynthesising Cyanobacteria and in the human microbiome (e.g. Firmicutes). DUF4291 was also found in abundance in organic compound-degrading bacteria of the order Myxococcales and in plant-symbiotic nitrogen-fixing rhizobia. In Eukaryotes, this family is present mainly in Fungi (605 hits), particularly Dikarya. It is widespread in Aspergillaceae. It also occurs in saprophytic fungi of the genus Fusarium. In Metazoa, the DUF4291 family was found in some taxa , among others in segmented annelid worms, urchins, lancelets and molluscs, but is absent from arthropods, tunicates and vertebrates. In addition, DUF4291 is found in Alveolata, Amoebozoa, Rhodymeniophycidae and Viridiplantae. Proteins from the new family are occasionally found in Archaea and Viruses.
Genomic neighbourhood analysis of DUF4291 family proteins (Fig. 8) revealed the common presence of TetR_N and NUDIX domains (in 28% and 18% of analysed neighbourhoods, respectively). TetR_N is a domain found in several bacterial and archaeal transcriptional regulators [110]. The NUDIX hydrolase domains are widespread and usually function as pyrophosphohydrolases. Some NUDIX proteins degrade potentially mutagenic, oxidised nucleotides while others control the levels of metabolic intermediates and signalling compounds [111]. Slightly less frequent are Macro and ADP_ribosyl_GH domains, co-occurring in 9–10% of DUF4291 neighbourhoods. Macro and ADP-ribosylglycohydrolase (ADP_ribosyl_GH) domains are elements of ADP-ribosylation signalling and complete the “writer-reader-eraser” triad. Macros can act as “readers’’ or “erasers”, whereas ADP-ribosylhydrolyses are “erasers” of ADP-ribosylation marks [112,113,114,115,116]. Another known ART domain – RNA 2’-phosphotransferase (PTS_2-RNA) was observed almost as often (10%) in the genomic neighbourhoods of DUF4291. Taken together, these observations strongly support involvement of DUF4291 in ADP-ribosylation signalling and transcription and/or RNA biochemistry.
It is estimated that more than 40% of proteins having a DUF4291 domain may be effectors (see Fig. 7 and Suppl. Table S7). However, due to the prediction algorithm using the presence of eukaryotic-like domains (ELDs), there is a risk of overprediction.
Lsan_2474, a new ART-like domain, similar to AbiGi
This novel family, named after the Legionella santicrucis Lsan_2474 protein, is a distant homologue of the RSE clade families, AbiGi (8% sequence identity) and DUF2971. FFAS sequence alignments identified three conserved active site motifs typical of the RSE clade (R-SxS-ExE), however, motif II is often mutated to CxS. The presence of the novel family was confirmed only in Bacteria (133 organisms) from diverse taxa: Gamma-, Beta- and Epsilonproteobacteria, Terrabacteria group, Acidobacteria, and Bacteroidetes, e.g., Vibrio cholerae, Pseudomonas syringae, Yersinia enterocolitica and a virus (Siphoviridae). There are bacterial effectors in the Lsan_2474 family and we estimate that about 78% of proteins having a Lsan_2474 domain may be effectors (Fig. 7 and Suppl. Table. S7).
Lani_1641, a new ART-like domain, similar to Ntox31 and enterotoxin_a
Lani_1641 is a small, novel ART-like family named after L. anisa protein Lani_1641. Although the active site presents the RSE motif, sequence comparisons (Figs. 4 and 5) do not place this family close to the RSE clade families. The matching of the protein structure model to known ART structures shows strong similarity to Pertussis toxin, Scabin and PARPs with 11–18% sequence identity and significant DALI Z-scores between 5.5 and 11.1. The CLANS analysis also places this family in the RSE clade, and the protein model suggests the presence of a catalytic core composed of seven beta sheets.
Members of this family were detected only in Legionellaceae. It is estimated that about 65% of proteins having a Lani_1641 domain may be effectors (Fig. 7 and Suppl. Table S7). The ART-like Lani_1641 domains are not accompanied by other functional and structural domains in proteins. Also, no multigene operons involving the Lani_1641 domain were identified (Fig. 8).
Lmac_3114 new ART-like domain, similar to DarT
The novel Lmac_3114 family, named after Legionella maceachernii protein Lmac_3114 is a distant homologue of DarT domain – from the HYE clade of ARTs. In Lmac_3114, the HYE clan catalytic motif II [Y-x-x] contains a conserved tryptophan residue. DarT family members act as the toxins in toxin-antitoxin (TA) system, by specifically modifying thymidine nucleotides on ssDNA [17].
Its closest relatives identified by FFAS, HHpred and Phyre2 also include the ART family RNA 2`-phosphotransferase. Sequence alignments identified three conserved active site motifs (Hxx, FFW, and xxE). Structural comparisons do not make it possible to determine precisely to which clade this new family should be assigned. Greatest similarity is found to the DarT family from the HYE clade and PTS_2-RNA from the HHh clade, however, the presence of a conserved glutamic acid residue in the third motif leads us to assign it to the HYE clade. For both clades, the presence of six beta-sheets in the catalytic core is characteristic, which is reflected in the Lmac_3114 model.
This family is detected only in bacteria from Gamma-, Alpha-, Beta- and Deltaproteobacteria, Nitrospirae, Terrabacteria group, Acidobacteria, PVC group and Bacteroidetes, e.g., Pseudomonas aeruginosa, P. syringae, Xanthomonas oryzae and Burkholderia pseudomallei.
Genomic neighbourhood analysis of Lmac_3114 family proteins provided evidence that the adenylosuccinate synthetase domain is extremely common in its vicinity (co-occurrence in 60% of cases) and, in at least some cases, forms an operon with the ART-like domain (Fig. 7). The presence of genes encoding ADP-ribosyltransferase and adenylosuccinate synthase in the operon suggests their functional association, e.g. co-operation in specific metabolic or regulatory processes. We have identified such an operon in nitrogen-fixing soil bacteria (Rhizobium azooxidifex and Mesorhizobium ciceri). A more elaborate version of the operon is found in Pseudomonas aeruginosa – there the ART-like domain is found together with the AAA domain, adenylosuccinate synthetase and NUDIX domain, a hydrolase that cleaves nucleoside diphosphates linked to another moiety. The presence in the common operon of adenylosuccinate synthase and ART domains suggests regulation of ART activity by changes in purine synthesis, e.g. indirect modulation of ADP-ribosyltransferase activity in either cellular regulation or stress response by regulating NAD+ availability.
Analysis of potential NAD binding sites
To evaluate the compatibility of the novel ART families with NAD binding, we used HADDOCK to dock NAD to reference protein structure models for families discussed in detail in this article. For comparison, we performed re-docking for a crystallographic complex consisting of NAD bound to eukaryotic mono-ADP-ribosyltransferase ART2.2 (PDB code 1OG3) [117]. Among the six novel ART families, the best NAD docking scores ranged from − 60.2 to −31.5 while the score for redocking to the NAD bound crystal structure, the score was − 49.6. The best scores (under − 60.0) were obtained for Lsan2474 and DUF4291 representatives. In all cases, manual identification of amino acids forming the putative NAD pocket based on sequence conservation and structural alignments to known ARTs resulted in a more favourable docking than automatic detection of ligand-binding residues. In almost all cases (except Lmac_3114), the docking procedure resulted in the NAD ligand in a correct position, i.e. resembling the NAD poses observed in known ART crystal structures (Supplementary Material 5: Fig. S2). Intriguingly, for Lmac_3114, the NAD molecule is located in a non-standard location, however, the predicted active site residues interact with the bound NAD molecule which may suggest an atypical catalytic mechanism, different from most ART families. Mapping sequence conservation onto the protein surfaces shows that highly conserved residues group within and near in the predicted NAD binding pockets. Close-up view of the NAD sites focusing on conserved residue side chains shows that the poses of NAD molecule are very similar for all models, with adenine ring stacking out (again except of Lmac_3114).
The active site signatures indicated by sequence logos (Fig. 2) were confirmed by structural models and NAD docking results as likely catalytically relevant residues involved in binding the NAD molecule (Supplementary material 4: Fig. S1, Supplementary material 5: Fig. S2 and Suppl. Table S8). We obtained similar results using the AlphaFold3 server [88] to model representative proteins from each family with NAD+ (data not shown). These observations support our hypothesis that the newly discovered ART-like families can bind NAD and be functional ADP-ribosyltransferases.
To gain insight into potential functions of the novel ART-like families, we predicted the content of secreted proteins and effectors in each family by analysing presence of signal peptides, the amino acid composition of the C-terminus of the sequence, the binding site of the secretion chaperone protein and the presence of eukaryotic-like domains (see Methods and Suppl. Table S7). Among the families discussed in detail in this article, all are predicted to include secreted effector proteins. Particularly prominent is the Lsan_0116 family, in which we predict that more than 80% of the proteins will be effectors (Fig. 7).
Discussion
In our bioinformatic examination of 41 Legionella species, we have documented 63 groups of Legionella orthologues with convincing sequence or structure similarity to ADP-ribosyltransferases, that can be organised into 39 ART-like families. Among these families, 26 families are novel ART superfamily members (see Table 1). One of the novel families (DUF2971) is represented in L. pneumophila. The newly identified ART families, initially found by sequence searches, have been further corroborated by artificial intelligence-driven structural predictions.
The largest newly identified ART-like family, DUF4291, including approximately 5 000 members, shows structural and sequence similarity to the diphtheria toxin family of mono-ART toxins, e.g. P. aeruginosa exotoxin A, C. diphtheriae diphtheria toxin and cholix toxin from V. cholerae [119]. These ARTs are important virulence factors belonging to the class of exotoxins secreted by pathogenic bacteria and cause diphtheria, cholera and pneumonia, respectively. All of them ADP-ribosylate elongation factor 2 in host cells, so it is plausible to hypothesise that the DUF4291 family modifies proteins rather than nucleic acids. Despite the absence of the last glutamate in the key ExE motif in DUF4291 family, we predict that it may be an active ART - there are known examples of active mono-ARTs with substitutions in this motif, for example TRPT1 and several members of the PARP family, including PARP8, PARP11, and PARP16.
The ART complement of the Legionella genus showcases diverse effector ARTs, possibly corresponding to adaptability to various hosts. For example, the extensively studied Legionella pneumophila, possesses genes known to facilitate infection in various hosts, which exhibit both similarities and distant relationships with eukaryotic ARTs, possibly acquired through horizontal gene transfer.
While some of the identified ART-like proteins show similarity to well-characterised antibacterial toxins or effectors targeting eukaryotic host cells, we did not attempt to assign a strict functional label (toxin vs. effector) based solely on prediction tools. This is due to the known limitations of current bioinformatic methods in reliably distinguishing these categories, particularly in the absence of experimental data.
Instead, we interpret these proteins as part of a broader class of virulence factors, recognising that the distinction between toxins and effectors can be context-dependent, and some ARTs may exhibit features of both. Where available, homology to known proteins and predicted secretion signals were used to support functional hypotheses, which are described in the family annotations.
The functional boundary between toxins and effectors is not always well defined. Indeed, there is a recent example of a secreted Yersinia protein that can function as a toxin killing other bacteria while also acting as effector and targeting eukaryotic host cells [120].
Although the predicted effector functions remain to be validated, these novel ART-like families present attractive candidates for experimental studies into mechanisms of pathogenicity due to their likely crucial roles in the pathogen’s survival. While this manuscript was being finalised, a thorough study presented a detailed survey of structural domains in all the effectors of L. pneumophila [121]. While our study addresses only ART-like proteins, it is complementary to this work of Patel et al. because we surveyed 41 species instead of one and did not limit the exploration to effectors.
In summary, our survey of the Legionella ADP-ribosyltransferase world provides insights into the pathogen’s toolkit for infection and offers a broader perspective on nature’s remarkable adaptability and ingenuity in the use of the evolutionarily successful ART-like superfamily. The results from this study are available at http://bioinfo.sggw.edu.pl/astarte/.
Conclusions
We have identified and bioinformatically characterised 26 novel ART-like families in Legionellas. This is a substantial increase over 30 ART families currently classified in the Pfam database ADP-ribosyl clan. Some of the novel families, e.g., DUF2971 and DUF4291, are widespread in the bacterial kingdom, suggesting their potentially important biological roles. The discovery and characterisation of new ART-like families may contribute to the elucidation of the strategy used by Legionella and other bacteria to modulate the functions of host cells. Our results should form the basis for future experimental studies aimed at determining the functions of the novel ARTs and facilitate the development of new methods of treating bacterial infections.
Data availability
All data generated or analysed during this study are included in this published article [and its supplementary information files]. Sets of aligned representative sequences of Legionella ART-like families are available from the online database at http:/bioinfo.sggw.edu.pl/astarte.
References
Corda D, Di Girolamo M. Functional aspects of protein mono-ADP-ribosylation. EMBO J. 2003;22:1953–8. https://doi.org/10.1093/emboj/cdg209.
Aravind L, Zhang D, de Souza RF, Anand S, Iyer LM. The natural history of ADP-Ribosyltransferases and the ADP-Ribosylation system. Curr Top Microbiol Immunol. 2015;384:3–32. https://doi.org/10.1007/82_2014_414.
Cohen MS, Chang P. Insights into the biogenesis, function, and regulation of ADP-ribosylation. Nat Chem Biol. 2018;14:236–43. https://doi.org/10.1038/nchembio.2568.
Ueda K, Hayaishi O. ADP-Ribosylation. Annu Rev Biochem. 1985;54:73–100.
Sugimura T, Miwa M. Poly(ADP-nbose). Historical perspective. Mol Cell Biochem. 1994;138:5–12.
Munnur D, Ahel I. Reversible mono-ADP-ribosylation of DNA breaks. FEBS J. 2017;284:4002–16. https://doi.org/10.1111/febs.14297.
Aravind L, de Souza RF. Identification of novel components of NAD-utilizing metabolic pathways and prediction of their biochemical functions. Mol Biosyst. 2012;8:1661–77. https://doi.org/10.1039/c2mb05487f.
Vyas S, Chesarone-Cataldo M, Todorova T, Huang Y-H, Chang P. A systematic analysis of the PARP protein family identifies new functions critical for cell physiology. Nat Commun. 2013;4:2240. https://doi.org/10.1038/ncomms3240.
Bock FJ, Chang P. New directions in poly(ADP-ribose) polymerase biology. FEBS J. 2016;283:4017–31. https://doi.org/10.1111/febs.13737.
Koch-Nolte F, editor. Endogenous ADP-Ribosylation. Springer International Publishing; 2015. https://doi.org/10.1007/978-3-319-10771-4.
Morales JC, Li L, Fattah FJ, Dong Y, Bey EA, Patel M, et al. Review of Poly (ADP-ribose) polymerase (PARP) mechanisms of action and rationale for targeting in cancer and other diseases. Crit Rev Eukaryot Gene Expr. 2014;24:15–28.
Maresso AW, Deng Q, Pereckas MS, Wakim BT, Barbieri JT. Pseudomonas aeruginosa exos ADP-ribosyltransferase inhibits ERM phosphorylation. Cell Microbiol. 2007;9:97–105. https://doi.org/10.1111/j.1462-5822.2006.00770.x.
Hottiger MO, Hassa PO, Lüscher B, Schüler H, Koch-Nolte F. Toward a unified nomenclature for mammalian ADP-ribosyltransferases. Trends Biochem Sci. 2010;35:208–19. https://doi.org/10.1016/j.tibs.2009.12.003.
Komander D, Randow F. Strange new world: bacteria catalyze ubiquitylation via ADP ribosylation. Cell Host Microbe. 2017;21:127–9. https://doi.org/10.1016/j.chom.2017.01.014.
Klockgether J, Tümmler B. Recent advances in Understanding Pseudomonas aeruginosa as a pathogen. F1000Res. 2017;6:1261. https://doi.org/10.12688/f1000research.10506.1.
Burstein D, Amaro F, Zusman T, Lifshitz Z, Cohen O, Gilbert JA, et al. Uncovering the Legionella genus effector repertoire - strength in diversity and numbers. Nat Genet. 2016;48:167–75. https://doi.org/10.1038/ng.3481.
Jankevicius G, Ariza A, Ahel M, Ahel I. The Toxin-Antitoxin system DarTG catalyzes reversible ADP-Ribosylation of DNA. Mol Cell. 2016;64:1109–16. https://doi.org/10.1016/j.molcel.2016.11.014.
Taylor M, Ross K, Bentham R. Legionella, protozoa, and biofilms: interactions within complex microbial systems. Microb Ecol. 2009;58:538–47. https://doi.org/10.1007/s00248-009-9514-z.
Fraser DW, Tsai TR, Orenstein W, Parkin WE, Beecham HJ, Sharrar RG, et al. Legionnaires’ disease: description of an epidemic of pneumonia. N Engl J Med. 1977;297:1189–97. https://doi.org/10.1056/NEJM197712012972201.
McDade JE, Shepard CC, Fraser DW, Tsai TR, Redus MA, Dowdle WR. Legionnaires’ disease: isolation of a bacterium and demonstration of its role in other respiratory disease. N Engl J Med. 1977;297:1197–203. https://doi.org/10.1056/NEJM197712012972202.
Brenner DJ, Steigerwalt AG, McDade JE. Classification of the legionnaires’ disease bacterium: Legionella pneumophila, genus novum, species Nova, of the family legionellaceae, Familia Nova. Ann Intern Med. 1979;90:656–8. https://doi.org/10.7326/0003-4819-90-4-656.
Newton HJ, Ang DKY, van Driel IR, Hartland EL. Molecular pathogenesis of infections caused by Legionella Pneumophila. Clin Microbiol Rev. 2010;23:274–98. https://doi.org/10.1128/CMR.00052-09.
Mondino S, Schmidt S, Rolando M, Escoll P, Gomez-Valero L, Buchrieser C. Legionnaires’ disease: state of the Art knowledge of pathogenesis mechanisms of Legionella. Annu Rev Pathol Mech Dis. 2020;15:439–66. https://doi.org/10.1146/annurev-pathmechdis-012419-032742.
Kanatani J, Watahiki M, Kimata K, Kato T, Uchida K, Kura F, et al. Detection of Legionella species, the influence of precipitation on the amount of Legionella DNA, and bacterial Microbiome in aerosols from outdoor sites near asphalt roads in Toyama prefecture, Japan. BMC Microbiol. 2021;21:215. https://doi.org/10.1186/s12866-021-02275-2.
Mercante JW, Winchell JM. Current and emerging Legionella diagnostics for laboratory and outbreak investigations. Clin Microbiol Rev. 2015. https://doi.org/10.1128/CMR.00029-14.
Chambers ST, Slow S, Scott-Thomas A, Murdoch DR. Legionellosis caused by Non-Legionella Pneumophila species, with a focus on Legionella longbeachae. Microorganisms. 2021;9:291. https://doi.org/10.3390/microorganisms9020291.
Steiner B, Weber S, Hilbi H. Formation of the Legionella-containing vacuole: phosphoinositide conversion, GTPase modulation and ER dynamics. Int J Med Microbiol. 2018;308:49–57. https://doi.org/10.1016/j.ijmm.2017.08.004.
Horwitz MA. The legionnaires’ disease bacterium (Legionella pneumophila) inhibits phagosome-lysosome fusion in human monocytes. J Exp Med. 1983;158:2108–26. https://doi.org/10.1084/jem.158.6.2108.
Horwitz MA. Formation of a novel phagosome by the legionnaires’ disease bacterium (Legionella pneumophila) in human monocytes. J Exp Med. 1983;158:1319–31. https://doi.org/10.1084/jem.158.4.1319.
Chauhan D, Shames SR. Pathogenicity and virulence of legionella: intracellular replication and host response. Virulence 12:1122–44. https://doi.org/10.1080/21505594.2021.1903199.
Bruckert WM, Abu Kwaik Y. Complete and ubiquitinated proteome of the Legionella-Containing vacuole within human macrophages. J Proteome Res. 2015;14:236–48. https://doi.org/10.1021/pr500765x.
Shames SR. Eat or be eaten: strategies used by Legionella to acquire Host-Derived nutrients and evade lysosomal degradation. Infect Immun 91:e00441–22. https://doi.org/10.1128/iai.00441-22.
Price CTD, Richards AM, Abu Kwaik Y. Nutrient generation and retrieval from the host cell cytosol by intra-vacuolar Legionella Pneumophila. Front Cell Infect Microbiol. 2014;4:111. https://doi.org/10.3389/fcimb.2014.00111.
Fonseca MV, Swanson MS. Nutrient salvaging and metabolism by the intracellular pathogen Legionella Pneumophila. Front Cell Infect Microbiol. 2014;4:12. https://doi.org/10.3389/fcimb.2014.00012.
Schunder E, Gillmaier N, Kutzner E, Eisenreich W, Herrmann V, Lautner M, et al. Amino acid uptake and metabolism of Legionella Pneumophila hosted by acanthamoeba castellanii. J Biol Chem. 2014;289:21040–54. https://doi.org/10.1074/jbc.M114.570085.
Ge J, Shao F. Manipulation of host vesicular trafficking and innate immune defence by Legionella dot/icm effectors. Cell Microbiol. 2011;13:1870–80. https://doi.org/10.1111/j.1462-5822.2011.01710.x.
Wexler M, Zusman T, Linsky M, Lifshitz Z, Segal G. The Legionella genus core effectors display functional conservation among orthologs by themselves or combined with an accessory protein. Curr Res Microb Sci. 2022;3:100105. https://doi.org/10.1016/j.crmicr.2022.100105.
Black MH, Osinski A, Park GJ, Gradowski M, Servage KA, Pawłowski K, et al. A Legionella effector ADP-ribosyltransferase inactivates glutamate dehydrogenase. J Biol Chem. 2021;296:100301. https://doi.org/10.1016/j.jbc.2021.100301.
Fu J, Zhou M, Gritsenko MA, Nakayasu ES, Song L, Luo Z-Q. Legionella Pneumophila modulates host energy metabolism by ADP-ribosylation of ADP/ATP translocases. eLife. 2022;11:e73611. https://doi.org/10.7554/eLife.73611.
Kubori T, Lee J, Kim H, Yamazaki K, Nishikawa M, Kitao T, et al. Reversible modification of mitochondrial ADP/ATP translocases by paired Legionella effector proteins. Proc Natl Acad Sci. 2022;119(e2122872119). https://doi.org/10.1073/pnas.2122872119.
Fu J, Li P, Guan H, Huang D, Song L, Ouyang S, et al. Legionella Pneumophila temporally regulates the activity of ADP/ATP translocases by reversible ADP-ribosylation. mLife. 2022;1:51–65. https://doi.org/10.1002/mlf2.12014.
Akturk A, Wasilko DJ, Wu X, Liu Y, Zhang Y, Qiu J, et al. Mechanism of phosphoribosyl-ubiquitination mediated by a single Legionella effector. Nature. 2018;557:729–33. https://doi.org/10.1038/s41586-018-0147-6.
Kotewicz KM, Ramabhadran V, Sjoblom N, Vogel JP, Haenssler E, Zhang M, et al. A single Legionella effector catalyzes a multistep ubiquitination pathway to rearrange tubular Endoplasmic reticulum for replication. Cell Host Microbe. 2017;21:169–81. https://doi.org/10.1016/j.chom.2016.12.007.
Kalayil S, Bhogaraju S, Bonn F, Shin D, Liu Y, Gan N, et al. Insights into catalysis and function of phosphoribosyl-linked Serine ubiquitination. Nature. 2018;557:734–8. https://doi.org/10.1038/s41586-018-0145-8.
Qiu J, Sheedlo MJ, Yu K, Tan Y, Nakayasu ES, Das C, et al. Ubiquitination independent of E1 and E2 enzymes by bacterial effectors. Nature. 2016;533:120–4. https://doi.org/10.1038/nature17657.
Dong Y, Mu Y, Xie Y, Zhang Y, Han Y, Zhou Y, et al. Structural basis of ubiquitin modification by the Legionella effector SdeA. Nature. 2018;557:674–8. https://doi.org/10.1038/s41586-018-0146-7.
Ye JS, Majumdar A, Park BC, Black MH, Hsieh T-S, Osinski A, et al. Bacterial ubiquitin ligase engineered for small molecule and protein target identification. BioRxiv. 2025. https://doi.org/10.1101/2025.03.20.644192. :2025.03.20.644192.
Zhang Z, Fu J, Rack JGM, Li C, Voorneveld J, Filippov DV, et al. Legionella metaeffector MavL reverses ubiquitin ADP-ribosylation via a conserved arginine-specific macrodomain. Nat Commun. 2024;15:2452. https://doi.org/10.1038/s41467-024-46649-2.
Wyżewski Z, Gradowski M, Krysińska M, Dudkiewicz M, Pawłowski K. A novel predicted ADP-ribosyltransferase-like family conserved in eukaryotic evolution. PeerJ. 2021;9:e11051. https://doi.org/10.7717/peerj.11051.
Li W, Jaroszewski L, Godzik A. Clustering of highly homologous sequences to reduce the size of large protein databases. Bioinformatics. 2001;17:282–3. https://doi.org/10.1093/bioinformatics/17.3.282.
Li W, Jaroszewski L, Godzik A. Tolerating some redundancy significantly speeds up clustering of large protein databases. Bioinformatics. 2002;18:77–82. https://doi.org/10.1093/bioinformatics/18.1.77.
Li W, Godzik A. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics. 2006;22:1658–9. https://doi.org/10.1093/bioinformatics/btl158.
Huang Y, Niu B, Gao Y, Fu L, Li W. CD-HIT suite: a web server for clustering and comparing biological sequences. Bioinformatics. 2010;26:680–2. https://doi.org/10.1093/bioinformatics/btq003.
Xu D, Jaroszewski L, Li Z, Godzik A. FFAS-3D: improving fold recognition by including optimized structural features and template re-ranking. Bioinformatics. 2014;30:660–7. https://doi.org/10.1093/bioinformatics/btt578.
Gabler F, Nam S-Z, Till S, Mirdita M, Steinegger M, Söding J, et al. Protein sequence analysis using the MPI bioinformatics toolkit. Curr Protocols Bioinf. 2020;72:e108. https://doi.org/10.1002/cpbi.108.
Kelley LA, Mezulis S, Yates CM, Wass MN, Sternberg MJE. The Phyre2 web portal for protein modeling, prediction and analysis. Nat Protoc. 2015;10:845–58. https://doi.org/10.1038/nprot.2015.053.
Eddy SR, Accelerated Profile HMM, Searches. PLoS Comput Biol. 2011;7:e1002195. https://doi.org/10.1371/journal.pcbi.1002195.
Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25:3389–402.
Boratyn GM, Schäffer AA, Agarwala R, Altschul SF, Lipman DJ, Madden TL. Domain enhanced lookup time accelerated BLAST. Biol Direct. 2012;7:12. https://doi.org/10.1186/1745-6150-7-12.
Sayers EW, Bolton EE, Brister JR, Canese K, Chan J, Comeau DC, et al. Database resources of the National center for biotechnology information. Nucleic Acids Res. 2021;50:D20–6. https://doi.org/10.1093/nar/gkab1112.
Katoh K, Misawa K, Kuma K, Miyata T. MAFFT: a novel method for rapid multiple sequence alignment based on fast fourier transform. Nucleic Acids Res. 2002;30:3059–66. https://doi.org/10.1093/nar/gkf436.
Katoh K, Rozewicki J, Yamada KD. MAFFT online service: multiple sequence alignment, interactive sequence choice and visualization. Brief Bioinform. 2019;20:1160–6. https://doi.org/10.1093/bib/bbx108.
Crooks GE, Hon G, Chandonia J-M, Brenner SE. WebLogo: A sequence logo generator. Genome Res. 2004;14:1188–90. https://doi.org/10.1101/gr.849004.
Jumper J, Evans R, Pritzel A, Green T, Figurnov M, Ronneberger O, et al. Highly accurate protein structure prediction with alphafold. Nature. 2021;596:583–9. https://doi.org/10.1038/s41586-021-03819-2.
Baek M, DiMaio F, Anishchenko I, Dauparas J, Ovchinnikov S, Lee GR, et al. Accurate prediction of protein structures and interactions using a three-track neural network. Science. 2021;373:871–6. https://doi.org/10.1126/science.abj8754.
Li Z, Jaroszewski L, Iyer M, Sedova M, Godzik A. FATCAT 2.0: towards a better Understanding of the structural diversity of proteins. Nucleic Acids Res. 2020;48:W60–4. https://doi.org/10.1093/nar/gkaa443.
Holm L. DALI and the persistence of protein shape. Protein Sci. 2020;29:128–40. https://doi.org/10.1002/pro.3749.
Zhang Y, Skolnick J. TM-align: a protein structure alignment algorithm based on the TM-score. Nucleic Acids Res. 2005;33:2302–9. https://doi.org/10.1093/nar/gki524.
Frickey T, Lupas A. CLANS: a Java application for visualizing protein families based on pairwise similarity. Bioinformatics. 2004;20:3702–4. https://doi.org/10.1093/bioinformatics/bth444.
Mistry J, Chuguransky S, Williams L, Qureshi M, Salazar GA, Sonnhammer ELL, et al. Pfam: the protein families database in 2021. Nucleic Acids Res. 2021;49:D412–9. https://doi.org/10.1093/nar/gkaa913.
Sievers F, Higgins DG. Clustal Omega for making accurate alignments of many protein sequences. Protein Sci. 2018;27:135–45. https://doi.org/10.1002/pro.3290.
Nucleotide. [Internet]. Bethesda (MD): National Library of Medicine (US), National Center for Biotechnology Information. https://www.ncbi.nlm.nih.gov/nucleotide/. Accessed 8 Mar 2022.
Glöckner FO, Yilmaz P, Quast C, Gerken J, Beccati A, Ciuprina A, et al. 25 years of serving the community with ribosomal RNA gene reference databases and tools. J Biotechnol. 2017;261:169–76. https://doi.org/10.1016/j.jbiotec.2017.06.1198.
Dereeper A, Guignon V, Blanc G, Audic S, Buffet S, Chevenet F, et al. Phylogeny.fr: robust phylogenetic analysis for the non-specialist. Nucleic Acids Res. 2008;465–9. https://doi.org/10.1093/nar/gkn180. 36 Web Server issue:W.
Lemoine F, Correia D, Lefort V, Doppelt-Azeroual O, Mareuil F, Cohen-Boulakia S, et al. NGPhylogeny.fr: new generation phylogenetic services for non-specialists. Nucleic Acids Res. 2019;47:W260–5. https://doi.org/10.1093/nar/gkz303.
Edgar RC. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2004;32:1792–7. https://doi.org/10.1093/nar/gkh340.
Guindon S, Dufayard J-F, Lefort V, Anisimova M, Hordijk W, Gascuel O. New algorithms and methods to estimate Maximum-Likelihood phylogenies: assessing the performance of PhyML 3.0. Syst Biol. 2010;59:307–21. https://doi.org/10.1093/sysbio/syq010.
Letunic I, Bork P. Interactive tree of life (iTOL) v5: an online tool for phylogenetic tree display and annotation. Nucleic Acids Res. 2021;49:W293–6. https://doi.org/10.1093/nar/gkab301.
Mirdita M, Schütze K, Moriwaki Y, Heo L, Ovchinnikov S, Steinegger M. ColabFold: making protein folding accessible to all. Nat Methods. 2022;19:679–82. https://doi.org/10.1038/s41592-022-01488-1.
Mirdita M, Steinegger M, Söding J. MMseqs2 desktop and local web server app for fast, interactive sequence searches. Bioinformatics. 2019;35:2856–8. https://doi.org/10.1093/bioinformatics/bty1057.
Steinegger M, Söding J. MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets. Nat Biotechnol. 2017;35:1026–8. https://doi.org/10.1038/nbt.3988.
Lin Z, Akin H, Rao R, Hie B, Zhu Z, Lu W, et al. Evolutionary-scale prediction of atomic-level protein structure with a Language model. Science. 2023;379:1123–30. https://doi.org/10.1126/science.ade2574.
van Zundert GCP, Rodrigues JPGLM, Trellet M, Schmitz C, Kastritis PL, Karaca E, et al. The HADDOCK2.2 web server: User-Friendly integrative modeling of biomolecular complexes. J Mol Biol. 2016;428:720–5. https://doi.org/10.1016/j.jmb.2015.09.014.
Honorato RV, Koukos PI, Jiménez-García B, Tsaregorodtsev A, Verlato M, Giachetti A et al. Structural biology in the clouds: the WeNMR-EOSC ecosystem. Front Mol Biosci. 2021;8:729513. https://doi.org/10.3389/fmolb.2021.729513
de Vries SJ, Bonvin AMJJ. CPORT: A consensus interface predictor and its performance in Prediction-Driven Docking with HADDOCK. PLoS ONE. 2011;6:e17695. https://doi.org/10.1371/journal.pone.0017695.
Pettersen EF, Goddard TD, Huang CC, Couch GS, Greenblatt DM, Meng EC, et al. UCSF Chimera–a visualization system for exploratory research and analysis. J Comput Chem. 2004;25:1605–12. https://doi.org/10.1002/jcc.20084.
Ashkenazy H, Erez E, Martz E, Pupko T, Ben-Tal N. ConSurf 2010: calculating evolutionary conservation in sequence and structure of proteins and nucleic acids. Nucleic Acids Res. 2010;38(Web Server issue):W529–533. https://doi.org/10.1093/nar/gkq399.
Abramson J, Adler J, Dunger J, Evans R, Green T, Pritzel A, et al. Accurate structure prediction of biomolecular interactions with alphafold 3. Nature. 2024;1–3. https://doi.org/10.1038/s41586-024-07487-w.
Salvatore M, Shu N, Elofsson A. The subcons webserver: A user friendly web interface for state-of‐the‐art subcellular localization prediction. Protein Sci. 2018;27:195–201. https://doi.org/10.1002/pro.3297.
Baranowski B, Pawłowski K. Protein family neighborhood analyzer-ProFaNA. PeerJ. 2023;11:e15715. https://doi.org/10.7717/peerj.15715.
Karp PD, Billington R, Caspi R, Fulcher CA, Latendresse M, Kothari A, et al. The biocyc collection of microbial genomes and metabolic pathways. Brief Bioinform. 2017;20:1085–93. https://doi.org/10.1093/bib/bbx085.
Marchler-Bauer A, Lu S, Anderson JB, Chitsaz F, Derbyshire MK, DeWeese-Scott C, et al. CDD: a conserved domain database for the functional annotation of proteins. Nucleic Acids Res. 2011;39(Database issue):D225–9. https://doi.org/10.1093/nar/gkq1189.
Community NPB. SignalP 6.0 predicts all five types of signal peptides using protein language models. Nature Portfolio Bioengineering Community. 2021. http://bioengineeringcommunity.nature.com/posts/signalp-6-0-predicts-all-five-types-of-signal-peptides-using-protein-language-models. Accessed 8 Mar 2022.
Eichinger V, Nussbaumer T, Platzer A, Jehl M-A, Arnold R, Rattei T. EffectiveDB—updates and novel features for a better annotation of bacterial secreted proteins and type III, IV, VI secretion systems. Nucleic Acids Res. 2016;44:669–74. https://doi.org/10.1093/nar/gkv1269. Database issue:D.
Arnold R, Brandmaier S, Kleine F, Tischler P, Heinz E, Behrens S, et al. Sequence-Based prediction of type III secreted proteins. PLoS Pathog. 2009;5:e1000376. https://doi.org/10.1371/journal.ppat.1000376.
Wang J, Li J, Hou Y, Dai W, Xie R, Marquez-Lago TT, et al. BastionHub: a universal platform for integrating and analyzing substrates secreted by Gram-negative bacteria. Nucleic Acids Res. 2021;49:D651–9. https://doi.org/10.1093/nar/gkaa899.
Skjerning RB, Senissar M, Winther KS, Gerdes K, Brodersen DE. The RES domain toxins of RES-Xre toxin-antitoxin modules induce cell stasis by degrading NAD+. Mol Microbiol. 2019;111:221–36. https://doi.org/10.1111/mmi.14150.
Piscotta FJ, Jeffrey PD, Link AJ. ParST is a widespread toxin–antitoxin module that targets nucleotide metabolism. Proc Natl Acad Sci U S A. 2019;116:826–34. https://doi.org/10.1073/pnas.1814633116.
Leutert M, Pedrioli DML, Hottiger MO. Identification of PARP-Specific ADP-Ribosylation targets reveals a regulatory function for ADP-Ribosylation in transcription elongation. Mol Cell. 2016;63:181–3. https://doi.org/10.1016/j.molcel.2016.07.006.
Okazaki IJ, Kim HJ, Moss J. Cloning and characterization of a novel membrane-associated lymphocyte nad:arginine ADP-ribosyltransferase. J Biol Chem. 1996;271:22052–7. https://doi.org/10.1074/jbc.271.36.22052.
Farelli JD, Gumbart JC, Akey IV, Hempstead A, Amyot W, Head JF, et al. IcmQ in the type 4b secretion system contains an NAD + Binding domain. Structure. 2013;21:1361–73. https://doi.org/10.1016/j.str.2013.05.017.
Matteoli FP, Passarelli-Araujo H, Reis RJA, da Rocha LO, de Souza EM, Aravind L, et al. Genome sequencing and assessment of plant growth-promoting properties of a Serratia marcescens strain isolated from vermicompost. BMC Genomics. 2018;19. https://doi.org/10.1186/s12864-018-5130-y.
Burroughs AM, Aravind L. Identification of uncharacterized components of prokaryotic immune systems and their diverse eukaryotic reformulations. J Bacteriol. 2020;202. https://doi.org/10.1128/JB.00365-20.
O’Neal CJ, Amaya EI, Jobling MG, Holmes RK, Hol WGJ. Crystal structures of an intrinsically active cholera toxin mutant yield insight into the toxin activation mechanism. Biochemistry. 2004;43:3772–82. https://doi.org/10.1021/bi0360152.
Cheng H, Schaeffer RD, Liao Y, Kinch LN, Pei J, Shi S, et al. ECOD: an evolutionary classification of protein domains. PLoS Comput Biol. 2014;10:e1003926. https://doi.org/10.1371/journal.pcbi.1003926.
Schuller M, Butler RE, Ariza A, Tromans-Coia C, Jankevicius G, Claridge TDW, et al. Molecular basis for dart ADP-ribosylation of a DNA base. Nature. 2021;596:597–602. https://doi.org/10.1038/s41586-021-03825-4.
Bell CE, Eisenberg D. Crystal structure of diphtheria toxin bound to nicotinamide adenine dinucleotide. Biochemistry. 1996;35:1137–49. https://doi.org/10.1021/bi9520848.
J Tindall B. Taking a closer look at the valid publication and authorship of Legionella Bozemanae brenner, et al 1980, fluoribacter Bozemanae garrity, et al. 1980, Legionella pittsburghensis Pasculle et al. 1980, Legionella Micdadei Hébert et al. 1980 and tatlockia Micdadei (Hébert et al. 1980) garrity et al. 1980. Curr Microbiol. 2020;77:146–53. https://doi.org/10.1007/s00284-019-01793-7.
Saini N, Gupta RS. A robust phylogenetic framework for members of the order legionellales and its main genera (Legionella, aquicella, Coxiella and Rickettsiella) based on phylogenomic analyses and identification of molecular markers demarcating different clades. Antonie Van Leeuwenhoek. 2021;114:957–82. https://doi.org/10.1007/s10482-021-01569-9.
Kisker C, Hinrichs W, Tovar K, Hillen W, Saenger W. The complex formed between tet repressor and tetracycline-Mg2 + reveals mechanism of antibiotic resistance. J Mol Biol. 1995;247:260–80. https://doi.org/10.1006/jmbi.1994.0138.
McLennan AG. The nudix hydrolase superfamily. Cell Mol Life Sci. 2006;63:123–43. https://doi.org/10.1007/s00018-005-5386-7.
Chen D, Vollmar M, Rossi MN, Phillips C, Kraehenbuehl R, Slade D, et al. Identification of macrodomain proteins as novel O-Acetyl-ADP-ribose deacetylases. J Biol Chem. 2011;286:13261–71. https://doi.org/10.1074/jbc.M110.206771.
Barkauskaite E, Brassington A, Tan ES, Warwicker J, Dunstan MS, Banos B, et al. Visualization of poly(ADP-ribose) bound to PARG reveals inherent balance between exo- and endo-glycohydrolase activities. Nat Commun. 2013;4:2164. https://doi.org/10.1038/ncomms3164.
Jankevicius G, Hassler M, Golia B, Rybin V, Zacharias M, Timinszky G, et al. A family of macrodomain proteins reverses cellular mono-ADP-ribosylation. Nat Struct Mol Biol. 2013;20:508–14. https://doi.org/10.1038/nsmb.2523.
O’Sullivan J, Tedim Ferreira M, Gagné J-P, Sharma AK, Hendzel MJ, Masson J-Y, et al. Emerging roles of eraser enzymes in the dynamic control of protein ADP-ribosylation. Nat Commun. 2019;10:1182. https://doi.org/10.1038/s41467-019-08859-x.
Wu H, Lu A, Yuan J, Yu Y, Lv C, Lu J. Mono-ADP-ribosylation, a MARylationmultifaced modification of protein, DNA and RNA: characterizations, functions and mechanisms. Cell Death Discov. 2024;10:1–15. https://doi.org/10.1038/s41420-024-01994-5.
Ritter H, Koch-Nolte F, Marquez VE, Schulz GE. Substrate binding and catalysis of ecto-ADP-ribosyltransferase 2.2 from rat. Biochemistry. 2003;42:10155–62. https://doi.org/10.1021/bi034625w.
Lyons B, Ravulapalli R, Lanoue J, Lugo MR, Dutta D, Carlin S, et al. Scabin, a novel DNA-acting ADP-ribosyltransferase from streptomyces scabies. J Biol Chem. 2016;291:11198–215. https://doi.org/10.1074/jbc.M115.707653.
Jørgensen R, Purdy AE, Fieldhouse RJ, Kimber MS, Bartlett DH, Merrill AR. Cholix toxin, a novel ADP-ribosylating factor from vibrio cholerae*. J Biol Chem. 2008;283:10671–8. https://doi.org/10.1074/jbc.M710008200.
Wang Y, Li Y, Yu Z, Zhang W, Ren Z, Zhang H, et al. A trans-kingdom T6SS RNase effector targeting both prokaryotic and host cells for pathogenesis. Cell Rep. 2025;44:116074. https://doi.org/10.1016/j.celrep.2025.116074.
Patel DT, Stogios PJ, Jaroszewski L, Urbanus ML, Sedova M, Semper C, et al. Global atlas of predicted functional domains in Legionella Pneumophila dot/icm translocated effectors. Mol Syst Biol. 2025;21:59–89. https://doi.org/10.1038/s44320-024-00076-z.
Acknowledgements
We thank Dr A. Muszewska, Dr V. Tagliabracci, Dr B. Mayro and M. Roberts for critical reading of the manuscript and helpful comments.
Funding
and additional information.
K.P. was supported by the Polish National Science Centre grant 2019/33/B/NZ2/01409.
M.G. was supported by the Polish National Science Centre grant 2019/35/N/NZ2/02844.
M.K. was supported by the Polish National Science Centre grant 2023/49/N/NZ2/03440.
The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Author information
Authors and Affiliations
Contributions
MK: conceptualised the project, performed the experiments, analysed the data, prepared figures and tables, authored or reviewed drafts of the paper, and approved the final draft. MG: performed the experiments, analysed the data, prepared supplemental tables, and approved the final draft. BB: performed the experiments, analysed the data, and approved the final draft. KP: conceptualised the project, co-supervised the project, authored or reviewed drafts of the paper, and approved the final draft. MD: conceptualised the project, co-supervised the project, performed the experiments, analysed the data, prepared supplemental figures and/or tables, authored or reviewed drafts of the paper, and approved the final draft.
Corresponding authors
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
12864_2025_11994_MOESM1_ESM.xlsx
Supplementary Material 1. Tables S1-S11. S1 – Table of CD-search results. S2 – Table of FFAS results. S3 – Table of HMM search results. S4 – Table of HH search results. S5 – Table of Phyre2 results. S6 – Structures analysis. S7 – Effector prediction. S8 – Docking results. S9 – Data for CLANS, HMMs and structure similarity network. S10 – List of all ART-like proteins in Legionella. S11 – List of ART families in astARTe database.
12864_2025_11994_MOESM4_ESM.png
Supplementary Material 4. Fig. S1. NAD docked to representatives of the novel ART families. (A) Crystallographic structures showing the greatest structural similarity (according to the DALI server) to the AlphaFold model for the novel ART domains, colouring by secondary structures; (B) Structure models for novel ART families with docked NAD, superposed on structures of closest PDB matches (A), colouring by secondary structures; (C) As in (B) – surface, colouring by sequence conservation in families: magenta – the highest conservation in family, blue – poor conservation, grey – no data; (D) As in (C), close-up views of the NAD docking sites.
12864_2025_11994_MOESM5_ESM.png
Supplementary Material 5. Fig. S2. Zooming in on the site of the NAD-binding pockets predicted by docking – proteins in colouring by conservation in the protein family. Active site signatures shown as sticks (A) DUF2971 (representative sequence KTD32344), (B) DUF4291 (MBA2747162), (C) Lani_1641 (WP_019232367), (D) Lmac_3114 (WP_058453766), (E) Lsan_0116 (KTD69838), (F) Lsan_2474 (WP_058514560). Colouring by sequence conservation in families: magenta the highest conservation in family MSA, blue – poor conservation, grey – no data.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Krysińska, M., Gradowski, M., Baranowski, B. et al. A survey of ADP-ribosyltransferase families in the pathogenic Legionella. BMC Genomics 26, 915 (2025). https://doi.org/10.1186/s12864-025-11994-z
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s12864-025-11994-z