Hierarchical Graph Learning For Protein
Hierarchical Graph Learning For Protein
1038/s41467-023-36736-1
Received: 18 October 2022 Ziqi Gao 1,2, Chenran Jiang3, Jiawen Zhang1, Xiaosen Jiang4, Lanqing Li 5
,
Peilin Zhao5, Huanming Yang 4, Yong Huang 6 & Jia Li 1,2
Accepted: 14 February 2023
Biological functions are accomplished by interactions and chemical approaches to understand PPIs12,13. Experimentally, scientists often
reactions among biomolecules. Among them, protein–protein inter- employ high-throughput mapping14–16 to pre-build the PPI network at
actions (PPIs) are arguably one of the most important molecular events scale, and use bioinformatics clustering methods to identify functional
in the human body and are an important source of therapeutic inter- modules of the network (top view). On the individual protein level,
ventions against diseases. A comprehensive dictionary of PPIs can help isolation methods, such as co-immunoprecipitation17, pull-down18, and
connect the dots in complicated biological pathways and expedite the crosslinking19 are used to establish the structures of individual pro-
development of therapeutic1,2. In biology, hierarchy information has teins, so that surficial ‘hotspots’ can be located and analyzed. In short,
been widely exploited to gain in-depth information about phenotypes hierarchy knowledge of structure information is important to under-
of interest, for example, in disease biology3–5, proteomics6–8, and stand the molecular details of PPIs.
neurobiology9–11. Naturally, PPIs encapsulate a two-view hierarchy: on More recently, the massive growth in the demand and the cost of
the top view, proteins interact with each other; on the bottom view, experimentally validating PPIs make it impossible to characterize most
key amino acids or residues assemble to form important local unknown PPIs in wet laboratories. To map out the human interactome
domains. Following this logic, biologists often take hierarchical efficiently and inexpensively, computational methods are increasingly
1
Data Science and Analytics, The Hong Kong University of Science and Technology, Guangzhou 511400, China. 2Division of Emerging Interdisciplinary Areas,
The Hong Kong University of Science and Technology, Hong Kong SAR, China. 3Pingshan Translational Medicine Center, Shenzhen Bay Laboratory, Shenzhen
518118, China. 4The Cancer Hospital of the University of Chinese Academy of Sciences (Zhejiang Cancer Hospital), Chinese Academy of Sciences, Hangzhou
310022, China. 5AI Lab, Tencent, Shenzhen 518000, China. 6Department of Chemistry, The Hong Kong University of Science and Technology, Hong Kong
SAR, China. e-mail: [email protected]; [email protected]
being used to predict PPIs automatically. Over the past decade, as one HIGH-PPI can identify binding and catalytic sites with high precision
of the most revolutionary tools in computation, Deep Learning (DL) automatically.
methods, have been applied to study PPIs. Development in this field
has been mostly focused on two aspects, learning appropriate protein Results
representations20,21 and inferring potential PPIs by link predictions22,23. HIGH-PPI introduces a hierarchical graph for learning structures
The former focuses on extracting structural information using protein of proteins and the PPI network
sequences. In particular, Convolutional Neural Networks (CNNs)24,25 Although deep learning (DL) models for Protein–Protein Interaction
and Recurrent Neural Networks (RNNs)26–28 have demonstrated high (PPI) prediction have been studied extensively, it has not yet been
generalization and fast inference speed to capture key sequence developed for simulating the natural PPI hierarchy. Here, we suggest
fragments for PPIs29. 3D CNNs21,30,31 have shown to be better at HIGH-PPI, a hierarchical graph neural network, for accurate and
extracting 3D structural features of proteins and thus capturing the interpretable PPI prediction. HIGH-PPI works like biologists in a hier-
spatial-biological arrangements of residues32 that are important to PPI archical manner as it contains the bottom inside-of-protein view and
predictions. However, 3D CNN suffers from high computational bur- top outside-of-protein view (schematic view in Fig. 1c and detailed
dens and limited resolution that is prone to quantization errors29. The architecture in Supplementary Fig. 1a). On one hand, HIGH-PPI applies
latter aspect of DL in PPI predictions focuses on the PPI network the bottom view when dealing with a protein, where a protein is
structures, which involves developing link prediction methods to represented by a protein graph with residue as nodes and their phy-
identify missing interactions within the known network topology. Link sical adjacencies as edges. On the other hand, from the top view,
prediction methods based on common neighbor (CN)33 assign high protein graphs and their interactions are considered nodes and edges
probabilities of PPI to protein pairs that are known to share common of the PPI graph, respectively. Correspondingly, two GNNs are
PPI partners. CN can be generalized to consider neighbors from a respectively employed to learn from protein graphs in the bottom view
greater path length (L3)22, which captures the structural and evolu- (BGNN) and learn from a PPI graph in the top view (TGNN). Conse-
tionary forces that govern biological networks such as the inter- quently, a set of graphs are interconnected by edges in a hierarchical
actome. Additionally, distance-based methods measure the possible graph, to present a potent data representation.
distances between protein pairs, such as Euclidean commute time In the proposed end-to-end model, the initial stage is to create
(ECT)34 and random walk with restart (RWR)35. Most methods of tra- protein graphs for learning appropriate protein representation. An
ditional link prediction focus on known interactions but tend to adjacency matrix of a protein graph is derived from a contact map
overlook important network properties such as node degrees and connecting physically close residues (See Section 4.1 in “Methods” for
community partitions. details). Node attributes are defined with residue-level features for
More importantly, these methods perceive only one of the two expressing the physicochemical properties of proteins (See Section 4.1
views of outside-of-protein and inside-of-protein. Few can model the in “Methods” for details). To produce a protein graph representation,
natural PPI hierarchy by connecting both views. To address this issue, Graph Convolutional Network (GCN)36 is used in BGNN to optimize the
we present a hierarchical graph that applies two Graph Neural Net- protein graphs. As shown in Fig. 1c, BGNN contains two GCN blocks,
works (GNNs)36,37 to represent protein and network structures, and we construct three components for each GCN block to obtain a
respectively. In this way, the limitations of 3D CNN and link prediction fixed-length embedding vector for a protein graph. Both the adjacency
methods mentioned above can be circumvented. First, GNNs can learn matrix and the residue-level features matrix are inputs for a GCN layer.
the protein 3D structures on more efficient graph representations, To respectively improve model expressiveness and accelerate training
even when facing high-resolution requirements for structure proces- convergence, the nonlinear activation function of ReLU and Batch
sing. Second, due to the propagation mechanism, GNNs are capable of Normalization (BN) are used. Readout operation including a self-
recovering network properties such as node degrees and community attention graph (SAG) pooling39 and the average aggregation is used to
partitions. In short, this hierarchical graph approach aims at modeling ensure a fixed-length embedding vector output. Regardless of the
the natural PPI hierarchy with more effective and efficient structure number and permutation of residues, a 1D embedding vector is
perceptions. obtained after two GCN blocks. By the end of those operations, the
Here we describe a generic DL platform tailored for predicting final protein representations are assembled, which are employed as
PPIs, Hierarchical Graph Neural Networks for Protein–Protein Inter- initial features of the PPI graph. In TGNN, features are propagated
actions (HIGH-PPI). HIGH-PPI models the structural protein repre- along interactions in the PPI network for learning network community
sentations with the bottom inside-of-protein view GNNs (BGNN) and and degree properties. In the top view, we specifically design a GIN
the PPI network with the top outside-of-protein view GNNs (TGNN). In block that contains a Graph Isomorphism Network (GIN)37 layer, ReLU
the bottom view, HIGH-PPI constructs protein graphs by treating activation function and a BN layer. Node features of the PPI graph are
amino acid residues as nodes and physical adjacencies as edges. Thus, updated with recursive neighborhood aggregations of three GIN
BGNN integrates the information of protein 3D structures and residue- blocks. Two arbitrary protein embeddings are combined by con-
level properties in a synergistic fashion. In the top view, HIGH-PPI catenation operations, and a Multi-Layer Perceptron (MLP) is then
constructs the PPI graph by taking protein graphs (the bottom view) as applied as a classifier for prediction. Moreover, we also consider graph
nodes and interactions as edges and learns protein–protein relation- attention (GAT) and arbitrarily deploy two of the three GNN layers (i.e.,
ships with TGNN. In an end-to-end training paradigm, HIGH-PPI gains GCN, GIN and GAT) on BGNN and TGNN. The performance of HIGH-PPI
mutual benefits from both views. On the one hand, the bottom view with various GNN layers is shown in Supplementary Fig. 2.
feeds protein representations to the top view to learn accurate protein We train and evaluate HIGH-PPI on multi-type human PPIs from
relationships. On the other hand, protein relationships learned by the the STRING database38, which contains a critical assessment and inte-
top view provide insights to further optimize the bottom view to gration of PPIs. SHS27k26, a homo sapiens subset from STRING38 that
establish better protein representations. HIGH-PPI outputs the prob- comprises 1,690 proteins and 7,624 PPIs, is used to train and evaluate
abilities of interactions for given protein pairs and predicts key “con- the HIGH-PPI unless otherwise noted. However, a small fraction of
tact” sites for such interactions by calculating residue importance. We proteins (∼ 8%) sometimes need to be removed because of the lack of
show the effectiveness of HIGH-PPI on the human interactome from their native structures in the PDB database. While evaluating the pre-
the STRING database38 and compare it with leading DL methods. We diction performance for multi-type PPIs, we consider the prediction
demonstrate the superiority of HIGH-PPI with higher prediction for each PPI type as a one-vs-all binary classification problem, for which
accuracy and better interpretability. We also show examples that two metrics, F1 score and area under the precision-recall curve (AUPR)
Fig. 1 | Schematic view of the HIGH-PPI architecture. Both the protein structure (internal edges) and sparse connections externally (external edges). c The HIGH-PPI
(biology structure) and network structure (interactome structure) are essential for is a hierarchical model for learning both protein structure information and network
predictions of PPIs. a The PPIs with protein structure information. Although protein structure information. The HIGH-PPI contains two views, the top view and the
sequence usually provides details among PPIs, it can also lead to low predictability bottom view. In bottom view, residues serve as nodes, residue-level physico-
for PPI prediction. Left: As an example, SERPINA1 and SERPINA3, protein members chemical properties as node features and edges connect physically adjacent resi-
of a shared superfamily, bind to almost the same binding surface (TM-score is 0.74) dues. Two trainable graph convolutional blocks are applied for learning complex
of ELANE, whereas they share low sequence consistency (identity is 0.13) locally in protein representations. In top view, proteins serve as nodes, interactions as edges
the binding surface. Right: From a global perspective, gaps in the sequence and and representations from the bottom view as node features. Three trainable graph
structure of proteins also exist. SERPINA1 and SERPINA3 highly align in structure isomorphism blocks are applied to update protein representations and after con-
(TM-Score is 0.89), but share a low sequence consistency (identity is 0.43). b The catenating a pair of query proteins, the resulting embedding is passed through the
PPIs with network structure information. PPI networks tend to yield community linear classifier to learn protein correlations.
structures that divide proteins into groups with dense connections internally
are used for predicting the presence or absence of the corresponding applies a GNN module to learn the PPI network topology and 1D CNN
PPI class. The overall performance of micro-F1 and AUPR scores for to learn protein representations by taking pre-trained residue
multi-type PPI prediction is averaged across all PPI types. embeddings as inputs. PIPR, an end-to-end framework based on
recurrent neural networks (RNN), represents proteins with only pre-
HIGH-PPI shows the best performance, robustness and trained residue embeddings. DrugVQA applies a visual question-
generalization answering mode to learn from protein contact maps with a 2D CNN
To validate the predictive power of our model, we compare HIGH-PPI model and extract semantic features with a sequential model. Sup-
with leading methods from four perspectives, including (1) the overall plementary Data File 2 contains predictions of HIGH-PPI for all test PPIs
performance under a random data split, (2) the robustness of HIGH-PPI from SHS27k. We provide the precision-recall curves in Fig. 2a. In terms
against random interaction perturbation, (3) model generalization for of best micro-F1 scores (best-F1), HIGH-PPI obtains the best perfor-
predicting PPI pairs containing unknown proteins, (4) evaluations in mance. Pre-trained residue embedding method GNN-PPI takes the
terms of AUPR on five separate PPI types. For each method, all the second place by effectively generalizing to unknown proteins. Without
proposed modules and strategies are involved to get the best using any pre-training techniques, HIGH-PPI surpasses GNN-PPI by an
performance. average of ∼4%, showing the superiority of the hierarchical modeling
First, we compare the overall performance of HIGH-PPI with approach. DrugVQA gets relatively poor performance (best-F1 ≈ 0.7),
leading baselines in Fig. 2a. To ensure native PDB structures for all which could be attributed to the neglect of residue property infor-
proteins, we filter from SHS27k and construct the dataset containing mation and structures of the PPI network.
∼1600 proteins (see Supplementary Data File 1) and ∼6600 PPIs. We Second, to evaluate the robustness of HIGH-PPI, we analyze the
randomly select 20% PPIs for testing and compare PPI to one state-of- model tolerance against interaction data perturbation including ran-
the-art DL method (i.e., GNN-PPI24), one sequence-based method (i.e., dom addition or removal of known interactions. This simulates sce-
PIPR26), one 2D CNN-based method (i.e., DrugVQA40) and one machine narios where PPI datasets always omit undiscovered interactions and
learning (ML) method based on random forest (i.e., RF-PPI41). GNN-PPI may introduce mislabeled ones. Based on the perturbated PPI network,
Fig. 2 | Performance of HIGH-PPI in predicting PPIs. a Precision-recall curves of are represented as boxplots (center line, the median; upper and lower edges, the
PPI prediction on SHS27k (sub-dataset from STRING) containing ∼6600 PPIs and interquartile range; whiskers, 0:5 × interquartile range) and moreover, dotted lines
∼1500 human proteins with native PDB structures showing the performance of show the mean results of 9 independent runs of PIPR, DrugVQA and RF-PPI under
HIGH-PPI compared to baselines containing GNN-PPI, PIPR, DrugVQA and RF-PPI. DFS-0.4, the easiest OOD pattern. The significance of HIGH-PPI versus GNN-PPI is
b Robustness evaluation showing the best micro-F1 scores (Best-F1) of baseline shown in each case (Two-sided t-test results: ****P = 1:1 × 10 5 for BFS-0.3,
predictions against link perturbations of various cases where links are randomly ***P = 4:5 × 10 3 for DFS-0.3, *P = 1:0 × 10 7 for BFS-0.4, ***P = 2:0 × 10 2 for DFS-0.4
added or removed with different ratios. Error bands of a and b represent the and **P = 3:0 × 10 6 for R-0.65). d Distributions of AUPR scores of 5 independent
standard deviation of the mean under 9 independent runs. c Generalization eva- runs computed on 5 PPI types and corresponding proportions. Each figure shows
luation showing Best-F1s of baselines tested on a regular and 4 Out-of-Distribution the performance significance of HIGH-PPI versus the second-best baseline (GNN-
(OOD) cases, in which datasets are constructed with random split (R), Breath-First PPI) (Two-sided t-test results: ****P = 2:0 × 10 5 for binding, ***P = 1:7 × 10 4 for
Search (BFS) and Depth-First Search (DFS) and three ratios represent probabilities reaction, *P = 4:4 × 10 2 for ptmod, ***P = 3:2 × 10 4 for catalysis and
of overlap of proteins between the training and test datasets. Distributions of Best- **P = 6:0 × 10 3 for inhibition). Error bars represent standard deviation of the mean.
F1s under 9 independent runs of HIGH-PPI and the second-best baseline (GNN-PPI) Source data are provided as a Source Data file.
we split the training and test sets at an 8:2 ratio. We observe in Fig. 2b Generalization ability is investigated by testing HIGH-PPI in var-
that our method exhibits stable performance in terms of best-F1 with a ious out-of-distribution (OOD) scenarios where unknown proteins
random perturbation of 40%. When compared to the second-best arrive in the test sets with different probabilities (see Fig. 2c). For
baseline (i.e., GNN-PPI), HIGH-PPI offers a significant performance gain example, BFS-0.3 denotes that the test set involves 30% known pro-
of up to 19%, which demonstrates the strongest model robustness teins via Breath-First Search approach24. For PIPR, DrugVQA and RF-
among all methods. It is crucial to notice that although RF-PPI and PPI, we visualize their best performances among all OOD cases using
DrugVQA perform consistently in the overall evaluation (see Fig. 2a), dotted lines, to demonstrate the absolute dominance of HIGH-PPI and
DrugVQA performs significantly more robustly than RF-PPI, demon- GNN-PPI. Furthermore, we observe that HIGH-PPI consistently out-
strating the undisputed superiority of DL methods over ML ones. performs GNN-PPI, the second-best method, with large margins in all
Furthermore, we perform false discovery on our method, which five scenarios. BFS typically produces worse performance than DFS,
investigates the effect of the training data unreliability (i.e., false because BFS creates a more challenging and realistic mode where
negative (FN) and false positive (FP)) on our model and a solid baseline unknown proteins exist in cluster forms. ML method (RF-PPI) exhibits
(GNN-PPI). Specifically, we consider the original dataset to be reliable poor generalization. Furthermore, we follow Park and Marcotte42 to
and artificially add perturbations to represent data unreliability. Sup- explore the differences in model performance on 3 kinds of PPI pairs
plementary Table 1 shows the created 9 datasets with different FP rates with different degrees of OOD. Specifically, C 1 stands for the percen-
(FPRtrain ) and FN rates (FNRtrain ). We respectively train the model on tage of PPIs of which both proteins were present in a training set (Class
the reliable training set and created 9 unreliable ones and present the 1), C 2 stands for the percentage of PPIs of which one of (but not both)
FP rates (FPRpre ), FN rates (FNRpre ) and false discovery rates (FDRpre ) proteins was present in the training set (Class 2), C 3 stands for the
metrics on the test sets (see Supplementary Table 2 and 3). Without percentage of PPIs of which neither protein was present in the training
unreliability, our model achieves best performance with insignificant set (Class 3). The detailed experimental protocol has been presented in
superiority (*P = 3:8 × 10 2 ) in the FPRpre metric, and considerable the Supplementary Method 3. We come to the same conclusion as Park
superiority in the FNRpre (***P = 1:2 × 10 4 ) and FDRpre and Marcotte did42. There is a noticeable difference in model test
(***P = 1:5 × 10 4 ) metrics. When introducing data unreliability, we are performance across the 3 distinct classes of test pairs. Particularly, on
surprised to find that our model substantially improves the superiority Class 1 test pairs, both models (HIGH-PPI and GNN-PPI) perform the
significance in the FPRpre metric (****P = 4:0 × 10 5 ) while retaining the best, on Class 2 test pairs they are the second best, and on Class 3 test
original significance in FNRpre and FDRpre . In addition to showing the pairs they are the poorest. Furthermore, we find that for each model,
excellent robustness of our model, we also provide more in-depth the class proportion (i.e., C 1 =C 2 =C 3 ) had an impact on the overall
insights in Section 3.2. performance of the model despite having little effect on performance
Fig. 3 | Performance of bottom view GNN of HIGH-PPI to represent a protein for without 3D information and boxplot (9 runs with independent seeds) showing the
PPI prediction. a Effectiveness in demonstration w or w/o protein 3D information relationship between Best-F1 scores of HIGH-PPI and the Root-Mean-Square Devia-
(3D coordinates of C α atoms in all residues). The protein is represented with back- tion (RMSD) of the tested structures relative to the native structures. For boxplots,
bones including Random Forest (RF from RF-PPI, gray), Recurrent Neural Networks the center line represents the median, upper and lower edges represent the inter-
(RNN from PIPR, purple), Convolutional Neural Networks (CNN (Seq) from GNN-PPI, quartile range, and the whiskers represent 0:5 × interquartile range. As an example,
CNN ( + 3D) from DeepRank blue), respectively. Converting 3D information into c HIGH-PPI can easily identify the binding site containing four physically adjacent
protein contact maps (CM), a backbone with graph structured data outperforms all residues via conventional graph motif research method (PDB id: 1BJP). CNN and RNN
other methods with high performance significances (Two-sided t-test results: graph based backbones may miss (missed) or mis-identify (non-essential) residues with
versus RF ( + 3D) ****P = 1:1 × 10 8 , graph versus RNN ( + 3D) ****P = 6:1 × 10 12 , graph Grad-CAM and RNNVis. d The feature importance in residue-level for overall (left-
versus CNN ( + 3D) ****P = 2:3 × 10 7 ). Error bars represent standard deviation of the most column) and type-specific (right six columns) PPI prediction calculated as the
mean under 9 independent runs. b HIGH-PPI can outperform other baselines without average z-score resulting from dropping each individual feature dimension from our
absolutely precise structures of query proteins. Blue dotted line (mean value of 9 model and calculating changes of AUPR before and after. Source data are provided
independent runs) representing the Best-F1 score of second-best baseline (GNN-PPI) as a Source Data file.
on the respective classes. Thus, it seems that the proportion of the First, we explore the effectiveness of backbones including RF,
three test pair classes (Supplementary Table 6) as well as the percen- RNN, CNN and GNN in Fig. 3a. For fairness, we feed the same features
tage of unknown proteins (Fig. 2c) in the test sets may both have a of residue sequence to RF, RNN and CNN, whose results are displayed
significant role in determining the degree of OOD in the dataset. by bar charts with ‘Seq’. We directly use RF-PPI as the RF backbone. For
Finally, for each of the five PPI types, we offer a separate perfor- RNN and CNN backbones, we respectively employ the RNN module of
mance analysis in terms of AUPR. In all five types, HIGH-PPI con- PIPR and the CNN module of GNN-PPI to extract sequence embeddings
sistently beats other baselines with high significance as shown in for representing proteins and apply the same fully connected layer as
Fig. 2d. As anticipated, PPI types with high proportions (such as classifiers. We test the predictive power of each model with 3D infor-
binding, reaction, and catalysis) can be predicted more easily since the mation. For RF and RNN, we employ the concatenations of sequence
model could learn enough relevant information. In addition, we find data and Cartesian 3D coordinates of each C α . For CNN, we apply the
that when predicting binding PPIs, HIGH-PPI outperforms GNN-PPI 3D CNN module suggested in DeepRank21, a deep learning framework
most significantly (****P = 2:0 × 10 5 ). This is reasonable as HIHG-PPI is for identifying interfaces of PPIs. For GNN, we learn from protein
designed to recognize spatial-biological patterns of proteins, which is graphs in which the adjacency matrix is determined by C α C α con-
highly related to binding type PPIs. Similar trends are also found in the tact map. With the aid of 3D information, we discover all the model
performance of HIGH-PPI and GNN-PPI in various PPI types under OOD performance can be improved, indicating that 3D information is an
cases (Supplementary Fig. 5). important complement to sequence-alone information. Importantly,
GNN performs the best when compared to RF ( + 3D), RNN ( + 3D) and
Bottom inside-of-protein view improves the performance CNN ( + 3D), which shows that GNN is the best approach for capturing
We investigate the role of the bottom inside-of-protein view from four spatial-biological arrangements of residues within a protein. More-
perspectives, including (1) the effectiveness of graph representations over, GNN performs significantly better than 3D CNN in memory and
and backbones with native protein structures, (2) the model tolerance time efficiency (Supplementary Fig. 3).
with low-quality protein structures, (3) the capability to predict motifs Second, we examine the model tolerance when testing with low-
(i.e., functional sites) in a protein, (4) the overall and type-specific quality structure data (see Fig. 3b). This meets the realistic scenarios,
feature importance. where native structure information is not always available for
predicting PPIs. We prefer the model whose performance is not ser- epochs and calculate the negative Mean Absolute Error (-MAE) of the
iously limited by the structure quality, which is robust to inputs predicted degrees and real degrees to represent degree recovery.
directly from computational models (e.g., AlphaFold43). We evaluate Similarly, for community recovery, we quantify the community
the quality of the input protein structure by calculating the root-mean- recovery using the normalized mutual information (NMI). As can be
square deviation (RMSD) of the native one and the input. Native pro- seen, we observe a significant correlation (R = 0:66) between degree
tein structures (RMSD = 0) are retrieved from the PDB database at the recovery and model performance (i.e., best-F1) as well as a high cor-
highest resolutions. We compute the best-F1 scores (box plots) of our relation (R = 0:68) between community recovery and model perfor-
method on a set of AlphaFold structures with various RMSDs (0.80, mance, which means better recovery of the degree and community of
1.59, 2.39, 3.19, 5.36, 7.98), and show the average result of second-best PPI network implies better PPI prediction performance.
method (GNN-PPI) in a blue dotted line. As can be seen, our model Second, we evaluate the performance of TGNN and leading link
performance is always better than GNN-PPI, even with RMSD up to 8. prediction methods using PPI network structure as input. Our method
The comparison with 3D CNN model21 further proves the denoising (TGNN) takes interactions as edges and node degrees as node features.
ability of the hierarchical graph for protein structure errors (Supple- We compare HIGH-PPI with six heuristic methods and one DL-based
mentary Fig. 4a). In short, our model performance is not significantly method. Heuristic methods, the simple yet effective ones utilizing the
affected by structure errors where powerful pre-trained features are heuristic node similarities as the link likelihoods, include common
not available. neighbors (CN)33, Katz index (Katz)50, Adamic-Adar (AA)51, preferential
Further, to interpret decisions made by RNN, CNN and GNN, an attachment (PA)52, SimRank (SR)53 and paths of length three (L3)22.
experiment is conducted to explore the ability to capture protein MLP_IP, a DL approach, learns node representations using a multilayer
functional sites. We apply the 3D-grad-CAM approach44 on the trained perceptron (MLP) and identifies the node similarity via inner product
3D CNN model named DeepRank21, and apply the RNNVis approach45 (IP) operation. We calculate the MAE and NMI values of recovered
on the trained PIPR26 model with 3D information. All three methods networks and highlight those with a high capacity for recovery
have identified more than one motif, in which we only show the most (NMI ≥ 0.7 and MAE ≤ 0.35) in orange. Results show that link prediction
crucial site. Figure 3c displays the binding site for an isomerase pro- methods that are more adept at recovering network properties typi-
tein’s chain A (PDB id: 1BJP). The binding site is made up of four resi- cally perform better. This gain validates our findings in Fig. 4a and
dues with the sequence numbers 6, 42, 43, and 44. As can be seen, highlights the need for TGNN in the top view. In addition, a comparison
whereas neither CNN nor RNN can identify the His-6 residue, our of MIL_IP and L3 elucidates that pairwise learning is insufficient to well
method can precisely identify the binding site by using graph motif capture the network information. Although L3 can capture the evolu-
search. It seems to be a challenge for the sequence model (i.e., RNN, tionary principles of PPIs to some extent, our method beats L3 by
CNN) to connect His-6 to the other residues, probably because of their better recovering the structure of the PPI network.
weak connections in a sequence mode. Moreover, 3D CNN performs We provide an example on an SHS27k sub-network. As can be
even worse than RNN as it incorrectly classifies the non-essential Ile-41 seen, there exist two distinct communities connected by two inter-
residue. community edges. We use the original sub-network as inputs and find
For node features in protein graphs, we select seven important that non-TGNN link prediction methods (i.e., CN, Katz, SR, AA, PA) tend
features from twelve residue-level feature options (see Supplementary to give high scores for intercommunity interactions. As an interesting
Table 4) that are easily available. The feature selection process (see observation, when we apply the Louvain community detection
Supplementary Method 1 for details) produces the optimal set con- algorithm54 to the recovered structure, it cannot produce an accurate
sisting of seven features to ensure that our model peaks at both AUPR community partition as the abundant inter-community interactions
and best-F1 scores. Here, we list the selected seven residue-level phy- disrupt the original community structure. To examine degree recovery
sicochemical properties in Fig. 3d and discuss their importance for ability, we randomly select 50% of interactions as inputs and show each
different types of PPIs to both better interpret our model and discover method’s degree recovery result for node KIF22 in Fig. 4c. We find non-
enlightening biomarkers for PPI interface. The average z-score, which TGNN approaches cannot well recover the links connecting the node
results from deleting each feature dimension and analyzing changes in KIF22 while TGNN approach can. In short, these experiments demon-
AUPR before and after, is calculated to determine the importance of a strate that the structure properties of the PPI network are not always
feature. We choose a representative type (i.e., binding) to explain reflected in traditional link prediction methods, and moreover, cap-
because it is the most prevalent in the STRING database. As a con- turing and learning the network structures in our top view improves
sequence, HIGH-PPI regards topological polar surface area (TPSA) and the prediction performance.
octanol-water partition coefficient (KOW) as dominant features. This
finding supports the conventional wisdom that TPSA and KOW play a HIGH-PPI accurately identifies key residues constituting func-
key role in drug transport process46, protein interface recognition47,48, tional sites
and PPI prediction49. Typically, functional sites are spatially clustered sets of residues. They
control protein functions and are thus important for PPI prediction. As
Top outside-of-protein view improves the performance our proposed model has the capacity to capture spatial-biological
We investigate the role of top outside-of-protein view TGNN from arrangements of residues in the bottom view, this characteristic can be
three perspectives, including (1) the importance of degree and com- used to explain the model’s decision. It is meaningful to notice that
munity recovery for predicting network structures, (2) comparison HIGH-PPI can automatically learn the residue importance without any
results of TGNN and other leading link prediction methods, (3) a real- residue-level annotations. In this section, we provide (1) a case study of
life example to show the shortcomings of the leading link prediction predicting residue importance for the binding surface, (2) two cases of
methods. estimating residue importance for catalytic sites, and (3) an explain-
Recently, various works have demonstrated the usefulness of able ability comparison of precision in predicting binding sites.
structure properties (e.g., degree, community) of networks for pre- First, a binding example between the query protein (PDB id: 2B6H-
dicting missing links. HIGH-PPI is inspired to efficiently recover the A) and its partner (PDB id: 2REY-A) is investigated. The ground truth
degree and community partitions of the PPI network by utilizing the binding surface is retrieved from the PDBePISA database55, which is
network topology. We show an empirical study in Fig. 4a to illustrate colored in red in Fig. 5a. Subsequently, we apply the GNN explanation
the impact of degree and community recovery for link prediction. We approach (see Section 4.5 in “Methods” for details) on the HIGH-PPI
randomly select the test results from the model trained in different model. As can be seen from Fig. 5a, HIGH-PPI can accurately and
Fig. 4 | Performance of top view GNN of HIGH-PPI to learn relational informa- Best-F1 distributions (5 runs with independent seeds) using various link prediction
tion in PPI network. a Pearson Correlations (R) between the prediction perfor- methods. Methods (green) predicting PPI networks of which the NMI < 0.7 and
mance (Best-F1) and degree recovery (left) and community recovery (right). It can MAE > 0.35 significantly underperform the others (orange). c Left: An example
be observed that high recovery for the degree and community of PPI network showing a PPI network with an area of each node representing its degree value and
indicates better performance for PPI prediction. Degree recovery is quantified with only two external edges connecting the two communities detected. Middle: Real
the Mean Absolute Error (MAE) between the true and predicted degree distribu- calculating results showing how other link prediction methods generate mislinks as
tions. Community recovery is quantified with the normalized mutual information external edges, which may disrupt the community partitions. Right: Real calculat-
(NMI) of true and predicted communities. The shaded area (error band) represents ing results showing the disability of other link prediction methods to recover
the 95% confidence interval. b Boxplots (center line, the median; upper and lower degrees. Source data are provided as a Source Data file.
edges, the interquartile range; whiskers, 0:5 × interquartile range) showing the
automatically identify the residues belonging to the binding surface. PDBePISA for each query protein and treat its residues with importance
Another observation is shown in Fig. 5c which indicates our learned >0 as surface compositions. To gauge the precision of the surface
residue importance is quite close to the real profiles. We show another prediction, intersection over union (IoU) is used, and the box plots of
six cases of HIGH-PPI for identifying binding surfaces correctly in the IoU score distributions are shown in Fig. 5d. The results elucidate
Supplementary Fig. 7. that HIGH-PPI significantly outperforms other models in terms of
Second, in order to evaluate the prediction of catalytic sites for interpretability with a minimum variance. In addition, 3D CNN outper-
PPIs, we utilize the same GNN explanation approach in our model. The forms CNN with a smaller variance, showing that 3D information sup-
ground truth catalytic site is retrieved from the Catalytic Site Atlas56 ports the learning of reliable and generalized protein representations.
(CSA), a database for catalytic residue annotation for enzymes. We Protein functional site prediction sheds light on the model deci-
calculate the residue importance of catalytic sites for query proteins sions and how to carry out additional experimental validations for PPI
(PDB id: 1S9I-A, 1I0O-A). As seen in Fig. 5b, our proposed HIGH-PPI can investigation. Excellent model interpretability also shows that our
correctly predict both residues for 1S9I-A and two out of three for approach can accurately describe biological evidence for proteins.
1I0O-A. We show another nine cases of HIGH-PPI for identifying cata-
lytic sites in Supplementary Fig. 6, where a total of 25 out of 34 catalytic Discussion
sites are correctly identified. Hierarchical graph learning
Additionally, we compare the model interpretability of the CNN, In this paper, we study the PPI problem from a hierarchical graph
3D CNN and HIGI-PPI models. We employ the CNN module in GNN-PPI24 perspective and develop a hierarchical graph learning model named
and 3D CNN module in DeepRank21, respectively. We apply grad-CAM57 HIGH-PPI to predict PPIs. Empirically, HIGH-PPI for PPI prediction
and 3Dgrad-CAM44 approaches to determine residue importance for outperforms leading methods by a significant margin. The hierarchical
CNN and 3D CNN models, correspondingly. We use the binding type graph exhibits high generalization for recognizing unknown proteins
PPIs from the STRING dataset as the training set, and randomly select 20 and robustness against protein structure errors and PPI network
binding type PPIs as the test set. We use the ground truth from perturbations.
Fig. 5 | Automatic explanation for residue importance without supervision. and black, respectively. c Polylines showing the consistency of highest peaks that
a Top: Depiction of a complex protein (left, query protein, PDB id: 2B6H-A; right, represent the learned (gray) and real (red) functional regions for the binding
interacted protein, PDB id: 2REY-A) modeled in surface representation. Residues on interaction case shown in a. d Boxplots (center line, the median; upper and lower
the binding surface of query protein are highlighted in red (important) and others edges, the interquartile range; whiskers, 0:5 × interquartile range) showing the
in blue (non-important). Bottom: Residue importance of the query protein learned explainable ability for binding PPIs by calculating the overlap of real and learned
from HIGH-PPI with coloring ranging from low (blue) to high (red). Important functional regions (IoU, Intersection over Union) with 20 PPI pairs and their real
regions are magnified to show the cartoon representation. b Depiction of two interfaces retrieved from STRING and PDBePISA database, respectively. HIGH-PPI
proteins (left, PDB id: 1S9I-A; right, PDB id: 1I0O-A) modeled in cartoon repre- shows greater explainable ability significantly (Two-sided t-test results: HIGH-PPI
sentations. Residues are colored to match the importance scores, with more versus CNN ****P = 4:4 × 10 6 , HIGH-PPI versus CNN ( + 3D) ****P = 4:4 × 10 8 ). No
important residues highlighted in red and unimportant ones in blue. Residues with information about residue importance was used to train our model. Source data are
catalytic functions that are correctly or incorrectly identified are highlighted in red provided as a Source Data file.
Even without explicit supervision from binding site information, machine intelligence. Here we connect both views by employing the
HIGH-PPI demonstrates its ability to capture residue importance for forward and backward propagation of DL models. The forward pro-
PPI with the aid of a hierarchical graph, which is a good indicator of pagation benefits the learning for the PPI network in the top view. In
excellent interpretability. Suppose HIGH-PPI predicts the presence of a turn, the backward propagation optimizes the PPI-appropriate protein
catalytic interaction for a protein pair but identifies important sites representations in the bottom view.
unrelated to catalysis, we will hardly trust the model’s decision. We describe two main limitations of HIGH-PPI and outline
Moreover, interpretability provides trusted guides for subsequent wet potential solutions in future work. (1) We did not explore in depth how
experimental validations. For example, if HIGH-PPI thinks a catalytic to use protein-level annotations. Annotations for protein functions are
site is important, experiments may be designed by targeting the spe- becoming more available due to the recent growth of protein function
cific site for validation. databases (e.g., the UniProt Knowledge-base59) and computational
In conclusion, interpretable, end-to-end learning with a hier- methods29 for protein function prediction. Some annotations may
archical graph revealing the PPI nature can pave the way to map out speed up learning PPIs. For example, two proteins with low scores of
human interactome and deepen our understanding of PPI the “protein binding” function term hardly interact with each other.
mechanisms. We suggest that future work may consider leveraging function anno-
tations to enhance the expressiveness of protein representations.
Limitations and future work Inspired by the contrastive learning principle, a potentially feasible
We describe our intuitions in the hierarchical graph learning for PPIs. solution is to enhance the consistency in protein representations and
The world is hierarchical. Humans tend to solve problems or learn functions. (2) Protein domain information may be beneficial for hier-
knowledge by conceptualizing the world from a hierarchical view58. archical models. We clarify the core ideas here and provide a detailed
Due to huge semantic gaps between hierarchical views, humans always description in Supplementary Method 2. Domains are distinct func-
use a multi-view learning strategy to deepen the understanding of one tional or structural units in proteins and are responsible for PPIs and
view from the other one. Given rich hierarchical information, recent specific protein functions. Both in terms of structures and functions,
machine intelligence methods can effectively learn knowledge in each the protein domain can represent a crucial middle scale for the PPI
separate view but are not experts in gaining mutual benefits from both hierarchy. However, to our knowledge, true (native) domain annota-
views. This is the challenge that our hierarchical world presents to tions are not easily available and predicted ones are usually retrieved
from computational tools, which inevitably leads to data unreliability. partition coefficient, and topological polar surface area. Supplemen-
If we employ the domain scale as a separate view, data unreliability tary Data File 3 contains quantitative values of seven types of prop-
may spread to other views and impair the entire hierarchical model. On erties for each amino acid. All properties can be easily retrieved from
this basis, we prefer to recommend domain annotations as supervised the RDKit repository61. (3) The PPI network structure determines the
information at the residue level. Precisely, a well-designed regulariza- adjacency matrix At 2 f0,1gm × m , in which the i-th row and j-th column
tion is required to guarantee that all functional sites, discovered by element is 1 if the i-th and j-th proteins interact. (4) The i-th row of the
HIGH-PPI, belong in the prepared domain database. The domain reg- feature matrix X t represents the representation vector for the i-th
ularization and the PPI prediction loss form a flexible trade-off of protein graph g b .
learning objectives, which can appropriately tolerate the domain
annotation unreliability. (3) Memory requirement grows with the view BGNN for learning protein representations
number of a hierarchical graph. HIGH-PPI employs two views to form We use the bottom view graph neural networks (BGNN) to learn pro-
the hierarchical graph and treat amino acid residues as microscopic tein representations. Graph convolutional networks (GCNs) have
components of proteins. However, we did not further consider one shown great effectiveness for relational data and are suitable for
more microscopic view where atoms, the components of residues, learning graph-structured protein representations. Thus, we propose
provide information for representing residues. It might be beneficial to BGNN based on GCNs.
introduce an atom-level view and develop a memory-efficient way for Given the adjacency matrix Ab 2 f0,1gn × n and the feature matrix
storing and processing explicit 3D atom-level information. (4) In future X b 2 Rn × θ of an arbitrary protein graph g b , BGNN outputs the residue-
work, model robustness can be further improved. Although our model level representations in the first GCN block, H ð1Þ 2 Rn × d 1 :
outperforms in the robustness evaluation (see Supplementary
Table 3), we observe that FDRpre is most impacted by unreliable data, H ð1Þ = GCN Ab ,X b ð1Þ
which is mostly because the number of FP significantly increases (up to
6 times) from Data 1 to Data 9. A possible explanation for the sig- where d 1 is the embedding dimension for the first GCN layer.
nificant rise in FP is that the model’s “low demand” for a positive Formally, we update residue representations with the neighbor
sample permits certain controversial samples to be projected as true. aggregations based on the work of Kipf and Welling36:
To address this issue, we recommend the future work consider a
1=2 1=2
straightforward method—the voting strategy which uses the voting H ð1Þ = BN ReLU De Ab + I n De X b W ð1Þ ð2Þ
outcomes of various independent classifiers to identify true PPIs.
Independence makes it unlikely for voting classifiers to commit the
same errors. A test pair can only be predicted as true if it is approved by e 2 Rn × n is the diagonal
where I n 2 Rn × n is the identity matrix, D
P
most voting classifiers, which makes the model more demanding for degree matrix with entries Dii = j Ab + I n ij , W ð1Þ 2 Rθ × d 1 is a learn-
the PPI presence. able weight matrix for the GCN layer, ReLU, BN denotes the ReLU
activation function and batch normalization, respectively.
Methods With the learnable weight matrix W ð2Þ 2 Rd 1 × d2 , the second GCN
Construction of a hierarchical graph block produces the output H ð2Þ 2 Rn × d2 :
We denote a set of amino acid residues in a protein as
1=2 1=2
Prot = fr 1 ,r 2 , . . . ,r n g. Each residue is described with θ kinds of physi- H ð2Þ = BN ReLU De Ab + I n De H ð1Þ W ð2Þ ð3Þ
cochemical properties. For the bottom inside-of-protein view, a pro-
tein graph g b = ðV b ,Ab ,X b Þ is constructed to model the relationship
between residues in Prot, where V b Prot is the set of nodes, Ab is an Finally, we perform the readout operation with a self-attention
n × n adjacency matrix representing the connectivity in g b , and X b 2 graph pooling layer39 and average aggregation to obtain the entire
Rn × θ is a feature matrix containing the properties of all residues. graph representation of a fixed size, x 2 R1 × d 2 .To clarify, we use x i 2
For the top outside-of-protein view, a set of protein graphs can be R1 × d 2 to represent the final representation for the i-th protein graph.
interconnected within a PPI graph g t , which is denoted as g b 2 V t . The
connectivity (i.e., interactions) between protein graphs can be denoted TGNN for learning PPI network information
as an m × m adjacency matrix At . In addition, X t 2 Rm × + represents a We use the top view graph neural networks (TGNN) to learn PPI net-
feature matrix containing the representations of all proteins. We work information. We are inspired by graph isomorphism network
model the protein graphs and their connections as a hierarchical (GIN37), which has the superb expressive power to capture graph
graph, in which four key variables (i.e., Ab , X b , At , X t ) need to be structures. Formally, we are given the PPI graph g t = ðV t ,At ,X t Þ, where
clarified. X t 2 Rm × d2 is defined as the feature matrix whose row vector is a final
(1) The adjacency matrix Ab 2 f0,1gn × n in the protein graph and protein representation from BGNN (i.e., X ½ti,: = x i ,i = 1,2, . . . ,m). TGNN
protein contact map are exactly equivalent. Contact maps are obtained updates the representation of protein v in the k-th GIN block:
with atomic level 3D coordinates of proteins. First, we retrieve the
X
native protein structures from the Protein Data Bank60 and protein x vðkÞ = BN ReLU MLPðkÞ ð1 + ϵÞ x ðvk 1Þ
+ x ðk 1Þ ð4Þ
u2NðvÞ u
structures of various RMSD scores by AlphaFold43. Then we represent
the location of each residue by the 3D coordinate of its C α atom. The where x vðkÞ denotes the representation of protein v after the k-th GIN
presence or the absence of contact between a pair of residues is block, NðvÞ is a set of proteins adjacent to v, and ϵ is a learnable
decided by their C α C α physical distance. We perform a sensitivity parameter. We denote the inputs of protein representations for the
analysis (see Supplementary Fig. 8) and find that our model produces first GIN block as x ð0Þ
i = x i ,i = 1,2, . . . ,m.
similar results when trained on contact maps with cutoff distances After three GIN blocks, TGNN produces representations for all
ranging between 9 Å to 12 Å. Finally, we choose the optimal cutoff proteins. For an arbitrary query pair containing the i-th and j-th pro-
distance of 10 Å, which allows our model to peak its performance. (2) teins, we use the concatenation operation to combine the repre-
For a feature matrix X b , each row represents a set of properties for one sentations of x ð3Þ ð3Þ
i and x j . A fully connected layer (FC) is employed as
amino acid residue. In this work, seven residue-level properties are the classifier. The final vector
^yij 2 R1 × c for the presence probability of
ð3Þ ð3Þ
considered (i.e., θ = 7): isoelectric point, polarity, acidity and alkalinity, PPI is denoted as ^yij = FC hi ∣∣hj where c denotes the total number
hydrogen bond acceptor, hydrogen bond donor, octanol-water of PPI types involved and k denotes the concatenation operation.
16. Huttlin, E. L. et al. Architecture of the human interactome defines 37. Xu, K., Hu, W., Leskovec, J. & Jegelka, S. How powerful are graph
protein communities and disease networks. Nature 545, neural networks? In 7th International Conference on Learning
505–509 (2017). Representations (ICLR). https://doi.org/10.48550/arXiv.1810.
17. Kaboord, B. & Perr, M. Isolation of proteins and protein complexes 00826 (2019).
by immunoprecipitation. Methods Mol. Biol. 424, 349–364 (2008). 38. Szklarczyk, D. et al. STRING v11: protein–protein association net-
18. Aronheim, A., Zandi, E., Hennemann, H., Elledge, S. J. & Karin, M. works with increased coverage, supporting functional discovery in
Isolation of an AP-1 repressor by a novel method for detecting genome-wide experimental datasets. Nucleic acids Res. 47,
protein-protein interactions. Mol. Cell. Biol. 17, 3094–3102 (1997). D607–D613 (2019).
19. Su, J. F., Huang, Z., Yuan, X. Y., Wang, X. Y. & Li, M. Structure and 39. Lee, J., Lee, I. & Kang, J. Self-attention graph pooling. In 36th
properties of carboxymethyl cellulose/soy protein isolate blend International Conference on Machine Learning (ICML). https://doi.
edible films crosslinked by Maillard reactions. Carbohydr. Polym. org/10.48550/arXiv.1904.08082 (2019).
79, 145–153 (2010). 40. Zheng, S., Li, Y., Chen, S., Xu, J. & Yang, Y. Predicting drug–protein
20. Zhao, L., Wang, J., Hu, Y. & Cheng, L. Conjoint feature representa- interaction using quasi-visual question answering system. Nat.
tion of GO and protein sequence for PPI prediction based on an Mach. Intell. 2, 134–140 (2020).
inception RNN attention network. Mol. Ther.-Nucleic Acids 22, 41. Wong, L., You, Z. H., Li, S., Huang, Y. A. & Liu, G. Detection of
198–208 (2020). protein–protein interactions from amino acid sequences using a
21. Renaud, N. et al. DeepRank: a deep learning framework for data rotation forest model with a novel PR-LPQ descriptor. Adv. Intell.
mining 3D protein-protein interfaces. Nat. Commun. 12, 1–8 (2021). Syst. Comput. https://doi.org/10.1007/978-3-319-22053-6_
22. Kov´acs, I. A. et al. Network-based prediction of protein interac- 75 (2015).
tions. Nat. Commun. 10, 1–8 (2019). 42. Park, Y. & Marcotte, E. M. Flaws in evaluation schemes for pair-input
23. Nasiri, E., Berahmand, K., Rostami, M. & Dabiri, M. A novel link pre- computational predictions. Nat. methods 9, 1134–1136 (2012).
diction algorithm for protein-protein interaction networks by 43. Jumper, J. et al. Highly accurate protein structure prediction with
attributed graph embedding. Computers Biol. Med. 137, AlphaFold. Nature 596, 583–589 (2021).
104772 (2021). 44. Yang, C., Rangarajan, A. & Ranka, S. Visual explanations from deep
24. Lv, G., Hu, Z., Bi, Y. & Zhang, S. Learning unknown from correlations: 3D convolutional neural networks for Alzheimer’s disease classifi-
graph neural network for inter-novel-protein interaction prediction. cation. AMIA Annu. Symp. Proc. 2018, 1571–1580 (2018).
In 30th International Joint Conference on Artificial Intelligence 45. Ming, Y. et al. Understanding hidden memories of recurrent neural
(IJCAI). https://doi.org/10.48550/arXiv.2105.06709 (2021). networks. In 2017 IEEE Conference on Visual Analytics Science and
25. Kulmanov, M., Khan, M. A. & Hoehndorf, R. DeepGO: predicting Technology (VAST). https://doi.org/10.48550/arXiv.1710.
protein functions from sequence and interactions using a deep 10777 (2017).
ontology-aware classifier. Bioinformatics 34, 660–668 (2018). 46. Fernandes, J. & Gattass, C. R. Topological polar surface area defines
26. Chen, M. et al. Multifaceted protein–protein interaction prediction substrate transport by multidrug resistance associated protein 1
based on Siamese residual RCNN. Bioinformatics 35, (MRP1/ABCC1). J. medicinal Chem. 52, 1214–1218 (2009).
i305–i314 (2019). 47. Hu, Z., Ma, B., Wolfson, H. & Nussinov, R. Conservation of polar
27. Hsieh, Y. L., Chang, Y. C., Chang, N. W. & Hsu, W. L. In Proc. 8th residues as hot spots at protein interfaces. Proteins: Struct., Funct.,
International Joint Conference on Natural Language Processing Vol. Bioinforma. 39, 331–342 (2000).
2 (Short Papers) 240–245 (Asian Federation of Natural Language 48. Young, L., Jernigan, R. & Covell, D. A role for surface hydrophobicity
Processing, 2017). in protein-protein recognition. Protein Sci. 3, 717–729 (1994).
28. Saha, S. & Raghava, G. P. S. Prediction of continuous B-cell epitopes 49. Korn, A. P. & Burnett, R. M. Distribution and complementarity of
in an antigen using recurrent neural network. Proteins: Struct., hydropathy in mutisunit proteins. Proteins: Struct., Funct., Bioin-
Funct., Bioinforma. 65, 40–48 (2006). forma. 9, 37–55 (1991).
29. Gligorijević, V. et al. Structure-based protein function prediction 50. Katz, L. A new status index derived from sociometric analysis. Psy-
using graph convolutional networks. Nat. Commun. 12, 1–14 (2021). chometrika 18, 39–43 (1953).
30. Jiménez, J. et al. DeepSite: protein-binding site predictor using 3D- 51. Zhou, T., Lv, L. & Zhang, Y. C. Predicting missing links via local
convolutional neural networks. Bioinformatics 33, information. Eur. Phys. J. B 71, 623–630 (2009).
3036–3042 (2017). 52. Barabási, A. L. & Albert, R. Emergence of scaling in random net-
31. Amidi, A. et al. EnzyNet: enzyme classification using 3D convolu- works. Science 286, 509–512 (1999).
tional neural networks on spatial representation. PeerJ 6, 53. Jeh, G. & Widom, J. Proc. 8th International Conference on Knowl-
e4750 (2018). edge Discovery and Data Mining (ACM, 2002).
32. Tubiana, J., Schneidman-Duhovny, D. & Wolfson, H. J. ScanNet: An 54. De Meo. P., Ferrara E., Fiumara G. & Provetti A. Generalized louvain
interpretable geometric deep learning model for structure-based method for community detection in large networks. In 11th Inter-
protein binding site prediction. Nat. Methods 19, 1–10 (2022). national Conference on Intelligent Systems Design and Applications
33. Goldberg, D. S. & Roth, F. P. Assessing experimentally derived (ISDA). https://doi.org/10.1109/ISDA.2011.6121636 (2011).
interactions in a small world. Proc. Natl Acad. Sci. 100, 55. Krissinel, E. & Henrick, K. Inference of macromolecular assemblies
4372–4376 (2003). from crystalline state. J. Mol. Biol. 372, 774–797 (2007).
34. Fouss, F., Pirotte, A., Renders, J. M. & Saerens, M. Random-walk 56. Porter, C. T., Bartlett, G. J. & Thornton, J. M. The Catalytic Site Atlas:
computation of similarities between nodes of a graph with appli- a resource of catalytic sites and residues identified in enzymes
cation to collaborative recommendation. IEEE Trans. Knowl. Data using structural data. Nucleic acids Res. 32, D129–D133 (2004).
Eng. 19, 355–369 (2007). 57. Selvaraju, R. R. et al. Grad-cam: Visual explanations from deep
35. Tong, H., Faloutsos, C. & Pan, J. Y. 6th International Conference on networks via gradient-based localization. In Proceedings of the 2017
Data Mining (ICDM) (IEEE, 2006). IEEE International Conference on Computer Vision (ICCV). https://
36. Kipf, T. N. & Welling, M. Semi-supervised classification with graph doi.org/10.1007/s11263-019-01228-7 (2017).
convolutional networks. In 5th International Conference on Learning 58. Li, J. et al. Semi-supervised graph classification: a hierarchical
Representations (ICLR). https://doi.org/10.48550/arXiv.1609.02907 graph perspective. In 2019 The World Wide Web Conference
(2017). (WWW). https://doi.org/10.48550/arXiv.1904.05003 (2019).
59. Boutet, E., Lieberherr, D., Tognolli, M., Schneider, M. & Bairoch, A. Additional information
UniProtKB/Swiss-Prot 89–112 (Humana Press, 2007). Supplementary information The online version contains
60. Berman, H. M. et al. The protein data bank. Nucleic Acids Res. 28, supplementary material available at
235–242 (2000). https://doi.org/10.1038/s41467-023-36736-1.
61. Landrum, G., Tosco, P. & Kelley, B. rdkit/rdkit: 2021_09_4 (Q3 2021) 351
Release. https://zenodo.org/record/5835217#.Y_JocB9Bzcs (2022). Correspondence and requests for materials should be addressed to
62. Ying, Z., Bourgeois, D., You, J., Zitnik, M. & Leskovec, J. Gnnexplai- Yong Huang or Jia Li.
ner: Generating explanations for graph neural networks. In 33rd
Advances in Neural Information Processing Systems (NeurIPS). Peer review information Nature Communications thanks Pufeng Du and
https://doi.org/10.48550/arXiv.1903.03894 (2019). the other, anonymous, reviewer(s) for their contribution to the peer
review of this work. Peer reviewer reports are available.