Using Energy Centrality Relationship (ECR) to identify and predict Functionally-Linked Interacting Proteins (FLIPs)
Interacting networks of proteins are responsible for a multitude of biological functions. These Functionally-Linked Interacting Proteins (FLIPs) occur at specific interfaces. It is therefore important to distinguish them from Functionally uncorrelated Contacts (FunCs). Here we utilize geometric, energetic, and sequence conservation characteristics at the interface to identify factors that may contribute towards an interface being FLIP or FunC. We studied these interface properties by analyzing a protein database we created called FLIPdb, which contains proteins belonging to various functional sub-categories. In our approach, which we term the Energy Centrality Relationship (ECR), we coupled Kortemme and Baker's computational alanine scanning analysis to estimate the energetic sensitivity of each amino acid at the center of the interface with geometric features. Principal Component Analysis and K-means Clustering analysis on FLIPdb could distinguish FLIPs from FunCs with an accuracy of 76%. To investigate if evolutionary pressure plays a role in maintaining FLIPs, similar analyses were carried out on a set of 154 interfaces. Here we use Lichtarge's Evolutionary Trace (ET) method to calculate the ET score (ρ) and alignment variability (# of states) of residues within various types of interfaces. Using PCA and K-means clustering analysis, we were able to distinguish FLIPs from FunCs with an accuracy of 69%. We also tested ECR's ability to identify near-native (≤ 5 Å RMSD) poses in a docking run. A common problem in molecular docking is the generation of a large number of false positives. The ECR methodology was able to predict near-native poses in 50% of the cases, representing an increase of 9% relative to HEX (a well known docking software package) alone. Overall, we identified that FLIPs have a stronger central organizing tendency than FunCs. Although FLIPs also show more conservation at the core than at the edges, they exhibit more overall variability than FunCs, suggesting energy is conserved at the expense of sequence stability. Finally, we indicate how our ECR method may be used to reduce false positive predictions in docking calculations.