BindML

Software for Predicting and Classifying Protein-Protein Interaction Sites


Description:

BindML+ is a method for predicting permanent and transient type protein-protein interface residues of a given protein structure using information from its protein family multiple sequence alignment (MSA) [1]. BindML+ is an extended version of the previous method, BindML [2]. BindML+ uses amino acid substitution patterns found in strong and weak protein complexes to predict protein interfaces types (permanent interfaces are defined as protein complexes with nano-Molar dissociation constant, while transient are those defined protein-protein interactions measured with micro-Molar dissociation constant).

Method:

BindML

Input:

BindML

PDB Structure and MSA submission:

From the main BindML submission page (http://kiharalab.org/bindml/), four different input fields are requested:

1.Email (A link to the results on your server will be sent to this address when your submission is processed and completed)
2.PDB File (Standard PDB format)
3.PDB Chain ID: (For example, chain 'A' or chain 'B'. If there are no chains in your PDB, you can use the underscore "_" instead).
4.Multiple Sequence Alignment File (This must be in the FASTA format. You have to make sure that your PDB sequence is included in submitted MSA). If the MSA field is left empty, the server will try to automatically go to the PFAM-A, PFAM-B and HMMER database [3] in order and retrieve family sequences and automatically generate the MSA (with the sequence of your input PDB file) with MUSCLE [4,5].

Output:

BindML

The interactive BindML output consists of an integrative structural-level view and a residue-level view with associated prediction scores.

On the left panel, the structural view allows you to visualize the PDB structure previously submitted with BindML predictions. BindML scores range from negative to positive values using the color spectrum from red to blue respectively. Stronger (high confidence) predictions are more negative in value (red in color), while more positive scores represent weaker predictions (blue in color).

The right panel shows residue scores for all surface residue predictions. The scores with high level of confidence are colored in red. Clicking on the one letter residue name will highlight the location (residue prediction shown as a red sphere) of that residue prediction on the left structural panel. In addition to interface predictions, each residue is classified into either permanent or transient types as predicted by their tL Z-scores. tL Z-scores greater than or equal to zero corresponds to permanent PPI site predictions, while tL Z-scores below zero represent transient PPI site predictions. The residue dL-scores, dL Z-scores, tL-scores and tL-Zscores are listed in a “wrapped” manner, where displayed in sequence order from left to right and continues to be listed from top to the bottom of the screen.

Further more, the PDB with all the dL Z-scores and tL Z-scores are mapped to the protein structure’s B-factors and occupancy can be downloaded from the link above the left and right panels.

References:

1.La, D., Kong, M., Hoffman, W., Choi, Y. I., & Kihara, D. (2013). Predicting permanent and transient protein-protein interfaces. Proteins, 81(5), 805–818. doi:10.1002/prot.24235
2.La, D., & Kihara, D. (2011). A novel method for protein-protein interaction site prediction using phylogenetic substitution models. Proteins, 80(1), 126–141. doi:10.1002/prot.23169
3.Guindon, S., & Gascuel, O. (2003). A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Systematic Biology, 52(5), 696–704.
4.Finn, R. D., Mistry, J., Tate, J., Coggill, P., Heger, A., Pollington, J. E., et al. (2009). The Pfam protein families database. Nucleic Acids Research, 38(Database), D211–D222. doi:10.1093/nar/gkp985
5.Edgar, R. C. (2004a). MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics, 5, 113. doi:10.1186/1471-2105-5-113
6.Edgar, R. C. (2004b). MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Research, 32(5), 1792–1797. doi:10.1093/nar/gkh340


Lab Home | Supplementary Materials | Download | Contact

© Purdue University, 2010.  All rights reserved.