Algorithm Overview
BindUP-Alpha is an automatic server to predict DNA and RNA binding proteins given the three dimensional structure of the protein (or a structural model). The DNA and RNA binding prediction is based on the electrostatic patches on the protein surface and does not rely on either sequence or structural homology. In addition to providing functional prediction (i.e. whether the protein binds nucleic acids or not), the server displays the largest positive and/or negative electrostatic patches on the protein surface, predicted by our Patch Finder algorithm.
BindUP-Alpha holds information for all the protein structures in the RCSB PDB database and is updated every month. The server is applicable in either single or batch mode and can be applied for testing hundreds of proteins simultaneously in a highly efficient manner.
The Patch Finder algorithm
The Patch Finder algorithm automatically assigns surface positive/negative patches by looking for adjacent points on the protein surface that meet a given electrostatic potential cutoff (2 or -2 kT/e).
The algorithm is built of five major steps:
  1. Calculating the electrostatic potential of the protein on a three dimensional grid, using the Poisson Boltzmann equation.
  2. Defining the grid points that fall closest to the protein surface while emitting all non surface points.
  3. Extracting all three-dimensional patches of adjacent grid points which meet the defined cutoff.
  4. Choosing the largest positive patch for each protein chain.
  5. Assigning the protein residues related to the patch.
As a preparation step for the electrostatic calculation, hydrogen atoms are assigned to the structure using the PDB2PQR package.
The electrostatic potential is calculated using the APBS software. The grid spacing is set to 1Å and the rest of the parameters are set to default.
The molecular surface is computed using the open source program DMS.
A detailed description of the algorithm is found in Shazman et al., 2007 and Stawiski et al., 2003.
The NAbind algorithm
The algorithm for predicting DNA and RNA binding proteins (NAbind) is a non homology structural-based predictor. NAbind employs the information from the largest positive and negative electrostatic patches on the protein surface, as well as other features extracted from the protein structure to distinguish nucleic acid binding proteins, specifically DNA-binding proteins and RNA-binding proteins, from other non-nucleic acids binding proteins. The method uses a Support Vector Machine (SVM) classifier (GIST) that is trained on an ensemble of structural and electrostatic features, extracted from the protein surface and the electrostatic patches. A detailed description of the algorithm is found in Shazman et al., 2008.
NAbind was recently employed for predicting DNA-binding proteins and RNA-binding proteins achieving high accuracy.