- I keep getting errors that my file has unknown residues. What should I do?
The problem is that some record in your pdb file is marked as ATOM, but is not one of the
20 standard amino acids or an RNA base. Some programs will place HETATMs in
ATOM records.
You can edit the file to remove these residues or change them to
HETATM records.
- I see four different choices for my docking results, "Balanced", "Electrostatic-favored", and so on. Which one should I choose?
We provide many different options for docking because we believe good results go hand-in-hand with
experimental knowledge of the complex. If you don't have any prior knowledge of what forces dominate
in your complex, we recommend using the balanced coefficients. If your complex is antibody-antigen, we recommend
using our antibody mode.
- I only see my receptor in my results. Where is the ligand?
This is probably due to your molecular viewer. Some molecular viewers do not have support for multiple PDB entries in one file.
There are two choices for how to proceed. The first is to switch to a molecular viewer that supports multiple entries in a single file
like
PyMOL. The second choice is to split the model file into receptor and ligand and load those independently
into your viewer. On Linux and Mac OS X, you can do this by calling
file=model.000.00.pdb;csplit --prefix=${file/pdb/} --suffix-format="%02d.pdb" $file %HEADER% /HEADER/
,
substituting your chosen model for model.000.00.pdb. This should give you two files, model.000.00.00.pdb and model.000.00.01.pdb, that contain the receptor and ligand respectively.
On Windows, you can open the file in Notepad or
Notepad++, searching for the END record in the middle of the file and manually copy
the two halves into separate files. Alternatively, simply removing the lines that say END may allow your viewer to load both the receptor and ligand into one object. On Linux and
Mac OS X, this can be done by calling
grep -Ev '^(HEADER)|(END)' model.000.00.pdb > model.000.00.stripped.pdb
. On Windows, you can manually remove those lines in
one of the text editors mentioned above.
(If you have a simple way to do either of these that is built into Windows, we would love to hear about it.)
- What is Piper and what is ClusPro? How does this version differ from the previous ClusPro?
Piper is the FFT-based rigid docking program developed in our lab. It provides 1000 low energy results to our clustering program, ClusPro to
attempt to find the native site under the assumption that it will have a wide free-energy attractor with the largest number of results.
The previous version of ClusPro used a similar clustering algorithm, but obtained 2000 results from other docking programs, not Piper.
- What are these Model Scores? Should I use them to rank my results?
We only provide the scores coming from Piper for our models because a large number of people have asked for them.
Our experience shows that the best way to rank models is by cluster size, which is how the models are ranked coming out of Cluspro.
This is the method we've used to great success in CAPRI and on various protein docking benchmarks.
As a brief explanation, the way ClusPro works is:
- We rotate the ligand with 70,000 rotations. For each rotation, we
translate the ligand in x,y,z relative to the receptor on a grid. We
choose the translation with the best score from each rotation.
- Of the 70,000 rotations, we choose the 1000 rotation/translation
combinations that have the lowest score.
- We do a greedy clustering of these 1000 ligand positions with a 9
angstrom C-alpha rmsd radius. This means we find the ligand position
with the most "neighbors" in 9 angstroms, and it becomes a cluster
center, and its neighbors the members of the cluster. These are then
removed from the set and we then look for a second cluster center.
And so on.
Note that in step 1, we sample around 109 positions of the
ligand relative to the receptor. From this 109, we choose 1000 or
103 positions. That means these 1000 are in the top millionth of all
positions of the ligand relative to the receptor. At this level, the
scoring function is too rough to discriminate meaningfully between
these 1000. The scoring function's purpose is to pull them out of the
10^9 positions we started from.
In summary, we strongly encourage you to not judge models based on these
scores because that is not what the scoring was designed for.
- I want to submit a lot of jobs. Is there a way I can do that?