ProRMSD Documentation
a tool for automatic atom matching and RMSD calculation
ProRMSD enables the calculation of the RMSD between a reference molecular structure and different conformations of the same molecule, usually the structures of the molecule predicted by a docking program.
The user can submit these files in different chemical file formats: SDF, PDB, MOL2, MOL.
The two inputs must be in the same file format (maximum file size: 50 MB). The inputs are handled by OpenBabel, which converts them into SDF format, read by ProRMSD, therefore, the user is encouraged to use this format to avoid problems with the connectivity that can be present in the PDB files.
The user must submit two identical molecules:
The number of hydrogens in the two molecules can be different, the RMSD is calculated only on heavy atoms.
Since ProRMSD also performs atom matching, different structures of the same molecule don't need to show a full atomic equivalence. ProRMSD can calculate RMSD also when atoms are randomly labelled.
Instead, in case an input file with multiple structures is loaded, they must all be indexed in the same way (this is usually the case with docking outputs).
The user must choose the RMSD calculation mode.
How to choose the calculation mode
The RMSD calculated on all heavy atoms of the molecule is the most employed calculation mode, but some specific cases may require a different type of calculation.
The “backbone RMSD” can be used in the case of a peptide, where it could be useful to also calculate the deviation of only the backbone of the molecule, removing the highly flexible side chains. Indeed, if also the side chains are considered, the standard RMSD could be very high even if the core structures of such ligand were placed accurately.
Each predicted molecular structure returns its own value of RMSD when compared to the reference. If the output is composed of different RMSD values, calculated between the reference pose and each one of the predicted poses, these are in the same order as in the input submitted. How to interpret the results RMSD is used to benchmark docking programs by their ability to reproduce the pose of a ligand from a known experimental protein-ligand complex structure, in a process called “re-docking”. When performing a comparison between the predicted pose and the reference pose of the ligand, the lower the RMSD, the more accurate the prediction. Success is typically regarded as being less than 2 Å, instead, high RMSD levels would suggest an inaccurate pose prediction. Anyway, comparing docking methods is not that easy since some methods can perform better with certain classes of ligands and targets, thus a good modus operandi is to try several docking methods to determine the best one for the specific problem. Alternative measures of success can be considered, such as whether the correct ligand-target interactions are recovered. Tips for beginners If the value of RMSD is high for a protein or a peptide with very flexible side chains, the user should try calculating the “backbone RMSD”.
Inputs
ProRMSD requires two different files:- a file containing a molecular structure chosen as reference, e.g., the ligand of the crystallised structure
- a file containing one or more molecular structures, e.g., the file generated by a docking program with the predicted poses of the ligand
The two inputs must be in the same file format (maximum file size: 50 MB). The inputs are handled by OpenBabel, which converts them into SDF format, read by ProRMSD, therefore, the user is encouraged to use this format to avoid problems with the connectivity that can be present in the PDB files.
The user must submit two identical molecules:
- the two files must both contain either aromatic structures or kekulé structures
- pay attention to the bond order, especially if the file was originated from a PDB format
Calculation modes
ProRMSD can execute 3 different types of RMSD calculation:- RMSD: calculates RMSD on all heavy atoms of the molecule
- Backbone RMSD: calculates RMSD only on the molecular backbone (with this option all the side chains will not be considered)
Output
The output generated is the numeric value of RMSD in angstrom (Å).Each predicted molecular structure returns its own value of RMSD when compared to the reference. If the output is composed of different RMSD values, calculated between the reference pose and each one of the predicted poses, these are in the same order as in the input submitted. How to interpret the results RMSD is used to benchmark docking programs by their ability to reproduce the pose of a ligand from a known experimental protein-ligand complex structure, in a process called “re-docking”. When performing a comparison between the predicted pose and the reference pose of the ligand, the lower the RMSD, the more accurate the prediction. Success is typically regarded as being less than 2 Å, instead, high RMSD levels would suggest an inaccurate pose prediction. Anyway, comparing docking methods is not that easy since some methods can perform better with certain classes of ligands and targets, thus a good modus operandi is to try several docking methods to determine the best one for the specific problem. Alternative measures of success can be considered, such as whether the correct ligand-target interactions are recovered. Tips for beginners If the value of RMSD is high for a protein or a peptide with very flexible side chains, the user should try calculating the “backbone RMSD”.