MMTSB
Tool Set Documentation

Convpdb.pl

From MMTSB
Revision as of 22:47, 13 September 2010 by Feig (talk | contribs)
(diff) ←Older revision | view current revision (diff) | Newer revision→ (diff)
Jump to: navigation, search

Usage

usage:   convpdb.pl [options] [PDBfile]
options: [-center] [-translate dx dy dz] [-orient]
         [-rotate m11 m12 m13 m21 m22 m23 m31 m32 m33]
         [-rotatex phi] [-rotatey phi] [-rotatez phi]
         [-biomt num] [-smtry num]
         [-scale factor] [-diff PDBfile] [-difflsqfit] [-add PDBFile]
         [-nmode file amplitude weight]
         [-nmodesample file prefix from to delta] [-skipzero]
         [-sel list] [-exclude list]
         [-chain id] [-model num] [-firstmodel] [-nohetero]
         [-selseq abbrev]
         [-nsel Selection]
         [-merge pdbfile]
         [-renumber start] [-addres value]
         [-renumwatersegs]
         [-match pdbfile]
         [-setchain id] [-setseg id] [-setall]
         [-readseg] [-chainfromseg] [-splitseg] [-alternate]
         [-charmm19] [-amber]
         [-out charmm19 | charmm22 | amber | generic]
         [-genres]
         [-crd] [-crdext] [-crdinp]
         [-segnames]
         [-fixcoo]
         [-ssbond res1:res2[=res1:res2]] [-nossbond]
         [-solvate] [-cutoff value] [-solvcut value]
         [-octahedron] [-cubic]
         [-ions NAME:num[=NAME:num]]
         [-replace PDB:num]
         [-info] [-listseg] [-residues] [-rescount]
         [-fill inx:seq]
         [-mol2]
         [-cleanaux]
         [-setaux1 value] [-setaux2 value]
         [-removeclashes] [-clashes] [-clashcut value]
         [-wrap boxx boxy boxz] [-by chain|atom|system]
         [-reimage cx cy cz]

Show source


Description

Converts and manipulates a protein structure PDB file. The input is read through standard input or from a file given as a command line argument. With the option -renumber renumbering of the residues can be requested to obtain continuous residue numbering starting from a given number. Alternatively, the option -add adds a constant to every residue number for the case of missing residues in the PDB file when continuous renumbering would not be desirable. As a third option the residue numbering may be adjusted to match the numbering in a reference PDB file given with -match by searching for the best sequence match.

If the input PDB file comes from CHARMM with the CHARMM19 force field the option -charmm19 needs to be specified to correctly identify histidine residues. Output is written by default for the CHARMM22 protein force field but the format can be selected with the -out option. Possible values are charmm19, charmm22, amber, and generic. With the generic option all histidine residues as named HIS regardless of the name or protonation state in the input file. It is also possible to append noh to the format name to request exclusion of all hydrogen atoms in the output.

The molecule can be centered at the origin with -center or shifted with -translate dx dy dz.

With -sel followed by a list of residues can be used to select a subset of residues. This may be done, e.g., for loop modeling applications where only the neighborhood of the loop under consideration is needed for modeling. This option is complemented by -merge for merging a template structure from a PDB file with another PDB file. Again, this functionality is particularly useful for loop modeling in order to reassemble a complete protein structure if only the loop vicinity is being used during modeling. Alternatively, one may also specify a list of residues with -exclude that should be excluded from the output.

A structure fragment can also be selected based on its amino acid sequence given with the option -selseq. The sequence has to match exactly part of the sequence of the input structure for this option to work and only a single fragment can be extracted at a time.

For multidomain structures the option -chain is available to select a particular chain. The chain ID may be set or changed with -setchain.

Files from the PDB data bank often contain residues in addition to a biomolecule of interest such us solvent or small ligands. They are usually denoted with HETATM records. The option -nohetero is available to ignore such atoms when a PDB structure is read.

A few options are available to handle CHARMM segment names. If -readseg is given, the CHARMM segment names are read from the output. The option -chainfromseg is available to set chain IDs from the last letter of the segment names.

With -segnames segment names are included in the output file. Segement IDs are necessary for using a PDB file with CHARMM. Unless they have been read from the input file they are generated automatically if this option is given.

The option -fixcoo can be used to ensure reasonable c-terminal oxygen coordinates. If the second terminal oxygen is missing or has incorrect coordinates it will be rebuilt correctly with this option.

If SSBOND records are present in the input file to indicate the presence of disulfide bonds, they are maintained. The option -nossbond is available to suppress SSBOND records. In order to add disulfide bonds to a PDB file, the option -ssbond may be used with a list of cystine residue pairs.

Finally, this script can be used to solvate the input PDB structure in a rectangular (default), cubic, or octahedrol box of pre-equilibrated water molecules. This is possible with the option -solvate. The type of box is selected with -cubic or -octahedron. A cutoff value may be specified with -cutoff to indicate the minimum margin from the molecule that is being solvated to the edge of the box.

Options

-help 
usage information
-center 
centers the molecule with respect to the origin
-translate dx dy dz 
translates the molecule according to the given displacements
-rotate m11 m12 m13 m21 m22 m23 m31 m32 m33 
rotates the molecule according to the given 3x3 rotation matrix (in 3D)
-rotatex phi 
rotates the molecule about the x-axis according to the given phi angle
-rotatey phi 
rotates the molecule about the y-axis according to the given phi angle
-rotatez phi 
rotates the molecule about the z-axis according to the given phi angle
-scale factor 
scales the molecule's coordinates according to the given factor
-diff PDBfile 
returns the difference in coordinate values between two PDB files
-difflsqfit 
perform least-squares fit before calculating difference
-add PDBfile 
returns the summed coordinate values between two PDB files
-sel list 
select a subset of residues according to a user defined list
-exclude list 
exclude a subset of residues according to a user defined list
-chain id 
select a specific chain according to the given id
-model num 
select a specific NMR model according to the given number
-nohetero 
exclude hetero atoms
-selseq abbrev 
select a specific amino acid sequence according to the given single letter abbreviated amino acid code
-nsel Selection
select part of the structure with new selection syntax
-merge PDBfile 
appends a PDB file
-renumber start 
renumbers the residues according to the given start value
-addres value 
add the given value to all residue number
-renumwatersegs 
renumbers water segment IDs
-match PDBfile 
renumber residues to match the numbering in the given PDB file
-setchain id 
sets the chain ID according to the given ID
-readseg 
read segment IDs from last column of PDB file
-chainfromseg 
generate the segment ID based on the chain ID
-charmm19 
read input PDB as CHARMM19 format
-amber 
read input PDB as Amber format
-out charmm19|charmm22|amber|generic 
specify output format
-segnames 
automatically generate segment IDs
-fixcoo 
fix C-terminal atoms
-ssbond res1
res2[=res1:res2] : add disulfide information in form of SSBOND record(s)
-nossbond 
do not write out SSBOND records
-solvate 
solvates the molecule by calling external solvate program
-cutoff value 
defines the minimum distance from molecule to edge of solvation box
-octahedron 
solvate the molecule using an octahedron
-cubic 
solvate the molecule using a cubic box
-ions NAME:num[=NAME:num] 
add ions called NAME according to the given number
-info 
write out some information about a given PDB structure
-fill inx:seq 
add C-alpha atom records with zero coordinates for missing residues according at the given index with the given sequence (this is useful for Modeller)
-mol2 
output MOL2 format
-cleanaux 
reset AUX1 column to 1.0 and AUX2 column to 0.0
-removeclashes 
removes atoms with clashes from PDB

Examples

convpdb.pl -out charmm19 1vii.orig.pdb
converts the input PDB file (from the PDB databank) to a format suitable for the CHARMM19 force field.

ATOM      1  N   MET    41       1.177 -10.035  -3.493  1.00  0.00          
ATOM      2  CA  MET    41       0.292  -8.839  -3.377  1.00  0.00          
ATOM      3  C   MET    41      -0.488  -8.912  -2.063  1.00  0.00          
ATOM      4  O   MET    41      -1.039  -9.937  -1.709  1.00  0.00          
ATOM      5  CB  MET    41      -0.674  -8.793  -4.565  1.00  0.00          
ATOM      6  CG  MET    41      -0.091  -7.889  -5.657  1.00  0.00          
ATOM      7  SD  MET    41      -0.153  -8.747  -7.255  1.00  0.00          
ATOM      8  CE  MET    41      -0.971  -7.432  -8.193  1.00  0.00          
ATOM      9 1H   MET    41       0.835 -10.784  -2.856  1.00  0.00          
ATOM     10 2H   MET    41       1.166 -10.381  -4.475  1.00  0.00          

...


convpdb.pl -renumber 1 -out charmm22noh -segnames 1vii.orig.pdb
converts the input PDB file (from the PDB databank) to a format suitable for CHARMM22. Hydrogen atoms are not included in the output and residues are renumbered to start at 1. Segment ID are generated and included in the output.

ATOM      1  N   MET     1       1.177 -10.035  -3.493  1.00  0.00      PRO0
ATOM      2  CA  MET     1       0.292  -8.839  -3.377  1.00  0.00      PRO0
ATOM      3  C   MET     1      -0.488  -8.912  -2.063  1.00  0.00      PRO0
ATOM      4  O   MET     1      -1.039  -9.937  -1.709  1.00  0.00      PRO0
ATOM      5  CB  MET     1      -0.674  -8.793  -4.565  1.00  0.00      PRO0
ATOM      6  CG  MET     1      -0.091  -7.889  -5.657  1.00  0.00      PRO0
ATOM      7  SD  MET     1      -0.153  -8.747  -7.255  1.00  0.00      PRO0
ATOM      8  CE  MET     1      -0.971  -7.432  -8.193  1.00  0.00      PRO0
ATOM     20  N   LEU     2      -0.523  -7.832  -1.331  1.00  0.00      PRO0
ATOM     21  CA  LEU     2      -1.241  -7.824  -0.028  1.00  0.00      PRO0

...


convpdb.pl -sel 10:21 1vii.exp.pdb
copies only residues 10 through 21 from the input PDB file to the output.

ATOM    141  N   VAL    10      -1.787  -4.543   8.123  1.00  0.00          
ATOM    142  CA  VAL    10      -0.514  -3.998   7.587  1.00  0.00          
ATOM    143  C   VAL    10      -0.582  -2.467   7.545  1.00  0.00          
ATOM    144  O   VAL    10      -0.049  -1.793   8.404  1.00  0.00          
ATOM    145  CB  VAL    10      -0.291  -4.552   6.183  1.00  0.00          
ATOM    146  CG1 VAL    10       0.935  -3.888   5.559  1.00  0.00          
ATOM    147  CG2 VAL    10      -0.064  -6.066   6.275  1.00  0.00          
ATOM    148  H   VAL    10      -2.636  -4.140   7.863  1.00  0.00          
ATOM    149  HA  VAL    10       0.303  -4.301   8.225  1.00  0.00          
ATOM    150  HB  VAL    10      -1.160  -4.352   5.575  1.00  0.00          

...


convpdb.pl -match 1vii.shift.pdb 1vii.exp.pdb
matches the residue numbering of the input file with the numbering in 1vii.shift.pdb after aligning both sequences.

ATOM      1  N   MET     6       1.177 -10.035  -3.493  1.00  0.00          
ATOM      2  CA  MET     6       0.292  -8.839  -3.377  1.00  0.00          
ATOM      3  C   MET     6      -0.488  -8.912  -2.063  1.00  0.00          
ATOM      4  O   MET     6      -1.039  -9.937  -1.709  1.00  0.00          
ATOM      5  CB  MET     6      -0.674  -8.793  -4.565  1.00  0.00          
ATOM      6  CG  MET     6      -0.091  -7.889  -5.657  1.00  0.00          
ATOM      7  SD  MET     6      -0.153  -8.747  -7.255  1.00  0.00          
ATOM      8  CE  MET     6      -0.971  -7.432  -8.193  1.00  0.00          
ATOM      9 1H   MET     6       0.835 -10.784  -2.856  1.00  0.00          
ATOM     10 2H   MET     6       1.166 -10.381  -4.475  1.00  0.00          

...


convpdb.pl -merge 1vii.exp.pdb 1vii.sel10:21.pdb
merges the fragment in 1vii.sel10:21.pdb with the structure in 1vii.exp.pdb.

ATOM      1  N   MET     1       1.177 -10.035  -3.493  1.00  0.00      PRO0
ATOM      2  CA  MET     1       0.292  -8.839  -3.377  1.00  0.00      PRO0
ATOM      3  C   MET     1      -0.488  -8.912  -2.063  1.00  0.00      PRO0
ATOM      4  O   MET     1      -1.039  -9.937  -1.709  1.00  0.00      PRO0
ATOM      5  CB  MET     1      -0.674  -8.793  -4.565  1.00  0.00      PRO0
ATOM      6  CG  MET     1      -0.091  -7.889  -5.657  1.00  0.00      PRO0
ATOM      7  SD  MET     1      -0.153  -8.747  -7.255  1.00  0.00      PRO0
ATOM      8  CE  MET     1      -0.971  -7.432  -8.193  1.00  0.00      PRO0
ATOM      9 1H   MET     1       0.835 -10.784  -2.856  1.00  0.00      PRO0
ATOM     10 2H   MET     1       1.166 -10.381  -4.475  1.00  0.00      PRO0

...


convpdb.pl -sel 1:5=10:21 -setchain B -segnames 1vii.exp.pdb
extracts residues 1 through 5 and 10 through 21 from the input file. The chain ID is set to B and CHARMM segment names are generated in the output.

ATOM      1  N   MET B   1       1.177 -10.035  -3.493  1.00  0.00      PR01
ATOM      2  CA  MET B   1       0.292  -8.839  -3.377  1.00  0.00      PR01
ATOM      3  C   MET B   1      -0.488  -8.912  -2.063  1.00  0.00      PR01
ATOM      4  O   MET B   1      -1.039  -9.937  -1.709  1.00  0.00      PR01
ATOM      5  CB  MET B   1      -0.674  -8.793  -4.565  1.00  0.00      PR01
ATOM      6  CG  MET B   1      -0.091  -7.889  -5.657  1.00  0.00      PR01
ATOM      7  SD  MET B   1      -0.153  -8.747  -7.255  1.00  0.00      PR01
ATOM      8  CE  MET B   1      -0.971  -7.432  -8.193  1.00  0.00      PR01
ATOM      9 1H   MET B   1       0.835 -10.784  -2.856  1.00  0.00      PR01
ATOM     10 2H   MET B   1       1.166 -10.381  -4.475  1.00  0.00      PR01

...


convpdb.pl -selseq AFANLPL 1vii.exp.pdb
extracts residues 17 through 23 corresponding to the sequence AFANLPL from the input file.

ATOM    250  N   ALA    17      -6.563   3.127  -1.620  1.00  0.000         
ATOM    251  CA  ALA    17      -6.531   4.418  -0.879  1.00  0.000         
ATOM    252  C   ALA    17      -5.098   4.662  -0.409  1.00  0.000         
ATOM    253  O   ALA    17      -4.613   5.776  -0.400  1.00  0.000         
ATOM    254  CB  ALA    17      -7.464   4.346   0.332  1.00  0.000         
ATOM    255  H   ALA    17      -7.104   2.381  -1.285  1.00  0.000         
ATOM    256  HA  ALA    17      -6.842   5.221  -1.532  1.00  0.000         
ATOM    257 1HB  ALA    17      -7.940   3.377   0.364  1.00  0.000         
ATOM    258 2HB  ALA    17      -6.892   4.496   1.236  1.00  0.000         
ATOM    259 3HB  ALA    17      -8.218   5.115   0.254  1.00  0.000         

...


convpdb.pl -rotate 1 0 0 0 1 0 0 0 1 1vii.exp.pdb
rotates the molecule around the x-axis by 180 degrees through this relation:

Rx (phi) = [[ 1 0 0 ],[ 0 cos(phi) sin(phi) ],[ 0 -sin(phi) cos(phi) ]]

ATOM      1  N   MET     1       1.177 -10.035  -3.493  1.00  0.00          
ATOM      2  CA  MET     1       0.292  -8.839  -3.377  1.00  0.00          
ATOM      3  C   MET     1      -0.488  -8.912  -2.063  1.00  0.00          
ATOM      4  O   MET     1      -1.039  -9.937  -1.709  1.00  0.00          
ATOM      5  CB  MET     1      -0.674  -8.793  -4.565  1.00  0.00          
ATOM      6  CG  MET     1      -0.091  -7.889  -5.657  1.00  0.00          
ATOM      7  SD  MET     1      -0.153  -8.747  -7.255  1.00  0.00          
ATOM      8  CE  MET     1      -0.971  -7.432  -8.193  1.00  0.00          
ATOM      9 1H   MET     1       0.835 -10.784  -2.856  1.00  0.00          
ATOM     10 2H   MET     1       1.166 -10.381  -4.475  1.00  0.00          

...