file: tutorial.four
Considering the ionization equilibria in Conformational search.
--------------------------------------------------------------
In this example we are going to used the electrostatically Driven Monte Carlo
method to produce a conformational search that considers explicitly the coupling
between the conformation of the molecule and the ionization equilibria as a
function of pH.
Since this type of runs are very time consuming, we are going to produce a short
test run with only five accepted conformations.
We are going to use here a four-residue sequence to show how to carry
out this type conformational search with ECEPPAK. The amino acid sequence is
shorter than those used in previous examples to speed up the computations.
1- Generate an input file with suffix "inp" ( i.e., ionz.inp) that will
contain the instructions for ECEPPAK.
2.- Include in ionz.inp the $CNTRL Data Group to define the type of run
(EDMC).
$CNTRL
runtyp = edmc
$END
3.- Include the $SEQ data group with the amino acid sequence. As a test,
let's consider the four residue-sequence of ionizable residues LYS-HIS-LYS-GLU,
with the AMINO-COCH3 and CARBOXYL-NHCH3 at the N- and C-terminus, respectively.
Let's use ECEPP numbers to specify it.
The $SEQ data group should read,
$SEQ
4
09 07 09 04
15
$END
4.- We must include in ionz.inp file the $GEOM data group with the set of
dihedral angles defining the conformation of the polypeptide chain.
For this run, we are going to use randomly-generated dihedral angles for the initial
conformation. Consequently, the $GEOM data group can be filled with arbitrary values.
Even empty lines (one per residue or end-group) works.
The $GEOM data group reads,
$GEOM
$END
5.- The conformational search protocol is defined through a set of specific
keywords. These keywords must be included in the data group $EDMC. Most of
the EDMC keywords (see manual) are assigned default values. We are going to
enter a few of them to produce a short run.
Length of the run: One possible manner of specifying the length of the Monte
Carlo run is to define the maximum number of conformations accepted by the Monte
Carlo criterion. This is done by using the using the keyword MAXIT. Since we
want five (5) accepted conformations, we include inside the $EDMC data group:
MAXIT=5
Random numbers: since the EDMC procedure uses random numbers, we need to
initialize the random number generator by providing an integer (positive or
negative). This is done using the keyword SEED:
SEED= -677676
The keyword THERMAL_SHOCK is used to indicate the procedure to follow
if the search becomes trapped in a region of the conformational space.
This requires two additional parameters: the high and low temperature
defined by the keywords T_UP and T_LOW, respectively.
Let used
THERMAL_SHOCK T_LOW = 200 T_UP = 8000
In an EDMC run, the initial conformation can be generated randomly.
The RAND_START keyword is used to specify this request.
In addition, since the omega dihedral angles are ussually close to 180 degrees,
it is possible to set the initial omega's to 180 by using the keyword OMEGA_180.
Let's use both keywords in our example
The $EDMC data group reads,
$EDMC
MAXIT=5
SEED= -677676
TEMP= 300
THERMAL_SHOCK T_LOW=200 T_UP=8000
RAND_START
OMEGA_180
$END
7.- To carry out the the calculation of the ionization equilibria, we need to add
to additional data groups to our ionz.inp file: $FFIELD and $IBEM_SIMS
The $FFIELD is used to specify the pH at which the ionization equilibria should be
calculated, e.g. pH 7
The data group should read
$FFIELD
PH=7.0
$END
8.- The IBEM_SIMS data group is used to input the parameters used by the IBEM algorithm.
The IBEM algorithm is used to solved the Poisson-Boltzmann equation.
The following is a list of the parameters and some standart values
$IBEM_SIMS
PROBE_RADIUS=1.4 ! radius of solvent probe sphere; default = 1.4
DETAIL_OUTPUT= 0 # values are= 0, 1, 2, 3
! 0 (no output), 1 (some), 2 (more), 3 (max)
! default = 0
DOT_DENSITY_L=1.25 ! default = 1.0
EPS_SOL= 80.0 ! Dielectric constant of solvent; default = 80
EPS_MOL = 2.0 ! Dielectric constant of molecule interior; default = 2.0
RRLOCK = 5.0 ! radius of local surface element; default = 5.0
RRINTERM =8.0 ! radius of intermediate surface element; default = 8.0
NITER_MAX = 50 ! maximun number of iterations; default = 50
DELTA_SIGMA = 0.0001 ! convergence criterion; default = 0.0001
!
LINMETH = BCG ! 'BCG' or 'G-S' default = 'BCG'
! 'BCG' linear system is solved using a biconjugate
! gradient method
! 'G-S' linear system is solved by GAUSS-SEIDEL method
RRLOCK_EN = 10.0 ! radius of calculation by small surface element;
! default = 10.0
ICHARATOM = 0 ! 1 0 sigma is calculated for each charge
ICHARGROUP= 1 ! 0 1 charges as a collection of charged groups
IONECENTR=1 ! 0 1 all charges are considered as one group
$END
9.- The complete ionz.inp file now reads,
$CNTRL
runtyp = edmc
$END
$SEQ
4
09 07 09 04
15
$END
$GEOM
$END
$EDMC
MAXIT=5
SEED= -677676
TEMP= 300
THERMAL_SHOCK T_LOW=200 T_UP=8000
RAND_START
OMEGA_180
$END
$IBEM_SIMS
PROBE_RADIUS=1.4 ! radius of solvent probe sphere; default = 1.4
DETAIL_OUTPUT= 0 # values are= 0, 1, 2, 3
! 0 (no output), 1 (some), 2 (more), 3 (max)
! default = 0
DOT_DENSITY_L=1.25 ! default = 1.0
EPS_SOL= 80.0 ! Dielectric constant of solvent; default = 80
EPS_MOL = 2.0 ! Dielectric constant of molecule interior; default = 2.0
RRLOCK = 5.0 ! radius of local surface element; default = 5.0
RRINTERM =8.0 ! radius of intermediate surface element; default = 8.0
NITER_MAX = 50 ! maximun number of iterations; default = 50
DELTA_SIGMA = 0.0001 ! convergence criterion; default = 0.0001
!
LINMETH = BCG ! 'BCG' or 'G-S' default = 'BCG'
! 'BCG' linear system is solved using a biconjugate
! gradient method
! 'G-S' linear system is solved by GAUSS-SEIDEL method
RRLOCK_EN = 10.0 ! radius of calculation by small surface element;
! default = 10.0
ICHARATOM = 0 ! 1 0 sigma is calculated for each charge
ICHARGROUP= 1 ! 0 1 charges as a collection of charged groups
IONECENTR=1 ! 0 1 all charges are considered as one group
$END
6.- Save the file and run ECEPPAK.
Using four processors, we should type in the command line:
recepp.s EDMC ionz IONZ x x 4
7.- As output, the program writes three different type of files:
(a) main_out.000IONZ with a description of
the results of the conformational search procedure
and,
(b) outo.000IONZ a file containing all the conformations accepted
by the Monte Carlo procedure (for each of them, the first line
lists the different energy terms, the next line(s) contains the sequence
(in ECEPP format) followed by the list of dihedral angles that describe the
conformation.
The file main_out.000IONZ provides the energy components of all the
accepted conformations and a description of the state of ionization of the
different ionizable residues.
Analysis of the results shows that the ionization of HIS is greatly affected by the
conformation adopted by the molecule.