file: tutorial.four



Considering the ionization equilibria in Conformational search.
--------------------------------------------------------------


In this example we are going to used the electrostatically Driven Monte Carlo 
method to produce a conformational search that considers explicitly the coupling 
between the conformation of the molecule and the ionization equilibria as a 
function of pH.
Since this type of runs are very time consuming, we are going to produce a short 
test run with only five accepted conformations.

We are going to use here a four-residue sequence to show how to carry 
out this type conformational search with ECEPPAK. The amino acid sequence is 
shorter than those used in previous examples to speed up the computations. 

1- Generate an input file with suffix "inp" ( i.e., ionz.inp) that  will 
contain the instructions for ECEPPAK.

2.-  Include in ionz.inp the $CNTRL Data Group to define the type of run 
(EDMC).

$CNTRL
runtyp = edmc
$END

3.- Include the $SEQ data group with the amino acid sequence.  As a test, 
let's consider the four residue-sequence of ionizable residues LYS-HIS-LYS-GLU, 
with the AMINO-COCH3 and CARBOXYL-NHCH3 at the N- and  C-terminus, respectively. 
Let's use ECEPP numbers to specify it.   

The $SEQ data group should  read, 

$SEQ
4
09 07 09 04
15
$END

4.- We must include in ionz.inp file the $GEOM data group with the set of 
dihedral angles defining the conformation of the polypeptide chain. 
For this run, we are going to use randomly-generated dihedral angles for the initial
conformation. Consequently, the $GEOM data group  can be filled with arbitrary values.
Even empty lines (one per residue or end-group) works. 


The $GEOM data group reads,

$GEOM






$END
 
5.- The conformational search protocol is defined through a set of specific 
keywords.  These keywords must be included in the data group $EDMC.  Most of 
the EDMC keywords (see manual) are assigned default values.  We are going to 
enter a few of them to produce a short run. 

Length of the run: One possible manner of specifying the length of the Monte 
Carlo run is to define the maximum number of conformations accepted by the Monte 
Carlo criterion.  This is done by using the using the keyword MAXIT. Since we 
want five (5) accepted conformations, we include inside the $EDMC data group: 
MAXIT=5

Random numbers: since the EDMC procedure uses random numbers, we need to 
initialize the random number generator by providing an integer (positive or 
negative). This is done using the keyword SEED:
SEED= -677676

The keyword THERMAL_SHOCK  is used to indicate the procedure to follow
if the search becomes trapped in  a region of the conformational space.
This requires two additional parameters: the high and low temperature 
defined  by the keywords T_UP  and T_LOW, respectively.
Let used 
 THERMAL_SHOCK  T_LOW = 200 T_UP = 8000

In an  EDMC run, the initial conformation can be generated randomly.  
The RAND_START  keyword  is used to specify this request. 
In addition, since the omega dihedral angles are ussually close to 180 degrees,
it is possible to set the initial omega's to 180 by using the  keyword OMEGA_180.

Let's use both keywords in our example    


The $EDMC data group reads,

$EDMC
MAXIT=5
SEED= -677676
TEMP= 300
THERMAL_SHOCK T_LOW=200 T_UP=8000
RAND_START
OMEGA_180
$END



7.- To carry out the the calculation of the ionization equilibria, we need to add
to additional data groups to our ionz.inp file: $FFIELD and $IBEM_SIMS 

The $FFIELD is used to specify the pH  at  which the ionization equilibria should be 
calculated, e.g. pH 7
The data group should read

$FFIELD
  PH=7.0
$END


8.-  The IBEM_SIMS data group is used to input the parameters used by the IBEM algorithm.
 The IBEM algorithm is used to solved the Poisson-Boltzmann equation.
The following is a list of the parameters and some standart values


$IBEM_SIMS
PROBE_RADIUS=1.4   ! radius of solvent probe sphere; default = 1.4

DETAIL_OUTPUT=  0      #  values are= 0, 1, 2, 3
! 0 (no output), 1 (some), 2 (more), 3 (max)
! default = 0

DOT_DENSITY_L=1.25    ! default = 1.0

EPS_SOL= 80.0        ! Dielectric constant of solvent; default = 80

EPS_MOL = 2.0        ! Dielectric constant of molecule interior; default = 2.0

RRLOCK =  5.0         ! radius of local surface element; default = 5.0

RRINTERM =8.0        ! radius of intermediate surface element; default = 8.0

NITER_MAX = 50       ! maximun number of iterations; default = 50

DELTA_SIGMA = 0.0001 ! convergence criterion;  default = 0.0001

!
LINMETH =  BCG       !  'BCG' or 'G-S' default = 'BCG'
                     ! 'BCG' linear system is solved using a biconjugate
                     !   gradient method
                     ! 'G-S' linear system is solved by GAUSS-SEIDEL method

RRLOCK_EN =  10.0     ! radius of calculation by small surface element;
                     !  default = 10.0

ICHARATOM = 0        ! 1  0 sigma is calculated for each charge

ICHARGROUP= 1        ! 0  1 charges as a collection of charged groups

IONECENTR=1          ! 0  1  all charges are considered as one group

$END


9.- The complete ionz.inp  file now reads,


$CNTRL
runtyp = edmc
$END

$SEQ
4
09 07 09 04
15
$END

$GEOM






$END

$EDMC
MAXIT=5
SEED= -677676
TEMP= 300
THERMAL_SHOCK T_LOW=200 T_UP=8000
RAND_START
OMEGA_180
$END


$IBEM_SIMS
PROBE_RADIUS=1.4   ! radius of solvent probe sphere; default = 1.4

DETAIL_OUTPUT=  0      #  values are= 0, 1, 2, 3
! 0 (no output), 1 (some), 2 (more), 3 (max)
! default = 0

DOT_DENSITY_L=1.25    ! default = 1.0

EPS_SOL= 80.0        ! Dielectric constant of solvent; default = 80

EPS_MOL = 2.0        ! Dielectric constant of molecule interior; default = 2.0

RRLOCK =  5.0         ! radius of local surface element; default = 5.0

RRINTERM =8.0        ! radius of intermediate surface element; default = 8.0

NITER_MAX = 50       ! maximun number of iterations; default = 50

DELTA_SIGMA = 0.0001 ! convergence criterion;  default = 0.0001

!
LINMETH =  BCG       !  'BCG' or 'G-S' default = 'BCG'
                     ! 'BCG' linear system is solved using a biconjugate
                     !   gradient method
                     ! 'G-S' linear system is solved by GAUSS-SEIDEL method

RRLOCK_EN =  10.0     ! radius of calculation by small surface element;
                     !  default = 10.0

ICHARATOM = 0        ! 1  0 sigma is calculated for each charge

ICHARGROUP= 1        ! 0  1 charges as a collection of charged groups

IONECENTR=1          ! 0  1  all charges are considered as one group

$END



6.- Save the file and run ECEPPAK.

Using four processors, we should type in the command line:
    
     recepp.s EDMC ionz IONZ x x  4

7.- As output, the program writes three different type of files:
(a) main_out.000IONZ with a description of
the results of the conformational search procedure  
and, 
(b) outo.000IONZ a file containing all the conformations accepted
by the Monte Carlo procedure (for each of them, the first line
lists the different energy terms, the next line(s) contains the sequence
(in ECEPP format) followed by the list of dihedral angles that describe the 
conformation.

The file main_out.000IONZ provides the energy components of all the 
accepted conformations and a description of the state of ionization of the 
different ionizable residues.
Analysis of the results shows that the ionization of HIS is greatly affected by the 
conformation adopted by the molecule.