file: tutorial.four Considering the ionization equilibria in Conformational search. -------------------------------------------------------------- In this example we are going to used the electrostatically Driven Monte Carlo method to produce a conformational search that considers explicitly the coupling between the conformation of the molecule and the ionization equilibria as a function of pH. Since this type of runs are very time consuming, we are going to produce a short test run with only five accepted conformations. We are going to use here a four-residue sequence to show how to carry out this type conformational search with ECEPPAK. The amino acid sequence is shorter than those used in previous examples to speed up the computations. 1- Generate an input file with suffix "inp" ( i.e., ionz.inp) that will contain the instructions for ECEPPAK. 2.- Include in ionz.inp the $CNTRL Data Group to define the type of run (EDMC). $CNTRL runtyp = edmc $END 3.- Include the $SEQ data group with the amino acid sequence. As a test, let's consider the four residue-sequence of ionizable residues LYS-HIS-LYS-GLU, with the AMINO-COCH3 and CARBOXYL-NHCH3 at the N- and C-terminus, respectively. Let's use ECEPP numbers to specify it. The $SEQ data group should read, $SEQ 4 09 07 09 04 15 $END 4.- We must include in ionz.inp file the $GEOM data group with the set of dihedral angles defining the conformation of the polypeptide chain. For this run, we are going to use randomly-generated dihedral angles for the initial conformation. Consequently, the $GEOM data group can be filled with arbitrary values. Even empty lines (one per residue or end-group) works. The $GEOM data group reads, $GEOM $END 5.- The conformational search protocol is defined through a set of specific keywords. These keywords must be included in the data group $EDMC. Most of the EDMC keywords (see manual) are assigned default values. We are going to enter a few of them to produce a short run. Length of the run: One possible manner of specifying the length of the Monte Carlo run is to define the maximum number of conformations accepted by the Monte Carlo criterion. This is done by using the using the keyword MAXIT. Since we want five (5) accepted conformations, we include inside the $EDMC data group: MAXIT=5 Random numbers: since the EDMC procedure uses random numbers, we need to initialize the random number generator by providing an integer (positive or negative). This is done using the keyword SEED: SEED= -677676 The keyword THERMAL_SHOCK is used to indicate the procedure to follow if the search becomes trapped in a region of the conformational space. This requires two additional parameters: the high and low temperature defined by the keywords T_UP and T_LOW, respectively. Let used THERMAL_SHOCK T_LOW = 200 T_UP = 8000 In an EDMC run, the initial conformation can be generated randomly. The RAND_START keyword is used to specify this request. In addition, since the omega dihedral angles are ussually close to 180 degrees, it is possible to set the initial omega's to 180 by using the keyword OMEGA_180. Let's use both keywords in our example The $EDMC data group reads, $EDMC MAXIT=5 SEED= -677676 TEMP= 300 THERMAL_SHOCK T_LOW=200 T_UP=8000 RAND_START OMEGA_180 $END 7.- To carry out the the calculation of the ionization equilibria, we need to add to additional data groups to our ionz.inp file: $FFIELD and $IBEM_SIMS The $FFIELD is used to specify the pH at which the ionization equilibria should be calculated, e.g. pH 7 The data group should read $FFIELD PH=7.0 $END 8.- The IBEM_SIMS data group is used to input the parameters used by the IBEM algorithm. The IBEM algorithm is used to solved the Poisson-Boltzmann equation. The following is a list of the parameters and some standart values $IBEM_SIMS PROBE_RADIUS=1.4 ! radius of solvent probe sphere; default = 1.4 DETAIL_OUTPUT= 0 # values are= 0, 1, 2, 3 ! 0 (no output), 1 (some), 2 (more), 3 (max) ! default = 0 DOT_DENSITY_L=1.25 ! default = 1.0 EPS_SOL= 80.0 ! Dielectric constant of solvent; default = 80 EPS_MOL = 2.0 ! Dielectric constant of molecule interior; default = 2.0 RRLOCK = 5.0 ! radius of local surface element; default = 5.0 RRINTERM =8.0 ! radius of intermediate surface element; default = 8.0 NITER_MAX = 50 ! maximun number of iterations; default = 50 DELTA_SIGMA = 0.0001 ! convergence criterion; default = 0.0001 ! LINMETH = BCG ! 'BCG' or 'G-S' default = 'BCG' ! 'BCG' linear system is solved using a biconjugate ! gradient method ! 'G-S' linear system is solved by GAUSS-SEIDEL method RRLOCK_EN = 10.0 ! radius of calculation by small surface element; ! default = 10.0 ICHARATOM = 0 ! 1 0 sigma is calculated for each charge ICHARGROUP= 1 ! 0 1 charges as a collection of charged groups IONECENTR=1 ! 0 1 all charges are considered as one group $END 9.- The complete ionz.inp file now reads, $CNTRL runtyp = edmc $END $SEQ 4 09 07 09 04 15 $END $GEOM $END $EDMC MAXIT=5 SEED= -677676 TEMP= 300 THERMAL_SHOCK T_LOW=200 T_UP=8000 RAND_START OMEGA_180 $END $IBEM_SIMS PROBE_RADIUS=1.4 ! radius of solvent probe sphere; default = 1.4 DETAIL_OUTPUT= 0 # values are= 0, 1, 2, 3 ! 0 (no output), 1 (some), 2 (more), 3 (max) ! default = 0 DOT_DENSITY_L=1.25 ! default = 1.0 EPS_SOL= 80.0 ! Dielectric constant of solvent; default = 80 EPS_MOL = 2.0 ! Dielectric constant of molecule interior; default = 2.0 RRLOCK = 5.0 ! radius of local surface element; default = 5.0 RRINTERM =8.0 ! radius of intermediate surface element; default = 8.0 NITER_MAX = 50 ! maximun number of iterations; default = 50 DELTA_SIGMA = 0.0001 ! convergence criterion; default = 0.0001 ! LINMETH = BCG ! 'BCG' or 'G-S' default = 'BCG' ! 'BCG' linear system is solved using a biconjugate ! gradient method ! 'G-S' linear system is solved by GAUSS-SEIDEL method RRLOCK_EN = 10.0 ! radius of calculation by small surface element; ! default = 10.0 ICHARATOM = 0 ! 1 0 sigma is calculated for each charge ICHARGROUP= 1 ! 0 1 charges as a collection of charged groups IONECENTR=1 ! 0 1 all charges are considered as one group $END 6.- Save the file and run ECEPPAK. Using four processors, we should type in the command line: recepp.s EDMC ionz IONZ x x 4 7.- As output, the program writes three different type of files: (a) main_out.000IONZ with a description of the results of the conformational search procedure and, (b) outo.000IONZ a file containing all the conformations accepted by the Monte Carlo procedure (for each of them, the first line lists the different energy terms, the next line(s) contains the sequence (in ECEPP format) followed by the list of dihedral angles that describe the conformation. The file main_out.000IONZ provides the energy components of all the accepted conformations and a description of the state of ionization of the different ionizable residues. Analysis of the results shows that the ionization of HIS is greatly affected by the conformation adopted by the molecule.