This software package is copyrighted, and may be used subject to the following conditions: * This software is provided free of charge to academic users, subject to the condition that no part of it be sold or used otherwise for commercial purposes, including, but not limited to its incorporation into commercial software packages, without written consent from the authors. For permission contact Prof. H. A. Scheraga, Cornell University. * This software package is provided on an "as is" basis. We in no way warrant either this software or results it may produce. * Reports or publications using this software package must contain an acknowledgement to the authors and the NIH Resource in the form commonly used in academic research. Theoretical Prediction of Crystal Structure by Global Optimization of Potential Energy Jaroslaw Pillardy and Yelena Arnautova Script for hands-on session Ethylene molecule has been chosen for this hands-on session, because it is a small, highly symmetric organic molecule, and all calculations with it are very fast. The global optimization for this molecule is not complicated, therefore the full prediction is possible in a very short time. For more complicated systems (less symmetric molecules containing heteroatoms) global optimization runs are longer by orders of magnitude (e.g., for maleic anhydride, with 4 molecules in the unit cell, full global optimization required 19 hours, using 20 processors of the SP2 supercomputer). The single global optimization run is designed to find a global minimum for a given number of molecules in the unit cell (Z). In order to predict the value of Z one should carry out several global optimization runs for different Z’s and choose the Z, for which the energy per molecule for the global minimum is the lowest. We suggest carrying out global optimization runs for four values of Z=1,2, 3, and 4. All files for hands-on session are located in subdirectory tutorial. There are 8 subdirectories there: 00, 01, …, 08 corresponding to Part 0, …, Part 8 of this tutorial, respectively. Part 0 Preparing input files for local and global optimizations, obtaining reference structure. The input file for the global optimization program (ethlen.inp) consists of following parts (see Fig 1): - title - two sets of global optimization parameters - information about atom types and connectivity (bonds) - molecular geometry - potential energy parameters Detailed information about global optimization parameters and the structure of an input file is provided in “Crystalg Manual”. Molecular geometry may be obtained in different ways. One of them is to prepare the molecule using any molecular modeling package, and then put atom coordinates in proper format into the input file. It is also possible to use experimental molecular geometry taken from the Cambridge Structural Database (CSD). In this case program molgen should be used. - The file crystdata.csds contains data from the CSD and parameters for program molgen, see “Molgen Manual” for details. - Run molgen - Program molgen produces molgen.out output file containing molecular models. Cut out molecular geometry (see “Molgen Manual”) and put it into the proper place in ethlen.inp. - Fill in the atom type and connectivity information according to numbering of atoms in crystdata.csds file (see Fig 1). - The part of the ethlen.inp describing the force field parameters is already prepared, we have chosen the DISCOVER force field. For details about force field parameters’ format see “Crystalg Manual”. Preparing reference structure: local minimization of the experimental structure. In order to validate the crystal structure prediction we should prepare the experimental structure of the crystal minimized locally with the force field we have chosen. Program crystalg has options allowing producing such a structure using the output from program molgen. The file molgen.out contains parameters of the experimental structure in the representation used in crystalg (see manual). - Rename file molgen.out into ethlen.coord. Set the parameter MODE to 2 and run start. - Rename file ethlen_W000.out into ethlen_exp.out. To visualize results of your calculations transfer files ethlen_exp.mol2 and ethlen_exp.pdb to your local workstation and use XXX for visualization. Part 1 Global optimization using initial (the simpliest) parameters. - Copy the file ethlen.inp prepared in Part 0 to your current directory (you should change directory to 01 for Part 1). - Set parameter MODE to 0 in order to run global optimization. Check if all global optimization parameters are set to proper values: ¨ number of macroiterations 2 ¨ number of iterations in a macroiteration 1 ¨ total number of iterations 2 ¨ number of trajectories 2 ¨ number of local searches 1 ¨ maximum deformation 10.0 ¨ minimum non-zero deformation 0.05 ¨ deformation step 1.0 ¨ number of molecules in the unit cell 2 - Run start. Watch progress of calculations: each newly find lowest-energy minimum geometry is stored in ethlen@XXX.pdb or ethlen@XXX.mol2 (XXX is the number), current lowest-energy minimum information is shown in ethylene.global; general information about progress in calculation is shown in ethlen.chkpt. The program is fully restartable, all necessary information for restarting is stored in ethlen.rst. - Compare results of the global optimization with the reference structure. Has the experimental structure been found as the global minimum of the DISCOVER potential for Z=2? How many minima has the program found? For this extremely simple system the minimized experimental structure has been found as a global minimum even with the simplest set of parameters (and minimum computational effort). However, it is not usually the case for more complicated systems. It is not possible to tell a priori if a given set of global optimization parameters is suitable for a given crystal, therefore few runs using different values of parameters are necessary. Usually one starts with the set allowing very fast calculations, and then increases the numerical effort until the result (the lowest-energy minimum found) does not change. This is so-called tuning-up. In the case of the ethylene crystal, since the global minimum has been found in the first global optimization the effectiveness of the global search may be evaluated using the total number of minima found during the run. Part 2 Tuning-up the global optimization parameters: number of iterations. - Copy the file ethlen.inp from Part 1 to your current directory (you should change directory to 02 for Part 2). - Change number of iterations in a macroiteration to 2 and total number of iterations to 4. Run start. Watch progress of calculations. - How many minima has the program found? Part 3 Tuning-up the global optimization parameters: number of trajectories, iterations and local search. - Copy the file ethlen.inp from Part 2 to your current directory (you should change directory to 03 for Part 3). - Change total number of iterations to 5, and number of trajectories to 4. Switch on linear search for deformed potential (previously only perturbations were used). Run start. Watch progress of calculations. - How many minima has the program found? Part 4 Tuning-up the global optimization parameters: systematic search. - Copy the file ethlen.inp from Part 3 to your current directory (you should change directory to 04 for Part 4). - Switch on linear search for undeformed surface (previously only perturbations were used for undeformed search). Switch on systematic search in rotations for deformed surface. Run start. Watch progress of calculations. - How many minima has the program found? Part 5 Global optimization for one molecule in the unit cell (Z=1). - Copy the file ethlen.inp from Part 4 to your current directory (you should change directory to 05 for Part 5). - Change number of molecules to 1. Run start. Watch progress of calculations. - How many minima has the program found? - Is the global minimum for Z=1 lower in energy than the one for Z=2? What value of Z should be chosen as a prediction? Please remember that program prints out energy per unit cell; in order to compare energies you have to divide them by Z. Part 6 Global optimization for three molecules in the unit cell (Z=3). In Part 6 the input file from Part 2 is used in order to make calculation time reasonable. - Copy the file ethlen.inp from Part 2 to your current directory (you should change directory to 06 for Part 6). - Change number of molecules to 3. Run start. Watch progress of calculations. - How many minima has the program found ? - Is the global minimum for Z=3 lower in energy than the one for Z=2? What value of Z should be chosen as a prediction ? Please remember that program prints out energy per unit cell; in order to compare energies you have to divide them by Z. Part 7 Global optimization for four molecules in the unit cell (Z=4). The global optimization for Z=4 takes about 20 minutes, even when using the input file from Part 2. Therefore we have already carried out these calculations and the output files are placed in the directory 07. We have used input file based on the Part 4, in this case the calculations took about 30 minutes using 20 processors of the IBM SP2. The global optimization run with doubled number of molecules in the unit cell (Z=4 compared to Z=2) validates the prediction obtained previously. The minimum being the representation of the global minimum structure for smaller Z (Z=2) with the number of molecules in the unit cell doubled should be found as a global minimum in the run with larger Z (Z=4) in order to confirm the prediction for Z=2. Unfortunately, in the case of ethylene with Z=4 the structure corresponding to the minimized experimental structure (Z=2) was found as a minimum number 4 (see output file ethlen_W000.out). The unit cell of this structure is equivalent to two unit cells for Z=2 concatenated together along the c axis. This shows that the DISCOVER potential is not able to predict successfully the crystal structure of ethylene. Part 8 Real-world example. Results of the global optimization for succinic anhydride are placed in directory 08. These calculations took about 7 hours using 30 processors of the IBM SP2 supercomputer in the Cornell Theory Center. The minimized experimental structure has been found as the global minimum, showing that the DISCOVER force field may be used for crystal structure calculations for succinic anhydride molecule. Progress of calculations is shown in sucanh.chkpt. Ten lowest-energy minima are stored in sucanh_XXX.pdb or sucanh_XXX.mol2, transfer them to your local workstation and visualize using XXX. Ethlen10 flat & Hagler (Z=2) SEED=1146441793 RAND_CONF PDBOUT MOL2OUT IRST=0 ENEDIF=0.0005 IPRINT=1 IPRT=0 & RTOLF1=1.0d-9 TOLF1=1.0d-9 RTOLF2=1.0d-9 TOLF2=1.0d-9 RTOLF3=1.0d-9 & TOLF3=1.0d-9 MCM_TYPE=0 INITMODE=0 MCM_TYPE1=8 IGRAND_MCM=3 NOMAP KUTOFF_1=2 & KUTOFF_R_1=3 KUTOFF_2=4 KUTOFF_R_2=5 EPMIN=0.4 EPMAX=1.0 IDIPOLE_SWITCH=0 & INPUTTYPE=0 R_LATTICE_FACTOR=10.0 WELEC=1. MAXREDUCE=10 BEL_DEF=1.75 TIMESTART=10. TIMEEND=0.0500 IGRANDMAX=5 INI_SEARCH=1 NUM_THREAD=4 DECAY=1.3 & MID_SEARCH=1 D_ROT=30.0 D_TRL=1.5 D_RST=1.5 DD_ROT=0.0 DD_TRL=0.16 DD_RST=0.16& DELTATIME=1.00 IPERT=5 RAN_SEARCH=0 NUM_REP_THREAD=2 NUM_THREAD_SS=20 MODE=0 & SC_TYPE=1 NEW_FIRST NUM_SHRINK=2 LOW_SEARCH=1 MULT_MAX1=1 HIGH_SEARCH=0 & MULT_MAX2=1 LOC_SEARCH PERT_SEARCH STEP_SEARCH=0.300 STEPH=0.01 MAXSTEP=50 & SYS_SEARCH=1 SYS_SEARCH1=0 IGRAND_CLEAR=0 B_DEF=1.75 NSYSPT=100 2 numbmol 6 numbat 1 C 2 2 H 1 3 H 1 4 C 2 5 H 1 6 H 1 5 nnbond 1 2 1 3 1 4 4 5 4 6 csds pattern_f: 1 -0.526491584389267131 0.356321226077682462 -0.165266114829753230 2 -0.908137376247887596 1.13766112481512693 0.491966303875811584 3 -1.04939095252346570 0.187079962199671240 -1.10655971695707533 4 0.526491584389267020 -0.356321226077682407 0.165266114829753341 5 0.908137376247887484 -1.13766112481512716 -0.491966303875811362 6 1.04939095252346570 -0.187079962199671268 1.10655971695707533 csds pattern_f: 1 -0.526491584389267131 0.356321226077682462 -0.165266114829753230 2 -0.908137376247887596 1.13766112481512693 0.491966303875811584 3 -1.04939095252346570 0.187079962199671240 -1.10655971695707533 4 0.526491584389267020 -0.356321226077682407 0.165266114829753341 5 0.908137376247887484 -1.13766112481512716 -0.491966303875811362 6 1.04939095252346570 -0.187079962199671268 1.1065597169570753 1 .064 16.08 cp 2 .020 8.9700 h 3 .020 8.9700 h 4 .064 16.08 cp 5 .020 8.9700 h 6 .020 8.9700 h 1 -4.6214 cp 2 2.3107 h 3 2.3107 h 4 -4.6214 cp 5 2.3107 h 6 2.3107 h Fig 1.