file: tutorial.one Generating a conformation and evaluating the energy with ECEPPAK ----------------------------------------------------------------- The general input to the program is given through a file with a set of instructions. The program uses a parser to read these instructions. The parser reads and interpret the first 78 characters of a line. No distinction is made between lower-case or upper-case letters. The symbols # and ! are used to indicate the beginning of a comment. When any of this symbols are encountered, the parser will ignore the rest of the line. Instructions related to a given procedure are associated into the so called "Data Groups". A "Data Group" is identified by a main keyword which contains the symbol '$' as the first character, i.e. $EDMC, $CNTRL. Also the keyword $end or $END, should be present, indicating the end of the Data Group (see the manual for a list of the Data Groups). Any word included between the main keyword and $end, is considered an instruction. Three Data Groups must be included in all input files, they are: $CNTRL, $SEQ and $GEOM, they describe the type of calculation to be carried out by the program, the amino acid sequence of the molecule to be considered, and the set of dihedral angles that determine the initial conformation, respectively. Example of how to generate a single polyalanine chain (10 residues) and compute its energy. 1- Generate an input file with suffix "inp" ( i.e., ten_ala.inp) with the instructions for ECEPPAK 2.- Include in ten_ala.inp the $CNTRL Data Group to define the type of ECCEPAK run: $CNTRL runtyp = energy $end 3.- Include in ten_ala.inp the $SEQ data group with the amino acid sequence. The default sequence specification uses ECEPP-residue types (ALA is referred as type 1). One or three letter codes can also be used) ECEPP ALWAYS expects the sequence to be terminated with end groups. In this example, we'll used AMINO-COCH3 and CARBOXYL-NHCH3 at the N- and C-terminus, respectively. These two end groups are identified as type numbers 4 and 15 in ECEPP. The $SEQ data group reads, $SEQ 4 1 1 1 1 1 1 1 1 1 1 15 $END 4.- Finally, we should include in ten_ala.inp the $GEOM data group, with the set of dihedral angles needed to define the conformation of the polypeptide chain. The set of dihedral angles describing the conformation of each residue must be entered in a single formatted (10F8.3)line. The conformation of all natural amino acids can be described with 10 or less dihedral angles. As an example, we will generate an alpha-helical conformation. These are typical dihedral values for phi, psi, omega and chi1 for an ALA residue in an alpha-helix: -66.000 -40.000 180.000 60.000 The $GEOM data group reads, $GEOM 180.000 180.000 -66.000 -40.000 180.000 60.000 -66.000 -40.000 180.000 60.000 -66.000 -40.000 180.000 60.000 -66.000 -40.000 180.000 60.000 -66.000 -40.000 180.000 60.000 -66.000 -40.000 180.000 60.000 -66.000 -40.000 180.000 60.000 -66.000 -40.000 180.000 60.000 -66.000 -40.000 180.000 60.000 -66.000 -40.000 180.000 60.000 180.000 $END 5.- This should do it! the complete file reads, $CNTRL runtyp = energy $end $SEQ 4 1 1 1 1 1 1 1 1 1 1 15 $END $GEOM 180.000 180.000 -66.000 -40.000 180.000 60.000 -66.000 -40.000 180.000 60.000 -66.000 -40.000 180.000 60.000 -66.000 -40.000 180.000 60.000 -66.000 -40.000 180.000 60.000 -66.000 -40.000 180.000 60.000 -66.000 -40.000 180.000 60.000 -66.000 -40.000 180.000 60.000 -66.000 -40.000 180.000 60.000 -66.000 -40.000 180.000 60.000 180.000 $END 6.- Save the file and run ECEPPAK In the command line type: recepp.s ENERGY ten_ala TEN_ALA x x 1 7.- As output of the program, we get a file named main_out.TEN_ALA with the results of the energy evaluation. The total energy of the molecule in this particular conformation, provided as a control, should be: ETOT -0.20945E+02 (total)