Instruction to run
BLAST2GO
1. Run BLAST on http://cbsuapps.tc.cornell.edu (If you want to run BLAST on the BioHPC lab Linux computers with command line BLAST+ software, follow the instruction in appendix at the end of this document. It is faster and more reliable than the web site.)
a. Log into the web site http://cbsuapps.tc.cornell.edu with your BioHPC user name and password. Then click “Sequence analysis” -> “P-BLAST”.
b. Modify the following fields:
Job name: Provide a unique name for your job. (No space character in the name)
Query file: Upload your FASTA file.
BLAST program: use “blastx” if your FASTA is a nucleotide sequence, use “blastp” if your FASTA is a protein sequence.
Choose database for BLAST: Select “swissprot”, then click “č” button.
Output File Format: XML (-m 7)
Cutoff E-value: 1e-6 (This is the default valued used by blast2go. It can be modified if needed).
Maximum Targets: 20
Nodes and Cluster: Check the http://cbsuapps.tc.cornell.edu/nodes.aspx site before you set cluster and nodes. The nodes should not exceed total nodes for that cluster, and better below the free nodes. Expect 15 seconds per sequence per node.
c. You might want to monitor the job status by check the “My jobs” link on the web site. Notify us if you do not get the results in two days.
d. After you receive an email with the BLAST results, download the file using the link in the email. Change the file name extension to .xml.
2. Run BLAST2GO on BioHPC computers with local BLAST2GO database server at Cornell. (The Cornell database server is only accessible from BioHPC lab computers)
a. Upload the FASTA file and BLAST XML file to the BioHPC lab computer cbsulogin.tc.cornell.edu. Instruction for file transferring can be found at http://cbsu.tc.cornell.edu/lab/doc/Remote_access.pdf
b. In order to run BLAST2GO, you will need to reserve a BioHPC lab computer at http://cbsu.tc.cornell.edu. Following the user guide at http://cbsu.tc.cornell.edu/lab/userguide.aspx ( click the “Quick Start Guide” tab) to reserve a computer.
c. Start VNC on your reserved computer. Following the user guide at http://cbsu.tc.cornell.edu/lab/userguide.aspx (click “Access” tab, then read the section under “Access with VNC”). After the VNC is connected, you can close the VNC window at any time, and it would not affect the software that is running on the workstation. To go back to the VNC window, you just need to click the “Connect VNC” link on the “My Reservation” page.
d. Start BLAST2GO by clicking the “BLAST2GO” icon on the desktop of the VNC window. It is normal that you do not see anything on the screen for about a minute, it takes some time for the software to start.
e. Change the database setting of BLAST2GO to point to Cornell server. From the menu, click “File”->"DataAccess setting", check the “Own database” checkbox.
Fill out the following information (copy-paste does not work in VNC, you will have to type)
DB Name: b2gdb
DB Host: cbsuss06.tc.cornell.edu
DB User: blast2go
DB Password: blast4it
f. Load sequences in FASTA file (the FASTA file that you have transferred to the BioHPC computer home directory)
From menu, click “File” -> “Load sequences”
g. Load BLAST results (the BLAST xml file that you have transferred to the BioHPC computer home directory)
“File” ->”Import” -> “Import blast results” -> “One xml file”, click the triangle start button to start importing.
h. Run mapping. “Mapping”->”Run Mapping”. This step might take a long time. You can close you laptop computer, and come back later by clicking “My Researvation” -> “Connect VNC” link on the cbsu.tc.cornell.edu web site.
i. Run annotation. “Annotation”-> “Run Annotation”. This step might take a long time. You can close your computer and come back later.
j. (Optional) Run interproscan. “Annotation “ -> “Interproscan” -> “Run interproscan”, followed by merging annotation: “Annotation “ -> “Interproscan” -> “Merge interproscan GO to annotation”.
k. After the annotation is finished, you can export the annotations to a file, “File”->”Export annotations” ->”Export annotations(.annot)”. Then, you can transfer the “.annot” file to your local computer, and continue the work by using the “BLAST2GO” installed on your own computer (“File”->”Load annotations”).
Appendix. Run BLAST with BioHPC lab Linux computers.
(If you are not familiar with Linux operating system, you need to get training either through our Linux workshop or signup on one of our office hours at http://cbsu.tc.cornell.edu/lab/office1.aspx)
First transfer your file to the /workdir/myUserName directory on the Linux computer using one of the SFTP client software. Create a directory if it does not exist.
cd /workdir/myUserName
cp /shared_data/genome_db/BLAST_NCBI/ swissprot * ./
### protein blast
blastp -num_threads 8 -query test.fa -db swissprot -out blastresults.xml -max_target_seqs 20 -evalue 1e-5 -outfmt 5 -culling_limit 10 >& logfile &
## nucleotide blast
blastx -num_threads 8 -query test.fa -db swissprot -out blastresults.xml -max_target_seqs 20 -evalue 1e-5 -outfmt 5 -culling_limit 10 >& logfile &