1.1 Set up your Claude account (if needed)
If you have not set up your Claude account on a BioHPC server, follow the instructions here.
https://biohpc.cornell.edu/doc/setup_account_claude.html
1.2 Connect to your assigned server.
Find your assigned server on this page:
https://biohpc.cornell.edu/ww/machines.aspx?i=169
On Cornell campus or using Cornell VPN:
ssh your_user_id@cbsuxxxxxx.biohpc.cornell.edu
Off campus (without VPN):
xssh your_user_id@cbsulogin.biohpc.cornell.edussh cbsuxxxxxx
1.3 Prepare your working directory and data
xmkdir /workdir/$USERcp -r /shared_data/RNAseq/exercise1 /workdir/$USER/cd /workdir/$USER/exercise1ls -l
You should see :
FASTQ files (6 samples):
ERR458493.fastq.gz
ERR458494.fastq.gz
ERR458495.fastq.gz
ERR458500.fastq.gz
ERR458501.fastq.gz
ERR458502.fastq.gz
Metadata file
sampleMeta.txt
Reference files:
R64.fa
R64.gtf
1.4 Inspect the two guardrail files.
xcp /programs/ai_pipelines/AGENTS.md /home/$USER/.claude/CLAUDE.mdcp /programs/ai_pipelines/rnaseq/*.md /workdir/$USER/exercise1/cat /home/$USER/.claude/CLAUDE.mdcat /workdir/$USER/exercise1/rnaseq.mdcat /workdir/$USER/exercise1/rnaseq_genomics_core.md
Descriptions:
CLAUDE.md: System prompt provided from the Bioinformatics Facility with usage guidelines.
rnaseq.md: nf-core RNA-seq workflow protocol (modifiable; e.g., CPU usage).
rnaseq_genomics_core.md: Protocol matching Cornell Genomics Facility standards.
You may use either RNA-seq protocols for this project. If you want to run both protocols, make sure to run them in two different sessions.
2.1 Start Claude Code
xcd /workdir/$USER/exercise1claude
If your Claude account is not been linked, you will be prompted to connect it.
Use number keys or arrow keys + Enter to navigate prompts.
2.2 Default file access policy
By default Claude Code always asks for permission before modifying or deleting file. If this default policy is changed, you can find setting in the file ~/.claude/settings.json.
2.3 Generate the RNA-seq pipeline script
xread rnaseq.mdCreate a script run_rnaseq.sh to run rna-seq data analysis using 10 CPU cores. Output directory: results.
⏱ Runtime: ~2 minutes. Claude will generate a script.
Exit Claude:
Press Ctrl + C, or
Type /exit
To resume later:
xclaude --continue
2.4 Run the RNA-seq data analysis script
Inspect the script and the formatted sample file:
xxxxxxxxxxcat run_rnaseq.shcat samplesheet.csv
Then run
x./run_rnaseq_nfcore.sh
⏱ Runtime: ~5–10 minutes (small training dataset)
Optional (skip computation and copy the pre-made results):
xxxxxxxxxxcp -r /shared_data/RNAseq/exercise1_results /workdir/$USER/exercise1/results
Pre-made results with Genomics Facility protocol: /shared_data/RNAseq/exercise1_results
2.5 Verify results
Resume Claude:
xcd /workdir/$USER/exercise1claude --continue
Ask:
xCan you check the output in the results directory?
Other useful prompts:
xxxxxxxxxxWhat files should I check?What should I do next?
You can download HTML reports using FileZilla and view them locally.
2.6 Downstream data analysis
Example tasks:
xxxxxxxxxxIdentify differentially expressed genes.Make a PCA plot of the samples, mu in blue, and wt in red. Using triangles for mu and circles for wt.
If Claude creates but does not run a script:
xxxxxxxxxxRun this script for me.
Plots are saved as .png files. You can:
Download via FileZilla
View in VS Code (recommended)
2.7 Function over-representation analysis (ORA).
ORA requires a GO annotation file. For most model organisms, GO annotation files are available online.
If the GO annotation file is not available, generate it in two steps:
Ask the agent to create a protein fasta file:
xxxxxxxxxxCreate a protein sequence fasta file using the genome fasta and the gtf file
use the BLAST2GO on BioHPC to generate the GO annotation: https://biohpc.cornell.edu/lab/userguide.aspx?a=software&i=73#c
For this workshop, use the pre-made GO annotation file: R64.go.txt
xxxxxxxxxxRun functional over-representation analysis using topGO with R64.go.txt. Use R version 4.4.3.
Alternative:
xxxxxxxxxxUse clusterProfiler for ORA.
Adjust thresholds:
xxxxxxxxxxRedo topGO using genes with log2 fold change > 2.
Documentation is essential in research.
Example prompts:
xOrganize all generated scripts into a scripts directory.Add a README file describing each script.Write a project summary including software versions for manuscript use.
VS Code improves visualization and workflow.
Setup instructions: https://biohpc.cornell.edu/doc/setup_account_claude.html
VS Code layout:
File explorer (left): Open files and images
AI Agent (right): Manage Codex sessions
Editor (center): View/edit files
Terminal (bottom): Run commands.
Running Claude in VS Code
Option 1: Terminal
xxxxxxxxxxclaude
Option 2: Agent panel (GUI)
Key concepts:
Switching agents:
At the top of the panel, you can select which Agent to use for your project:
Codex: OpenAI agent
Claude: Anthropic agent
Chat: Microsoft Copilot agent
Project directory = folder opened in VS Code
Open with: Ctrl + Shift + P → Open Folder
Session management:
Resume previous session (default)
Start new session via “New Chat” button (upper right corner)
xxxxxxxxxxMake a PCA plot of the samples, wt in blue and mu in red.
The .png will appear in the file explorer—double-click to view.