institute of biotechnology >> brc >> bioinformatics >> internal >> biohpc cloud: user guide
 

BioHPC Cloud:
: User Guide

 

 


BioHPC Cloud Software

There are 1120 software titles installed in BioHPC Cloud. The sofware is available on all machines (unless stated otherwise in notes), complete list of programs is below, please click on a title to see details and instructions. Tabular list of software is available here

Please read details and instructions before running any program, it may contain important information on how to properly use the software in BioHPC Cloud.

3D Slicer, 3d-dna, 454 gsAssembler or gsMapper, a5, ABRicate, ABruijn, ABySS, AdapterRemoval, adephylo, Admixtools, Admixture, AF_unmasked, AFProfile, AGAT, agrep, albacore, Alder, AliTV-Perl interface, AlleleSeq, ALLMAPS, ALLPATHS-LG, Alphafold, alphapickle, Alphapulldown, AMOS, AMPHORA, amplicon.py, AMRFinder, analysis, ANGSD, AnnotaPipeline, Annovar, ant, antiSMASH, anvio, apollo, arcs, ARGweaver, aria2, ariba, Arlequin, ART, ASEQ, aspera, assembly-stats, ASTRAL, atac-seq-pipeline, ataqv, athena_meta, ATLAS, Atlas-Link, ATLAS_GapFill, atom, ATSAS, Augustus, AWS command line interface, AWS v2 Command Line Interface, axe, axel, BA3, BactSNP, bakta, bamsnap, bamsurgeon, bamtools, bamUtil, barcode_splitter, BarNone, Basset, BayeScan, Bayescenv, bayesR, baypass, bazel, BBMap/BBTools, BCFtools, BCL convert, bcl2fastq, BCP, Beagle, Beast2, bedops, BEDtools, bettercallsal, bfc, bgc, bgen, bicycle, BiG-SCAPE, bigQF, bigtools, bigWig, bioawk, biobakery, biobambam, Bioconductor, biom-format, BioPerl, BioPython, Birdsuite, Bismark, Blackbird, blasr, BLAST, BLAST_to_BED, blast2go, BLAT, BlobToolKit, BLUPF90, BMGE, bmtagger, bonito, Boost, Bowtie, Bowtie2, BPGA, Bracken, BRAKER, BRAT-NextGen, BRBseqTools, BreedingSchemeLanguage, breseq, brocc, bsmap, BSseeker2, BUSCO, BUSCO Phylogenomics, BWA, bwa-mem2, bwa-meth, bwtool, cactus, CAFE, caffe, cagee, canu, Canvas, CAP3, caper, CarveMe, catch, cBar, CBSU RNAseq, CCMetagen, CCTpack, cd-hit, cdbfasta, cdo, CEGMA, CellRanger, cellranger-arc, cellranger-atac, cellranger-dna, centrifuge, centroFlye, CFM-ID, CFSAN SNP pipeline, CheckM, CheckM2, chimera, chimerax, chip-seq-pipeline, chromosomer, Circlator, Circos, Circuitscape, CITE-seq-Count, ClermonTyping, clues, CLUMPP, clust, Clustal Omega, CLUSTALW, Cluster, cmake, CMSeq, CNVnator, coinfinder, colabfold, CombFold, Comparative-Annotation-Toolkit, compat, CONCOCT, Conda, Cooler, copyNumberDiff, cortex_var, CoverM, crabs, CRISPRCasFinder, CRISPResso, crispron, Cromwell, CrossMap, CRT, cuda, Cufflinks, curatedMetagenomicDataTerminal, cutadapt, cuteSV, dadi, dadi-1.6.3_modif, dadi-cli, danpos, DAS_Tool, DBSCAN-SWA, dDocent, DeconSeq, Deepbinner, deeplasmid, DeepTE, deepTools, Deepvariant, defusion, delly, DESMAN, destruct, DETONATE, dfast, diamond, dipcall, diploSHIC, discoal, Discovar, Discovar de novo, distruct, DiTASiC, DIYABC, dnmtools, Docker, dorado, DRAM, dREG, dREG.HD, drep, Drop-seq, dropEst, dropSeqPipe, dsk, dssat, Dsuite, dTOX, duphold, DWGSIM, dynare, ea-utils, ecopcr, ecoPrimers, ectyper, EDGE, edirect, EDTA, eems, EgaCryptor, EGAD, eggnog-mapper, EIGENSOFT, elai, ElMaven, EMBLmyGFF3, EMBOSS, EMIRGE, Empress, enfuse, EnTAP, entropy, epa-ng, ephem, epic2, ermineJ, ete3, EukDetect, EukRep, EVM, exabayes, exonerate, ExpansionHunterDenovo-v0.8.0, eXpress, FALCON, FALCON_unzip, Fast-GBS, fasta, FastANI, fastcluster, FastME, FastML, fastp, FastQ Screen, fastq-multx-1.4.3, fastq_demux, fastq_pair, fastq_species_detector, FastQC, fastqsplitter, fastsimcoal2, fastspar, fastStructure, FastTree, FASTX, fcs, feems, feh, FFmpeg, fgbio, figaro, Filtlong, fineRADstructure, fineSTRUCTURE, FIt-SNE, flash, flash2, flexbar, Flexible Adapter Remover, Flye, FMAP, FragGeneScan, FragGeneScan, FRANz, freebayes, FSA, funannotate, FunGene Pipeline, FunOMIC, G-PhoCS, GADMA, GAEMR, Galaxy, Galaxy in Docker, GATK, gatk4, gatk4amplicon.py, gblastn, Gblocks, GBRS, gcc, GCTA, GDAL, gdc-client, GEM library, GEMMA, GeMoMa, GENECONV, geneid, GeneMark, Genespace, genomad, Genome STRiP, Genome Workbench, GenomeMapper, Genomescope, GenomeThreader, genometools, GenomicConsensus, genozip, gensim, GEOS, germline, gerp++, GET_PHYLOMARKERS, gfaviz, GffCompare, gffread, giggle, git, glactools, GlimmerHMM, GLIMPSE, GLnexus, Globus connect personal, GMAP/GSNAP, GNU Compilers, GNU parallel, go-perl, GO2MSIG, GONE, GoShifter, gradle, graftM, grammy, GraPhlAn, graphtyper, graphviz, greenhill, GRiD, gridss, Grinder, grocsvs, GROMACS, GroopM, GSEA, gsort, GTDB-Tk, GTFtools, Gubbins, gunc, GUPPY, hail, hal, HapCompass, HAPCUT, HAPCUT2, hapflk, HaploMerger, Haplomerger2, haplostrips, HaploSync, HapSeq2, HarvestTools, haslr, hdf5, hget, hh-suite, HiC-Pro, hic_qc, HiCExplorer, HiFiAdapterFilt, hifiasm, hificnv, HISAT2, HMMER, Homer, HOTSPOT, HTSeq, htslib, https://github.com/CVUA-RRW/RRW-PrimerBLAST, hugin, humann, HUMAnN2, hybpiper, hyperopt, HyPhy, hyphy-analyses, iAssembler, IBDLD, idba, IDBA-UD, IDP-denovo, idr, idseq, IgBLAST, IGoR, IGV, IMa2, IMa2p, IMAGE, ImageJ, ImageMagick, Immcantation, impute2, impute5, IMSA-A, INDELseek, infernal, Infomap, inStrain, inStrain_lite, InStruct, Intel MKL, InteMAP, InterProScan, ipyrad, IQ-TREE, iRep, JaBbA, jags, Jane, java, jbrowse, JCVI, jellyfish, juicer, julia, jupyter, jupyterlab, kaiju, kallisto, Kent Utilities, keras, khmer, kinfin, king, kma, KMC, KmerFinder, KmerGenie, kneaddata, kraken, KrakenTools, KronaTools, kSNP, kWIP, LACHESIS, lammps, LAPACK, LAST, lastz, lcMLkin, LDAK, LDhat, LeafCutter, leeHom, lep-anchor, Lep-MAP3, LEVIATHAN, lftp, Liftoff, Lighter, LinkedSV, LINKS, localcolabfold, LocARNA, LocusZoom, lofreq, longranger, Loupe, LS-GKM, LTR_retriever, LUCY, LUCY2, LUMPY, lyve-SET, m6anet, MACE, MACS, MaCS simulator, MACS2, macs3, maffilter, MAFFT, mafTools, MAGeCK, MAGeCK-VISPR, Magic-BLAST, magick, MAGScoT, MAKER, manta, mapDamage, mapquik, MAQ, MARS, MASH, mashtree, Mashtree, MaSuRCA, MATLAB, Matlab_runtime, Mauve, MaxBin, MaxQuant, McClintock, mccortex, mcl, MCscan, MCScanX, medaka, medusa, megahit, MeGAMerge, MEGAN, MELT, MEME Suite, MERLIN, merqury, MetaBAT, MetaBinner, MetaboAnalystR, MetaCache, MetaCRAST, metaCRISPR, metamaps, MetAMOS, MetaPathways, MetaPhlAn, metapop, metaron, MetaVelvet, MetaVelvet-SL, metaWRAP, methpipe, mfeprimer, MGmapper, MicrobeAnnotator, microtrait, MiFish, Migrate-n, mikado, MinCED, minigraph, Minimac3, Minimac4, minimap2, mira, miRDeep2, mirge3, miRquant, MISO, MITObim, MitoFinder, mitohelper, MitoHiFi, mity, MiXCR, MixMapper, MKTest, mlift, mlst, MMAP, MMSEQ, MMseqs2, MMTK, MobileElementFinder, modeltest, MODIStsp-2.0.5, module, moments, MoMI-G, mongo, mono, monocle3, mosdepth, mothur, MrBayes, mrsFAST, msdial, msld, MSMC, msprime, MSR-CA Genome Assembler, msstats, MSTMap, mugsy, MultiQC, multiz-tba, MUMandCo, MUMmer, mummer2circos, muscle, MUSIC, Mutation-Simulator, muTect, MZmine, nag-compiler, nanocompore, nanofilt, NanoPlot, Nanopolish, nanovar, ncbi_datasets, ncftp, ncl, NECAT, Nemo, Netbeans, NEURON, new_fugue, Nextflow, NextGenMap, NextPolish2, nf-core/rnaseq, ngmlr, NGS_data_processing, NGSadmix, ngsDist, ngsF, ngsLD, NGSNGS, NgsRelate, ngsTools, NGSUtils, NINJA, NLR-Annotator, NLR-Parser, Novoalign, NovoalignCS, nQuire, NRSA, NuDup, numactl, nvidia-docker, nvtop, Oases, OBITools, Octave, OMA, Oneflux, OpenBLAS, openmpi, openssl, orthodb-clades, OrthoFinder, orthologr, Orthomcl, pacbio, PacBioTestData, PAGIT, pal2nal, paleomix, PAML, panaroo, pandas, pandaseq, pandoc, PanPhlAn, Panseq, Parsnp, PASA, PASTEC, PAUP*, pauvre, pb-assembly, pbalign, pbbam, pbh5tools, PBJelly, pblat, pbmm2, PBSuite, pbsv, pbtk, PCAngsd, pcre, pcre2, PeakRanger, PeakSplitter, PEAR, PEER, PennCNV, peppro, PERL, PfamScan, pgap, PGDSpider, ph5tools, Phage_Finder, pharokka, phasedibd, PHAST, phenopath, Phobius, PHRAPL, PHYLIP, PhyloCSF, phyloFlash, phylophlan*, PhyloPhlAn2, phylophlan3, phyluce, PhyML, phyx, Picard, PICRUSt2, pigz, Pilon, Pindel, piPipes, PIQ, PlasFlow, platanus, Platypus, plink, plink2, Plotly, plotsr, Point Cloud Library, popbam, PopCOGenT, PopLDdecay, Porechop, poretools, portcullis, POUTINE, pplacer, PRANK, preseq, pretext-suite, primalscheme, primer3, PrimerBLAST, PrimerPooler, prinseq, prodigal, progenomics, progressiveCactus, PROJ, prokka, Proseq2, ProtExcluder, protolite, PSASS, psmc, psutil, pullseq, purge_dups, pyani, PyCogent, pycoQC, pyfaidx, pyGenomeTracks, PyMC, pymol-open-source, pyopencl, pypy, pyRAD, Pyro4, pyseer, PySnpTools, python, PyTorch, PyVCF, qapa, qcat, QIIME, QIIME2, QTCAT, Quake, Qualimap, QuantiSNP2, QUAST, quickmerge, QUMA, R, RACA, racon, rad_haplotyper, RADIS, RadSex, RagTag, rapt, RAPTR-SV, RATT, raven, RAxML, raxml-ng, Ray, rck, rclone, Rcorrector, RDP Classifier, REAGO, REAPR, Rebaler, Red, ReferenceSeeker, regenie, regtools, Relate, RelocaTE2, Repbase, RepeatMasker, RepeatModeler, RERconverge, ReSeq, RevBayes, RFdiffusion, RFMix, RGAAT, rgdal, RGI, Rgtsvm, Ribotaper, ripgrep, rJava, rMATS, RNAMMER, rnaQUAST, Rnightlights, Roary, Rockhopper, rohan, RoseTTAFold-All-Atom, RoseTTAFold2NA, rphast, Rqtl, Rqtl2, RSAT, RSEM, RSeQC, RStudio, rtfbs_db, ruby, run_dbcan, sabre, SaguaroGW, salmon, SALSA, Sambamba, samblaster, sample, SampleTracker, samplot, samtabix, Samtools, Satsuma, Satsuma2, SCALE, scanorama, scikit-learn, Scoary, scythe, seaborn, SEACR, SecretomeP, self-assembling-manifold, selscan, Sentieon, seqfu, seqkit, SeqPrep, seqtk, SequelTools, sequenceTubeMap, Seurat, sf, sgrep, sgrep sorted_grep, SHAPEIT, SHAPEIT4, SHAPEIT5, shasta, Shiny, shore, SHOREmap, shortBRED, SHRiMP, sickle, sift4g, SignalP, SimPhy, simuPOP, sina, singularity, sinto, sirius, sistr_cmd, SKESA, skewer, SLiM, SLURM, smap, smcpp, smoove, SMRT Analysis, SMRT LINK, snakemake, snap, SnapATAC, SNAPP, SnapTools, snATAC, SNeP, Sniffles, snippy, snp-sites, SnpEff, SNPgenie, SNPhylo, SNPsplit, SNVPhyl, SOAP2, SOAPdenovo, SOAPdenovo-Trans, SOAPdenovo2, SomaticSniper, sorted_grep, spaceranger, SPAdes, SPALN, SparCC, sparsehash, SPARTA, split-fasta, sqlite, SqueezeMeta, SQuIRE, SRA Toolkit, srst2, stacks, Stacks 2, stairway-plot, stampy, STAR, Starcode, statmodels, stellarscope, STITCH, STPGA, StrainPhlAn, strawberry, Strelka, stringMLST, StringTie, STRUCTURE, Structure_threader, Struo2, stylegan2-ada-pytorch, subread, sumatra, supernova, suppa, SURPI, surpyvor, SURVIVOR, sutta, SV-plaudit, SVaBA, SVclone, SVDetect, svengine, SVseq2, svtools, svtyper, svviz2, SWAMP, sweed, SweepFinder, SweepFinder2, sweepsims, swiss2fasta.py, sword, syri, tabix, tagdust, Taiji, Tandem Repeats Finder (TRF), tardis, TargetP, TASSEL 3, TASSEL 4, TASSEL 5, tax_myPHAGE, tbl2asn, tcoffee, telescope, TensorFlow, TEToolkit, TEtranscripts, texlive, TFEA, tfTarget, thermonucleotideBLAST, ThermoRawFileParser, TMHMM, tmux, Tomahawk, TopHat, Torch, traitRate, Trans-Proteomic Pipeline (TPP), TransComb, TransDecoder, TRANSIT, transrate, TRAP, tree, treeCl, treemix, Trim Galore!, trimal, trimmomatic, Trinity, Trinotate, TrioCNV2, tRNAscan-SE, Trycycler, UCSC Kent utilities, ultraplex, UMAP, UMI-tools, umi-transfer, UMIScripts, Unicycler, UniRep, unitig-caller, unrar, usearch, VALET, valor, vamb, Variant Effect Predictor, VarScan, VCF-kit, vcf2diploid, vcfCooker, vcflib, vcftools, vdjtools, Velvet, vep, VESPA, vg, Vicuna, ViennaRNA, VIP, viral-ngs, virmap, VirSorter, VirusDetect, VirusFinder 2, vispr, VizBin, vmatch, vsearch, vt, WASP, webin-cli, wget, wgs-assembler (Celera), WGSassign, What_the_Phage, wiggletools, windowmasker, wine, Winnowmap, Wise2 (Genewise), wombat, Xander_assembler, xpclr, yaha, yahs

Details for MAKER (If the copy-pasted commands do not work, use this tool to remove unwanted characters)

Name:MAKER
Version:3.01
OS:Linux
About:MAKER is an easy-to-configure, portable genome annotation pipeline. MAKER allows smaller eukaryotic genome projects and prokaryotic genome projects to annotate their genomes and to create genome databases.
Added:1/23/2013 4:46:28 PM
Updated:7/18/2020 3:24:14 PM
Link:http://gmod.org/wiki/MAKER
Notes:

#Maker is only available to academic users (https://github.com/Yandell-Lab/maker/blob/master/LICENSE )

#set environment 

export PATH=/programs/maker-3.01/bin:$PATH

 

Here is the instruction to run maker with MPI. You need to copy maker software to the local /workdir.

cd /workdir

cp -r /programs/Augustus-3.3.2/config/ /workdir/

 

## create a working directory for your data, and put your genome fasta file in your working directory

mkdir /workdir/$USER

cd /workdir/$USER

## create configure file

maker -CTL


## create a tmp directory
mkdir /workdir/tmp

## Modify the maker_opts.ctl file created by the previous command

#1. modify the last line of the maker_opts.ctl, so that it looks like: TMP=/workdir/tmp

#2. modify the parameters in maker_opts.ctl and maker_bopts.ctl that are appropriate for your project

##: Run mpi maker.

## Note: As maker take several days to finish, you need to run maker with "screen" or "nohup" so that the job will keep running in the background. 

## The following instruction uses "screen" (Documentation for "screen": https://www.rackaid.com/blog/linux-screen-tutorial-and-how-to/ ) :

screen

export PATH=/workdir/maker/bin:$PATH

export PATH=/programs/snap:$PATH
export AUGUSTUS_CONFIG_PATH=/workdir/config
export ZOE=/programs/snap/Zoe

export LD_LIBRARY_PATH=/programs/boost_1_80_0/lib

## before running the next step, check how many CPU cores that your reserved computer has. You can find the information at "My Reservation" page of the  BioHPC web site under "System info". Replace "N" in the next command with the number of cores.

mpiexec -n N maker -qq >& log &

## press "ctrl-a" followed by press "d" to detach from "screen", now you can safely close your laptop.

## to re-attach to "screen", use the command "screen -r".

 

Please note, the maker software created large number of intermediate files. With our current file system, it is not recommended to save large number of small files in home directory.   If you want to save the maker results from /workdir to your home directory, make sure you make a tarball, and copy the tar file to home directory. For example: 

 

tar -cvfz my.maker.output.tar.gz my.maker.output

When you want to work on the directory later, you can copy the .tar.gz  file to work directory, do 

tar -xvfz my.maker.output.tar.gz

 

#instructions to install

You will need to install latest Maker by yourself, as it requires an academic license. Follow instructions in this page to licence and download Maker: http://www.yandell-lab.org/software/maker.html

After you download Maker, keep the file maker-3.01.03.tgz under /workdir.

cd /workdir
tar xvfz maker-3.01.03.tgz
cd maker/src
perl Build.PL
#when prompted: answer: 1. Y ; 2. use default 3. use default
./Build install

Now the maker is ready for use. You might want to keep a copy of maker directory in your home directory, so that it can be used later

A protocol for using maker can be found at https://biohpc.cornell.edu/doc/annotation_2019_exercises1_v2.html

Maker is slow, and it could take up to a week to finish a maker run with MPI, and much longer without MPI. It is recommended then you run maker with MPI, and use medium memory generation 2 (for medium size job) or large memory generation 2 computer (for large genome)  on BioHPC. If a job is stopped during the run, it can be restarted from where it stops. 

 


Notify me if this software is upgraded or changed [You need to be logged in to use this feature]

 

Website credentials: login  Web Accessibility Help