institute of biotechnology >> brc >> bioinformatics >> internal >> biohpc cloud: user guide
 

BioHPC Cloud:
: User Guide

 

 


BioHPC Cloud Software

There are 1095 software titles installed in BioHPC Cloud. The sofware is available on all machines (unless stated otherwise in notes), complete list of programs is below, please click on a title to see details and instructions. Tabular list of software is available here

Please read details and instructions before running any program, it may contain important information on how to properly use the software in BioHPC Cloud.

3D Slicer, 3d-dna, 454 gsAssembler or gsMapper, a5, ABRicate, ABruijn, ABySS, AdapterRemoval, adephylo, Admixtools, Admixture, AF_unmasked, AFProfile, AGAT, agrep, albacore, Alder, AliTV-Perl interface, AlleleSeq, ALLMAPS, ALLPATHS-LG, Alphafold, AMOS, AMPHORA, amplicon.py, AMRFinder, analysis, ANGSD, AnnotaPipeline, Annovar, ant, antiSMASH, anvio, apollo, arcs, ARGweaver, aria2, ariba, Arlequin, ART, ASEQ, aspera, assembly-stats, ASTRAL, atac-seq-pipeline, ataqv, athena_meta, ATLAS, Atlas-Link, ATLAS_GapFill, atom, ATSAS, Augustus, AWS command line interface, AWS v2 Command Line Interface, axe, axel, BA3, BactSNP, bakta, bamsnap, bamsurgeon, bamtools, bamUtil, barcode_splitter, BarNone, Basset, BayeScan, Bayescenv, bayesR, baypass, bazel, BBMap/BBTools, BCFtools, bcl2fastq, BCP, Beagle, Beast2, bedops, BEDtools, bfc, bgc, bgen, bicycle, BiG-SCAPE, bigQF, bigWig, bioawk, biobakery, biobambam, Bioconductor, biom-format, BioPerl, BioPython, Birdsuite, Bismark, Blackbird, blasr, BLAST, BLAST_to_BED, blast2go, BLAT, BlobToolKit, BLUPF90, BMGE, bmtagger, bonito, Boost, Bowtie, Bowtie2, BPGA, Bracken, BRAKER, BRAT-NextGen, BRBseqTools, BreedingSchemeLanguage, breseq, brocc, bsmap, BSseeker2, BUSCO, BUSCO Phylogenomics, BWA, bwa-mem2, bwa-meth, bwtool, cactus, CAFE, caffe, cagee, canu, Canvas, CAP3, caper, CarveMe, catch, cBar, CBSU RNAseq, CCMetagen, CCTpack, cd-hit, cdbfasta, cdo, CEGMA, CellRanger, cellranger-arc, cellranger-atac, cellranger-dna, centrifuge, centroFlye, CFM-ID, CFSAN SNP pipeline, CheckM, CheckM2, chimera, chimerax, chip-seq-pipeline, chromosomer, Circlator, Circos, Circuitscape, CITE-seq-Count, ClermonTyping, clues, CLUMPP, clust, Clustal Omega, CLUSTALW, Cluster, cmake, CMSeq, CNVnator, coinfinder, colabfold, CombFold, compat, CONCOCT, Conda, Cooler, copyNumberDiff, cortex_var, CoverM, crabs, CRISPRCasFinder, CRISPResso, Cromwell, CrossMap, CRT, cuda, Cufflinks, curatedMetagenomicDataTerminal, cutadapt, cuteSV, dadi, dadi-1.6.3_modif, dadi-cli, danpos, DAS_Tool, DBSCAN-SWA, dDocent, DeconSeq, Deepbinner, deeplasmid, DeepTE, deepTools, Deepvariant, defusion, delly, DESMAN, destruct, DETONATE, diamond, dipcall, diploSHIC, discoal, Discovar, Discovar de novo, distruct, DiTASiC, DIYABC, dnmtools, Docker, dorado, DRAM, dREG, dREG.HD, drep, Drop-seq, dropEst, dropSeqPipe, dsk, dssat, Dsuite, dTOX, duphold, DWGSIM, dynare, ea-utils, ecopcr, ecoPrimers, ectyper, EDGE, edirect, EDTA, eems, EgaCryptor, EGAD, EIGENSOFT, elai, ElMaven, EMBLmyGFF3, EMBOSS, EMIRGE, Empress, enfuse, EnTAP, entropy, epa-ng, ephem, epic2, ermineJ, ete3, EukDetect, EukRep, EVM, exabayes, exonerate, ExpansionHunterDenovo-v0.8.0, eXpress, FALCON, FALCON_unzip, Fast-GBS, fasta, FastANI, fastcluster, FastME, FastML, fastp, FastQ Screen, fastq-multx-1.4.3, fastq_demux, fastq_pair, fastq_species_detector, FastQC, fastqsplitter, fastsimcoal2, fastspar, fastStructure, FastTree, FASTX, fcs, feems, feh, FFmpeg, fgbio, figaro, Filtlong, fineRADstructure, fineSTRUCTURE, FIt-SNE, flash, flash2, flexbar, Flexible Adapter Remover, Flye, FMAP, FragGeneScan, FragGeneScan, FRANz, freebayes, FSA, funannotate, FunGene Pipeline, FunOMIC, G-PhoCS, GADMA, GAEMR, Galaxy, Galaxy in Docker, GATK, gatk4, gatk4amplicon.py, gblastn, Gblocks, GBRS, gcc, GCTA, GDAL, gdc-client, GEM library, GEMMA, GeMoMa, GENECONV, geneid, GeneMark, Genespace, genomad, Genome STRiP, Genome Workbench, GenomeMapper, GenomeThreader, genometools, GenomicConsensus, genozip, gensim, GEOS, germline, gerp++, GET_PHYLOMARKERS, gfaviz, GffCompare, gffread, giggle, git, glactools, GlimmerHMM, GLIMPSE, GLnexus, Globus connect personal, GMAP/GSNAP, GNU Compilers, GNU parallel, go-perl, GO2MSIG, GONE, GoShifter, gradle, graftM, grammy, GraPhlAn, graphtyper, graphviz, greenhill, GRiD, gridss, Grinder, grocsvs, GROMACS, GroopM, GSEA, gsort, GTDB-Tk, GTFtools, Gubbins, GUPPY, hail, hal, HapCompass, HAPCUT, HAPCUT2, hapflk, HaploMerger, Haplomerger2, haplostrips, HaploSync, HapSeq2, HarvestTools, haslr, hdf5, hget, hh-suite, HiC-Pro, hic_qc, HiCExplorer, HiFiAdapterFilt, hifiasm, hificnv, HISAT2, HMMER, Homer, HOTSPOT, HTSeq, htslib, https://github.com/CVUA-RRW/RRW-PrimerBLAST, hugin, humann, HUMAnN2, hybpiper, hyperopt, HyPhy, hyphy-analyses, iAssembler, IBDLD, idba, IDBA-UD, IDP-denovo, idr, idseq, IgBLAST, IGoR, IGV, IMa2, IMa2p, IMAGE, ImageJ, ImageMagick, Immcantation, impute2, impute5, IMSA-A, INDELseek, infernal, Infomap, inStrain, inStrain_lite, InStruct, Intel MKL, InteMAP, InterProScan, ipyrad, IQ-TREE, iRep, JaBbA, jags, Jane, java, jbrowse, JCVI, jellyfish, juicer, julia, jupyter, jupyterlab, kaiju, kallisto, Kent Utilities, keras, khmer, kinfin, king, kma, KmerFinder, KmerGenie, kneaddata, kraken, KrakenTools, KronaTools, kSNP, kWIP, LACHESIS, lammps, LAPACK, LAST, lastz, lcMLkin, LDAK, LDhat, LeafCutter, leeHom, lep-anchor, Lep-MAP3, LEVIATHAN, lftp, Liftoff, Lighter, LinkedSV, LINKS, localcolabfold, LocARNA, LocusZoom, lofreq, longranger, Loupe, LS-GKM, LTR_retriever, LUCY, LUCY2, LUMPY, lyve-SET, m6anet, MACE, MACS, MaCS simulator, MACS2, macs3, maffilter, MAFFT, mafTools, MAGeCK, MAGeCK-VISPR, Magic-BLAST, magick, MAGScoT, MAKER, manta, mapDamage, mapquik, MAQ, MARS, MASH, mashtree, Mashtree, MaSuRCA, MATLAB, Matlab_runtime, Mauve, MaxBin, MaxQuant, McClintock, mccortex, mcl, MCscan, MCScanX, medaka, medusa, megahit, MeGAMerge, MEGAN, MELT, MEME Suite, MERLIN, merqury, MetaBAT, MetaBinner, MetaboAnalystR, MetaCache, MetaCRAST, metaCRISPR, metamaps, MetAMOS, MetaPathways, MetaPhlAn, metapop, metaron, MetaVelvet, MetaVelvet-SL, metaWRAP, methpipe, mfeprimer, MGmapper, MicrobeAnnotator, MiFish, Migrate-n, mikado, MinCED, minigraph, Minimac3, Minimac4, minimap2, mira, miRDeep2, mirge3, miRquant, MISO, MITObim, MitoFinder, mitohelper, MitoHiFi, mity, MiXCR, MixMapper, MKTest, mlift, mlst, MMAP, MMSEQ, MMseqs2, MMTK, MobileElementFinder, modeltest, MODIStsp-2.0.5, module, moments, MoMI-G, mongo, mono, monocle3, mosdepth, mothur, MrBayes, mrsFAST, msld, MSMC, msprime, MSR-CA Genome Assembler, msstats, MSTMap, mugsy, MultiQC, multiz-tba, MUMandCo, MUMmer, mummer2circos, muscle, MUSIC, Mutation-Simulator, muTect, MZmine, nag-compiler, nanocompore, nanofilt, NanoPlot, Nanopolish, nanovar, ncftp, ncl, NECAT, Nemo, Netbeans, NEURON, new_fugue, Nextflow, NextGenMap, NextPolish2, nf-core/rnaseq, ngmlr, NGS_data_processing, NGSadmix, ngsDist, ngsF, ngsLD, NGSNGS, NgsRelate, ngsTools, NGSUtils, NINJA, NLR-Annotator, NLR-Parser, Novoalign, NovoalignCS, nQuire, NRSA, NuDup, numactl, nvidia-docker, nvtop, Oases, OBITools, Octave, OMA, Oneflux, OpenBLAS, openmpi, openssl, orthodb-clades, OrthoFinder, orthologr, Orthomcl, pacbio, PacBioTestData, PAGIT, pal2nal, paleomix, PAML, panaroo, pandas, pandaseq, pandoc, PanPhlAn, Panseq, Parsnp, PASA, PASTEC, PAUP*, pauvre, pb-assembly, pbalign, pbbam, pbh5tools, PBJelly, pblat, pbmm2, PBSuite, pbsv, pbtk, PCAngsd, pcre, pcre2, PeakRanger, PeakSplitter, PEAR, PEER, PennCNV, peppro, PERL, PfamScan, pgap, PGDSpider, ph5tools, Phage_Finder, pharokka, phasedibd, PHAST, phenopath, Phobius, PHRAPL, PHYLIP, PhyloCSF, phyloFlash, phylophlan*, PhyloPhlAn2, phylophlan3, phyluce, PhyML, Picard, PICRUSt2, pigz, Pilon, Pindel, piPipes, PIQ, PlasFlow, platanus, Platypus, plink, plink2, Plotly, plotsr, Point Cloud Library, popbam, PopCOGenT, PopLDdecay, Porechop, poretools, portcullis, POUTINE, pplacer, PRANK, preseq, primalscheme, primer3, PrimerBLAST, PrimerPooler, prinseq, prodigal, progenomics, progressiveCactus, PROJ, prokka, Proseq2, ProtExcluder, protolite, PSASS, psmc, psutil, pullseq, purge_dups, pyani, PyCogent, pycoQC, pyfaidx, pyGenomeTracks, PyMC, pymol-open-source, pyopencl, pypy, pyRAD, Pyro4, pyseer, PySnpTools, python, PyTorch, PyVCF, qapa, qcat, QIIME, QIIME2, QTCAT, Quake, Qualimap, QuantiSNP2, QUAST, quickmerge, QUMA, R, RACA, racon, rad_haplotyper, RADIS, RadSex, RagTag, rapt, RAPTR-SV, RATT, raven, RAxML, raxml-ng, Ray, rck, rclone, Rcorrector, RDP Classifier, REAGO, REAPR, Rebaler, Red, ReferenceSeeker, regenie, regtools, Relate, RelocaTE2, Repbase, RepeatMasker, RepeatModeler, RERconverge, ReSeq, RevBayes, RFdiffusion, RFMix, RGAAT, rgdal, RGI, Rgtsvm, Ribotaper, ripgrep, rJava, rMATS, RNAMMER, rnaQUAST, Rnightlights, Roary, Rockhopper, rohan, RoseTTAFold2NA, rphast, Rqtl, Rqtl2, RSAT, RSEM, RSeQC, RStudio, rtfbs_db, ruby, run_dbcan, sabre, SaguaroGW, salmon, SALSA, Sambamba, samblaster, sample, SampleTracker, samplot, samtabix, Samtools, Satsuma, Satsuma2, SCALE, scanorama, scikit-learn, Scoary, scythe, seaborn, SEACR, SecretomeP, self-assembling-manifold, selscan, Sentieon, seqfu, seqkit, SeqPrep, seqtk, SequelTools, sequenceTubeMap, Seurat, sf, sgrep, sgrep sorted_grep, SHAPEIT, SHAPEIT4, SHAPEIT5, shasta, Shiny, shore, SHOREmap, shortBRED, SHRiMP, sickle, sift4g, SignalP, SimPhy, simuPOP, singularity, sinto, sirius, sistr_cmd, SKESA, skewer, SLiM, SLURM, smap, smcpp, smoove, SMRT Analysis, SMRT LINK, snakemake, snap, SnapATAC, SNAPP, SnapTools, snATAC, SNeP, Sniffles, snippy, snp-sites, SnpEff, SNPgenie, SNPhylo, SNPsplit, SNVPhyl, SOAP2, SOAPdenovo, SOAPdenovo-Trans, SOAPdenovo2, SomaticSniper, sorted_grep, spaceranger, SPAdes, SPALN, SparCC, sparsehash, SPARTA, split-fasta, sqlite, SqueezeMeta, SQuIRE, SRA Toolkit, srst2, stacks, Stacks 2, stairway-plot, stampy, STAR, Starcode, statmodels, STITCH, STPGA, StrainPhlAn, strawberry, Strelka, stringMLST, StringTie, STRUCTURE, Structure_threader, Struo2, stylegan2-ada-pytorch, subread, sumatra, supernova, suppa, SURPI, surpyvor, SURVIVOR, sutta, SV-plaudit, SVaBA, SVclone, SVDetect, svengine, SVseq2, svtools, svtyper, svviz2, SWAMP, sweed, SweepFinder, SweepFinder2, sweepsims, swiss2fasta.py, sword, syri, tabix, tagdust, Taiji, Tandem Repeats Finder (TRF), tardis, TargetP, TASSEL 3, TASSEL 4, TASSEL 5, tbl2asn, tcoffee, TensorFlow, TEToolkit, TEtranscripts, texlive, TFEA, tfTarget, thermonucleotideBLAST, ThermoRawFileParser, TMHMM, tmux, Tomahawk, TopHat, Torch, traitRate, Trans-Proteomic Pipeline (TPP), TransComb, TransDecoder, TRANSIT, transrate, TRAP, tree, treeCl, treemix, Trim Galore!, trimal, trimmomatic, Trinity, Trinotate, TrioCNV2, tRNAscan-SE, Trycycler, UCSC Kent utilities, ultraplex, UMAP, UMI-tools, UMIScripts, Unicycler, UniRep, unitig-caller, unrar, usearch, valor, vamb, Variant Effect Predictor, VarScan, VCF-kit, vcf2diploid, vcfCooker, vcflib, vcftools, vdjtools, Velvet, vep, VESPA, vg, Vicuna, ViennaRNA, VIP, viral-ngs, virmap, VirSorter, VirusDetect, VirusFinder 2, vispr, VizBin, vmatch, vsearch, vt, WASP, webin-cli, wget, wgs-assembler (Celera), WGSassign, What_the_Phage, windowmasker, wine, Winnowmap, Wise2 (Genewise), wombat, Xander_assembler, xpclr, yaha, yahs

Details for Docker (If the copy-pasted commands do not work, use this tool to remove unwanted characters)

Name:Docker
Version:20.10.17
OS:Linux
About:Executes applications in containers that are isolated from main operating system (OS-level virtualization)
Added:2/14/2017 3:44:40 PM
Updated:6/2/2023 12:32:11 PM
Link:https://www.docker.com/
Notes:

This link points to our Docker Quick Start Guide - an example based fast introduction to our Docker implementation.

This link points to our "Using Docker at BioHPC" virtual workshop, which is an in-depth example based introduction to Docker.

For more details read below.

Docker allows users to run applications in a way that is isolated from the host operating system, therefore preventing compatibility issues and allowing to run applications normally impossible to run without custom installations - like native Ubuntu programs on CentOS. It also allows users to install and run applications as administrators ("root"). Unfortunately Docker does not sufficiently isolate users running applications as administrators from the main machine, so we had to deploy a modified version of Docker that is safe and secure in BioHPC Lab environment and still allows users great freedom of running, installing and modifying applications.

IMPORTANT: Original Docker command is "docker". This command has been replaced by "docker1" command in BioHPC Cloud. Whenever reading a Docker book or website please replace "docker" with "docker1" when you want to run the command on BioHPC Lab machines. Most options and syntax is the same and differences are discussed below. Syntax of any command can be displayed with "docker1 commandname --help", "docker1 --help" will display all available commands. Please note you need to put any Docker command options BEFORE container or image name.
If you run "docker" instead of "docker1" you will get the error "Cannot connect to the Docker daemon. Is the docker daemon running on this host?" or "Got permission denied while trying to connect to the Docker daemon socket".

The text below is a fast crash-course to help starting with Docker. Please refer to online tutorials or our Docker workshop for more in-depth introduction.

Docker images.

Docker image is a template Docker uses to create instances of running programs, which are called containers. Before running any Dockerized application you need to know how to access its Docker image. There are two ways:
 

  • Images are stored in Docker registries (or hubs) and their names and addresses are described in respective software documentation. You can import images from repositories with command "docker1 pull imagename". A number of customized images for BioHPC users are in "biohpc" repository (see below).
     
  • Images can be imported from a file ("docker1 import filename" or "docker1 load filename"). We provide a number of custom images in directory /programs/docker/images (they can be also imported from biohpc repository). You can export your own modified container to a file for later use - typically the workflow is to pull a basic image, run container, install software in it and then export for later use.

Any image imported is stored as a local copy. If it is imported from a repository the image name is the same as in the repository ("reponame/imagename"). If it is imported from a file, the image name will be "biohpc_labid/imagename" where labid is your Lab ID. If your image is in a file you have to import it first in order to run. If you use an image from a repository you can run it directly with "run" command - it will be pulled automatically. Example of import from a file:

docker1 import /programs/docker/images/cowsay.tar

This is an example of repository import:

docker1 pull biohpc/cowsay

You can list local images with "docker1 images", here is what I got after the above import from file:

[jarekp@cbsum1c2b011 ~]$ docker1 images
REPOSITORY                            TAG                 IMAGE ID            CREATED             SIZE
biohpc_jarekp/cowsay                  latest              eac8cfea6661        4 seconds ago       319.7 MB

After importing from repository the result is slightly different - same image, different naming:

[jarekp@cbsum1c2b011 ~]$ docker1 pull biohpc/cowsay
Using default tag: latest
Trying to pull repository dtr.cucloud.net/biohpc/cowsay ...
latest: Pulling from dtr.cucloud.net/biohpc/cowsay

08d48e6f1cff: Pull complete
a1aa994f5ff7: Pull complete
Digest: sha256:b4ec86cdbb2d564d7ea94c9b49196f6b82e3c635a6581ee4eae02687e8ba91b8
Status: Downloaded newer image for dtr.cucloud.net/biohpc/cowsay:latest
[jarekp@cbsum1c2b011 ~]$ docker1 images
REPOSITORY                      TAG                 IMAGE ID            CREATED             SIZE
dtr.cucloud.net/biohpc/cowsay   latest              195f168235c9        2 weeks ago         337.1 MB

Running Docker applications.

A command to run a Docker container is "docker1 run [OPTIONS] IMAGE [COMMAND] [ARG...]". This command has a lot of options, but its basics are very simple. First, a simple test command to run to check if Docker is OK is "docker1 run hello-world". In principle there are 3 main ways to run a Docker container
 

  • Single command. An image is run with "docker1 run image cmd", after the command is completed the container stops. It cannot be rerun, but its output and structure can be still examined. The container can be saved.

    [jarekp@cbsum1c2b011 ~]$ docker1 run biohpc/cowsay cowsay 'Hi there!'
     ___________
    < Hi there! >
     -----------
            \   ^__^
             \  (oo)\_______
                (__)\       )\/\
                    ||----w |
                    ||     ||
    [jarekp@cbsum1c2b011 ~]$

     
  • User in docker container. There is an option to run as host user ID within Docker container, so that files created will have same user ID as the host machine. 
    docker1 run --rm --user $(id -u):$(id -g) -e HOME=/workdir biohpc/cowsay cowsay 'Hi there!'
     
  • Interactive mode. An image is run with "docker1 run -it image cmd", the container input is now linked to the keyboard, output to the screen and it will run interactively as long as the "cmd" is active. Typically "cmd" is a shell like "/bin/bash":

    [jarekp@cbsum1c2b011 ~]$ docker1 run -it biohpc_jarekp/cowsay /bin/bash
    [root@a605b04a7ca5 workdir]#

    The container is now available to run commands until we exit the shell.
     
  • Background mode. The container can be started in the background (with "-d" option), then users can execute commands inside the container using "docker1 exec" command.

    [jarekp@cbsum1c2b011 ~]$ docker1 run -d -t biohpc/cowsay /bin/bash
    5ab4520a337fbb01b2b3f45c14688e095446237930657d7293fa7238c91c8864
    [jarekp@cbsum1c2b011 ~]$ docker1 ps -a
    CONTAINER ID        IMAGE               COMMAND                CREATED             STATUS                      PORTS               NAMES
    5ab4520a337f        biohpc/cowsay       "/bin/bash"            6 seconds ago       Up 4 seconds                                    jarekp__biohpc_5

    [jarekp@cbsum1c2b011 ~]$ docker1 exec 5ab4520a337f /bin/bash -c "fortune | cowsay"
     _________________________________________
    / If a person (a) is poorly, (b) receives \
    | treatment intended to make him better,  |
    | and (c) gets better, then no power of   |
    | reasoning known to medical science can  |
    | convince him that it may not have been  |
    | the treatment that restored his health. |
    |                                         |
    | -- Sir Peter Medawar, "The Art of the   |
    \ Soluble"                                /
     -----------------------------------------
            \   ^__^
             \  (oo)\_______
                (__)\       )\/\
                    ||----w |
                    ||     ||
    [jarekp@cbsum1c2b011 ~]$

All the current containers can be listed with "docker1 ps -a" command - without "-a" only running containers will show.

[jarekp@cbsum1c2b011 ~]$ docker1 ps -a
CONTAINER ID        IMAGE               COMMAND                CREATED             STATUS                      PORTS               NAMES
5ab4520a337f        biohpc/cowsay       "/bin/bash"            4 minutes ago       Up 4 minutes                                    jarekp__biohpc_5
7e25f8af4981        biohpc/cowsay       "/bin/bash"            7 minutes ago       Exited (0) 7 minutes ago                        jarekp__biohpc_4
b30bf95712c7        biohpc/cowsay       "cowsay 'Hi there!'"   20 minutes ago      Exited (0) 20 minutes ago                       jarekp__biohpc_3
6b073ad025c7        biohpc/cowsay       "echo 'Hi there!'"     21 minutes ago      Exited (0) 21 minutes ago                       jarekp__biohpc_2
6111cccdbf71        biohpc/cowsay       "ls -a -l /"           22 minutes ago      Exited (0) 22 minutes ago                       jarekp__biohpc_1
[jarekp@cbsum1c2b011 ~]$ docker1 ps
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES
5ab4520a337f        biohpc/cowsay       "/bin/bash"         4 minutes ago       Up 4 minutes                            jarekp__biohpc_5
[jarekp@cbsum1c2b011 ~]$

 

Containers have their own root directory and system directories - after all they are isolated. In BioHPC Cloud each container has direct access to /workdir/labid directory (where labid is your Lab ID) - which is mounted inside the container as /workdir. If you would like to copy files to or from the container you can use this directory. You can also copy all necessary data there to use inside container. As you run inside the container as "root" new files created in /workdir/labid will be owned by root, and therefore may be difficult to handle outside the container. You can use "docker1 claim" command (custom BioHPC command) to change ownership of all files in /workdir/labid to your labid. If you would like NOT to have /workdir mounted please use --noworkdir option, which disables mounting /workdir and setting default working directory.

[jarekp@cbsum1c2b011 ~]$ docker1 run biohpc/cowsay df -h
Filesystem                                                                                           Size  Used Avail Use% Mounted on
/dev/mapper/docker-253:2-128408749-19d0dcbcbed14933b4bd8db3d769b34820cb37457317eb3247e69e85cd7c7790   10G  365M  9.7G   4% /
tmpfs                                                                                                7.8G     0  7.8G   0% /dev
tmpfs                                                                                                7.8G     0  7.8G   0% /sys/fs/cgroup
/dev/mapper/rhel-local                                                                               813G  121G  692G  15% /workdir
shm                                                                                                   64M     0   64M   0% /dev/shm
[jarekp@cbsum1c2b011 ~]$
[jarekp@cbsum1c2b011 ~]$ docker1 run biohpc/cowsay /bin/bash -c "echo test > /workdir/testfile"
[jarekp@cbsum1c2b011 ~]$ ls -al /workdir/jarekp/
total 4
drwxr-xr-x  2 jarekp root 21 Feb 15 16:02 .
drwxrwxrwx. 4 root   root 30 Feb  8 16:51 ..
-rw-r--r--  1 root   root  5 Feb 15 16:02 testfile
[jarekp@cbsum1c2b011 ~]$ docker1 claim
[jarekp@cbsum1c2b011 ~]$ ls -al /workdir/jarekp/
total 4
drwxr-xr-x  2 jarekp root 21 Feb 15 16:02 .
drwxrwxrwx. 4 root   root 30 Feb  8 16:51 ..
-rw-r--r--  1 jarekp root  5 Feb 15 16:02 testfile
[jarekp@cbsum1c2b011 ~]$

You can pull and run any image from public repositories. We have found, however, that most of them are very "light", i.e. they do not include development tools or libraries. Therefore we provide several development images to use as starting points to install your applications. We use Cornell Docker repository at dtr.cucloud.net, full path is dtr.cucloud.net/biohpc/imagename, but dtr.cucloud.net is added to the Docker repo search path on BioHPC Lab machines so using biohpc/imagename is just fine.

Description Repository image File
CentOS 7 cowsay
Basic image for testing with two extra commands installed: fortune and cowsay.
biohpc/cowsay /programs/docker/images/cowsay.tar
CentOS 7 development
CentOS 7 image with developemnt tools and libraries installed (compilers, Java, Perl, Python etc)
biohpc/centos7dev /programs/docker/images/centos7dev.tar
Ubuntu development
Ubuntu image with developemnt tools and libraries installed (compilers, Java, Perl, Python etc).
biohpc/ubuntudev /programs/docker/images/ubuntudev.tar

CentOS 7 development with GUI and sshd
CentOS 7 image with a standard set of developemnt tools and libraries  built on centos7dev image. Includes X11 and GUI tools and libraries. Automatically starts sshd, it must be run in background and connected to with ssh (see below "Running GUI ...").

biohpc/centos7devgui /programs/docker/images/centos7devgui.tar

Of course you can run any public images, for example "docker1 run -it ubuntu /bin/bash" will start an Ubuntu image for interactive use, the image will be downloaded from the official Ubuntu repository. Also, many programs or pipelines can be installed in basic images, development images are needed when building from source is required.

All containers you create are named "labid__biohpc_##'. Over time, lots of stopped containers will accumulate, they can be deleted using "docker1 rm imagename_or_id" command, but it can only deal with one container at a time. We provide "docker1 clean [options]" command that can help dealing with groups of containers:

docker1 clean --help
Usage:  docker1 clean [OPTIONS]

Remove containers from local machine

        remove all my non-running containers (default)
  all   remove all my containers (running or not)
  nores remove all containers from users not having current reservation

 

Saving containers for future use.

The best way to save your container is to save it as an image. This way the internal structure is preserved and it is also most compatible should you want to load it back using different Docker version. First you need to commit any changes you made in the container to an image using the command below, it will take container container_name_or_id  and create an image image_name that contains everything what is in the conatiner.

docker1 commit container_name_or_id image_name

Then you can save the resulting Docker image is

docker1 save -o filename image_name

The resulting file can be loaded back with

docker1 load -i filename

Finally you will need to use "docker1 run" to create a new running conatiner.

A container can be exported to a file using the following command "docker1 export -o filename name_or_id", e.g.

docker1 export -o /home/jarekp/mycowsay.tar 97651589ec95

The resulting file can be imported into Docker with "docker1 import" command. This command saves container as a simple image, it does not need commit step, but resulting image is simpler and less compatible.

Accessing local storage.

As discussed above local directory /workdir/labid is mapped to /workdir inside a container. On a hosted machines local storage is available in /local/storage directory. Any subdirectory of this directory can be mapped to a container if it is owned by the user launching the container. /local/storage form a hosted server can be mounted on another server (see details here), in this case it is mounted as /fs/servername/storage. Same as for /local/storage, any subdirectory of /fs/servername/storage may be mapped to a container if it is owned by the user. For example:

docker1 run -it -v /local/storage/jarekp:/storage centos /bin/bash

In this case /local/storage/jarekp is mapped as /storage in the container. Again, /local/storage is available only on hosted servers, on regular servers all storage is available via /workdir.

NOTE: Container /tmp directory is automatically mapped to /local/tmpdocker/$user/$$/tmp on the host server where $user is user's labid and $$ is container process number. This way each container has temporary space equal to total available space on the server, it is important for some pipelines. User can override this setting with -v option.

 

Accessing storage from other containers.

It is possible to share storage between containers, i.e. share internal storage space from one container to another. In order to do so a Docker volume must be created, either in a container or directly. A volume may be created in a running container (as an option to docker1 run), or a special "storage" container with a volume can be created. Volume from a container may be used from either running or just created container. Here is an example:

docker1 create -v /data --name data centos

The command creates a container named labid__biohpc_data (labid is BioHPC ID) with volume /data inside. Now you can use it with --volumes-from option:

docker1 run -it --volumes-from labid__biohpc_data centos /bin/bash

/data in the current container is now mapped from labid__biohpc_data and its content will be preserved as long as labid__biohpc_data container is present, even if the current container is exited or deleted. You can also export the container with data for future reuse or backup. Volume can be created directly and managed with "docker1 volume" commands:

docker1 volume create --name testvol

It will create volume labid__biohpc_testvol, all the volumes present on the server can be viewed with

docker1 volume ls

Container volumes can be inspected with "docker1 volume inspect" and deleted (only ones owned by current user) with "docker1 volume rm containername".

Volumes can be mapped to directory inside container with -v option, volume name should be the first argument:

docker1 run -it -v labid__biohpc_testvol:/data centos /bin/bash

 

Building images with Dockerfile.

Docker images can be buld (docker1 build path) using a file with a set of instructions called Dockerfile, path option in docker1 build command should point to a directory with Dockerfile in it. Please refer to online books or tutorials for more information about building images using a Dockerfile. BioHPC Lab restricts Dockerfile build path to  /workdir/labid, i.e. build path /workdir/labid/myimagedir is fine, but /home/labid/myimagedir will be denied (as also will be any system directories). Use "-t" instead of "--tag" parameter to set the image name. 

 

Running GUI/graphical/X-Windows/X11 applications in Docker container.

In order to run graphical applications in Docker container, the image must have X11 components installed and it must be able to start sshd program on launch. We provide one such image: centos7devgui. First you need to pull or import the image

docker1 pull biohpc/centos7devgui

Then the container must be started in the background, with ssh port mapped to your local machine's internal network:

docker1 run -d -p 127.0.0.1:5000:22 -P -t biohpc/centos7devgui /start.sh

Once the container is running you can connect to it from the machine where you run docker1 command using ssh to tunnel X11 graphics to your normal display:

ssh -X root@localhost -p 5000

It will ask you for a password which is 'biohpc' (without quotation marks). Remember that you need to run a program on your local machine that can accept and render the graphics. Consult our online User Guide or "Linux for Biologists" workshops for more details on using GUI applications.

 

Limiting CPUs and memory available to the container

The CPU cycles and memory available to the container can be limited using options to the docker1 run command. For example, 

docker1 run -it --memory="4g" --cpus=4 biohpc/centos7dev /bin/bash

will create an interactive CentOS7 container with memory limited to 4GB and able to consume CPU cycles equivalent to 4 CPU cores (cpu-quota/cpu-period=4). The imposed memory and CPU limits are not immediately obvious to a user working in the container. For example, the top or free commands will still show the full memory of the host rather than the limited amount, and cat /proc/cpuinfo will still show all the CPU cores of the host. However, the imposed limits will affect programs running within the container.

Sharing running containers with other users

Docker1 container can be accessed (e.g. with "docker1 exec") ONLY by the user who started it. We provide a custom command to share this access with other BioHPC users having access to the same server. The command is

docker1 access options

  • docker1 access list
       Prints current access list

  • docker1 access remove container_name user_id
       Removes access for user_id to container_name

  • docker1 access add container_name user_id
       Adds access for user_id to container_name

Sharing directories with other users containers

By default docker1 conatner can access only directories that belong to current user, or user group. Users can share directories and files with other users conatiners using docker1 access path commands

  • docker1 access path list
       prints current file/directory access list

  • docker1 access path remove pathname user_id
       Removes access for user_id to pathname

  • docker1 access path add pathname user_id
       Adds access for user_id to pathname

Deleting images and containers.

Unused containers and images take space and therefore they need to be periodically pruned. Every night a script is run that removes unused containers and images. The rules are as follows.
 

  • Any non-running container older than 1 week is deleted regardless of reservations.
  • Any container from a user that does not have a reservation is deleted.
  • Unused images, i.e. images that are not linked to containers are deleted. Images imported by users that do have a reservation are not deleted.

If you have custom images please make sure they are named properly (biohpc_labid/name, you can use docker1 tag command to change image names). You only need to provide the name part of the docker1 tag command, biohpc_labid will be pre-pended automatically, i.e. 'docker1 tag f49eec89601e myimage' will name image f49eec89601e with 'biohpc_labid/myimage'. When you import image from a file it will be properly named automatically.

You can manually delete your own containers and images.

 

Summary of custom BioHPC Docker commands.

docker1 claim

Enables user to take ownership in all files under /workdir/labid on a local machine.

docker1 clean [options]

Removes set of Docker containers. Supports 3 options:

  • docker1 clean   
    remove all my non-running containers (default)
  • docker1 clean all
    remove all my containers (running or not)
  • docker1 clean nores
    remove all containers from users not having current reservation

docker1 run

Various options relating to Docker volumes are disabled. Two special options have been added
  --noworkdir disables mapping /workdir dirctory and setting default cwd in the container
  --nodefcwd disables setting default cwd in the container

docker1 build

docker1 restricts Dockerfile build path to  /workdir/labid, i.e. build path /workdir/labid/myimagedir is fine, but /home/labid/myimagedir will be denied.

docker1 access

This command allows sharing of a container with other BioHPC users having access to the server where the container runs.

docker1 access options

  • docker1 access list
       prints current access list

  • docker1 access remove container_name user_id
       Removes access for user_id to container_name

  • docker1 access add container_name user_id
       Adds access for user_id to container_name

naming volumes and containers

All containers and volumes names start with labid__biohpc_ .

  


Notify me if this software is upgraded or changed [You need to be logged in to use this feature]

 

Website credentials: login  Web Accessibility Help