This two-session workshop will cover various aspects of using multiple processors to parallelize your work in BioHPC cloud and beyond. We will discuss different parallelization paradigms (shared memory, distributed memory, mixed approach) and shortly introduce the respective programming tools. The common problem of simultaneous processing of multiple independent tasks will be discussed in detail along with tools used to distribute and balance computational load over available resources. Among such tools, job schedulers such as SLURM play an important role as means to balance and prioritize tasks in complex environments with multiple machines, groups of users, and jobs with different CPU and memory requirements. We will introduce the recently developed 'SLURM cluster on demand' feature of BioHPC cloud as an efficient way to streamline your work.
Access to BioHPC Cloud workstations requires a
BioHPC Cloud account. If you did not already have such account, you were asked to create one at the time you registered for the workshop. It you need to re-set your BioHPC password, you can do it at
https://biohpc.cornell.edu//lab/labpassreset.aspx. If you do not know your BioHPC user ID, contact us at
support@biohpc.cornell.edu.
Since the BioHPC resources are behind the Cornell firewall, the easiest way to access them is from Ithaca campus network (not possible at present) or from any other location while using the
Cornell VPN. The latter is avaliable to all users having the
Cornell NetID. Please check the
relevant CIT website to see if you are elligible for a NetID and obtain one if possible. While out-of-campus access to BioHPC Cloud without the VPN is still possible, it is somewhat more complicated.
A Linux machine for hands-on exercises will be assigned to you automatically (you
do not have to make your own reservation).
The workstations will be accessed using the Secure SHell (
ssh) protocol. To participate in the exercises, please bring your own laptop with an
ssh client installed. MACs and Linux laptops come with native ssh clients and no extra installtion is needed. For windows, the recommended ssh client is
PuTTy - please install it prior to the workshop(just download the executable file
http://the.earth.li/~sgtatham/putty/latest/x86/putty.exe and put it anywhere on your hard drive; double-click to launch) . To transfer files between your laptop and a Linux machine, you will need an
sftp clinet, such as
FileZilla (although MAC and Linux laptops come with native sftp clients and no extra installation is necessary, FileZilla would be helpful on these platforms as well). For detailed instructions and more information on access to BioHPC machines, please refer to the following document:
http://biohpc.cornell.edu/lab/doc/Remote_access.pdf, especially points 1 and 2.2-2.4.
Lecture slides
Exercise 1 handouts
PDF HTML
Exercise 2 handouts
PDF HTML
Workshop presentation 1
Workshop presentation 2
Workshop presentation 3
Workshop presentation 4
Workshop server assignment