Space University of Florida - The Foundation of the Gator Nation
University of Florida College of Liberal Arts and Sciences
Space
Quantum Theory Project QTP Home page
Slater Lab

Slater Lab clusters User Guide

More details on each system: Linux clusters

Changes in QTP clusters

The summer of 2008, we are upgrading the software, operating system and compilers, on all QTP clusters as most of them are 3 years old. The new version of software and its organization will be the same as used in the UF HPC Center to make it easier to mix the use of both systems in the same project.

QTP clusters used LoadLeveler for IBM clusters simu and atanasoff; but these clusters will be dismantled during the summer.

QTP linux cluster are managed by the PBSPro resource manager and scheduler. All Linux clusters will be upgraded to new software and at the same time the resource manager and scheduler will be replaced by Torque and Moab respectively, to make the behavior consistent with the UF HPC Center and to prepare for UF wide grid computing and grid scheduling.

During the transition the QTP Linux clusters will be managed by two schedulers accesible from two interactive nodes.

  • Linux clusters arwen, haku, surg, ra: use ssh to connect to linx32, and use commands qstat, qsub, qdel, etc. to manage jobs. The PBSPro scheduler and resource manager run on arwen, and thus all jobs will have ID's of the form ######.arwen. This list of cluster will shrink as the clusters are upgraded to the new software.
  • Linux clusters wukong, and ock: use ssh to connect to linx32, and use commands qstat, qsub, qdel, etc. to manage jobs. ock can be managed from ock itself as well, but it is only to be used by members of the Bartlett group. The Moab scheduler and Torque resource manager run on wukong, and thus all jobs will have ID's of the form ######.wukong. This list of cluster will shrink as the clusters are upgraded to the new software.
  • AIX cluster simu: use rlogin to connect to any node and use commands llclass, llq, llsubmit, llcancel, llstatus, etc. to manage jobs.
  • AIX cluster atanasoff: use ssh to connect to atanasoff and use commands llclass, llq, llsubmit, llcancel, llstatus, etc. to manage jobs.

>> top

Connection to HPC

The new software on the nodes and on linx64 is the same as the software on the HPC center cluster. Thus the lates Intel compilers are available on the new nodes.

In addition the large parallel files systems /scratch/ufhpc (30 TB) and /scratch/crn (80 TB) are mounted on linx64 and on the wukong nodes. Thus enabling easier access to the same files when working on a large project using QTP and HPC Center resources. Howver, this connection is over GigabitEthernet and not over InfiniBand. Therefore the performance is good very general file manipulation, but not good enough to store high-intensity scratch files, like integral files for Gaussian.

Details about the HPC center cluster can be found at the HPC Center web site.

>> top

The clusters amun, arwen, haku, ra, surg, ock, wukong

Arwen was installed in Spring 2004, amun, haku, and ra in Summer 2005, surg Winter 2006, ock in Winter 2007, and wukong in Summer 2008. amun was dismantled to make room for wukong.

Characteristics
The clusters have a total of 436 cores: 2.8 GHz IA32 Xeon, 3.2 GHz EM64T Xeon, 2.2 GHz AMD64 Opteron. surg has two dual-core AMD Opteron CPUs. wukong has two quad-core Intel E5420 2.5 GHz CPUs.
Several TB-sized file system are mounted on /scr_2 inside the clusters and on /scr/arwen_2 /scr/arwen_3 /scr/haku_2 /scr/ra_2 for other QTP hosts. The ock 3 TB files system is mounted on both /scr_1 and /scr_2 inside ock and on /scr/ock_2 for other QTP hosts.

The Linux clusters are suited for calculations that require:

  • Fast CPUs
  • Large RAM for each CPU
  • Fine grained parallelism with up to 2 CPUs, use OpenMP
  • I/O to local disk, use /scr_1/tmp

The QTP Linux clusters do not have a fast interconnect. For parallel jobs that require many processors to communicate with each other the HPC Center clusters are a better choice.

Commands
Jobs are managed with PBSPro or Torque and Moab. Use commands qstat, qsub, qdel, etc. from the interactive nodes linx32, or linx64.

Logging in
Use ssh to log in to linx32, lin64. You can log in to any node, but avoid it as much as possible.

Scheduling principles
It may be helpful to consider the scheduling principles and priorities used by the Moab scheduler when planning your jobs and workflow.

>> top

Cluster ownership and usage

The primary use of each QTP cluster is as follows:

  • arwen is for use by the Roitberg group
  • haku is for use by the Hirata group
  • ra and wukong are for use by the Merz group
  • surg is for use by all members of QTP
  • ock is for use by the Bartlett group
  • simu and atanasoff are for use by any mamber of QTP; but they will soon disappear.

A member of QTP can make individual arrangements with the principal investogator of each research group to make use of the cluster for special projects.

The HPC Center cluster can be used by any researcher on campus. Prof. Cheng (phase II and Phase III) and Prof. Merz (Phase III) have invested in the HPC Center at the faculty level which gives the jobs submitted by members of their research groups higher priority; QTP has invested at the department/institue level in Phase III, which gives jobs from all members of QTP higher priority. The College of Liberal Arts and Sciences has invested at the college level and this gives an advantage for all researchers in CLAS.

>> top

Space Space Space
Space
Have a Question? Contact us.
Last Updated 6/29/08
 
University of Florida