System Architecture

The Frank cluster is managed with Scyld ClusterWare running Red Hat Enterprise Linux 6. The batch system is managed with the Torque PBS resource manager, the Moab Scheduler and the Gold Allocation Manager.

Node Definitions

Frank is divided into two major partitions. The Shared Memory partition supports sequential or multi-threaded jobs which can utilize one or more cores of a single node. The Distributed Memory partition supports jobs that utilize distributed parallelism via MPI or other means, and can make use of 2 or more nodes simultaneously. Memory configurations shown below are physical values. Follow the links for detailed information about each queue.

Shared Memory

  • 15 quad-socket 12-core AMD Magny Cours (6172) 2.1 GHz CPU (48 core) nodes. They have 125 GB of memory and 1 TB of local disk. Queue: shared

  • 8 dual-socket 8-core Intel Sandy Bridge (E5-2670) 2.6 GHz (16 core) nodes. They have 63 GB of memory and 2 TB of local disk. Queue: shared_large

  • 2 dual-socket 8-core Intel Sandy Bridge (E5-2670) 2.6 GHz (16 core) nodes. They have 126 GB of memory and 3 TB of local disk. Queue: shared_heavy

  • 2 quad-socket 16-core AMD Interlagos (Opteron 6276) 2.3 GHz nodes. They have 256 GB of RAM and 2 TB of local scratch. Queue: shared_amd

Distributed Memory

  • 40 dual-socket 6-core Intel Westmere (X5650) 2.67 GHz CPU (12 core) nodes and 48 GB of memory. All nodes are connected by QDR InfiniBand. Queues: distributed, jordan

  • 36 dual-socket 8-core Intel Sandy Bridge (E5-2670) 2.6 GHz (16 core) nodes. They have 31 GB of memory, 1 TB of local disk and are connected by QDR InfiniBand. Queue: dist_small

  • 36 dual-socket 8-core Intel Sandy Bridge (E5-2670) 2.6 GHz (16 core) nodes. They have 63 GB of memory, 1 TB of local disk and are connected by QDR InfiniBand. Queue: dist_big

  • 36 quad-socket 16-core AMD Interlagos (Opteron 6276) 2.3 GHz nodes. They have 128 GB of RAM. All nodes are connected by QDR Infiniband and have 2 TB of local scratch. Queues: dist_amd, amd_compbio

  • 24 quad-socket 16-core AMD Interlagos (Opteron 6276) 2.3 GHz nodes. They have 256 GB of RAM. Both nodes are connected by QDR Infiniband and have 2 TB of local scratch. Queues: shared_amd, amd_compbio

  • 24 dual-socket 8-core Intel Sandy Bridge (E5-2670) 2.6 GHz (16 core) nodes. They have 128 GB of memory, 1 TB of local disk and are connected by FDR InfiniBand. Queue: dist_fast

  • 20 dual-socket 8-core Intel Ivy Bridge (E5-2650v2) 2.6 GHz (16 core) nodes. They have 64 GB of memory, 1 TB of local disk and are connected by FDR InfiniBand. Queue: dist_ivy

GPU

Frank has 4 GPU equipped nodes each with 4 NVIDIA Tesla GPGPUs.

  • 12 NVIDIA Tesla C2050 GPGPUs (1.15 GHz) each with 448 cores and 2 GB of memory. Queue: gpu

  • 4 NVIDIA Tesla C2075 GPGPUs (1.15 GHz) each with 448 cores and 6 GB of memory. Queue: gpu

  • 24 NVIDIA GTX Titan GPGPUs (837 MHz) each with 2688 cores and 6 GB of memory. Queue: gpu

Legacy and Specialty Hardware

  • 8 quad-socket 12-core AMD Magny Cours (6172) 2.1 GHz CPU (48 core) nodes. They have varying amounts of memory and 1 TB of local disk. Queue: jordan

  • 110 dual-socket 4-core Intel Nehalem CPU (12 core) nodes with varying amounts of memory and local disk. Queues: mem24g, mem48g, ib, one_day.

Shared Storage Systems

  • MobyDisk Xyratex ClusterStor 1500 lustre filesystem. MobyDisk is connected to the central FDR InfiniBand gateway switch to provide high bandwidth and low latency disk access to the Frank cluster. All home directories are stored on MobyDisk and mounted login nodes and compute nodes. Coming Spring 2015.

  • SuperCell. Coming Spring 2015.

  • Backups. A tape backup of the data stored in $HOME is maintained by CSSD. Snapshots are taken daily. No backup is kept for data stored on in groupshares on MobyDisk. More information forthcoming.