History, Objectives, Definitions, and All That

CORE.SAM

CORE.SAM is a web service provided by the SAM team, and its primary target is the SAM cluster users community.

"CORE" is the acronym for Collaborative Online Resource Exchange. With the word "Resource" we mean knowledge, information, help and support of all sorts. CORE.SAM's objective is to create a collaborative/conversational online environment for the users of our HPC clusters, where the "Resource" is collectively edited and grown.

About SAM

SAM is a cross-disciplinary and multi-departmental center whose mission is to advance the application of computing to research at the University, through education and multi-disciplinary collaborations and partnerships. The Center serves as a collaboration portal for the modeling and simulation community at the University and has assembled a group of more than 50 collaborators from across the University who are engaged in computational research. These include faculty in Chemistry, Biology, Computational Biology, Physics and Astronomy, Mathematics, Computer Science, Economics and several of the departments in the Swanson School of Engineering, as well as faculty from the Schools of Public Health, Medicine, and the Graduate School of Public and International Affairs for the purpose of leveraging our community of expertise for tackling grand challenge problems. SAM provides in-house HPC resources allocated for shared usage free of charge for campus researchers. The systems are housed, administered and maintained in collaboration with CSSD. Currently (last update 7/11/2012), the HPC cluster is comprised of 45 12-core Intel Westmere, 110 8-core Intel Nehalem, 23 48-core AMD Magny-Cours, and 65 8-core Intel Harpertown compute nodes, totaling to 3044 computation-only CPU cores. The nodes have a range of 12GB to 128GB per node shared memory, and 1.5 PB of shared and scratch storage including the two Panasas and GlusterFS parallel filesystems. Four of the 12-core nodes have a total of 16 general purpose NVIDIA Fermi GPU accelerator cards in order to support hybrid parallelization efforts. The nodes are clustered via a fast Infiniband low latency network fabric in order to enable efficient distributed parallel (MPI based) runs. The infrastructure is designed for future scaling via additional resources funded by national instrumentation grants, internal University funds, or faculty contributions from grants or start-up funds. Also available is PittGRID, a grid computing platform which recycles unused CPU cycles across campus and makes them available for research.

The Center employs full-time research faculty whose expertise cover a wide range of areas in HPC and academic research, including parallel programming for distributed, shared memory, and graphical processing units (GPUs) and various areas of theoretical and computational science and engineering. The SAM faculty are responsible for preparing training and educational material, teaching, cluster user support and consulting, and focused software development and research support for various projects at Pitt. SAM provides user support, training, and project management services on a continual basis through web 2.0 based platforms (core.sam.pitt.edu and collab.sam.pitt.edu), as well as organizing year round workshops and training sessions on cluster usage, parallel programming, and various topics in HPC based research. The Center also acts as liaison for national computational resources, through partnerships with the Pittsburgh Supercomputing Center and the NSF/XSEDE Campus Champions program.

The CSSD's Network Operations Center (NOC) was established to maintain a stable, reliable, technology environment for University research and research data and the University enterprise systems. In addition to housing the current SAM HPC hardware, the NOC monitors all of the University’s critical research and enterprise applications and services, such as the University network, specialized research clusters, the enterprise course management system, the enterprise IMAP and Exchange email systems, the enterprise student information system, the enterprise library services, the University’s central directory service, enterprise financial systems, enterprise web and portal services and central authentication systems. The NOC also maintains internet connectivity to the Pittsburgh Supercomputing Center (PSC) and Internet2. The facility is situated 12 miles from the main campus and provides 9680 square feet of dedicated data center floor space. There is a 10Gbit, high-bandwidth, fiber-optic dedicated network connection that ties the facilities to the main campus area. The data center is cooled by centrally controlled air conditioning units. A specialized in-row, refrigerant-based cooling infrastructure is dedicated for the support of the HPC clusters. The facilities are protected by a preemptive fire detection and suppression system. Power is supplied via two separate utility power feeds. There are redundant uninterruptable power supply (UPS) units and redundant diesel generators with 4,000 gallons of diesel fuel stored on site. Power is monitored in real time by systems that provide both visual and audio alarms. Switching between power feeds, UPSs, and the diesel generators is fully automated. All server racks are equipped with rack power distribution units, and redundant, monitored power feeds and switches. A power management utility provides power and cooling trend analysis. Several server racks and a UPS is dedicated exclusively to support the compute clusters. The NOC is operated as a secure environment with both external and internal video surveillance. The location of the facility is not made public, and physical access to the building is restricted by a secure access card system. The NOC is staffed 24 hours a day, seven days a week. Employed are two HPC engineers dedicated for cluster support, seven professional system engineers, seven network engineers, one network management system engineer, and nine NOC monitors. In contrast to many other network operations services in higher education and corporate environments, CSSD's NOC provides both network monitoring and applications and other critical enterprise and research system support. The sheer scale of the NOC's network monitoring activities also sets it apart from most other network operation centers in higher education. CSSD is responsible for the design, implementation, and maintenance of the network serving all five campuses of the University and links to affiliated hospitals. The University’s network joins hundreds of local Ethernets into a large, geographically-distributed network supporting more than 100,000 networked devices.

Read more at the Center web site.

Alternative (shorter) edits of SAM description text is available here

About CMMS

The University of Pittsburgh established the Center for Molecular and Materials Simulations (CMMS) in 2000, providing computational resources to researchers in the Sciences and Engineering. The effort to establish CMMS was spearheaded by researchers in the Departments of Chemistry and Chemical Engineering with support being provided by the College of Engineering and the School of Arts and Sciences.

When originally established in 2000, the resources in CMMS included a 50-CPU IBM RS6000 POWER3 workstation cluster and a 9-processor Pentiumiii cluster. Sixteen of the RS6000 computers were connected optically via switched Gigabit Ethernet to provide parallel processing applications using up to 32 CPUs. Funding for this hardware was provided by the Major Research Instrumentation (MRI) program of the National Science Foundation and the SUR program of IBM.

Between 2000 and 2003 CMMS added several new computer systems, including four 4-processor IBM 44p workstations, a 32-processor 1.0 GHz Pentiumiii cluster, a 32-processor Athlon 1700 MP cluster and a 20-processor Athlon 2200 MP cluster.

In the Fall of 2004, two new clusters–one with six 8-CPU IBM Power4+ p655 nodes, and the other with 80 nodes, each with two 2.4 GHz Opteron CPUs–were added. The Opteron nodes are connected via Gigabit Ethernet. These new systems were funded by an MRI grant form NSF and a SUR grant from IBM. In 2005, six new Opteron nodes, each with two dual-core CPUs were added to the Cluster.

In January 2007, a 24 node cluster, with an Infiniband network was installed. Each node of this cluster has two dual-core 2.6 GHz Opteron 2218 CPUs. Funding for this cluster was provided by the University.

In 2008 the computational facilities were further enchanced with the addition of 66 nodes, each with two quadcores Xeon E5430 nodes running at 2.66 GHz, and with between 8 and 16 GB of memory. All Xeon nodes are interconnected with a low-latency Infiniband fabric. These Xeon nodes were funded by the University. After the addition of these nodes, the cluster had a total of 964 cores dedicated for use by CMMS researchers.

In June of 2009, there was the most significant upgrade in hardware yet for CMMS. 98 nodes were purchased from Penguin Computing together with a QDR Infiniband fabric. All 98 nodes consist of two quad-core Intel Xeon E5550 CPUs (Nehalem). 90 of the nodes have 12 GB of RAM, and 8 have 48 GB of RAM. These latter 8 nodes have 1.8 TB of local disk for heavy I/O calculations. These Penguin nodes are managed by Scyld Clusterware, and the queue software employed is Taskmaster/Moab. With the addition of these nodes, there is now 1748 cores available for CMMS users. Funding for the purchase from Penguin Computing was provided by the University.

See here for most up-to-date hardware details.

Since October 2008, CMMS has worked in conjunction with the Center for Simulation and Modeling (SaM) to facilitate High Performance Computing research at the University of Pittsburgh. The documentation and forums web site home page for both CMMS users, and the SaM community can be found here.

Contact

You can leave a message using the contact form below. Make sure to choose a Category that best identifies your request. For mailing address, phone/fax numbers, see SAM/contact. Current cluster users: If you already have a SAM account and able to login to this website, please login and use the POST menu above to submit a support ticket or start a forum discussion rather than using this contact form.