Stella Computer Cluster Architecture
Hardware
Stella is composed of:
- Four identical 4-way nodes (comp02-comp05), each of them with:
- 4 dual-core opterons 2.4GHz
- 16GB RAM
- 73GB SCSI320 HD
- Infinipath HTX adapter [not delivered yet]
- 1U chassis
- One extensible 4-way node (comp00):
- Same configuration as comp02-05 plus:
- Extra 300GB SCSI320 HD
- 4U chassis with available PCI-Express slots
- One 2-way node (comp01)
- 2 dual-core opterons 2.4GHz
- 8GB RAM
- 73GB SATA Raptor HD
- Infinipath PCI-Express adapter
- 4U chassis with available HTX and PCI-Express slots
- Front-end node + NFS file server (stella)
- 2 single-core opterons 2.4GHz
- 4GB RAM
- 150GB RAID1
- 2TB RAID5 accessible via NFS via Gigabit ethernet
- Cluster Management Appliance (CMA)
- Gigabit Ethernet Switch
- Silverstorm Infiniband Switch
- UPS
Cluster interconnect
The following figure outlines the functional connectivity within the cluster. Users connect via the front-end node, which is a full Linux system with shells, compilers, editors, etc. Jobs are submitted to the front-end node's scheduler, which distributes them to the compute nodes via gigabit ethernet. The compute nodes have a further fast interconnect for high-bandwidth low-latency communications.
Gigabit ethernet
The Gigabit ethernet switch is connecting all the components of the cluster together (front-end node, compute nodes and CMA). The Gbit ethernet LAN is mainly used for the following tasks:
- Job control between front-end node and compute nodes
- NFS file server transfers
- Monitoring functions by the CMA
- Serial-Over-Ethernet communications
MPI Infinipath/Infiniband interconnect
The Infinipath/Infiniband interconnect is (in 2006) the interconnect which provides the lowest latency for MPI communications. A tiny fraction of the latency was additionally saved by selecting Infinipath HTX adapters instead of PCI-Express ones.
This low-latency interconnect is exploited by the MPI libraries installed on the cluster.
Software
The front-end and compute nodes are based on Suse 9.3. As such, all the usual linux tools and libraries are available:
- gcc
- liblapack, libscalapack
- etc.
- Java 1.5 and 1.6
- Synopsys VCS with SystemC
- Matlab with Distributed Toolkit
Scheduler: Sun N1 Grid Engine
Every task intended to run on the cluster's compute nodes must be submitted to the Sun Grid Engine software: Grid Engine keeps track of available resources, running jobs, user history etc., and automatically decides where and when the various jobs should run to use the compute nodes at the best of their capabilities and share the resources equitably between users.
TODO: Add a description of all those useful things in SGE. Add description of queues.
Network File System
Two common ways of copying files accross computers are by using scp and by sharing them via NFS.
Scp is available to and from Stella.
Regarding NFS, we decided not to mount any departmental NFS share on stella for two reasons:
- They are unreliable: machines mounting them usually hang when the department's network goes down.
- A file written to a department's NFS file server goes via a series of 100Mbps switches.
Stella is linked to the APT group via a 1Gbps switch.
