Version 6 (modified by admin, 14 years ago) (diff) |
---|
Automatic Power Control
One key design component of the new Limulus Case is software controlled power to the nodes. This feature will allow nodes to be powered-on only when needed. As an experiment, a simpel script was written that monitors the Grid Engine queue. If there are jobs waiting in the queue, nodes are powered-on. Once the work is done (i.e. nothing more in the queue) the nodes are powered off).
As an example, an 8 core job was run on the Norbert cluster (in the Limulus case). The head node has 4 cores and each worker node has 2 cores for a total of 10 cores. An 8 node job was submitted via Grid Engine with only the head node powered-on. The script noticed the job waiting in the queue and turned on a single worker node to give 6 cores total, which were still not enough. Another node was powered-on and the total cores reached 8 and the job started to run. After completeion, the script noticed that there was nothing in the queue and shutdown the nodes.
Update January 2011
The following is a list of basic cluster RPMS that will be included in the software stack. The base distribution will be Scientific Linux.
- Scientific Linux V5.4
- Perceus Cluster Toolkit - Cluster administration
- PDSH - Parallel Distributed Shell for collective administration
- Sun Grid Engine - Resource Scheduler
- Torque - Alternative/Optional? Resource Scheduler (previously Open PBS)
- Ganglia - Cluster Monitoring System
- GNU Compilers (gcc, g++, g77, gdb) - Standard GNU compiler suite
- Modules - Manages User Environments
- PVM - Parallel Virtual Machine (message passing middleware)
- MPICH2 - MPI Library (message passing middleware)
- OPEN-MPI - MPI Library (message passing middleware)
- ATLAS - host tuned BLAS library
- FFTW - Optimized FFT (2-MPI,3) library
- FFTPACK - FFT library
- LAPACK and BLAS - Linear Algebra library
- GNU GSL - GNU Scientific Library (over 1000 functions)
- Userstat - a "top" like job queue/node monitoring application
- Beowulf Performance Suite - benchmark and testing suite
Attachments
-
Ganglia-sge-control3-600x258.jpg
(52.6 KB) - added by admin 14 years ago.