Changes between Version 13 and Version 14 of LimulusSoftware


Ignore:
Timestamp:
12/09/11 10:51:28 (9 years ago)
Author:
admin
Comment:

moved software to top

Legend:

Unmodified
Added
Removed
Modified
  • LimulusSoftware

    v13 v14  
    1 == Automatic Power Control == 
    2  
    3 One key design component of the new [wiki:LimulusCase Limulus Case] is software controlled power to the nodes. This feature will allow nodes to be powered-on only when needed. As an experiment, a simple script was written that monitors the Grid Engine queue. If there are jobs waiting in the queue, nodes are powered-on. Once the work is done (i.e. nothing more in the queue) the nodes are powered off). 
    4  
    5 As an example, an 8 core job was run on the Norbert cluster (in the Limulus case). The head node has 4 cores and each worker node has 2 cores for a total of 10 cores. An 8 node job was submitted via Grid Engine with only the head node powered-on. The script noticed the job waiting in the queue and turned on a single worker node to give 6 cores total, which were still not enough. Another node was powered-on and the total cores reached 8 and the job started to run. After completion, the script noticed that there was nothing in the queue and shutdown the nodes.  
    6  
    7 To see how the number of nodes changes with the work in the queue, consider the Ganglia trace below. Note that the number of nodes in the cluster load graph (green line) changes from 1 to 2 then 3 then back down to 1. Similarly the number of CPUs (red line) raises to 8 then back to down to 4 (the original 4 cores in the head node). Similar changes can be seen in the system memory. 
    8  
    9 [[Image(Ganglia-sge-control3-600x258.jpg)]] 
    10  
    11 == Software Update January 2011 ==  
     1== Limulus Software  ==  
    122The following is a list of basic cluster RPMS that will be included in the software stack. The base distribution will be Scientific Linux. 
    133 
     
    3424 * relayset - power relay control utility 
    3525 * ssmtp - mail forwarder for nodes 
     26 
     27 
     28== Automatic Power Control == 
     29 
     30One key design component of the new [wiki:LimulusCase Limulus Case] is software controlled power to the nodes. This feature will allow nodes to be powered-on only when needed. As an experiment, a simple script was written that monitors the Grid Engine queue. If there are jobs waiting in the queue, nodes are powered-on. Once the work is done (i.e. nothing more in the queue) the nodes are powered off). 
     31 
     32As an example, an 8 core job was run on the Norbert cluster (in the Limulus case). The head node has 4 cores and each worker node has 2 cores for a total of 10 cores. An 8 node job was submitted via Grid Engine with only the head node powered-on. The script noticed the job waiting in the queue and turned on a single worker node to give 6 cores total, which were still not enough. Another node was powered-on and the total cores reached 8 and the job started to run. After completion, the script noticed that there was nothing in the queue and shutdown the nodes.  
     33 
     34To see how the number of nodes changes with the work in the queue, consider the Ganglia trace below. Note that the number of nodes in the cluster load graph (green line) changes from 1 to 2 then 3 then back down to 1. Similarly the number of CPUs (red line) raises to 8 then back to down to 4 (the original 4 cores in the head node). Similar changes can be seen in the system memory. 
     35 
     36[[Image(Ganglia-sge-control3-600x258.jpg)]] 
     37