Cbb/Baobab: Difference between revisions
No edit summary |
No edit summary |
||
(15 intermediate revisions by the same user not shown) | |||
Line 5: | Line 5: | ||
The new cluster is made up of 6 nodes with 24 cores and 64 GB RAM on each node. | The new cluster is made up of 6 nodes with 24 cores and 64 GB RAM on each node. | ||
To use the baobab cluster setup your environment to use mpi. I generally recommend you add this to your CBB .bashrc file before you get started. | |||
<PRE> | |||
if [ ! -z `hostname | grep baobab` ]; then | |||
module add mpi/openmpi-x86_64 | |||
fi | |||
</PRE> | |||
Another recommendation is to create an ssh key pair for yourself, if you don't already have one. If your jobs are not running this is probably the reason why. | |||
<PRE>cd .ssh | |||
ssh-keygen -t dsa | |||
<enter><enter><enter> [just take the defaults and DON'T set a pass phrase] | |||
cat id_dsa.pub >> authorized_keys</PRE> | |||
After you have done this, you can ssh to the baobab headnode. You can do this from any machine connected to the internal CBB network. | |||
<PRE> | |||
ssh baobab.cbb.lan | |||
</PRE> | |||
Add the node hostkeys to your known_hosts (Only needs done once) | |||
<PRE>pdsh -w baobab-[1-6] "uptime"</PRE> | |||
Please use the PBS submission system to run your jobs. Here is a quick rundown of what you will need to know. | |||
Make yourself a PBS submission script like the following: | |||
<PRE> | <PRE> | ||
#PBS -l nodes=6:ppn=24,walltime=00:01:00 | #PBS -l nodes=6:ppn=24,walltime=00:01:00 | ||
## nodes = total number of nodes you need | |||
######################################## | ## ppn = processors per node that you will need | ||
## walltime = amount of time your job will be allowed before being forcefully removed | |||
##################################################################################### | |||
cd $PBS_O_WORKDIR | cd $PBS_O_WORKDIR | ||
## How many cores total do we have? | ## How many cores total do we have? | ||
Line 18: | Line 46: | ||
## Main execution | ## Main execution | ||
echo "Job Started at: `date`" | echo "Job Started at: `date`" | ||
mpiexec -np $NO_OF_CORES -machinefile $PBS_NODEFILE | mpiexec -np $NO_OF_CORES -machinefile $PBS_NODEFILE hello | ||
echo "Job Ended at: `date`" | echo "Job Ended at: `date`" | ||
</PRE> | </PRE> | ||
To submit the jobs to the cluster from the headnode: | To submit the jobs to the cluster from the headnode: | ||
Line 26: | Line 55: | ||
qsub <yoursPBSscript> | qsub <yoursPBSscript> | ||
</PRE> | </PRE> | ||
To check the status of your jobs: | To check the status of your jobs: | ||
Line 31: | Line 61: | ||
qstat -a | qstat -a | ||
</PRE> | </PRE> | ||
To delete your jobs from the queue: | To delete your jobs from the queue: | ||
Line 36: | Line 67: | ||
qdel <Job_Number/Identifier> | qdel <Job_Number/Identifier> | ||
</PRE> | </PRE> | ||
For more examples look here: https://wiki.cs.vt.edu/wiki/Cbb/Baobab/examples |
Latest revision as of 09:16, 11 August 2015
The Baobab Cluster
CBB's internal cluster.
The new cluster is made up of 6 nodes with 24 cores and 64 GB RAM on each node.
To use the baobab cluster setup your environment to use mpi. I generally recommend you add this to your CBB .bashrc file before you get started.
if [ ! -z `hostname | grep baobab` ]; then module add mpi/openmpi-x86_64 fi
Another recommendation is to create an ssh key pair for yourself, if you don't already have one. If your jobs are not running this is probably the reason why.
cd .ssh ssh-keygen -t dsa <enter><enter><enter> [just take the defaults and DON'T set a pass phrase] cat id_dsa.pub >> authorized_keys
After you have done this, you can ssh to the baobab headnode. You can do this from any machine connected to the internal CBB network.
ssh baobab.cbb.lan
Add the node hostkeys to your known_hosts (Only needs done once)
pdsh -w baobab-[1-6] "uptime"
Please use the PBS submission system to run your jobs. Here is a quick rundown of what you will need to know.
Make yourself a PBS submission script like the following:
#PBS -l nodes=6:ppn=24,walltime=00:01:00 ## nodes = total number of nodes you need ## ppn = processors per node that you will need ## walltime = amount of time your job will be allowed before being forcefully removed ##################################################################################### cd $PBS_O_WORKDIR ## How many cores total do we have? NO_OF_CORES=`cat $PBS_NODEFILE | egrep -v '^#'\|'^$' | wc -l | awk '{print $1}'` ## ## Main execution echo "Job Started at: `date`" mpiexec -np $NO_OF_CORES -machinefile $PBS_NODEFILE hello echo "Job Ended at: `date`"
To submit the jobs to the cluster from the headnode:
qsub <yoursPBSscript>
To check the status of your jobs:
qstat -a
To delete your jobs from the queue:
qdel <Job_Number/Identifier>
For more examples look here: https://wiki.cs.vt.edu/wiki/Cbb/Baobab/examples