Thursday, October 20, 2016

[Linux Cluster: PBS/ Torque]: Installing Torque 4.2.5 on CentOS 6

Installing Torque 4.2.5 on CentOS 6

References:
Do take a look at the Torque Admin Manual
Step 1: Download the Torque Software from Adaptive Computing
Download the Torque tarball from Torque Resource Manager Site
Step 2: Ensure you have the gcc, libssl-devel, and libxml2-devel packages
# yum install libxml2-devel openssl-devel gcc gcc-c++
Step 3: Configure the Torque Server
./configure \
--prefix=/opt/torque \
--exec-prefix=/opt/torque/x86_64 \
--enable-docs \
--disable-gui \
--with-server-home=/var/spool/torque \
--enable-syslog \
--with-scp \
--disable-rpp \
--disable-spool \
--enable-gcc-warnings \
--with-pam
Step 4: Compile the Torque
# make -j8
# make install
Step 5: Configure the trqauthd daemon to start automatically at system boot for the PBS Server
# cp contrib/init.d/trqauthd /etc/init.d/
# chkconfig --add trqauthd
# echo /usr/local/lib > /etc/ld.so.conf.d/torque.conf
# ldconfig
# service trqauthd start
Step 6: Copy the pbs_server and pbs_sched daemon for the PBS Server
# cp contrib/init.d/pbs_server /etc/init.d/pbs_server
# cp contrib/init.d/pb_sched /etc/init.d/pbs_sched
Step 6: Initialize serverdb by executing the torque.setup script for the PBS Server
# ./torque.setup root
Step 7: Make self-extracting tarballs packages for Client Nodes
# make packages
Building ./torque-package-clients-linux-i686.sh ...
Building ./torque-package-mom-linux-i686.sh ...
Building ./torque-package-server-linux-i686.sh ...
Building ./torque-package-gui-linux-i686.sh ...
Building ./torque-package-devel-linux-i686.sh ...
Done
Step 7b. Run libtool –finish /opt/torque/x86_64/lib
libtool: finish: PATH="/opt/xcat/bin:/opt/xcat/sbin:/opt/xcat/share/xcat/tools:/usr/lib64/qt-3.3/bin:/usr/local/intel/composer_xe_2011_sp1.11.339/bin/intel64:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/opt/ibutils/bin:/usr/local/intel/composer_xe_2011_sp1.11.339/mpirt/bin/intel64:/opt/maui/bin:/opt/torque/x86_64/bin:/root/bin:/sbin" ldconfig -n /opt/torque/x86_64/lib
----------------------------------------------------------------------
Libraries have been installed in:
/opt/torque/x86_64/lib

If you ever happen to want to link against installed libraries
in a given directory, LIBDIR, you must either use libtool, and
specify the full pathname of the library, or use the `-LLIBDIR'
flag during linking and do at least one of the following:
- add LIBDIR to the `LD_LIBRARY_PATH' environment variable
during execution
- add LIBDIR to the `LD_RUN_PATH' environment variable
during linking
- use the `-Wl,-rpath -Wl,LIBDIR' linker flag
- have your system administrator add LIBDIR to `/etc/ld.so.conf'

See any operating system documentation about shared libraries for
more information, such as the ld(1) and ld.so(8) manual pages.
----------------------------------------------------------------------

Step 8a: Copy and install on the Client Nodes
for i in node01 node02 node03 node04 ; do scp torque-package-mom-linux-i686.sh ${i}:/tmp/. ; done
for i in node01 node02 node03 node04 ; do scp torque-package-clients-linux-i686.sh ${i}:/tmp/. ; done
for i in node01 node02 node03 node04 ; do ssh ${i} /tmp/torque-package-mom-linux-i686.sh --install ; done
for i in node01 node02 node03 node04 ; do ssh ${i} /tmp/torque-package-clients-linux-i686.sh --install ; done
Step 8b: Alternatively, you can use xCAT to push and run the packages from the PBS Server to the Client Node (auuming you install XCAT on the PBS Server)
# pscp  torque-package-mom-linux-i686.sh compute_noderange:/tmp
# pscp torque-package-clients-linux-i686.sh compute_noderange:/tmp
# psh compute_noderange:/tmp/torque-package-mom-linux-i686.sh
# psh compute_noderange:/tmp/torque-package-clients-linux-i686.sh
Step 9: Enabling Torque as a service for the Client Node
# cp contrib/init.d/pbs_mom /etc/init.d/pbs_mom
# chkconfig --add pbs_mom
Step 10a: Start the Services for each of the client nodes
# service pbs_mom start
Step 10b: Alternatively, Use XCAT to start the service for all the Client Node
# psh compute_noderange "/sbin/service/pbs_mom start"

Configuring the Torque Default Queue


qmgr -c "create queue dqueue"
qmgr -c "set queue dqueue queue_type = Execution"
qmgr -c "set queue dqueue resources_default.neednodes = dqueue"
qmgr -c "set queue dqueue enabled = True"
qmgr -c "set queue dqueue started = True"

qmgr -c "set server scheduling = True"
qmgr -c "set server acl_hosts = headnode.com"
qmgr -c "set server default_queue = dqueue"
qmgr -c "set server log_events = 127"
qmgr -c "set server mail_from = Cluster_Admin"
qmgr -c "set server query_other_jobs = True"
qmgr -c "set server resources_default.walltime = 240:00:00"
qmgr -c "set server resources_max.walltime = 720:00:00"
qmgr -c "set server scheduler_iteration = 60"
qmgr -c "set server node_check_rate = 150"
qmgr -c "set server tcp_timeout = 6"
qmgr -c "set server node_pack = False"
qmgr -c "set server mom_job_sync = True"
qmgr -c "set server keep_completed = 300"
qmgr -c "set server submit_hosts = headnode1.com"
qmgr -c "set server submit_hosts += headnode2.com"
qmgr -c "set server allow_node_submit = True"
qmgr -c "set server auto_node_np = True"
qmgr -c "set server next_job_number = 21293"

No comments: