Building and running lustre clients

We should use the lustre client version recommended from Seagate. Currently, the recommended version is 2.5.1. Lustre server does not support higher version of lustre client yet.

wget https://downloads.hpdd.intel.com/public/lustre/lustre-2.5.1/el6/client/S...

To monitor MobyDisk visit https://cstor00.frank.sam.pitt.edu/ or https://cstor01.frank.sam.pitt.edu/. The username is admin and the password is the Frank root password.

Yum install

Yum install is not recommended. You should always use custom builds. If you would like to try yum install, here is the command.

> yum --nogpgcheck install lustre-client

Custom builds

You can download the source:

wget https://downloads.hpdd.intel.com/public/lustre/lustre-2.5.1/el6/client/S...

Full OS

The following example will build RPMs for a full OS node.

Unmount Lustre (if already mounted)

umount /mnt/mobydisk

Unload Lustre Modules

lustre_rmmod

Ethernet

rpm -ivh --nodeps lustre-client-2.5.1-2.6.32_431.5.1.el6.x86_64.src.rpm
cd /root/rpmbuild/SOURCES/
 
tar xzvf lustre-2.5.1.tar.gz
 
cd lustre-2.5.1
 
./configure --disable-server --with-linux=/usr/src/kernels/2.6.32-504.16.2.el6.x86_64
 
make && make rpms

Note the linux kernel version should match the kernel version of the computer. Use 'uname -r' to check the linux kernel version.

IB

The difference is the configure step.

./configure --disable-server --with-linux=/usr/src/kernels/2.6.32-358.23.2.el6.x86_64 --with-o2ib=/usr/src/ofa_kernel/default

You should install the ofa_kernel under /usr/src/ofa_kernel/default.

Check whether Lustre client is already installed.

rpm -qa |grep lustre
# unload lustre client
rpm -e <package name>

Install Lustre client 2.5.1.

cd /root/rpmbuild/RPMS/x86_64
rpm –Uvh <package name> 

Warewulf

If building on a Warewulf headnode then make sure that the compute nodes' chroot has kernel-devel and kernel-headers installed. Here we assume that IB is used.

rpmbuild --define 'kversion 2.6.32-358.23.2.el6.x86_64' --define 'kdir /var/chroots/golden/usr/src/kernels/2.6.32-358.23.2.el6.x86_64/' --define "configure_args --with-o2ib=/var/chroots/golden/usr/src/ofa_kernel/default" --rebuild --without servers lustre-client-2.5.1-2.6.32_431.5.1.el6.x86_64.src.rpm
 
# check installed lustre clients
yum --nogpgcheck -c /var/chroots/golden/root/yum-ww.conf --tolerant --installroot /var/chroots/golden/ list installed |grep lustre
 
# remove lustre clients
yum --nogpgcheck -c /var/chroots/golden/root/yum-ww.conf --tolerant --installroot /var/chroots/golden/ remove <package name>
 
# install lustre client 2.5.1
cd /root/rpmbuild/RPMS/x86_64/
yum --nogpgcheck -c /var/chroots/golden/root/yum-ww.conf --tolerant --installroot /var/chroots/golden/ --setopt=obsoletes=0 localinstall <package name>
 
 
>wwvnfs golden --chroot=/var/chroots/golden
Creating VNFS image from golden
Building new chroot...
Building and compressing the final image
Cleaning temporary files
WARNING:  Do you wish to overwrite 'golden' in the Warewulf data store?
Yes/No> yes
>wwbootstrap --chroot=/var/chroots/golden 2.6.32-358.23.2.el6.x86_64
>#reboot compute nodes
  • RPMs for Scyld compute/head nodes are built by Penguin Computing. Contact support for more information.

startup and shutdown scripts

On Warewulf special care has to be taken in making sure that the lustre kernel modules are unloaded early in the reboot process or compute nodes may hang.

The included script has been provided to handle this scenario.

The following steps have to be taken when adding lustre to a warewulf image

  1. Copy mount_lustre to /var/chroots/IMAGE/etc/init.d and make it executable
  2. Chroot into the image and run chkconfig mount_lustre on
  3. While in the chroot image run ln -s ../init.d/mount_lustre /etc/rc6.d/K01mount_lustre
  4. Exit the chroot and rebuild the VNFS

Configure lnet for clients

There are two classes of lustre clients based on network connectivity. The contents of the files below must be put in /etc/modprobe.d/lnet.conf before mounting MobyDisk.

InfiniBand clients

options lnet networks=o2ib(ib0)
options lnet dead_router_check_interval=60
options lnet avoid_asym_router_failure=1
options lnet check_routers_before_use=1
options lnet live_router_check_interval=60
options lnet router_ping_timeout=50
options lnet large_router_buffers=1025 small_router_buffers=16384

Ethernet clients

Clients that do not have a direct IB connection to MobyDisk must route traffic through mobydisk-router (see below).

options lnet dead_router_check_interval=60
options lnet avoid_asym_router_failure=1
options lnet check_routers_before_use=1
options lnet live_router_check_interval=60
options lnet router_ping_timeout=50
options lnet large_router_buffers=1025 small_router_buffers=16384
options lnet networks="tcp(<interface>)"
options lnet routes="o2ib 10.201.0.[10,20]@tcp"
options ko2iblnd peer_credits=16
options ksocklnd tx_buffer_size=0
options ksocklnd rx_buffer_size=65536
  • NOTE: remember to change <interface> to the correct value for the internal 10.201 network.

Mouting

Finally, mount the filesystem

modprobe lustre
mount -t lustre 10.201.0.24@o2ib:10.201.0.23@o2ib:/mobydisk /mnt/mobydisk

Or add the following to /etc/fstab

10.201.0.24@o2ib:10.201.0.23@o2ib:/mobydisk /mnt/mobydisk lustre defaults 0 0

Usually no other actions are required to mount. The kernel modules will be loaded automatically.

Stopping kernel plugins

If something goes wrong or during an upgrade of WhamCloud only run lustre_rmmod to stop all kernel modules.

Client Tuning

Proper tuning of the Lustre clients is described in the attached PowerPoint file, which was provided by Mike Solari (Xyratex).

FDR IB clients (2.6)

echo 384 > /proc/fs/lustre/osc/mobydisk-OST[0000-0007]-osc-fff88010a937000/max_dirty_mb
echo 256 > /proc/fs/lustre/osc/mobydisk-OST[0000-0007]-osc-fff88010a937000/max_rpcs_in_flight
lclt set_param ldlm.namespaces.*osc*.lru_size=0

QDR IB clients (2.6)

echo 256 > /proc/fs/lustre/osc/mobydisk-OST[0000-0007]-osc-fff88010a937000/max_dirty_mb
echo 256 > /proc/fs/lustre/osc/mobydisk-OST[0000-0007]-osc-fff88010a937000/max_rpcs_in_flight
lclt set_param ldlm.namespaces.*osc*.lru_size=0

TCP-based clients; including VMs (2.6)

echo 32 > /proc/fs/lustre/osc/mobydisk-OST[0000-0007]-osc-fff88010a937000/max_dirty_mb
echo 32 > /proc/fs/lustre/osc/mobydisk-OST[0000-0007]-osc-fff88010a937000/max_rpcs_in_flight
lclt set_param ldlm.namespaces.*osc*.lru_size=0

Lustre v1.8.8/1.8.9 clients

echo 128 > /proc/fs/lustre/osc/mobydisk-OST[0000-0007]-osc-fff88010a937000/max_dirty_mb
echo 32 > /proc/fs/lustre/osc/mobydisk-OST[0000-0007]-osc-fff88010a937000/max_rpcs_in_flight
lclt set_param ldlm.namespaces.*osc*.lru_size=(ncpus*100)

Note: These settings are not persistent, meaning that if Lustre is unmounted for any reason, they must be reset by an admin.

Lustre Routers

A single lustre router provides access to MobyDisk to clients with only an ethernet connection. Lustre routers need at least the lustre-client-modules RPM package installed.

  • mobydisk-router.sam.pitt.edu has a 2 bonded 10Gbps ethernet connection and a single QDR IB connection.

To start the lustre router run the following commands

> modprobe ksocklnd
> modprobe ko2iblnd
> modprobe lustre
> modprobe mgc

service lnet start may also work.

The /etc/modprobe.d/lnet.conf file is as follows. The <interface> is the ethernet interface name and <ip> is the IP address of the ethernet interface.

options lnet networks=tcp0(bond1)
options lnet dead_router_check_interval=60
options lnet avoid_asym_router_failure=1
options lnet check_routers_before_use=1
options lnet live_router_check_interval=60
options lnet router_ping_timeout=50
options lnet large_router_buffers=1025 small_router_buffers=16384
options lnet networks="tcp(bond1),o2ib(ib0)" forwarding=enabled
options lnet routes="o2ib0 10.201.0.[10,20]@tcp0"
options ksocklnd peer_credits=16
  • NOTE: in this config and in the client config above the 10.201.0.10 IP address is left in incase a second router is added to the network. In this event clients will not need to be remounted.

Notes from Fangping Mu

For "Permission Denied" error after failover, Mathieu Dube from seagate fixed it. This is a user and group upcall problem. If it comes up again, please run the following on the MDS node:

echo NONE > /proc/fs/lustre/mdt/mobydisk-MDT0000/identity_upcall

The user has to be in the MDS passwd files, or you must set the upcall so that the MDS can lookup user permissions. This sentence avoid the MDS know the uid:gid clients.

AttachmentSize
ClientConnectivity_Benchmarking_0514_ver2.pptx953.81 KB
mount_lustre.1.89 KB