GlusterFS on ARM

GlusterFS on ARM

GlusterFS on ARM

ARM devices like the Raspberry Pi offer an inexpensive way to learn and play with new technologies. Combine a few Raspberry Pis with GlusterFS to create a distributed and fault tolerant storage system that can be used by Docker and Kubernetes.

What is GlusterFS

GlusterFS is a clustered file system. Without going into the mechanics, clustered file systems provide a reliable process for files to be replicated among all of the cluster file system nodes. GlusterFS is well documented and free, which provides an excellent way to learn how clustered file systems work and how they can fit into a private (non-cloud) storage solution.

Hardware setup

4x Raspberry Pi 3B
4x Micro SD cards
4x USB thumb drives
1x 4-layer stackable case
1x USB Power (4 x 2.0A)
1x Pack of micro USB cables
1x 5 port gigabit ethernet switch
1x Pack of flat/flexible network cables

Installing XFS and GlusterFS packages

The XFS and GlusterFS packages need to be installed on all devices. Log into each Raspberry Pi and run:

sudo apt-get install -y xfsprogs glusterfs-server

Preparing the thumb drives

Note: The following steps will wipe all the data from the thumb drives, ensure that any data that needs to be saved is copied from the thumb drive before proceeding.

Ensure the thumb drives are inserted into the Raspberry Pis before continuing. First we need to identify the device name for each thumb drive. This should be /dev/sda if only a single thumb drive is inserted, and the lsblk command can be used to confirm:

lsblk
NAME        MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda           8:0    1 57.9G  0 disk 
└─sda1        8:1    1 57.9G  0 part 
mmcblk0     179:0    0 29.7G  0 disk 
├─mmcblk0p1 179:1    0 43.2M  0 part /boot
└─mmcblk0p2 179:2    0 29.7G  0 part /

As shown in the output above, there are two drives; sda and mmcblk0. The mount point of / and /boot indicate that mmcblk0 is the Micro SD card that Raspbian is installed on, and sda is a 57.9GB disk with no mount point which is the USB thumb drive. Make note of the device name for the USB thumb drive and substitute sda in the commands below as necessary.

The USB thumb drive needs to be partitioned and formatted for GlusterFS to use. The first step is to create a new partition with the correct type using fdisk. On all Raspberry Pis execute the following commands, which are highlighted in bold in the output below.

  1. sudo fdisk -w auto /dev/sda
  2. g
  3. n
  4. (Just press the Enter key to accept the default)
  5. (Just press the Enter key to accept the default)
  6. (Just press the Enter key to accept the default)
  7. w
sudo fdisk -w auto /dev/sda
Welcome to fdisk (util-linux 2.29.2).
Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.
Command (m for help): g
Created a new GPT disklabel (GUID: 8BAE955A-00E2-4FAB-BF57-79B32194E5CD).
Command (m for help): n
Partition number (1-128, default 1): 
First sector (2048-121307102, default 2048): 
Last sector, +sectors or +size{K,M,G,T,P} (2048-121307102, default 121307102):
Created a new partition 1 of type 'Linux filesystem' and of size 57.9 GiB.
Command (m for help): w
The partition table has been altered.
Calling ioctl() to re-read partition table.
Syncing disks.

The new partition on each thumb drive must be formatted with an XFS file system for GlusterFS.

sudo mkfs.xfs -f -L myvol-brick1 /dev/sda1

Mounting the thumb drives

The thumb drive must be mounted before it can be used. Using the GlusterFS administrator guide as a reference, the naming convention is /data/glusterfs/[volume]/[brick]/brick. On all Raspberry Pis, create the directory to mount the USB drive at.

sudo mkdir -p /data/glusterfs/myvol1/brick1/

Using the raw device path, such as /dev/sda1, is not recommended for USB drives because the device path can change when more than one USB drive is plugged in. The recommended and modern approach is to use the partition UUID to identify the target partition, as Raspbian is already doing if /etc/fstab is inspected. The blkid command can be used to find the partition UUID, and the output is passed to tee to write to /etc/fstab.

printf $(sudo blkid -o export /dev/sda1|grep PARTUUID)" /data/glusterfs/myvol1/brick1 xfs defaults,noatime 1 2\n" | sudo tee -a /etc/fstab

Now the partition is ready to be mounted so it can be used by glusterd.

sudo mount /data/glusterfs/myvol1/brick1

Configuring GlusterFS

The gluster daemon must be configured to know what the other nodes in the cluster are. From one Raspberry Pi run the following command 3 times, replacing PEER_HOSTNAME with the hostname of the other 3 Raspberry Pis. Gluster will automatically share the

sudo gluster peer probe PEER_HOSTNAME
peer probe: success.

Now the status of the gluster network should be checked to ensure proper configuration on each Raspberry Pi.

sudo gluster peer status
Number of Peers: 3
Hostname: pi-node2
Uuid: 3abf3a5f-1d41-47f9-b4b4-0f1d92def46d
State: Peer in Cluster (Connected)
Hostname: pi-node3
Uuid: d635c767-1257-41f5-b598-0be84bbfa75d
State: Peer in Cluster (Connected)
Hostname: pi-node4
Uuid: d635c767-1257-41f5-b598-0be84bbfa75d
State: Peer in Cluster (Connected)

With the gluster network running, a new volume needs to be created. Be sure to replace HOSTNAMEX with your hostnames. Since I want each of my Raspberry Pis to have a copy of the data, I have specified replica 4 and provided all 4 hosts and paths that will replicate the data. Tailor these to how you want your clustered file system to be replicated.
Note: Notice at the end of the path an additional /brick was added to the thumb drive mount point of /data/glusterfs/myvol1/brick1. This is done intentionally so that if the thumb drive is not mounted, gluster will fail to start the volume. This prevents gluster from replicating the contents of the cluster volume to the root drive (in this case the Micro SD card where Raspbian is installed).

sudo gluster volume create myvol1 replica 4 HOSTNAME1:/data/glusterfs/myvol1/brick1/brick HOSTNAME2:/data/glusterfs/myvol1/brick1/brick HOSTNAME3:/data/glusterfs/myvol1/brick1/brick HOSTNAME4:/data/glusterfs/myvol1/brick1/brick
volume create: myvol1: success: please start the volume to access data

Then the volume needs to be started.

sudo gluster volume start myvol1
volume start: myvol1: success

Testing GlusterFS

With the gluster volume running, we should conduct a test to ensure the data replicates as expected. One one node only, create a directory to mount the clustered volume, mount the volume, create a test file, and unmount the clustered volume.

sudo mkdir -p /mnt/myvol1
sudo mount -t glusterfs localhost:/myvol1 /mnt/myvol1
sudo touch /mnt/myvol1/helloworld.txt
sudo umount /mnt/myvol1

Now we can check the thumb drive to see if helloworld.txt was replicated to all the nodes. On each Raspberry Pi, check the thumb drive for files.

ls -al /data/glusterfs/myvol1/brick1/brick
total 0
drwxr-xr-x  4 root root  63 Jul 13 09:04 .
drwxr-xr-x  3 root root  19 Jul 13 09:04 ..
drw------- 10 root root 194 Jul 13 09:04 .glusterfs
-rw-r--r--  2 root root   0 Jul 13 09:04 helloworld.txt
drwxr-xr-x  3 root root  25 Jul 13 09:04 .trashcan

That is all it takes to get your very own GlusterFS clustered file system running! Next step is to incorporate it into your service architecture to put it to good use.


Recent Posts

Archives

Recent Posts