Raspberry Pi Cluster Build
Building a bioinformatic computing cluster from single-board computers
Why Build a Raspberry Pi Cluster?
Single board computers (SBCs) aren't suitable for all analyses—they have relatively limited memory and are slower than contemporary workstation CPUs. However, they excel at simple calculations and can efficiently parallelize certain jobs due to their low cost. Use cases include fitting many possible models to a single dataset, performing numerous iterative simulations, and other embarrassingly parallel problems.
Beyond practical computing, hosting a cluster in our lab provides students with the unique opportunity to experiment with cluster computing in ways that are difficult when using university-administered HPC systems. Plus, this project is a great hands-on learning experience in systems administration, distributed computing, and open-source software.
Hardware & Supplies
Most components were sourced from Amazon. With careful shopping or repurposed equipment, costs could be reduced.
| Quantity | Description | Link | Unit Cost | Total Cost |
|---|---|---|---|---|
| 25 | Raspberry Pi 4 B with 4 GB RAM | canakit | $55.00 | $1,375.00 |
| 1 | Raspberry Pi 4 B with 8 GB RAM | canakit | $89.95 | $89.95 |
| 5 | Rack and fan setups | amazon | $25.00 | $125.00 |
| 6 | Micro SD 32 GB 5-pack | amazon | $33.00 | $198.00 |
| 1 | 24-port Gigabit Ethernet switch | amazon | $65.90 | $65.90 |
| 5 | CAT6 cables 1 foot 6-pack | amazon | $12.99 | $64.95 |
| 8 | USB-C power cords 8" 3-pack | amazon | $8.99 | $71.92 |
| 6 | USB 60 Watt charging stations 6-socket | amazon | $25.99 | $155.94 |
| Total Project Cost: | $2,146.66 | |||
Not included in this cost are items already on hand and "donated" to the project: a 1TB SSD external hard drive, monitor, keyboard, mouse, 8-port Gigabit Ethernet switch, and miscellaneous cables.
Power Configuration Note: Six 6-port chargers were purchased for 36 total ports (for 26 Pis). After extensive research on power requirements, most power packs run only 4 Pis each to ensure adequate power delivery, especially when USB devices are connected.
Total System Specs
104 cores (26 quad-core processors) with 108 GB of RAM
Inspiration & References
This project builds upon excellent existing tutorials:
Part 1: Physical Assembly
Assembling the Pi racks was straightforward:
- Install heatsinks on all Pis
- Attach fans to mounting plates (designed for five Pis each)
- Combine plates into three towers: two with 8 Pis and one with 9 Pis
- Mount each Pi on standoffs and stack plates in towers
- Connect fans to pins on the Pi below each plate
The head node is a separate Raspberry Pi 4 (8GB RAM version) housed in a standard case. This design allows for future replacement with a more powerful Linux machine from System76 without disrupting worker nodes. At this stage, cabling, disk setup, and software installation remained.
Part 2: Operating System & Software Installation
OS Selection
Initial testing compared Ubuntu Server 20.04 with desktop (LUBUNTU) to the standard 32-bit Raspbian (Buster, based on Debian 10). While Ubuntu is more general-purpose, Raspbian proved more responsive with lower RAM overhead and was ultimately chosen for all nodes.
Head Node Setup
The installation process on the head node follows these steps:
- Install Raspbian Buster with system updates
- Compile R from source (version 4.0.2)
- Install required system libraries and R packages
- Install OpenMPI for distributed computing
R Compilation: Compiling R requires installing multiple dependencies. If the configure script fails, search for the missing library using patterns like "how to install libXXX on raspberry pi"—solutions are typically found in the first few results.
Note: PDF man pages failed to build (texi2any version conflict), but base R and all packages installed successfully.
SD Card Cloning & Worker Nodes
After head node configuration, use Raspbian's built-in SD card clone utility to create 25 copies (approximately 20 minutes per card). To enable headless SSH setup, add an empty file named ssh to the boot folder before first boot. However, monitor connection during initial setup is recommended to verify HDMI detection.
HDMI Detection Issue: If Pis boot without HDMI cables connected, they don't start the X11 windowing system, producing no video output even after plugging in a monitor. Always connect HDMI cables before the first boot.
Node Configuration
Configure each node (head node first, then workers) by performing these steps on boot:
1. System Configuration
Open the Pi configuration application and:
- Change the hostname (head node: "stevens"; worker nodes: "node1", "node2", etc.)
- Update the default password if desired
- Select "No" when prompted to reboot—we'll reboot after additional changes
Naming Convention: All lab computers are named after great scientists. The head node is named "stevens" after Nettie Stevens, a pioneer in genetics who wrote the groundbreaking Studies in Spermatogenesis volumes (1905-1906) on sex chromosomes.
2. Static IP Configuration
Set up static IP addresses to enable consistent node identification:
Add these lines to the file (increment the third octet for each node):
For example: head node uses 10.0.0.1, node1 uses 10.0.0.2, etc., up to 10.0.0.26.
3. Enable SSH
SSH is disabled by default and doesn't persist after reboot when enabled through the GUI. Enable it via command line:
Verify SSH status:
4. Reboot and Verify
After reboot, verify the hostname change and confirm SSH is running.
SSH Key Exchange
Generate SSH keys for passwordless communication between nodes:
Press Enter through all prompts to use defaults. Configure key sharing:
On Worker Nodes: Copy the head node's key:
On Head Node: After all nodes are configured, copy the head node's key to all workers:
Verification: From the head node, test seamless SSH access to a worker:
Part 3: Cabling & Finalization
Once all nodes are configured, finalize the physical cabling setup. In hindsight, slightly longer network cables would have provided more flexibility for cable management and rack organization. The current setup works but leaves minimal slack in some cable runs.
Part 4: Test Run
Coming soon—distributed computing benchmarks and first job submissions.
Last Updated: 2024