Skip to content

High Availability Installation (Basic Failover)

This guide covers installing Netsocs across two Linux nodes in an active-passive high availability configuration. If Node A (master) goes down, Node B (backup) automatically takes over and Netsocs keeps running — no manual intervention required.

Scope of this guide

This configuration covers failover of the application layer only (Docker stack + shared files + Virtual IP). Databases (MySQL, MongoDB, Redis) are assumed to run on external servers and are not part of this setup.


Requirements

Both nodes must have

  • Ubuntu 20.04+ or Debian 11+ (or compatible)
  • Docker Engine ≥ 24.x installed
  • Docker Compose plugin installed
  • The netsocs-docker-compose repository cloned at the same path on both nodes (recommended: /opt/netsocs)
  • Network connectivity between both nodes (mutual ping)

Network ports to open between nodes

Port Protocol Direction Purpose
112 VRRP Both ways Keepalived heartbeat
2049 TCP/UDP Node B → Node A NFS (shared volumes)
111 TCP/UDP Node B → Node A NFS portmapper

Information you will need

Variable Description Example
NODE_A_IP Real IP of Node A (master) 192.168.1.10
NODE_B_IP Real IP of Node B (backup) 192.168.1.11
VIP_ADDRESS Floating Virtual IP — must not be in use 192.168.1.100
NETWORK_IFACE Network interface name on both nodes eth0 / ens3
AUTH_PASS VRRP shared password (max 8 characters) Ns3c2024

VIP must be free

The VIP_ADDRESS must be an IP that is not assigned to any device on your network. Both nodes must be on the same subnet as this IP.


Step 1 — Configure Node A (Master)

On Node A, navigate to the Netsocs project directory and run the master setup script as root:

cd /opt/netsocs
sudo bash failover/start-as-master.sh

The script will ask for the required values interactively. If you prefer to pre-set them, export the variables before running:

export NODE_A_IP="192.168.1.10"
export NODE_B_IP="192.168.1.11"
export VIP_ADDRESS="192.168.1.100"
export NETWORK_IFACE="eth0"
export AUTH_PASS="Ns3c2024"
sudo bash failover/start-as-master.sh

What the script does on Node A

  1. Installs and configures an NFS server to share Netsocs volumes with Node B
  2. Creates the compose.override.yml symlink to activate NFS-backed volumes
  3. Brings up the Docker stack (docker compose up -d)
  4. Installs Keepalived and generates its configuration (MASTER mode, priority 100)
  5. Starts Keepalived and assigns the Virtual IP to Node A
  6. Places the notify-master.sh, notify-backup.sh, and health-check.sh scripts in /etc/keepalived/

When the script finishes, Node A is live and serving traffic on the VIP.


Step 2 — Configure Node B (Backup)

On Node B, navigate to the Netsocs project directory and run the backup setup script as root. Use the exact same values you used on Node A:

cd /opt/netsocs
sudo bash failover/start-as-backup.sh

What the script does on Node B

  1. Installs the NFS client and mounts the 8 shared volumes from Node A (persisted in /etc/fstab)
  2. Creates the compose.override.yml symlink (same as Node A)
  3. Installs Keepalived and generates its configuration (BACKUP mode, priority 50)
  4. Starts Keepalived in standby — the Docker stack is not started
  5. Places the same notify and health-check scripts in /etc/keepalived/

Docker stack on Node B

The Netsocs stack does not run on Node B during normal operation. Keepalived will start it automatically via notify-master.sh when Node B takes the VIP.


Verification

After both scripts complete, run these checks.

Check Node A holds the VIP

On Node A:

ip addr show eth0 | grep 192.168.1.100
# The VIP should appear on the interface

Check Keepalived status

On either node:

systemctl status keepalived
journalctl -u keepalived -f

Check NFS mounts on Node B

On Node B:

df -h | grep netsocs-nfs
# Should show 8 NFS mounts

Access Netsocs

From any browser on your network, navigate to the VIP address:

http://192.168.1.100

Failover Behavior

The system follows an active-passive model:

Normal state:
  Node A → MASTER  → holds VIP  → stack running  → serving traffic
  Node B → BACKUP  → standby    → stack stopped

Node A fails:
  Node B → takes VIP  → notify-master.sh → docker compose up -d
  Node B → MASTER  → stack running  → serving traffic

Node A recovers:
  Node A → reclaims VIP (preempt)  → notify-master.sh → docker compose up -d
  Node B → releases VIP  → notify-backup.sh  → docker compose down
  Back to normal state

Failover time is approximately 3–6 seconds.


Testing Failover Manually

Simulate Node A failure

On Node A:

systemctl stop keepalived

On Node B (within ~5 seconds):

# VIP should appear on Node B
ip addr show eth0 | grep 192.168.1.100

# Docker stack should be running
docker compose ps

# Check the failover event log
tail -20 /var/log/keepalived-failover.log

Restore Node A as master

On Node A:

systemctl start keepalived
# Node A reclaims the VIP automatically via preempt (~10 seconds)

On Node B:

# VIP should be gone from Node B
ip addr show | grep 192.168.1.100  # no output expected

# Stack should be stopped
docker compose ps  # no containers expected


Useful Log Files

File Purpose
/var/log/keepalived-failover.log Failover events (VIP taken / released, stack start/stop)
/var/log/keepalived-health.log Periodic health check results
journalctl -u keepalived -f Live Keepalived daemon log

Common Issues

VIP not appearing on Node A after setup

Check that port 112 (VRRP) is open between nodes:

# On Node A, allow VRRP from Node B
ufw allow from 192.168.1.11 proto vrrp

# On Node B, allow VRRP from Node A
ufw allow from 192.168.1.10 proto vrrp

NFS mounts failing on Node B

Verify the NFS server is running on Node A and the firewall allows ports 2049 and 111:

# On Node A
showmount -e localhost       # should list 8 exports
systemctl status nfs-kernel-server

# On Node B
showmount -e 192.168.1.10   # must be reachable

Node B takes the VIP immediately after setup

This is expected if Node A's Keepalived was not yet running when Node B started. Once both nodes are running, Node A will reclaim the VIP automatically within ~10 seconds due to the preempt setting.