High Availability Installation (Basic Failover)

This guide covers installing Netsocs across two Linux nodes in an active-passive high availability configuration. If Node A (master) goes down, Node B (backup) automatically takes over and Netsocs keeps running — no manual intervention required.

Scope of this guide

This configuration covers failover of the application layer only (Docker stack + shared files + Virtual IP). Databases (MySQL, MongoDB, Redis) are assumed to run on external servers and are not part of this setup.

Requirements¶

Both nodes must have¶

Ubuntu 20.04+ or Debian 11+ (or compatible)
Docker Engine ≥ 24.x installed
Docker Compose plugin installed
The netsocs-docker-compose repository cloned at the same path on both nodes (recommended: /opt/netsocs)
Network connectivity between both nodes (mutual ping)

Network ports to open between nodes¶

Port	Protocol	Direction	Purpose
112	VRRP	Both ways	Keepalived heartbeat
2049	TCP/UDP	Node B → Node A	NFS (shared volumes)
111	TCP/UDP	Node B → Node A	NFS portmapper

Information you will need¶

Variable	Description	Example
`NODE_A_IP`	Real IP of Node A (master)	`192.168.1.10`
`NODE_B_IP`	Real IP of Node B (backup)	`192.168.1.11`
`VIP_ADDRESS`	Floating Virtual IP — must not be in use	`192.168.1.100`
`NETWORK_IFACE`	Network interface name on both nodes	`eth0` / `ens3`
`AUTH_PASS`	VRRP shared password (max 8 characters)	`Ns3c2024`

VIP must be free

The VIP_ADDRESS must be an IP that is not assigned to any device on your network. Both nodes must be on the same subnet as this IP.

Step 1 — Configure Node A (Master)¶

On Node A, navigate to the Netsocs project directory and run the master setup script as root:

cd /opt/netsocs
sudo bash failover/start-as-master.sh

The script will ask for the required values interactively. If you prefer to pre-set them, export the variables before running:

export NODE_A_IP="192.168.1.10"
export NODE_B_IP="192.168.1.11"
export VIP_ADDRESS="192.168.1.100"
export NETWORK_IFACE="eth0"
export AUTH_PASS="Ns3c2024"
sudo bash failover/start-as-master.sh

What the script does on Node A¶

Installs and configures an NFS server to share Netsocs volumes with Node B
Creates the compose.override.yml symlink to activate NFS-backed volumes
Brings up the Docker stack (docker compose up -d)
Installs Keepalived and generates its configuration (MASTER mode, priority 100)
Starts Keepalived and assigns the Virtual IP to Node A
Places the notify-master.sh, notify-backup.sh, and health-check.sh scripts in /etc/keepalived/

When the script finishes, Node A is live and serving traffic on the VIP.

Step 2 — Configure Node B (Backup)¶

On Node B, navigate to the Netsocs project directory and run the backup setup script as root. Use the exact same values you used on Node A:

cd /opt/netsocs
sudo bash failover/start-as-backup.sh

What the script does on Node B¶

Installs the NFS client and mounts the 8 shared volumes from Node A (persisted in /etc/fstab)
Creates the compose.override.yml symlink (same as Node A)
Installs Keepalived and generates its configuration (BACKUP mode, priority 50)
Starts Keepalived in standby — the Docker stack is not started
Places the same notify and health-check scripts in /etc/keepalived/

Docker stack on Node B

The Netsocs stack does not run on Node B during normal operation. Keepalived will start it automatically via notify-master.sh when Node B takes the VIP.

Verification¶

After both scripts complete, run these checks.

Check Node A holds the VIP¶

On Node A:

ip addr show eth0 | grep 192.168.1.100
# The VIP should appear on the interface

Check Keepalived status¶

On either node:

systemctl status keepalived
journalctl -u keepalived -f

Check NFS mounts on Node B¶

On Node B:

df -h | grep netsocs-nfs
# Should show 8 NFS mounts

Access Netsocs¶

From any browser on your network, navigate to the VIP address:

http://192.168.1.100

Failover Behavior¶

The system follows an active-passive model:

Normal state:
  Node A → MASTER  → holds VIP  → stack running  → serving traffic
  Node B → BACKUP  → standby    → stack stopped

Node A fails:
  Node B → takes VIP  → notify-master.sh → docker compose up -d
  Node B → MASTER  → stack running  → serving traffic

Node A recovers:
  Node A → reclaims VIP (preempt)  → notify-master.sh → docker compose up -d
  Node B → releases VIP  → notify-backup.sh  → docker compose down
  Back to normal state

Failover time is approximately 3–6 seconds.

Testing Failover Manually¶

Simulate Node A failure¶

On Node A:

systemctl stop keepalived

On Node B (within ~5 seconds):

# VIP should appear on Node B
ip addr show eth0 | grep 192.168.1.100

# Docker stack should be running
docker compose ps

# Check the failover event log
tail -20 /var/log/keepalived-failover.log

Restore Node A as master¶

On Node A:

systemctl start keepalived
# Node A reclaims the VIP automatically via preempt (~10 seconds)

On Node B:

# VIP should be gone from Node B
ip addr show | grep 192.168.1.100  # no output expected

# Stack should be stopped
docker compose ps  # no containers expected

Useful Log Files¶

File	Purpose
`/var/log/keepalived-failover.log`	Failover events (VIP taken / released, stack start/stop)
`/var/log/keepalived-health.log`	Periodic health check results
`journalctl -u keepalived -f`	Live Keepalived daemon log

Common Issues¶

VIP not appearing on Node A after setup¶

Check that port 112 (VRRP) is open between nodes:

# On Node A, allow VRRP from Node B
ufw allow from 192.168.1.11 proto vrrp

# On Node B, allow VRRP from Node A
ufw allow from 192.168.1.10 proto vrrp

NFS mounts failing on Node B¶

Verify the NFS server is running on Node A and the firewall allows ports 2049 and 111:

# On Node A
showmount -e localhost       # should list 8 exports
systemctl status nfs-kernel-server

# On Node B
showmount -e 192.168.1.10   # must be reachable

Node B takes the VIP immediately after setup¶

This is expected if Node A's Keepalived was not yet running when Node B started. Once both nodes are running, Node A will reclaim the VIP automatically within ~10 seconds due to the preempt setting.