OpenBSD Handbook

    • Part I. Install & Configure
      • Introduction
      • Installing OpenBSD
      • The X Window System
      • Networking
      • System Configuration
      • OpenBSD Basics
      • Managing Software: Packages and Ports
    • Part II. Daily Operations
      • Graphical Environments
      • Multimedia
      • Printing
      • Linux Compatibility
      • Windows Compatibility
      • Games
    • Part III. System Administration
      • Security
      • Virtualization
      • Storage and File Systems
      • Updating and Upgrading
      • Localization
      • The OpenBSD Boot Process
    • Part IV. Networking & Daemons
      • Services
        • Database
          • MariaDB
          • PostgreSQL
          • Redis
          • memcached
        • Directory
          • YP (NIS)
          • LDAP
        • File
          • NFS
          • Samba
        • FTP Services
          • ftpd
          • ProFTPD
          • vsftpd
          • TFTP
        • Mail
          • Dovecot
          • smtpd
          • Postfix
          • Exim
          • Rspamd
        • Name
          • Named
          • Unbound
          • NSD
        • Networking
          • OpenBGPD
          • rtadvd
          • DHCP
          • slaacd
        • Web
          • Apache
          • nginx
          • httpd
          • relayd
        • Logging
          • syslogd
        • Monitoring
          • SNMP
        • Remote Access
          • Audit OpenSSH
          • sshd
        • File Synchronization
          • rsync
        • Messaging
          • RabbitMQ
        • Time
          • NTP
      • PF
        • pfctl cheat sheet
        • PF Anchors
        • PF Filter Rules
        • PF Forwarding
        • PF Lists and Macros
        • PF Load Balancing
        • PF Logging
        • PF NAT
        • PF Options
        • PF Policies
        • PF Shortcuts
        • PF Tables
      • Advanced Networking
        • High Availability and State Replication
        • Multi-WAN and Policy-Based Routing
        • VPN and Cryptographic Tunneling
        • Classic and Lightweight Tunnels
        • IPv6 at Scale
        • QoS and Traffic Shaping
        • MPLS and Label Distribution
        • Network Services at Scale
        • Virtualization and Host Networking
        • Large-Scale L2 and L3 Design
        • Telemetry, Logging, and Flow Export
        • Hardening and Operational Safety
        • Reference Architectures
        • Troubleshooting Playbooks
      • Serial Communication
    • Part V. Miscellaneous
      • Virtualization Cheat Sheet
      • OpenBSD Cheatsheet
      • Howto
        • Install Z shell (zsh)
        • Set Up WordPress
        • Build a Simple Router and Firewall
      • OpenBSD for Linux Users
      • OpenBSD for FreeBSD Users
      • OpenBSD for macOS Users
    • Package Search
      High Availability and State Replication
      • Synopsis
      • Design Considerations
      • Configuration
        • System settings (both nodes)
        • Node A (preferred MASTER)
        • Node B (BACKUP)
        • PF policy (both nodes)
        • Optional: L4 VIP for a service using relayd (both nodes)
        • Optional: event-driven demotion on SYNC or upstream loss
      • Verification
      • Troubleshooting
      • See Also

      High Availability and State Replication

      Synopsis #

      This chapter describes building a two-node OpenBSD firewall or gateway cluster with carp(4) virtual IPs, pfsync(4) state replication, and optional service failover using relayd(8) . It provides canonical configurations for inside and outside virtual IPs (VIPs), a dedicated state-sync network, and safe failover behavior. Use this pattern whenever an outage of a single gateway or firewall would impact production traffic. Packet filtering is configured in pf.conf(5) and managed at runtime with pfctl(8) .

      Design Considerations #

      • Topology. Use three physical interfaces per node: WAN, LAN, and SYNC. Place pfsync(4) on a private L2 segment with no other traffic.
      • Addressing. Assign each node a fixed unicast address on WAN and LAN, and create per-segment CARP VIPs with unique VHIDs. Example:
        • WAN: node A 203.0.113.2/29, node B 203.0.113.3/29, VIP 203.0.113.1/29, VHID 10
        • LAN: node A 10.10.10.2/24, node B 10.10.10.3/24, VIP 10.10.10.1/24, VHID 20
      • Priority and preemption. Lower advskew wins. Enable net.inet.carp.preempt=1 so the preferred node retakes MASTER after repair.
      • Isolation and filtering. Allow proto carp on member interfaces and proto pfsync only on the SYNC interface. Skip filtering on the SYNC interface.
      • Failure domains. A SYNC link failure must trigger CARP demotion so a healthy peer becomes MASTER. Monitor upstream reachability if asymmetric failures are possible.
      • Configuration parity. Keep identical PF and service configuration on both nodes. States replicate; rules do not.
      • Operational guardrails. Test failover during maintenance windows. Log CARP events at an elevated level during rollout.

      Configuration #

      The examples assume interface names em0 (WAN), em1 (LAN), and em2 (SYNC). Adjust to your hardware. Interface files are described in hostname.if(5) . Interface configuration and CARP parameters are managed with ifconfig(8) . System controls are set with sysctl(8) and persisted via sysctl.conf(5) . Services are managed with rcctl(8) .

      System settings (both nodes) #

      # printf '%s\n' \
          'net.inet.carp.preempt=1' \
          'net.inet.carp.log=2' \
          >> /etc/sysctl.conf
        # Enable CARP preemption and raise log verbosity; persists across reboots
      
      # sysctl net.inet.carp.preempt=1 net.inet.carp.log=2
        # Apply immediately at runtime
      

      Node A (preferred MASTER) #

      # cat > /etc/hostname.em0 <<'EOF'
      inet 203.0.113.2 255.255.255.248
      EOF
        # WAN unicast on em0
      
      # cat > /etc/hostname.em1 <<'EOF'
      inet 10.10.10.2 255.255.255.0
      EOF
        # LAN unicast on em1
      
      # cat > /etc/hostname.em2 <<'EOF'
      inet 10.255.255.2 255.255.255.252
      EOF
        # SYNC network on em2 (private /30)
      
      # cat > /etc/hostname.carp0 <<'EOF'
      inet 203.0.113.1 255.255.255.248 vhid 10 carpdev em0 pass sharedsecret advskew 0
      EOF
        # WAN VIP: VHID 10, lowest advskew to prefer MASTER
      
      # cat > /etc/hostname.carp1 <<'EOF'
      inet 10.10.10.1 255.255.255.0 vhid 20 carpdev em1 pass sharedsecret advskew 0
      EOF
        # LAN VIP: VHID 20
      
      # cat > /etc/hostname.pfsync0 <<'EOF'
      up syncdev em2
      EOF
        # pfsync over em2; multicast by default on the SYNC segment
      
      # sh /etc/netstart em0 em1 em2 carp0 carp1 pfsync0
        # Bring interfaces up now without reboot
      

      Node B (BACKUP) #

      # cat > /etc/hostname.em0 <<'EOF'
      inet 203.0.113.3 255.255.255.248
      EOF
      
      # cat > /etc/hostname.em1 <<'EOF'
      inet 10.10.10.3 255.255.255.0
      EOF
      
      # cat > /etc/hostname.em2 <<'EOF'
      inet 10.255.255.3 255.255.255.252
      EOF
      
      # cat > /etc/hostname.carp0 <<'EOF'
      inet 203.0.113.1 255.255.255.248 vhid 10 carpdev em0 pass sharedsecret advskew 100
      EOF
        # Higher advskew so this node remains BACKUP while A is healthy
      
      # cat > /etc/hostname.carp1 <<'EOF'
      inet 10.10.10.1 255.255.255.0 vhid 20 carpdev em1 pass sharedsecret advskew 100
      EOF
      
      # cat > /etc/hostname.pfsync0 <<'EOF'
      up syncdev em2
      EOF
      
      # sh /etc/netstart em0 em1 em2 carp0 carp1 pfsync0
      

      PF policy (both nodes) #

      The following minimal policy enables pfsync and CARP traffic, skips filtering on the SYNC interface, and allows all on LAN while blocking by default on WAN. Edit to fit your security model. Syntax is defined in pf.conf(5) . Load and introspect with pfctl(8) .

      ## /etc/pf.conf (minimal HA skeleton)
      
      set skip on { lo, pfsync0 }
      
      # Interfaces
      wan = "em0"
      lan = "em1"
      sync = "em2"
      
      # Allow control-plane for HA
      pass on $wan proto carp
      pass on $lan proto carp
      pass on $sync proto pfsync
      
      # Example policy (tighten for production)
      block all
      pass on $lan
      pass out on $wan proto { tcp udp icmp } from ($wan) to any keep state
      
      # pfctl -f /etc/pf.conf
        # Atomic reload of rules
      # pfctl -sr
        # Verify rules include proto carp and proto pfsync
      

      Optional: L4 VIP for a service using relayd (both nodes) #

      Use relayd.conf(5) to front-end a service VIP on the LAN side that fails over with CARP. Manage the daemon with rcctl(8) and runtime with relayctl(8).

      ## /etc/relayd.conf (simple TCP health-checked relay)
      table <web_backends> { 10.10.10.11, 10.10.10.12 }
      
      relay "www" {
          listen on 10.10.10.1 port 80
          forward to <web_backends> check tcp port 80
      }
      
      # rcctl enable relayd
        # Start on boot
      # rcctl start relayd
        # Launch now
      

      Optional: event-driven demotion on SYNC or upstream loss #

      In addition to automatic kernel demotion on certain failures, operators often demote CARP when SYNC breaks or when upstream reachability fails. The simplest operational control is to adjust advskew on all CARP VIPs of the affected node. The following example forces a node to BACKUP or returns it to preferred MASTER:

      # ifconfig carp0 advskew 240
        # Heavily demote WAN VIP on this node
      
      # ifconfig carp1 advskew 240
        # Heavily demote LAN VIP on this node
      
      # ifconfig carp0 advskew 0
        # Restore preferred priority for WAN VIP
      
      # ifconfig carp1 advskew 0
        # Restore preferred priority for LAN VIP
      

      Automating this decision can be done with policy tooling such as ifstated(8) (not shown here; refer to its manual for condition syntax) that triggers demotion on link or reachability failure.

      Verification #

      • Interface roles. Confirm MASTER/BACKUP on each CARP VIP:
      $ ifconfig carp0
        # Shows "MASTER" or "BACKUP" and VHID/advskew
      $ ifconfig carp1
      $ ifconfig pfsync0
        # Confirms syncdev em2 and up status
      
      • State replication. Start a connection through the cluster, then verify states exist on both nodes with pfctl(8) :
      $ pfctl -vvss | head
        # Show replicated states with creators and age/counters
      
      • Wire-level sync. Observe pfsync traffic on the SYNC interface with tcpdump(8) :
      # tcpdump -n -e -ttt -i pfsync0
        # Expect steady state updates; bursts during new flows and failover
      
      • CARP health. Inspect CARP sysctls with sysctl(8) :
      $ sysctl net.inet.carp
        # Shows allow, preempt, log level, and current demotion value
      
      • Runtime dashboards. Use systat(1) pf view for live counters, and check /var/log/messages for CARP role changes.

      Troubleshooting #

      • MASTER/MASTER condition. Two nodes both claim MASTER on a VLAN. Check for VHID collisions, mismatched pass, or isolated broadcast domains. Inspect ifconfig carpX and correct VHID/password.
      • No state replication. Verify pfsync0 is up with the correct syncdev, PF is enabled, and proto pfsync is permitted only on the SYNC interface. Confirm with tcpdump -i pfsync0.
      • Flapping roles. Upstream instability or asymmetric paths can cause preemption loops. Temporarily raise advskew on the unstable node, investigate physical links, and consider disabling preemption until repaired.
      • Failover works, but connections drop. Ensure rulesets are identical on both nodes and NAT settings match. Confirm states replicate (pfctl -vvss) and that hosts on LAN receive gratuitous ARP from CARP during role change.
      • Relayd not serving after failover. Ensure relayd runs on both nodes, listens on the CARP VIP, and that the health checks match the service. Review /var/log/relayd and relayctl show relays.
      • Demotion stuck high. Inspect sysctl net.inet.carp.demotion. Find and resolve the underlying cause (for example, SYNC down). Reset by lowering advskew back to normal and clearing the fault.

      See Also #

      • Networking
      • OpenBGPD
      • Related: Multi-WAN and Policy-Based Routing
      • Related: Troubleshooting Playbooks
      Report a bug
      • Synopsis
      • Design Considerations
      • Configuration
        • System settings (both nodes)
        • Node A (preferred MASTER)
        • Node B (BACKUP)
        • PF policy (both nodes)
        • Optional: L4 VIP for a service using relayd (both nodes)
        • Optional: event-driven demotion on SYNC or upstream loss
      • Verification
      • Troubleshooting
      • See Also