Flannel over Tailscale¶

Overview¶

By default k3s uses Flannel VXLAN with each node's LAN IP as the VTEP (tunnel endpoint). If DHCP reassigns a node's IP, the other nodes' forwarding database (fdb) tables go stale and cross-node pod traffic silently drops - causing DNS timeouts, Flux reconciliation failures, and general cluster instability.

The primary fix is to set node-ip to the node's Tailscale IP. Tailscale IPs (100.x.x.x) are assigned by the Tailscale control plane and never change, regardless of what DHCP does to the LAN interfaces. This ensures the Kubernetes control plane and node registration always use stable IPs.

For the Flannel VXLAN data plane, this cluster uses flannel-iface: eth0 (the LAN interface) rather than tailscale0. This is an intentional performance trade-off - see Flannel iface trade-off below.

Node registration:  100.x.x.x  (Tailscale IP - stable, via node-ip flag)
Flannel data plane: <lan-cidr> (LAN IP - fast, via flannel-iface: eth0)

Current Node IPs¶

Node	LAN IP	Tailscale IP
k3s-server
k3s-agent-1
k3s-agent-2

How It Works¶

The config¶

Each node has /etc/rancher/k3s/config.yaml written before k3s starts:

Server (k3s-server):

write-kubeconfig-mode: "644"
tls-san:
  - k3s-server.tailnet.ts.net
  - <k3s-server-ts-ip>
  - <k3s-server-lan-ip>
node-ip: "<k3s-server-ts-ip>"
flannel-iface: eth0

Agents (k3s-agent-1, k3s-agent-2):

node-ip: "100.x.x.x"
node-external-ip: "100.x.x.x"
flannel-iface: eth0

What each flag does¶

Flag	Effect
`flannel-iface: eth0`	Flannel binds its VXLAN VTEP to the LAN interface for the data plane
`node-ip`	The IP the node advertises to the API server and Flannel - set to Tailscale IP for stability
`node-external-ip`	The externally-routable IP for the node (agents only)
`tls-san`	Adds the Tailscale IP to the API server's TLS certificate (server only)

What happens at the kernel level¶

Flannel maintains a forwarding database (fdb) entry per remote node. With flannel-iface: eth0, VXLAN packets are sent directly to LAN IPs:

# fdb entries point at LAN IPs (fast, direct)
bridge fdb show dev flannel.1
aa:bb:cc:dd:ee:01 dst <k3s-agent-1-lan-ip> self permanent

The Kubernetes node registration (control plane routing, kubectl, etc.) still uses Tailscale IPs because node-ip is set to 100.x.x.x. If the LAN IP changes, only the Flannel data plane is affected - the cluster control plane remains healthy.

Flannel iface trade-off¶

Why not `flannel-iface: tailscale0`?¶

An earlier version of this cluster used flannel-iface: tailscale0, which routes all Flannel VXLAN traffic through the Tailscale WireGuard tunnel. While this encrypts inter-node pod traffic, it creates a severe MTU cascade when combined with the Tailscale operator:

tailscale0 MTU 1280 − 50 (VXLAN) = pod MTU 1230

The Tailscale operator proxy pods run their own tailscaled inside them - a second layer of WireGuard on top of the already-encapsulated pod network. This triple-encapsulates every proxied packet:

data → pod WireGuard → Flannel VXLAN → host WireGuard → eth0

Effective payload MTU dropped to ~1150 bytes vs the expected ~1420, causing significant IP fragmentation and throughput degradation on all Tailscale-exposed services.

See Tailscale Proxy Performance Degradation for the full diagnosis and fix.

Current trade-off¶

Concern	`flannel-iface: tailscale0`	`flannel-iface: eth0` (current)
Stable node IPs	✅ (via tailscale0)	✅ (via `node-ip: 100.x.x.x`)
Pod MTU	1230	1450
Inter-node traffic encrypted	✅ WireGuard	❌ Plain VXLAN on LAN
Tailscale proxy throughput	Degraded (triple-tunnel)	Normal
Breaks if LAN IP changes	No	Yes - restart k3s-agent

Best practice: Ensure LAN IPs are static or DHCP-reserved so Flannel's fdb table stays valid. See Provisioning New Nodes.

Verifying the Configuration¶

Check node registration IPs¶

kubectl get nodes -o json | jq -r '.items[] |
  .metadata.name +
  "  internal=" + .metadata.annotations["k3s.io/internal-ip"] +
  "  flannel=" + .metadata.annotations["flannel.alpha.coreos.com/public-ip"]'

Expected output - all IPs should be 100.x.x.x:

k3s-agent-1  internal=<k3s-agent-1-ts-ip>  flannel=<k3s-agent-1-ts-ip>
k3s-agent-2  internal=<k3s-agent-2-ts-ip>   flannel=<k3s-agent-2-ts-ip>
k3s-server   internal=<k3s-server-ts-ip>  flannel=<k3s-server-ts-ip>

Test cross-node connectivity¶

# Run a pod and test DNS (which requires cross-node pod routing)
kubectl run dns-test --image=alpine --rm -it --restart=Never -- nslookup kubernetes.default.svc.cluster.local

Check the fdb table on a node¶

# Via privileged nsenter pod on a node
kubectl debug node/k3s-server -it --image=alpine -- chroot /host bridge fdb show dev flannel.1

Provisioning New Nodes¶

The Ansible playbooks handle this automatically. Before installing k3s, each playbook:

Gets the node's Tailscale IP: tailscale ip -4
Writes /etc/rancher/k3s/config.yaml with flannel-iface: eth0, node-ip set to the Tailscale IP, and any agent-specific flags
Installs k3s (which picks up the config file)

Prerequisite: Tailscale must be installed and authenticated on the node before running the k3s playbook so that node-ip (the Tailscale IP) resolves correctly at startup.

Relevant playbooks:

ansible/playbooks/deploy_k3s.yml - server node
ansible/playbooks/deploy_k3s_worker_tailscale.yml - worker joining via Tailscale network
ansible/playbooks/deploy_k3s_worker_local.yml - worker joining via LAN (still uses Tailscale for Flannel)

Applying the Change to an Existing Node¶

If you add a new node that wasn't provisioned with this config, or need to re-apply:

1. Write the config file¶

Use a privileged pod (replace IP and node name):

cat > /tmp/write-cfg.yaml << 'EOF'
apiVersion: v1
kind: Pod
metadata:
  name: write-cfg
  namespace: default
spec:
  hostPID: true
  hostNetwork: true
  nodeName: k3s-agent-1          # <-- change this
  containers:
  - name: nsenter
    image: alpine
    command:
    - sh
    - -c
    - |
      nsenter -t 1 -m -u -i -n -- sh << 'SCRIPT'
      mkdir -p /etc/rancher/k3s
      cat > /etc/rancher/k3s/config.yaml << 'CONF'
      node-ip: "<k3s-agent-1-ts-ip>"       # <-- Tailscale IP of this node
      node-external-ip: "<k3s-agent-1-ts-ip>"
      flannel-iface: eth0
      CONF
      cat /etc/rancher/k3s/config.yaml
      SCRIPT
    securityContext:
      privileged: true
  restartPolicy: Never
EOF
kubectl apply -f /tmp/write-cfg.yaml
kubectl logs -f write-cfg
kubectl delete pod write-cfg

2. Restart k3s on the node¶

For agents:

cat > /tmp/restart.yaml << 'EOF'
apiVersion: v1
kind: Pod
metadata:
  name: restart-k3s
  namespace: default
spec:
  hostPID: true
  hostNetwork: true
  nodeName: k3s-agent-1          # <-- change this
  containers:
  - name: nsenter
    image: alpine
    command: ["nsenter", "-t", "1", "-m", "-u", "-i", "-n", "--", "systemctl", "restart", "k3s-agent"]
    securityContext:
      privileged: true
  restartPolicy: Never
EOF
kubectl apply -f /tmp/restart.yaml

For the server (kubectl will briefly disconnect and reconnect):

# Replace k3s-agent with k3s in the restart pod, nodeName: k3s-server
# command: [..., "systemctl", "restart", "k3s"]

3. Verify¶

sleep 30
kubectl get nodes -o json | jq -r '.items[] | .metadata.name + " flannel=" + .metadata.annotations["flannel.alpha.coreos.com/public-ip"]'

Troubleshooting¶

Node still showing LAN IP as its flannel IP¶

If the node registered before the config was written, it may have an old IP in the fdb. Force re-registration:

# Check the config was actually written
kubectl debug node/<node> -it --image=alpine -- chroot /host cat /etc/rancher/k3s/config.yaml

# If missing, re-apply the write-cfg pod above, then restart again

Flannel performance is degraded (slow Tailscale proxy)¶

If pod MTU shows 1230 instead of 1450, or if Tailscale-proxied services are slow, the cluster may have been provisioned with flannel-iface: tailscale0 (the previous configuration). See Tailscale Proxy Performance Degradation for full diagnosis and the fix.

Cross-node pods can't communicate after adding a node¶

If the new node's LAN IP differs from what other nodes have in their fdb, VXLAN packets will be misrouted. Check:

# On an agent node, check what IPs flannel knows about
kubectl debug node/k3s-agent-1 -it --image=alpine -- chroot /host bridge fdb show dev flannel.1

# If stale: restart k3s-agent on all nodes to force re-registration

Verifying flannel data plane¶

# Check the fdb - entries should show LAN IPs (<lan-cidr>)
kubectl debug node/k3s-server -it --image=alpine -- chroot /host bridge fdb show dev flannel.1

# Check the flannel MTU is 1450
cat /run/flannel/subnet.env