Flannel over Tailscale¶
Overview¶
By default k3s uses Flannel VXLAN with each node's LAN IP as the VTEP (tunnel endpoint). If DHCP reassigns a node's IP, the other nodes' forwarding database (fdb) tables go stale and cross-node pod traffic silently drops - causing DNS timeouts, Flux reconciliation failures, and general cluster instability.
The primary fix is to set node-ip to the node's Tailscale IP. Tailscale IPs (100.x.x.x) are assigned by the Tailscale control plane and never change, regardless of what DHCP does to the LAN interfaces. This ensures the Kubernetes control plane and node registration always use stable IPs.
For the Flannel VXLAN data plane, this cluster uses flannel-iface: eth0 (the LAN interface) rather than tailscale0. This is an intentional performance trade-off - see Flannel iface trade-off below.
Node registration: 100.x.x.x (Tailscale IP - stable, via node-ip flag)
Flannel data plane: <lan-cidr> (LAN IP - fast, via flannel-iface: eth0)
Current Node IPs¶
| Node | LAN IP | Tailscale IP |
|---|---|---|
| k3s-server | ||
| k3s-agent-1 | ||
| k3s-agent-2 |
How It Works¶
The config¶
Each node has /etc/rancher/k3s/config.yaml written before k3s starts:
Server (k3s-server):
write-kubeconfig-mode: "644"
tls-san:
- k3s-server.tailnet.ts.net
- <k3s-server-ts-ip>
- <k3s-server-lan-ip>
node-ip: "<k3s-server-ts-ip>"
flannel-iface: eth0
Agents (k3s-agent-1, k3s-agent-2):
What each flag does¶
| Flag | Effect |
|---|---|
flannel-iface: eth0 |
Flannel binds its VXLAN VTEP to the LAN interface for the data plane |
node-ip |
The IP the node advertises to the API server and Flannel - set to Tailscale IP for stability |
node-external-ip |
The externally-routable IP for the node (agents only) |
tls-san |
Adds the Tailscale IP to the API server's TLS certificate (server only) |
What happens at the kernel level¶
Flannel maintains a forwarding database (fdb) entry per remote node. With flannel-iface: eth0, VXLAN packets are sent directly to LAN IPs:
# fdb entries point at LAN IPs (fast, direct)
bridge fdb show dev flannel.1
aa:bb:cc:dd:ee:01 dst <k3s-agent-1-lan-ip> self permanent
The Kubernetes node registration (control plane routing, kubectl, etc.) still uses Tailscale IPs because node-ip is set to 100.x.x.x. If the LAN IP changes, only the Flannel data plane is affected - the cluster control plane remains healthy.
Flannel iface trade-off¶
Why not flannel-iface: tailscale0?¶
An earlier version of this cluster used flannel-iface: tailscale0, which routes all Flannel VXLAN traffic through the Tailscale WireGuard tunnel. While this encrypts inter-node pod traffic, it creates a severe MTU cascade when combined with the Tailscale operator:
The Tailscale operator proxy pods run their own tailscaled inside them - a second layer of WireGuard on top of the already-encapsulated pod network. This triple-encapsulates every proxied packet:
Effective payload MTU dropped to ~1150 bytes vs the expected ~1420, causing significant IP fragmentation and throughput degradation on all Tailscale-exposed services.
See Tailscale Proxy Performance Degradation for the full diagnosis and fix.
Current trade-off¶
| Concern | flannel-iface: tailscale0 |
flannel-iface: eth0 (current) |
|---|---|---|
| Stable node IPs | ✅ (via tailscale0) | ✅ (via node-ip: 100.x.x.x) |
| Pod MTU | 1230 | 1450 |
| Inter-node traffic encrypted | ✅ WireGuard | ❌ Plain VXLAN on LAN |
| Tailscale proxy throughput | Degraded (triple-tunnel) | Normal |
| Breaks if LAN IP changes | No | Yes - restart k3s-agent |
Best practice: Ensure LAN IPs are static or DHCP-reserved so Flannel's fdb table stays valid. See Provisioning New Nodes.
Verifying the Configuration¶
Check node registration IPs¶
kubectl get nodes -o json | jq -r '.items[] |
.metadata.name +
" internal=" + .metadata.annotations["k3s.io/internal-ip"] +
" flannel=" + .metadata.annotations["flannel.alpha.coreos.com/public-ip"]'
Expected output - all IPs should be 100.x.x.x:
k3s-agent-1 internal=<k3s-agent-1-ts-ip> flannel=<k3s-agent-1-ts-ip>
k3s-agent-2 internal=<k3s-agent-2-ts-ip> flannel=<k3s-agent-2-ts-ip>
k3s-server internal=<k3s-server-ts-ip> flannel=<k3s-server-ts-ip>
Test cross-node connectivity¶
# Run a pod and test DNS (which requires cross-node pod routing)
kubectl run dns-test --image=alpine --rm -it --restart=Never -- nslookup kubernetes.default.svc.cluster.local
Check the fdb table on a node¶
# Via privileged nsenter pod on a node
kubectl debug node/k3s-server -it --image=alpine -- chroot /host bridge fdb show dev flannel.1
Provisioning New Nodes¶
The Ansible playbooks handle this automatically. Before installing k3s, each playbook:
- Gets the node's Tailscale IP:
tailscale ip -4 - Writes
/etc/rancher/k3s/config.yamlwithflannel-iface: eth0,node-ipset to the Tailscale IP, and any agent-specific flags - Installs k3s (which picks up the config file)
Prerequisite: Tailscale must be installed and authenticated on the node before running the k3s playbook so that node-ip (the Tailscale IP) resolves correctly at startup.
Relevant playbooks:
ansible/playbooks/deploy_k3s.yml- server nodeansible/playbooks/deploy_k3s_worker_tailscale.yml- worker joining via Tailscale networkansible/playbooks/deploy_k3s_worker_local.yml- worker joining via LAN (still uses Tailscale for Flannel)
Applying the Change to an Existing Node¶
If you add a new node that wasn't provisioned with this config, or need to re-apply:
1. Write the config file¶
Use a privileged pod (replace IP and node name):
cat > /tmp/write-cfg.yaml << 'EOF'
apiVersion: v1
kind: Pod
metadata:
name: write-cfg
namespace: default
spec:
hostPID: true
hostNetwork: true
nodeName: k3s-agent-1 # <-- change this
containers:
- name: nsenter
image: alpine
command:
- sh
- -c
- |
nsenter -t 1 -m -u -i -n -- sh << 'SCRIPT'
mkdir -p /etc/rancher/k3s
cat > /etc/rancher/k3s/config.yaml << 'CONF'
node-ip: "<k3s-agent-1-ts-ip>" # <-- Tailscale IP of this node
node-external-ip: "<k3s-agent-1-ts-ip>"
flannel-iface: eth0
CONF
cat /etc/rancher/k3s/config.yaml
SCRIPT
securityContext:
privileged: true
restartPolicy: Never
EOF
kubectl apply -f /tmp/write-cfg.yaml
kubectl logs -f write-cfg
kubectl delete pod write-cfg
2. Restart k3s on the node¶
For agents:
cat > /tmp/restart.yaml << 'EOF'
apiVersion: v1
kind: Pod
metadata:
name: restart-k3s
namespace: default
spec:
hostPID: true
hostNetwork: true
nodeName: k3s-agent-1 # <-- change this
containers:
- name: nsenter
image: alpine
command: ["nsenter", "-t", "1", "-m", "-u", "-i", "-n", "--", "systemctl", "restart", "k3s-agent"]
securityContext:
privileged: true
restartPolicy: Never
EOF
kubectl apply -f /tmp/restart.yaml
For the server (kubectl will briefly disconnect and reconnect):
# Replace k3s-agent with k3s in the restart pod, nodeName: k3s-server
# command: [..., "systemctl", "restart", "k3s"]
3. Verify¶
sleep 30
kubectl get nodes -o json | jq -r '.items[] | .metadata.name + " flannel=" + .metadata.annotations["flannel.alpha.coreos.com/public-ip"]'
Troubleshooting¶
Node still showing LAN IP as its flannel IP¶
If the node registered before the config was written, it may have an old IP in the fdb. Force re-registration:
# Check the config was actually written
kubectl debug node/<node> -it --image=alpine -- chroot /host cat /etc/rancher/k3s/config.yaml
# If missing, re-apply the write-cfg pod above, then restart again
Flannel performance is degraded (slow Tailscale proxy)¶
If pod MTU shows 1230 instead of 1450, or if Tailscale-proxied services are slow, the cluster may have been provisioned with flannel-iface: tailscale0 (the previous configuration). See Tailscale Proxy Performance Degradation for full diagnosis and the fix.
Cross-node pods can't communicate after adding a node¶
If the new node's LAN IP differs from what other nodes have in their fdb, VXLAN packets will be misrouted. Check:
# On an agent node, check what IPs flannel knows about
kubectl debug node/k3s-agent-1 -it --image=alpine -- chroot /host bridge fdb show dev flannel.1
# If stale: restart k3s-agent on all nodes to force re-registration