Table of Contents

Building a Kubernetes Homelab: My Journey from ISP Router to VLAN-Segmented Network

This is the second post in our “Building a Kubernetes Homelab” series. If you haven’t read the first post yet, I recommend starting there for the full context of this project.

The Beginning: A Dream and a Plan

It all started with a simple dream: transform my basic homelab into a properly segmented, Kubernetes-based infrastructure. I had been running everything on a single network for years, and the security implications were starting to keep me up at night. Smart home devices mixed with servers, personal devices sharing the same subnet as my lab equipment – it was a security nightmare waiting to happen.

In my previous post, I outlined the vision: 5 VLANs (Management, Lab, IoT, Devices, Guests) with proper isolation, automated configuration, and a path to Kubernetes. Now it was time to make it real.

Chapter 1: The First Hardware Choice and Its Betrayal

The Budget Router: A Love Story That Wasn’t Meant to Be

I started with what seemed like a reasonable choice: a budget OpenWRT-compatible router. It was affordable, had good reviews, and supported OpenWRT. Perfect for a homelab project, right?

Wrong.

The first red flag came when I tried to configure it through SSH. The router would accept my configuration changes, everything would work perfectly… until I rebooted it. Then it would reset to factory defaults, losing everything I had painstakingly configured.

I spent hours troubleshooting this issue. I tried:

Disabling proprietary GL.iNet services (repeater, gl-config)
Force remounting the overlay filesystem as read-write
Multiple sync operations and cache clearing
Enhanced save processes with explicit commit commands

Nothing worked. Every single reboot would wipe my configuration clean.

The Realization: Proprietary Firmware Hell

After extensive research and countless failed attempts, I finally understood the problem: this router runs a heavily modified version of OpenWRT that resets all configuration after every reboot. This isn’t a bug – it’s by design. The manufacturer’s custom firmware is designed to work with their proprietary web interface and mobile app, not with standard OpenWRT UCI configuration management.

I was fighting against the fundamental architecture of the device. No amount of Ansible automation, no clever configuration tricks, no amount of persistence commands would change this behavior.

The Hard Decision: Hardware Upgrade

I had to face the music: I needed different hardware. After researching alternatives, I ordered a mid-range OpenWRT router which supports flashing with vanilla OpenWRT firmware.

This was a significant investment, but I was committed to doing this right. The new router would give me:

Full control over configuration persistence
Standard UCI commands working as expected
No proprietary service interference
Better long-term maintainability
Access to the full OpenWRT package ecosystem

Chapter 2: The GL-MT3000 Arrives – A New Beginning

Unboxing and First Impressions

When the new router arrived, I was excited but cautious. I had been burned by the previous router, so I approached this with measured optimism.

The hardware felt solid, and the setup process was straightforward. I flashed it with vanilla OpenWRT, configured SSH access, and updated my inventory to use the standard 192.168.1.x management network.

The First Success: Configuration Persistence

The moment of truth came when I configured the router, rebooted it, and… it kept my configuration! This was a revelation. For the first time, I had a router that actually respected my configuration changes.

I immediately started building my Ansible automation system, confident that this time it would work.

Chapter 3: Building the Automation Foundation

The Ansible Infrastructure Takes Shape

With a reliable router, I could finally focus on building proper automation. I created a comprehensive Ansible structure with roles for:

Common: Shared configuration and package management
Router: OpenWRT-specific configuration
Switch: Managed switch configuration (manual for now)
Testing: Network validation and health checks

The Template System: My First Major Breakthrough

The first major challenge was getting Jinja2 templates to work properly. I initially tried using the copy module, but variables weren’t being resolved. After some research, I discovered I needed to use the template module instead.

# This didn't work
- name: Deploy network configuration
  copy:
    src: network.conf
    dest: /etc/config/network

# This worked
- name: Deploy network configuration
  template:
    src: network.conf
    dest: /etc/config/network
    backup: yes
    mode: '0644'

This was a small change, but it unlocked the power of dynamic configuration generation. Now I could use variables like {{ vlans.management.gateway }} and have them properly resolved.

The Python3 Challenge

Another early hurdle was getting Python3 working on OpenWRT for full Ansible module support. The default installation was minimal, so I had to install additional packages:

- name: Install required packages
  raw: |
    opkg update && opkg install python3 python3-pip    
  register: package_install
  retries: 3
  delay: 5
  until: package_install.rc == 0

This took several attempts due to network instability, but the retry logic I built into the playbook eventually succeeded.

Chapter 4: The VLAN Configuration Journey

The First VLAN Attempt: A Learning Experience

With the basic automation working, I turned my attention to VLAN configuration. I started with a simple approach: configure the router VLANs first, then the switch.

This seemed logical, but I quickly discovered that order matters. When I configured the switch VLANs before the router was ready, I lost connectivity entirely. The switch was sending VLAN-tagged traffic that the router couldn’t handle yet.

The Breakthrough: Router First, Then Switch

After several frustrating attempts, I learned the correct sequence:

Router VLANs: Configure all VLAN interfaces and bridges on the router
Switch VLANs: Configure the managed switch with proper trunk and access ports
Testing: Validate connectivity and DHCP functionality

This order was crucial because the router needed to be ready to handle VLAN-tagged traffic before the switch started sending it.

The MAC Address Format Mystery

One of the most frustrating issues I encountered was with DHCP static assignments. I had configured a static lease for my switch:

config host
    option name 'switch-device'
    option mac 'aa:bb:c:d:e:f'  # This didn't work
    option ip '192.168.1.2'

The switch wasn’t getting its static IP. After hours of debugging, I discovered the issue: MAC addresses need leading zeros. The correct format was:

config host
    option name 'switch-device'
    option mac 'aa:bb:0c:0d:0e:0f'  # This worked
    option ip '192.168.1.2'

This was a subtle but critical difference that taught me to always verify MAC address formats in network configurations.

Chapter 5: The WiFi Configuration Saga

The WPA2 Package Nightmare

With VLANs working, I turned my attention to WiFi configuration. This turned out to be one of the most challenging parts of the entire project.

The first issue was WPA2 support. OpenWRT ships with wpad-basic-mbedtls by default, which doesn’t support WPA2-PSK encryption. I needed the full wpad package:

- name: Install full wpad package for WPA2 support
  raw: |
    opkg update && opkg remove wpad-basic-mbedtls && opkg install wpad    
  register: wpad_install
  retries: "{{ ansible_retry_attempts }}"
  delay: "{{ ansible_retry_delay }}"
  until: wpad_install.rc is defined and wpad_install.rc == 0

This process was fragile and required multiple attempts, but it was essential for proper WiFi security.

The Hardware Path Discovery

The next challenge was finding the correct hardware paths for the GL-MT3000’s WiFi radios. The standard OpenWRT paths didn’t work, and I spent considerable time experimenting with different configurations.

Eventually, I discovered the correct paths:

2.4GHz: platform/soc/xxxxx.wifi
5GHz: platform/soc/xxxxx.wifi+1

The Encryption Syntax Puzzle

Even with the correct hardware paths, WiFi configuration wasn’t working. The issue was with the encryption syntax. I was using 'wpa2' but OpenWRT expects 'psk2':

# This didn't work
option encryption 'wpa2'

# This worked
option encryption 'psk2'

This was another subtle difference that took time to debug, but it was the final piece needed for WiFi functionality.

Chapter 6: The DHCP Crisis

The APIPA Address Mystery

With WiFi networks broadcasting and VLANs configured, I expected everything to work smoothly. Instead, I encountered a new problem: WiFi clients were receiving APIPA addresses (169.254.x.x) instead of proper VLAN IP addresses.

This was a critical issue that indicated DHCP wasn’t working for WiFi clients on VLAN networks.

The Root Cause Investigation

I spent days investigating this issue. The problem had multiple components:

Issue 1: DNSMasq Service Restart

DNSMasq wasn’t automatically restarting after DHCP configuration changes
VLAN DHCP ranges weren’t being generated in the DNSMasq configuration file

Issue 2: WiFi VLAN Bridging

WiFi interfaces were not properly bridged to VLAN interfaces
VLAN interfaces were configured directly on eth1.27, eth1.28, etc.
WiFi clients couldn’t communicate with DHCP servers on VLAN interfaces

The Solution: VLAN Bridge Configuration

The breakthrough came when I realized I needed to create bridges for each VLAN network. Instead of configuring VLAN interfaces directly, I needed to bridge them:

Before (broken configuration):

config interface 'vlan20'
    option proto 'static'
    option ipaddr '192.168.20.1'
    option netmask '255.255.255.0'
    option ifname 'eth1.20'

After (working configuration):

config device
    option name 'br-vlan20'
    option type 'bridge'
    list ports 'eth1.20'

config interface 'vlan20'
    option device 'br-vlan20'
    option proto 'static'
    option ipaddr '192.168.20.1'
    option netmask '255.255.255.0'

This change, combined with automatic DNSMasq restarts, finally made DHCP work for WiFi clients.

Chapter 7: The Firewall Configuration Nightmare

The Final Hurdle: Firewall Syntax

Even with VLAN bridges working, I was still getting APIPA addresses. The issue was with my firewall configuration syntax. After analyzing the original OpenWRT firewall configuration, I discovered several critical syntax differences:

Key Issues Found:

Network List Format: Used option network 'lan' instead of list network 'lan'
ICMP Type Format: Used option icmp_type with space-separated values instead of list icmp_type
Conflicting Rules: Added unnecessary wan→lan forwarding that broke Management VLAN connectivity
Structure Mismatch: Didn’t preserve the original OpenWRT firewall structure

The Breakthrough: Analyzing Original Configuration

The solution was to study the default OpenWRT firewall configuration and use the correct syntax:

# Correct syntax
config zone
    option name 'lan'
    list network 'lan'  # Not option network 'lan'
    option input 'ACCEPT'
    option output 'ACCEPT'
    option forward 'ACCEPT'

config rule
    option name 'Allow-ICMPv6-Input'
    option src 'wan'
    option proto 'icmp'
    list icmp_type 'echo-request'  # Not option icmp_type 'echo-request'
    list icmp_type 'echo-reply'
    option family 'ipv6'
    option target 'ACCEPT'

This was the final piece of the puzzle. With correct firewall syntax, DHCP finally worked properly for all VLANs.

Chapter 8: Device Integration and Real-World Testing

HomeAssistant: The First Real Device

With the network infrastructure working, I could finally start connecting real devices. The first was my HomeAssistant instance, which I wanted on the IoT VLAN.

This was relatively straightforward:

Connected HomeAssistant to the IoT Ethernet network
Configured static DHCP lease with MAC address aa:bb:cc:dd:ee:ff
Set up firewall rules to allow Management, Lab, Devices, and Guest VLANs to access IoT VLAN

The HomeAssistant integration was successful, and I could access it from any VLAN as intended.

The Printer Saga: Dual-Band WiFi Discovery

The printer integration was more challenging. I had a network printer that I wanted on the Devices VLAN, but it couldn’t see the “HomeDevices” WiFi network.

After some investigation, I discovered the issue: the printer only supports 2.4GHz, but HomeDevices was only broadcasting on 5GHz.

The Dual-Band Solution

The solution was to configure HomeDevices to broadcast on both 2.4GHz and 5GHz:

# HomeDevices dual-band configuration
- name: "HomeDevices"
  vlan: 20
  security: "wpa2"
  password_key: "devices"
  radio: "radio0"  # 2.4GHz for printer compatibility
- name: "HomeDevices"
  vlan: 20
  security: "wpa2"
  password_key: "devices"
  radio: "radio1"  # 5GHz for modern devices

This required fixing a configuration conflict where the same SSID on different radios would create duplicate wifi-iface identifiers. I solved this by including the radio name in the identifier:

config wifi-iface 'homedevices_radio0'  # 2.4GHz
config wifi-iface 'homedevices_radio1'  # 5GHz

The MAC Address Correction

Even with dual-band WiFi working, the printer still wasn’t getting its static IP. After checking the DHCP leases, I discovered the printer was using a different MAC address than I had configured. I had to update the inventory with the correct MAC address ff:ee:dd:cc:bb:aa.

This taught me the importance of verifying actual device MAC addresses rather than relying on documentation or assumptions.

Chapter 9: Network Security and Access Control

Implementing Firewall Rules

With devices connected, I needed to implement proper network security. I created firewall rules that allowed controlled access between VLANs:

IoT VLAN Access: Management, Lab, Devices, and Guest VLANs can access IoT VLAN
Devices VLAN Access: Management VLAN can access Devices VLAN for printer management
Bidirectional Isolation: IoT and Devices VLANs cannot access other VLANs
Internet Access: All VLANs have proper internet connectivity

The Security Matrix

I documented the access policies in a clear matrix:

Source VLAN	Management VLAN	Lab VLAN	IoT VLAN	Devices VLAN	Guest VLAN	VPN VLAN	Internet
Management		✅	✅	✅	✅	✅	✅
Lab	❌		✅	❌	❌	❌	✅
IoT	❌	❌		❌	❌	❌	✅
Devices	❌	✅	✅		✅	❌	✅
Guests	❌	❌	❌	❌		❌	✅
VPN	❌	✅	❌	❌	❌

This provided clear documentation of which VLANs could access which resources, making it easy to understand and maintain the security model.

Chapter 10: VPN Configuration - Secure Remote Access

The Need for Remote Access

With the network infrastructure working and devices properly segmented, I realized I needed a way to access my homelab remotely. Whether I was traveling, working from a different location, or simply wanted to manage my infrastructure from anywhere, I needed secure remote access to my lab environment.

However, I didn’t want to expose my entire network to the internet. The security matrix I had carefully designed needed to be maintained even for remote access. This meant I needed a VPN solution that would provide controlled access to specific VLANs only.

Choosing Wireguard: Modern VPN Technology

After researching VPN options, I chose Wireguard for several reasons:

Modern cryptography: Uses state-of-the-art cryptographic primitives
Lightweight: Minimal attack surface with a small codebase
High performance: Faster than traditional VPN protocols like OpenVPN
OpenWRT support: Native support with excellent integration
Simple configuration: Easy to manage and troubleshoot

VPN Architecture: Lab Access Only

Following the principle of least privilege, I designed the VPN to provide access to the Lab VLAN only. This meant:

Remote developers could access the Kubernetes cluster for development work
IoT devices remained isolated and inaccessible from VPN
Personal devices and guest networks stayed protected
Management access was still possible through the VPN for administration

The Implementation: Ansible-Driven VPN Configuration

The VPN configuration was integrated into my existing Ansible automation system, ensuring consistency and repeatability.

Network Interface Configuration

The VPN interface was defined in the network configuration template:

# VPN Interface - Wireguard VPN server
config interface 'vpn'
    option proto 'wireguard'              # Wireguard protocol
    option private_key '{{ hostvars["router"].wireguard_keys.private }}'  # Router's private key
    option listen_port '51820'            # UDP port for VPN
    option addresses '{{ hostvars["router"].vlans.vpn.ip }}/24'  # Router IP (192.168.30.1/24)

This created a dedicated VPN interface on the router with IP 192.168.30.1/24, separate from all other VLANs.

Peer Configuration Management

The Wireguard peer configuration was dynamically generated from the Ansible inventory:

# Wireguard Peers - Client configurations
{% for host in groups['vpn'] %}
{% if hostvars[host].wireguard_keys.public is defined and host != 'router' %}
# Peer: {{ host }} ({{ hostvars[host].description }})
config wireguard_{{ host }}
    option public_key '{{ hostvars[host].wireguard_keys.public }}'    # Client's public key
    option allowed_ips '{{ hostvars[host].allowed_ips }}'  # Allowed IP ranges for this client
    option persistent_keepalive '25'                      # Keep-alive interval (seconds)
    option interface 'vpn'                               # Associate with VPN interface
{% endif %}
{% endfor %}

This approach allowed me to manage VPN clients through the Ansible inventory, making it easy to add or remove users.

Hotplug Integration

One of the challenges with Wireguard on OpenWRT is ensuring peer configurations are applied when the VPN interface comes up. I solved this with a custom hotplug script:

#!/bin/sh

# Wireguard Hotplug Script
# This script is triggered by network interface events
# It ensures Wireguard peers are configured when the VPN interface comes up

# Only process ifup events for the VPN interface
[ "$ACTION" = "ifup" ] || exit 0
[ "$INTERFACE" = "vpn" ] || exit 0

# Wait a moment for the interface to be fully ready
sleep 2

# Configure Wireguard peers from UCI configuration
{% for host in groups['vpn'] %}
{% if hostvars[host].wireguard_keys.public is defined and host != 'router' %}
# Configure peer: {{ host }}
wg set vpn peer {{ hostvars[host].wireguard_keys.public }} allowed-ips {{ hostvars[host].allowed_ips }} persistent-keepalive 25
{% endif %}
{% endfor %}

# Log the event
logger -t wireguard "VPN interface up, peers configured"

This script automatically configures all VPN peers whenever the VPN interface is brought up, ensuring reliable connectivity.

Firewall Integration: Enforcing the Security Matrix

The VPN integration required careful firewall configuration to maintain the security matrix. The key rules were:

# VPN VLAN access rules - VPN can only access Lab VLAN
# VPN to Lab forwarding (for remote development access)
config forwarding
    option src      vlan30              # Source: VPN VLAN
    option dest     vlan26              # Destination: Lab VLAN

# Management to VPN forwarding (for VPN management)
config forwarding
    option src      lan                 # Source: Management VLAN
    option dest     vlan30              # Destination: VPN VLAN

This ensured that:

VPN users could only access the Lab VLAN
Management VLAN could access VPN for administration
All other VLANs remained isolated from VPN access

Client Configuration: Workstation Integration

The VPN client configuration was managed through the Ansible inventory:

# VPN
vpn:
  hosts:
    workstation:
      ansible_host: 192.168.30.10
      hostname: "workstation.vpn.home.example.net"
      description: "Workstation with VPN client"
      public_key: "{{ hostvars['workstation'].wireguard_keys.public }}"
      private_key: "{{ hostvars['workstation'].wireguard_keys.private }}"
      allowed_ips: "192.168.30.10/32"

This allowed me to assign static IP addresses to VPN clients and manage their access through the same automation system.

Service Management: Reliable VPN Operation

The VPN service management was integrated into the Ansible handlers:

- name: restart wireguard
  command: ifdown vpn && ifup vpn
  retries: 3
  delay: 5

This ensured that VPN configuration changes were properly applied and the service was restarted reliably.

Testing and Validation

After implementing the VPN, I conducted comprehensive testing:

Connectivity Testing: Verified VPN clients could connect and receive proper IP addresses
Access Control Testing: Confirmed VPN users could only access the Lab VLAN
Isolation Testing: Verified that IoT, Devices, and Guest VLANs remained inaccessible
Performance Testing: Measured VPN throughput and latency
Failover Testing: Tested VPN reconnection after network interruptions

The Result: Secure Remote Development

The VPN implementation provided exactly what I needed:

Secure remote access to the Kubernetes cluster for development work
Maintained network isolation with the security matrix intact
Automated management through Ansible for easy client addition/removal
Reliable operation with proper service management and monitoring
Modern security with Wireguard’s state-of-the-art cryptography

This completed the network infrastructure, providing both local segmentation and secure remote access while maintaining the security principles I had established.

Chapter 11: The Automation Maturity

Comprehensive Template Documentation

Throughout this journey, I learned the importance of documentation. I added comprehensive comments to all router configuration templates:

# Network configuration for OpenWRT router
# This file defines all network interfaces including VLANs

# Loopback interface - standard localhost interface
config interface 'loopback'
    option device 'lo'                    # Loopback device
    option proto 'static'                 # Static IP configuration
    option ipaddr '127.0.0.1'            # Standard loopback IP
    option netmask '255.0.0.0'           # Loopback subnet mask

This documentation proved invaluable for troubleshooting and future maintenance.

The Complete Automation System

By the end of this journey, I had built a comprehensive automation system that included:

Network Configuration: Automated VLAN setup with proper bridging
DHCP Management: Static and dynamic IP assignment with proper MAC address handling
WiFi Configuration: Dual-band support with proper security
Firewall Management: Controlled access between VLANs
Service Management: Automatic restart of services after configuration changes
Testing and Validation: Comprehensive health checks and connectivity testing
Backup and Restore: Automated configuration backup with timestamped archives

This is part of the “Building a Kubernetes Homelab” series. In the next post, we’ll deploy the Kubernetes cluster on the Lab VLAN and begin migrating services to our new network infrastructure.