YAABAS - Yet Another Admin Building A System

Home Chronological Blog Updates The Network as it stands But Why?

A Warning

I am not a web developer but a "backend code kinda person"; as such, no good web design should be expected from my site. Anything here can usually be attributed to a tutorial on CSS/HTML. Bear with it, as I will not commit to get better. I am also not a great example of someone who follows Industry Standards, so DO NOT (under any circumstances!) use what I am doing here as a template/guide. This is not a production network, and it should be treated as a learning log. DO NOT USE MY CONFIGURATIONS FOR YOUR ENVIRONMENT.

Do your own research, look up settings you don't fully understand, and make informed decisions on your own. I will not assist you.

Get to the Site's point, already

As an avid player of the factory-management game affectionally referred to as "Crack-torio" , I have an odd addiction to complex automated systems. In the past I used to build spaghetti bases that didn't scale with the playthrough, and I got so frustrated with entire bases being un-knowably complex (a "magical mechanicus moment", as described by MandaloreGaming ) that I decided to tear all of my factory down before entering the Space Age and utilize a scalable, blueprint-based factory design. After 1k hours building and expanding that scalable factory, I want to apply that method to my homelab.

After having read the Wiki.C2 article "Kill your Darlings", I finally pushed myself to think more about server scaling. I needed to scale if I want to continue my game and communication server hobby. As planning and implementing a multi-service and self-hosted environment from scratch is my idea of a fun project, my server scaling experiments are going to be done with that scope. Let's start talking about my plans.

Identity

Since there are a few Self-Hosted services that I want to run in my environment that could grow into multiple users with multiple scopes getting access, I knew I wanted to go about setting up proper frameworks for group-driven, role-based access. From that standpoint I want a fully federated Authentication and Authorization workflow, and OpenLDAP looked to me like the right choice. While reading this, you might be asking yourself: "Why didn't they choose to use lldap? It's easier to set up..." That is a great question, that initially boiled down to me not knowing that lldap existed. However, the more I looked into the documentation around OpenLDAP the more I realized I could grow into management of it. Being a Windows sysadmin and not knowing LDAP queries and proper structure management...for shame, for shame... On top of the LDAP service, I want to support OpenID Connect/Single-Sign-On/JWT-Based logins, so I'll be attempting to set up Authelia to handle Federated Identities.

In the end, the initial OpenLDAP setup being daunting is (I believe) going to help force me to plan out my network, but as I don't want to retain all of the knowledge of how I set that up in my stupid meat knowledge-container (read: brain), I hope to use a management tool to configure everything remotely.

Automation

For a bit of background I work as a Windows system admin managing a Hyper-V cluster using mostly GUIs. Sacrilege, I know. In an effort to sharpen my general cluster management skills and work the "Kill Your Darlings" method into my toolset, I am looking at Ansible. While not directly useful to my work cluster stack, it has one fantastic advantage throughout almost the entire stack: it's free. I have been picking software solutions effectively at random during this process, mostly choosing around licensing cost and general support/usage within the community, but with Ansible I believe I can reasonably automate all of them. Some services have specific modules I plan to leverage; other services I am going to be using templated files to configure. One of the bigger lifts that the Asnible community modules are going to help with is the Community.Proxmox module, as the cluster will be Proxmox-based. I aim to have most of the containers built and maintained by Ansible.

The Cluster

For the cluster itself I am going to be using a three-node Proxmox Virtual Environment (PVE) cluster consisting of 3x Dell R630s with 1Gb networks for cluster management and access (at least two separate subnets!), and 10Gb networks for storage and backups (again, two separate subnets). On those servers, I'll be configuring a Ceph storage cluster using the non-OS disks (4x 512GB Samsung SSDs per node) and hosting containers/virtual machines stored on that Ceph Cluster. The Cluster will also utilize High Availability (HA), so that if nodes need to have maintenance done I can set them into maintenance mode to flush guests to another node and remove it from the cluster (I do have some consistent users who get grumpy if I turn specific services off like Matrix, jellyfin, etc.)

Future Updates

For now, that is all I want to cover. I have outlines for the rest of the project, but they will be broken into Blog Posts that cover from one single service configuration to over-arching network changes to simple updates using automation. Below is a general listing of topics that I'd like to cover at the moment:

Hosted Services

User-Side/Provided Services

  • Media Management
    • Jellyfin
    • Invidious for YouTube-control, as a doom-scrolling avoidance mechanism
  • Game Servers
    • Minecraft
    • Palworld
  • Communication Servers
    • Matrix
  • FileSharing
    • Undecided as of Yet

Management Services

  • Identity
    • OpenLDAP
      • Why I didn't use lldap
      • Who/describe user roles
    • Authelia
      • Why, different auth methods on different services
  • Monitoring
    • Beszel (?)
  • Cluster Management
    • DNS/DHCP/Routing
      • PiHole for LocalDNS+DHCP
      • Un-Holy Forbidden Router setup
        • Yeah I am going to Double-NAT, I'm kinda stupid but Services are exposed on Tailscale sooooo.... we'll tackle that when we get to it :T
    • Backups
      • Why not PVE-Backup?
      • Debian Host with Tape Backups
      • Rsync
    • Ansible
      • LDAP Inventory?
      • Telnet switch configs?
      • General chaos?
  • Proxy
    • Caddy
      • VPN-Exposed Services
      • Internet-Exposed Services
  • VPN
    • TailScale
      • SubnetRouter
      • Specific Clients

Here are the Cluster's specs, for interested parties:

  • 3x Dell R630 (Details are Per-Node)
    • CPU: 2x Intel Xeon
    • RAM: 64GB (32GB per-CPU)
    • Disks: 5x 512GB Samsung 980 Pro
      • 1x for OS
      • 4x for Ceph
    • NICs:
      • - 4x 1Gb Ports
        • General Service access
        • Management
        • Cluster Comms
        • Cluster Heartbeat
          • - 2x 10Gb Ports
            • Storage
            • Backups

Miscellaneous Crap:

  • 1x Dell Optiplex 7050 (backup machine, reviewing replacement)
  • 1x Lenovo Thinkpad T420 (ansible control, reviewing replacement)
  • 2x HP ProCurve J9450A Switches (horrible coil whine, crying softly about it and reviewing replacement)
  • 1x TrendNet 10Gb Switch

I plan on continuing to add Blog posts, and I have added an RSS Feed so you can add my awful ramblings to your RSS Tracker. Expect nothing, but updates may eventually be done.

Site Crudely Constructed By: Dextano