An Overdue Update
Well, a lot has happened in the months I haven't updated here, both professionally and in the lab. I'll spare you the details, but as a heads up this one is more blog than technical overview.
TL;DR on the lab side:
- I've switched to FreeIPA for account management. OpenLDAP is fine and was working, I was just frustrated with certificate management and Kerberos (read: I was burnt out on OpenLDAP configuration...)
- 10GbE Links for storage felt limiting compared to the 10GB Fiber switches + NICs I saw on Ebay and well...I made a few poor financial decisions and I'm using bonded 10GB fiber connections in the lab now.
- Ansible has taken a backseat while figuring out how to configure things before introducing automation.
- THE FORBIDDEN ROUTER HAS BEEN BANISHED. Double-NAT remains, sue me.
- I have added a few Dell Precision Micro 7050 units to the lab. They're relatively cheap and powerful.
- A TrueNAS storage server has also joined the fray.
- The HP Procurves with coil whine have been replaced by an Arista DCS7010T-48-F and an Arista DCS-7150S-24-F. And I love them very much.
- VMs and LXCs are provisioned and the system is mostly stable.
- Lenovo T420 was replaced with a Lenovo IdeaPad Slim 5i. It's nice.
- TailScale is an admittedly limited godsend.
Identity
So....yeah. If you've read the TL;DR entry and got a giggle out of the first line, this is a tad awkward with what I said in the first blog concerning OpenLDAP. After a few hours of trying to configure Kerberos with it I looked at FreeIPA.
And you know what? I realized I was a lot better off for it.
Joining a PC to the domain with a single command, having it manage it's own DDNS and Service Certificates AND I don't have to configure my Domain LDIF-by-LDIF? Sign me _RIGHT_ the fuck up.
I now have a web interface that doesn't operate like it's built on Java from the mid-90s that lets my hosts manage their own DNS records and all computer/host/service records in a single pane. I didn't even mind that a lot of the Help articles on RedHat's site were paywalled, because it was likely that the second result in my search was someone else on StackOverflow with the same question, answered just as effectively.
Caddy is really cool, and TailScale Saves the Day
'Cause I can't wrap my fuckin' head around Nginx. Rather, I could, but that's a lot more effort than I'm willing to put in right now. My certificates are auto-managed, things are connecting, and I am happy.
The cooliest bit? Caddy + TailScale, which lets me route traffic from sub-domains to internal nodes. Score! Eat your heart out, TailScale Funnel!
Anyways, I was disappointed with TailScale Funnel's HTTPS-only forwarding. Not to say it's bad, I just wanted to use my nice domains I bought and paid for.
My solution was to reverse-proxy traffic from an EC2 instance hosting Caddy to local machines on my (Dual NAT'd!!) network over the TailScale interface. Let me tell you how lovely it is, because I don't have to think about:
- opening ports in my firewall
- forwarding those ports over my local network to a SECOND port-forwarding firewall (Yuck!)
- dealing with isolating my local network from this traffic (not that I don't already have it on an isolated VLAN)
- rolling my own tunnel with a VPN client and server, and all the configs and management that comes with it (stay tuned, we might try to break from TailScale's server as well...)
For all I just hyped up Caddy+TailScale, I do have to warn that I had to custom compile it with some specific plugins (route53, caddy-dynamicdns, crowdsec bouncer). Which for someone who's never used, let alone INSTALLED the Go programming language before, was confusing. Nevertheless, once that was done and my config was in place, the sweet sweet connections started rolling in. Not from me, mind you, but they DID start rolling in...
Crowdsec and me
I absolutely detest having the existence of anything I'm serving to my friends be known by the wider internet. However, they hold the same amount of disdain for secure solutions as they're "too much effort", and it's "too inconvenient to use for simple occasional game chatter" that I hold for systems that sell my data so that when I shit a little too loudly, they're right there to sell me some sound deadening for bathrooms at 30% markup. Or to let the feds know to aim for my stomache for max effectiveness, whichever gets them top dollar. /rant
Where was I? Oh yeah, I live for my friends' convenience and I also don't like being probed. The latter only on weekdays. Typically. So to appease my buddies and keep my systems relatively secure, I have started using CrowdSec. It automatically adds blocks for known-baddies. As in bad-bitc--
Crowdsec and AWS' DDOS protection have me sleeping snug as a bug in a rug, but there's some more stuff I know for a fact I can do.
Authentication, Matrix, TURN, and my everliving nightmare trying to leave Discord
Discord announced their ID and age verification workflows and I decided it was time to leave, as did many of my friend group.
As I'm the most technical of the group (I host all game servers and help everyone with troubleshooting), I volunteered to look into alternatives. Way back when, we used to use Skype, then Steam chat for text comms and TeamSpeak3 for calls, and in this refresh we looked at TeamSpeak again alongside Stoat and Matrix. Many of my friends only use Discord for our tight-nit group, so I offered to host the instance on the domain we bought for the TeamSpeak waaaaaaaaaaaaay back in...oh lord, 2012? Jesus, I'm getting old...
I'd like to state that any software that limits simple setup, OIDC, and LDAP authentication behind a paywall can chortle my fucking balls. I have an LDAP Server, I have an OIDC Provider, and I want Single-Sign-On for my buddies and I. Why do I need an enterprise license for that? Let me rephrase, I know _why_, I just don't agree.
Long story short, I had a whole rant here about how I disliked Element's Matrix Server implementation called Synapse. As it turns (ha!) out, it's the..."easiest"...one to set up.
TURN/STUN and not wanting to use Google drove me right back to Synapse. The initial gripe I had was it was limited to 100 users (get over yourself, dude. You have ~5 friends) and LDAP connections were gated behind a paid, enterprise-level license.
So, I began to look into what I could do to host the Element ess-stack. And there began my foray into k3s, Helm, and how much I don't know about Kubernetes. I should have known that something selling itself as an "orchestrator" would have put me through the wringer. I didn't even know what I was orchestrating, and truthfully still don't. I need to spend more time and effort in understanding Kubernetes as a system, and then I can think about something more high-impact/production like Matrix hosting.
As always, this is a temporary skill issue. I've still got more thinking, planning, and testing here, but I'm stubborn and set enough on Matrix that I believe I can get it to a point my friends will use it. We will see if that statement proves true or not.
Nvidia - The Transcoding Server from Hell
Speaking of my stubborness...I have had quite the battle with Nvidia's support pages. After I posted Blog 0, I realized I wanted a GPU transcoding server and some automated scripting to help with placement of files on my NAS. I wanted to offload transcoding from my PleX/Jellyfin servers to another machine where I could review the content before final transfer to the server.
I found what I wanted in a used Nvidia Tesla A2, offered as the cheapest option on Ebay that covered all the Codecs I wanted (up thru Hx265, but no AV1, sadly). This led me down a rabbithole of helm-based frustration distraction concerning NVENC-supported transcoding that I wasn't prepared to spend time on but really wanted to get into. I mean I REEEEEEEEEEEEEEALLY wanted to get into it. Like Elon Musk begging for an invite to a part-- Anyways, I ordered an InWin 1u server chassis, a FlexATX psu, an Asrock Industrial board, and the aforementioned Tesla A2. To anyone looking to see if an Asrock IMB-1220-D will fit into an InWin IW-RF100-S315 chassis: don't do it. Stop. Order a different board or a different chassis. The northbridge heatsink on the back of the motherboard and the backing plate for the CPU cooler WILL prevent you from using the included standoffs unless you wrench that fucker down so hard that I--uh, SOMEONE...nearly made a splinter-fest of motherboard pieces...don't ask for images...
Anyways, once the machine was together and it stopped booting with PCIe Graphics as default (A2 has no outputs, it is a datacenter-focused card) and I fixed the BIOS not staying around as the motherboard doesn't have a coin-cell battry for settings saves (seriously, just don't buy this board), I was finally able to install Debian 13 and the Nvidia drivers and we-- wait...the Debain repos are 40 versions behind on Nvidia drivers....I needed version 590, but I was capped at 550.
Well I'm not going to be using anything other than Debian on this here fine machinery, so my ass needed to figure out how to compile the Nvidia datacenter drivers on top of compiling ffmpeg with NVENC support.
I am in so much pain.
The State of the Cluster
The Network
The previous plan to use some HP 1GbE switches fell through as soon as I booted my Ebay purcahses. The coil while was LOUD. Loud enough I that I decided I didn't want them. The Arista switches I was eyeing-up the previous months started looking really nice right about then.
Hosted Services
- Hosted for friends
- Valhelsia 5
- Jellyfin
- Kavita
- Seerr
- Hosted for me
- DokuWiki
- KanBoard
- Apache Guacamole
- *Arrs
- Identity
- Monitoring
- Proxy
- VPN
- Dashboards
- Homer - Users
- Homer - Admins
Here are the Lab's specs, for interested parties:
Cluster Nodes:
- 3x Dell R630 (Details are Per-Node)
- CPU: 2x Intel Xeon
- RAM: 64GB (32GB per-CPU)
- Disks: 5x 512GB Samsung 980 Pro
- - NICs:
-
- 4x 1Gb Ports
-
- 2x 10GbE Ports
- - 2x 10Gb FC Ports
Other Nodes:
- 3x Dell Optiplex 7050s (Details are Per-Node)
- CPU: 1x Intel 7th Gen 8 Core
- RAM: 32GB
- Disks: 1x 512GB Samsung 980 Pro NVMe
- NICs:
- 1x 1Gb Port
- Addtl. 1x 1Gb Port (Router Machine)
- 1x Custom-Built Transcode Server
- CPU: 1x Intel 10th Gen 8 Core
- RAM: 16GB
- Disks:
- 1x 512GB Samsung 980 Pro NVMe
- 2x 1TB Samsung 980 EVO SATA
- GPU: Nvidia Tesla A2
Storage Server:
- 1x SuperMicro CSE-847E16-R1400LPB
- CPU: 2x Intel Xeon
- RAM: 256GB
- Disks:
- 10x 14TB Blend of WD Red and Seagate Exos drives (shucked from externals)
- 3x 22TB Seagate Drives (I don't recall the model, shucked from externals)
- 4x 8TB Dell-Branded Seagate Drives
- - NICs:
- - 4x 1Gb Ports
- - 2x 10Gb FC Ports
I plan on continuing to add Blog posts, and I have added an RSS Feed so you can add my awful ramblings to your RSS Tracker. Expect nothing, but updates may eventually be done.
|