Welcome to Incremental Social! Learn more about this project here!
Check out lemmyverse to find more communities to join from here!

Self Hosting Fail

I woke up this morning to a text from my ISP, "There is an outage in your area, we are working to resolve the issue"

I laugh, this is what I live for! Almost all of my services are self hosted, I'm barely going to notice the difference!

Wrong.

When the internet went out, the power also went out for a few seconds. Four small computers host all of my services. Of those, one shutdown, and three rebooted. Of the three that ugly rebooted some services came back online, some didn't.

30 minutes later, ISP sends out the text that service is back online.

2 hours later I'm still finding down services on my network.

Moral of the story: A UPS has moved to the top of the shopping list! Any suggestions??

Kuinox ,

When you are bored, backup a VM then hard kill it and see if it manage to restart properly.
Software should be able to recover from that.
If it doesn't, troubleshoot.

pezhore ,
@pezhore@lemmy.ml avatar

While I appreciate the sentiment, most traditional VMs do not like to have their power killed (especially non-journaling file systems).

Even crash consistent applications can be impacted if the underlying host fs is affected by power loss.

I do think that backup are a valid suggestion here, provided that the backup is an interrupted by a power surge or loss.

taladar ,

most traditional VMs do not like to have their power killed (especially non-journaling file systems).

Why are you using a non-journaling file system in 2024 when those were common 10+ years ago?

ryannathans ,

Or even better use something like ZFS with CoW that can't corrupt on power loss

emptiestplace ,

and don't fuck with sync writes

taladar ,

I would still consider that generation of filesystem to be effort to use while regular journaling filesystems have been so ubiquitous that you need to invest effort to avoid using one.

ryannathans ,

It was supported and the default out of the box when I installed my OS

taladar ,

Maybe on some distros that is the case if you install a recent version but to get a non-journaling filesystem you literally have to partition manually to avoid using one on any distro that is still supported today and meant for full sized PCs (as opposed to embedded devices).

ryannathans ,

Are you talking about Linux distros? What manual partitioning has to occur?

taladar ,

If you want to use a filesystem that is so bad that it doesn't even have journaling you need to manually select it. None of them have been using one of those by default for 10-15 years now.

lazynooblet ,
@lazynooblet@lazysoci.al avatar

It's been a while since a power cut affected my services, is this why?

I remember having to troubleshoot mysql corruption following abrupt power loss, is this no longer a thing?

taladar ,

Databases shouldn't even need a journaling filesystem, they usually pay attention to when to use fsync and fdatasync.

In fact journaling filesystems basically use the same mechanisms as databases only for filesystem metadata.

possiblylinux127 ,
@possiblylinux127@lemmy.zip avatar

Your system should be fine after a hard kill. If its not stop using it as that's going to be a problem down the road.

BlackAura ,

When I built my home server this is what I did with all VMs. Learned how to change the start up delay time in esxi and ensured everything came back online with no issues from a cold built.

Rip VMware.

Deebster ,
@Deebster@programming.dev avatar

That reminds me of Netflix's Chaos Monkey (basically in office hours this tool will randomly kill stuff).

redcalcium ,

I feel your pain. Just the other day the disk on my home assistant machine died after a power outage and I had to replace it with another disk and restore from backup.

JovialSodium ,

My suggestion just changes your threat model, so may not be a good one based on your wants.

Perhaps consolidate systems? Managing less devices = less points of failure. But adds the risk of any given failure being more severe.

padook OP ,
@padook@feddit.nl avatar

This thought came to me this morning. I have 4 machines both because the BEAST grows organically, and because we're always trying to avoid that single point of failure. Then a scenario comes along that makes you question your whole way of thinking, diversifying may actually create more problems

Static_Rocket ,
@Static_Rocket@lemmy.world avatar

I present to you the holy hardware compatibility table:

https://networkupstools.org/stable-hcl.html

Anything not listed there is not worth buying.

catloaf ,

A lot of stuff on there isn't worth buying either, like anything from APC. If you want good stuff, just get Eaton.

But also you have to understand that UPSes aren't set and forget. The batteries need replacement every 3-5 years. And they're not for extended outages, they're mostly to bridge the gap between mains power going out and a generator starting up.

Personally I just have everything running from docker-compose, so I run one command and everything not running gets started. I don't worry about stuff being down for a bit.

anamethatisnt ,

Eatons batteries are usually really simple to switch, see
https://www.eaton.com/content/dam/eaton/products/backup-power-ups-surge-it-power-distribution/backup-power-ups/eaton-5s-ups/eaton-5s-120v-user-manual-700-1000-1500-lcd.pdf

For me they are meant for allowing a graceful shutdown in a powerout scenario and to protect the hardware behind them from power surges.

acockworkorange ,

That's why you integrate with NUT. So you can automate a graceful shutdown when battery levels drop to a set level.

wreckedcarzz ,
@wreckedcarzz@lemmy.world avatar

Tl;dr apc bad? I have 4 cyberpower so no experience with them

corsicanguppy ,

They got bought. They started to suck.

Now they've been 'sploited, they're overpriced, and they still schlepp the same bad software.

clmbmb ,

What's wrong with APC? I have one for 6-7 years. I've changed the battery once and I think I'll have to change it again this year. I didn't have any problems with it.

Appoxo ,

Trash software.
At least their service is good.

Static_Rocket ,
@Static_Rocket@lemmy.world avatar

Can confirm. Software is trash. Wanted me to connect it to the internet and setup a cloud access account. Like, dude, you're a glorified battery pack I'm not adding a backdoor because you want to tell APC when my warrenty is about to expire so I can get marketing emails.

lemmyvore ,

IMHO you're optimizing for the wrong thing. 100% availability is not something that's attainable for a self-hoster without driving yourself crazy.

Like the other comment suggested, I'd rather invest time into having machines and services come back up smoothly after reboots.

That being said, an UPS may be relevant to your setup in other ways. For example it can allow a parity RAID array to shut down cleanly and reduce the risk of write holes. But that's just one example, and an UPS is just one solution for that (others being ZFS, or non-parity RAID, or SAS/SATA controller cards with built-in battery and/or hardware RAID support etc.)

pezhore ,
@pezhore@lemmy.ml avatar

I agree that 99.999% uptime is a pipedream for most home labs, but I personally think a UPS is worth it, if only to give yourself the option to gracefully shut down systems in the event of a power outage.

Eventually, I'll get a working script that checks the battery backup for mains power loss and handle the graceful shutdown for me, but right now that extra 10-15 minutes of battery backup is enough for a manual effort.

shnizmuffin ,
@shnizmuffin@lemmy.inbutts.lol avatar

Some of the nicer models of UPS have little servers built in for remote management, and also communicate to their tenants via USB or Serial or Emergency Power Off (EPO) Port.

You shouldn't have to write a script that polls battery status, the UPS should tell you. Be told, don't ask.

knobbysideup ,
@knobbysideup@sh.itjust.works avatar

I run nut on a pi.

JustEnoughDucks ,
@JustEnoughDucks@feddit.nl avatar

The problem is that for most self-hosters, they would be working and unavailable to do a graceful shutdown in any case even if they had a UPS unless they work fully from home with 0 meetings. If they are sleeping or at work, (>70% of the day for many or most) then it is useless without graceful shutdown scripts.

I just don't worry about it and go through the 10 minute startup and verification process if anything happens. Easier to use an uptime monitor like uptimekuma and log checker like dozzle for all of your services available locally and remotely and see if anything failed to come back up.

pezhore ,
@pezhore@lemmy.ml avatar

This is why I have about five of these bad boys: CyberPower CP1500PFCLCD.

One is in my utility room for my cable modem and our chest freezer, three back up my homelab and wifi AP, and one is for my office.

They've been bulletproof through storms, and when we've lost power, but not Internet I can't keep on working.

The big thing to look for is number of battery+surge outlets vs just surge outlets. Typically they top out at 1500VA - the more overhead for what you're powering, the longer you can go without mains power.

A screen/display is helpful for at-a-glance information like expected runtime, current output, etc.

Catsrules ,

Never heard of someone using a UPS on a Fridge/Freezer.
Does it make a difference? Seems like the UPS would just died after 10-20 minutes and not really make much difference to your freezer.

pezhore ,
@pezhore@lemmy.ml avatar

I didn't intend to use it on the chest freezer - it was mostly for the modem, but since I had spare battery capacity and outlets I thought what the heck.

The power load is practically nothing until it cycles, and even then it's fairly efficient - my current runtime is estimated to be about 18 hours, more than enough to come up with an alternative if we lose power in a storm.

Decronym Bot , (edited )

Acronyms, initialisms, abbreviations, contractions, and other phrases which expand to something larger, that I've seen in this thread:

Fewer Letters More Letters
CSAM Child Sexual Abuse Material
DNS Domain Name Service/System
NAS Network-Attached Storage
PiHole Network-wide ad-blocker (DNS sinkhole)
Plex Brand of media server package
RAID Redundant Array of Independent Disks for mass storage
SATA Serial AT Attachment interface for mass storage
VPS Virtual Private Server (opposed to shared hosting)
ZFS Solaris/Linux filesystem focusing on data integrity

8 acronyms in this thread; the most compressed thread commented on today has 11 acronyms.

[Thread for this sub, first seen 3rd Mar 2024, 16:05]
[FAQ] [Full list] [Contact] [Source code]

anamethatisnt ,

Figure out how much power your servers use on average with the help of a wattage meter, then enter that number and how many minutes battery backup you want in Eatons UPS Power Calculator to find a suitable unit. I'm sure other vendors have similar tools too.

JoMiran ,
@JoMiran@lemmy.ml avatar

A UPS should always be your first or second purchase if only for power conditioning and brown-out protection.

jkrtn ,

They will do power conditioning? My modem is such a sensitive baby I cannot plug anything else in next to it or it starts dropping packets. Would a UPS help with that? Unfortunately I cannot replace the modem, that's the only one the ISP will give me.

catloaf ,

Yes. An online/double-conversion UPS will be the most effective, because it actually runs off the battery the whole time, so it's disconnected from any line quality issues.

A line-interactive UPS is cheaper, but doesn't do full power conditioning.

An offline UPS doesn't do any at all, only comes online when power drops.

https://community.fs.com/article/line-interactive-vs-online-vs-offline-ups.html

Catsrules , (edited )

FYI
Few downside of an online/double conversion UPS will use extra power if that is something your trying to avoid.

Also some of them will have a 24/7 fan so there will be extra noise.

atzanteol ,

You should buy a UPS if those things are concerns for you. If not, then don't.

ChojinDSL ,
@ChojinDSL@discuss.tchncs.de avatar

UPS with usb allows you to configure a script to properly shutdown your server when a power outage happens and the UPS battery is about to run out.

piefedderatedd ,
  • UPS, good idea.

  • backups too.

CameronDev ,

Did the services fail to come back due to the bad reboot, or would they have failed to come back on a clean reboot? I ugly reboot my stuff all the time, and unless the hardware fails, i can be pretty sure its all going to come back. Getting your stuff to survive reboot is probably a better spend of effort.

fuckwit_mcbumcrumble ,

Yeah an unclean reboot shouldn’t break anything as long as it wasn’t doing anything when it went down. I’ve never had any issues when I have to crash a computer unless it was stuck doing an update.

padook OP ,
@padook@feddit.nl avatar

I didn't mean to imply that Services actually broke. Only that they didn't come back after a reboot. A clean reboot may have caused some of the same issues because, I'm learning as I go. Some services are restarted by systemctl, some by cron, some....manual. This is certainly a wake up call that I need standardize and simplify the way the services are started.

CameronDev ,

We've all.committed that sin before. Its better to rely on it surviving the reboot than to try prevent the reboot.

Also worth looking into some form of uptime monitoring software. When something goes down, you want to know about it asap.

And documenting your setup never hurts :D

nimmo ,
@nimmo@lem.nimmog.uk avatar

On the uptime monitoring I've been quite happy with uptime kuma, but... If you put it on the same host that's down... Well, that's not going to work :p (I nearly made that mistake)

CameronDev ,

Same, Uptime Kuma is fantastic. I put it on my most critical server, if Kuma is down, everything is down :D

elvith ,

It's not the most detailed thing, but I just use a free account on cron-job.org to send a head request every two minutes to a few services that are reachable from the internet (either just their homepage or some ping endpoint in the API) and then used the status page functionality to have a simple second status page on a third party server.

You can do a bit more on their paid tier, but so far I didn't need that.

On the other hand, you could try if a free tier/cheap small vps on one of the many cloud providers is sufficient for an uptime Kuma installation. Just don't use the same cloud provider as all other of your services run in.

nimmo ,
@nimmo@lem.nimmog.uk avatar

Oh, I'm fine with my setup, I have a couple of external servers that can monitor all my web accessible stuff with kuma and then I've got another local one to monitor my non-web accessible stuff.

Thanks for those tips though, definitely useful to consider other options

iknowitwheniseeit ,

I reboot every box monthly to flush out such issues. It's not perfect, since it won't catch things like circular dependencies or clusters failing to start if every member is down, but it gets lots of stuff.

Swarfega ,

This is why I gave up self hosting. It's great when it works but it just becomes an expensive second job. I still have Plex/Jellyfin etc but for emails and password vaults I just pay for external services.

jelloeater85 ,
@jelloeater85@lemmy.world avatar

I self host stuff that I feel the need to. But TBH, you don't really need to self host much, outside of media collections. PhotoPrism and JellyFin are about the only two I need, aside from a PiHole. Most folks would be fine with a beefy NAS.

cypherix93 ,

if you don't want it to feel like a second job, you could always quit your first!

padook OP ,
@padook@feddit.nl avatar

I could have the best self hosted setup.... living in a van, down by the river!
https://feddit.nl/pictrs/image/50ed2716-bdba-4376-8cfa-7d9d512eb2ae.png

atmur ,

I like to host as many services as possible and I’m fine with it being a second job at times since this is my main hobby, but I actually agree with you on your examples. The three things I won’t self-host are:

  1. Emails - I am not willing to put in the effort on this. Plus, my ISP blocks those ports so I’d already be into using a VPS even if I wanted to host this. I’d rather just pay someone else, like Proton.

  2. Password manager - I actually did self-host Bitwarden for a long time, but after thinking about it for a while, I decided to take the pay someone else approach here too. I’m pretty sure I’m doing everything correctly, but I’m not a security expert. I’d rather be 100% sure my passwords are in safe hands rather than be 95% sure that I’m doing everything right on this one.

  3. Lemmy - I’ve heard about (luckily never seen) CSAM attacks on Lemmy/Kbin and will not risk that kind of content being downloaded because I’m federated with an instance dealing with those attacks. I’m happy to throw a couple bucks at lemmy.world’s Patreon and let them handle that.

cyborganism ,

Yeah if you self host, a UPS is very important.

knobbysideup ,
@knobbysideup@sh.itjust.works avatar

In addition to ups, an LTE failover. I've had my Comcast crap be offline for hours.

bitwolf ,

Does this require a lot of gear? Or does it simply act as another gateway?

themoonisacheese ,
@themoonisacheese@sh.itjust.works avatar

It requires an LTE capable gateway and a data plan. As for the rest you can simply write your routing tables so that if the main gateway doesn't work, use the secondary gateway with lower prio.

knobbysideup ,
@knobbysideup@sh.itjust.works avatar

There are devices like the Netgear lm1200 that can do it inline by themselves.

I have that device, but configured as a second gateway. My firewall manages the failover based on primary packet loss and latency.

ripcord ,
@ripcord@lemmy.world avatar

I'd like that, but also a really long-running UPS. multi-hour power outages are surprisingly common in my area.

towerful ,

Thats no longer a UPS.
You could get something like a powerwall, something designed to power things from batteries for a long time.
Or get a generator with an automatic failover. The UPS then covers the downtime between powerfailure and generator taking load

ripcord ,
@ripcord@lemmy.world avatar

Why is that no longer a UPS?

towerful ,

Generally, UPS (lead acid) batteries are not designed for long-cycle deep discharge.
They are designed to hold their rated load for a minute or so until the power is restored (generators start, power-uncuts) or the servers have a chance to shut down.
But maybe thats dated information, and modern UPSs are designed to run from batteries for a few hours.

ripcord ,
@ripcord@lemmy.world avatar

That seems like a weirdly and artificially narrow definition of UPS.

  • All
  • Subscribed
  • Moderated
  • Favorites
  • selfhosted@lemmy.world
  • random
  • incremental_games
  • meta
  • All magazines