Welcome to Incremental Social! Learn more about this project here!
Check out lemmyverse to find more communities to join from here!

@yote_zip@pawb.social avatar

yote_zip

@yote_zip@pawb.social

Every community I care about is dead

This profile is from a federated server and may be incomplete. Browse more on the original instance.

yote_zip ,
@yote_zip@pawb.social avatar

Break the beef into smaller pieces first so the germs can't find it.

yote_zip ,
@yote_zip@pawb.social avatar

I've seen a trend where people move the goalposts on the reasons they're not able to switch. "If only this program worked I could switch", but when that program is ported it'll be a new excuse next. Sooner or later you'll have to draw a line and say "99% of my stuff works, the 1% that doesn't can get bent".

yote_zip ,
@yote_zip@pawb.social avatar

Syncthing - No introduction needed. Couldn't live without it.

Healthchecks.io (you can self host this) - Dead man's switch monitoring for all my automation. Most of my automated scripts hit up a Healthchecks endpoint when they run, and if they fail to hit the endpoint on a regular schedule I get notified. Mandatory for my anxiety.

yote_zip ,
@yote_zip@pawb.social avatar

ZFS doesn't eat your SSD endurance. If anything it is the best option since you can enable ZSTD compression for smaller reads/writes and reads will often come from the RAM-based ARC cache instead of your SSDs. ZFS is also practically allergic to rewriting data that already exists in the pool, so once something is written it should never cost a write again - especially if you're using OpenZFS 2.2 or above which has reflinking.

My guess is you were reading about SLOG devices, which do need heavier endurance as they replicate every write coming into your HDD array (every synchronous write, anyway). SLOG devices are only useful in HDD pools, and even then they're not a must-have.

IMO just throw in whatever is cheapest or has your desired performance. Modern SSD write endurance is way better than it used to be and even if you somehow use it all up after a decade, the money you save by buying a cheaper one will pay for the replacement.

I would also recommend using ZFS or BTRFS on the data drive, even without redundancy. These filesystems store checksums of all data so you know if anything has bitrot when you scrub it. XFS/Ext4/etc store your data but they have no idea if it's still good or not.

yote_zip ,
@yote_zip@pawb.social avatar

ZFS without redundancy is not great in the sense that redundancy is ideal in all scenarios, but it's still a modern filesystem with a lot of good features, just like BTRFS. The main problem will be that it can detect data corruption but not heal it automatically. Transparent compression, snapshotting, data checksums, copy-on-write (power loss resiliency), and reflinking are modern features of both ZFS/BTRFS, and BTRFS additionally offers offline-deduplication, meaning you can deduplicate any data block that exists twice in your pool without incurring the massive resources that ZFS deduplication requires. ZFS is the more mature of the two, and I would use that if you've already got ZFS tooling set up on your machine.

Note that the TrueNAS forums spread a lot of FUD about ZFS, but ZFS without redundancy is ok. I would take anything alarmist from there with a grain of salt. BTRFS and ZFS both store 2 copies of all metadata by default, so bitrot will be auto-healed on a filesystem level when it's read or scrubbed.

Edit: As for write amplification, just use ashift=12 and don't worry too much about it.

yote_zip ,
@yote_zip@pawb.social avatar

ZFS can grow if it has extra space on the disk. The obvious answer is that you should really be using RAIDZ2 instead if you are going with ZFS, but I assume you don't like the inflexibility of RAIDZ resizing. RAIDZ expansion has been merged into OpenZFS, but it will probably take a year or so to actually land in the next release. RAIDZ2 could still be an option if you aren't planning on growing before it lands. I don't have much experience with mdadm, but my guess is that with mdadm+ZFS, features like self-healing won't work because ZFS isn't aware of the RAID at a low-level. I would expect it to be slightly janky in a lot of ways compared to RAIDZ, and if you still want to try it you may become the foremost expert on the combination.

yote_zip ,
@yote_zip@pawb.social avatar

The main problem with self-healing is that ZFS needs to have access to two copies of data, usually solved by having 2+ disks. When you expose an mdadm device ZFS will only perceive one disk and one copy of data, so it won't try to store 2 copies of data anywhere. Underneath, mdadm will be storing the two copies of data, so any healing would need to be handled by mdadm directly instead. ZFS normally auto-heals when it reads data and when it scrubs, but in this setup mdadm would need to start the healing process through whatever measures it has (probably just scrubbing?)

yote_zip ,
@yote_zip@pawb.social avatar

Mirrored vdevs allow growth by adding a pair at a time, yes. Healing works with mirrors, because each of the two disks in a mirror are supposed to have the same data as each other. When a read or scrub happens, if there's any checksum failures it will replace the failed block on Disk1 with Disk2's copy of that block.

Many ZFS'ers swear by mirrored vdevs because they give you the best performance, they're more flexible, and resilvering from a failed mirror disk is an order of magnitude faster than resilvering from a failed RAIDZ - leaving less time for a second disk failure. The big downside is that they eat 50% of your disk capacity. I personally run mirrored vdevs because it's more flexible for a small home NAS, and I make up for some of the disk inefficiency by being able to buy any-size disks on sale and throw them in whenever I see a good price.

yote_zip ,
@yote_zip@pawb.social avatar

Blind automatic upgrades are a bad idea even for casual home users. You could run into a Linus Tech Tips "do as I say" scenario where it uninstalls half your system due to a dependency issue. Or it could accidentally uninstall part of your system that you don't notice.

I'm not sure how stable Gentoo's default branch is but I know that daily upgrades on Arch Linux is close to suicide - you have a higher chance of installing a buggy package before it's fixed if you install every package version as it comes in.

I'm surprised this strategy was approved for a public server - it's playing with a loaded revolver and it looks like you were finally shot.

yote_zip ,
@yote_zip@pawb.social avatar

There's no way $3k/month is an accurate number to run a single community. Lemmy.world (and mastodon.world etc) combined takes ~1k/month to run, and they're far larger and more active.

Also, anyone who goes back to Reddit doesn't belong with the 196 community. No bootlickers allowed.

yote_zip ,
@yote_zip@pawb.social avatar

A shortlist:

  • it has the best lossy image compression (not counting extremely low bitrate images, where AVIF starts to win)
  • it can losslessly recompress JPEGs for a free 20% space savings - no image quality loss
  • it supports parallel decoding for extra speed
  • it supports progressive decoding (viewing a lower quality version of the image while it loads), unlike WebP/AVIF which just "pop up" when you've downloaded the whole thing
  • it supports lossless
  • it compresses lossless extremely well (notably unlike AVIF and PNG which fall on their face with lossless compression)
  • it supports animation (though AVIF is generally a better format for animation, because it's based on a proper video codec)
  • it supports HDR
  • it has a very strong resilience against generation loss (the classic "JPEG degradation" of resaving images)
  • it is royalty-free
  • it otherwise has roughly every image format feature we've ever thought of included in its spec

If JXL is not the next image format then we will never ever get rid of JPEG and PNG. There has never been a more obviously superior image format in history.

This might help: Image format comparison table
yote_zip ,
@yote_zip@pawb.social avatar

I think you forgot a pretty crucial point, that it is also royalty free.

I'll go back and add it - there's a lot of great stuff that I didn't mention just for brevity. The biggest royalty concern is HEIC atm, which is basically a nonstarter. I'm not sure how the licensing on the other free formats compares against JXL.

I wonder how the Chrome team managed to test it so poorly they claimed it wasn’t worth it? Just the versatility alone should make it a no-brainer.

Make no mistake, it was a political killing. They didn't kill it because of perceived performance, they killed it ahead of their public benchmarks because of "lack of interest". Their cited lack of interest was determined after only a few months of the format going live behind opt-in experimental flags, and once they made their original decision, just about every large tech company spoke up in favor of JXL against Google's decision on their bugtracker, including Adobe, Intel, Nvidia, Facebook, Shopify, and Flickr. Google still plugged their ears and pretended no one was interested.

Google is trying to push WebP (2.0?) and AVIF, and using their browser marketshare to kill JXL and make that happen. Why they went through all this trouble to kill a format that they themselves co-developed, I really have no idea. I follow JXL relatively closely and I still am not 100% sure why they went through with this. All I know is that the decision was politically-motivated, and without applying political/ecosystem pressure they're not going to change their minds with data.

Edit: by the way, the last few comments still trickling in on that bugtracker are a great read, especially . reads so similarly to my comment I'm surprised I didn't write it, haha.

yote_zip ,
@yote_zip@pawb.social avatar

You don't need to use the high compression profiles to get good performance though. If you have a usecase where you are resource limited you should stick to effort levels 5-7 for very little loss in quality, or even 3-4 for lightning quick speed (the default is effort 7). Reference this benchmark against AVIF for effort values vs. speed (SSIMULACRA 2 is a deterministic psychovisual metric - higher is better).

Also, an important consideration in this realm is that JXL makes really clever use of variable-DCT (how big a chunk is) and adaptive quantization (what quality should be used for that chunk), allowing "quality levels" that you specify to be much more visually consistent across every image, instead of other codecs that make some images look bad at quality level 90 and some images look good at level 70. This allows you to select a consistent quality level and lower your encoding effort to compensate, instead of needing to always drive a high quality+effort level to account for every region in a picture looking good.

(If you want a slightly deeper dive into JXL's performance, this is a concise post on various metrics)

yote_zip ,
@yote_zip@pawb.social avatar

No one uses hardware decoding for images - it's just not a good fit for the reality of how we use images. Images are small and easy to decode, whereas starting up a hardware decoder takes a non-trivial amount of time. Additionally, GPU decoders only work single-threaded, so each image would have to be decoded one by one, instead of all at once like with CPU decoding. This was already attempted with VP8/WebP and they gave up trying to make it any good. Videos are good candidates for hardware decoding since they're large and you're only looking at one at a time.

If you have benchmarks or some proof showing otherwise by all means post here.

yote_zip ,
@yote_zip@pawb.social avatar

JXL has been ready for practical use for a while now - the only place where JXL support is still missing is browsers (due to Google's politically-motivated removal from chromium). I'm not sure if anyone has tried using JXL with ML, but it's certainly ready to be tested right now. IMO JXL has been ready since their libJXL 0.7.0 release, which happened September 2022 last year. They're still working towards a 1.0 but every image-related application has built-in support for JXL already and it can more or less be considered ready.

haven’t seen any major downsides, besides less optimal performance for very low resolution images

Just to note here, to be precise AVIF starts (barely) winning at low fidelity ranges, not low resolution. Meaning if you want a blurry mess that looks like this, AVIF will compress slightly better (that's an actual AVIF converted to PNG by the way).

At the risk of sounding like sour grapes, this compression advantage doesn't truly matter. This level of compression is almost never used, and even if it was, even drastic relative filesize savings would ultimately amount to bytes/kilobytes in the grand scheme of all images you're serving. It's more impactful to compress large images simply because they are larger. Smaller images are already small and efficiency deltas in a 1kB vs 1.1kB image are meaningless compared to a 600kB vs 800kB image.

I also wonder if the support for progressive loading might be useful for more efficient, low resolution variants of high resolution models. Just store one set of high res images and load them in progressive steps to make smaller data sets. Like, say you have a bunch of 8k images, but you only want to make a website banner based on the model from those 8k res images. I wonder if it’s possible to use the the progressive loading support to halt reading in the images at 1k

I'm not fully confident on this aspect but I'm pretty sure that JXL supports more than just traditional progressive decoding - you can actually pull "complete" images out of the bitstream from arbitrary ranges. Meaning you could efficiently store a full range of quality options in just one image, then serve them on the fly.

Any time I see a big feature jump, like better file size, I assume the trade off in another feature negates at least half the benefit. It’s pretty rare, from what I’ve seen, to have improvements on all fronts.

JXL is self-described "alien technology from the future", and it was made by a "dream team" of image engineers who have had a hand in just about every image codec and compression technique from our past. It also benefits from being a real image codec, whereas every recent image format that has gained widespread adoption has been derived from a video codec (WebP, AVIF, HEIC).

The only truly useful thing it doesn't perform best-in-class at is animation encoding (losing to AVIF because it's based on the amazing AV1 video codec), and I would honestly recommend just serving AV1 videos instead, and skipping image formats entirely.

A neutral aspect of JXL is that it does worse in single-core decode speed compared to JPEG (which is disgustingly fast), but JXL can be parallelized whereas JPEG cannot. This is ultimately an advantage for JXL for general usecase where users have at least 4 cores available, but for large-scale distributed processing I imagine this property of JPEG may still have an edge use-case?

If you're curious about the technical aspects of JXL, I recommend reading their official slidedeck. The nitty-gritty details start at page 59, but the whole thing is a good read.

yote_zip ,
@yote_zip@pawb.social avatar

With effort level 7 you should be getting images roughly 2/3's of the size of the original PNG on average (assuming the PNG is already properly optimized). I would try again with at least effort levels 3, 4, 5, and 7. Also consider that PNGs need very expensive CPU time to properly compress them, using a tool like oxipng.

What sort of balance are you looking for with regards to filesize and encode time? At the very least, effort levels 1 through 3 will probably still give you better results than PNG while being ridiculously quick, so there shouldn't be any configuration where PNG is a better choice than JXL with regards to speed.

yote_zip OP ,
@yote_zip@pawb.social avatar

It actually happens a decent amount, e.g. your boss is required to sign in each shipment, but they're lazy so they tell you to handle the shipments and sign for them.

  • All
  • Subscribed
  • Moderated
  • Favorites
  • incremental_games
  • meta
  • All magazines