Welcome to Incremental Social! Learn more about this project here!
Check out lemmyverse to find more communities to join from here!

@AlexanderESmith@social.alexanderesmith.com cover
@AlexanderESmith@social.alexanderesmith.com avatar

AlexanderESmith

@AlexanderESmith@social.alexanderesmith.com

This profile is from a federated server and may be incomplete. Browse more on the original instance.

AlexanderESmith ,
@AlexanderESmith@social.alexanderesmith.com avatar

85453462 . Burned into my memory forever.

AlexanderESmith ,
@AlexanderESmith@social.alexanderesmith.com avatar

Out of nowhere, I logged in like a month ago. Everything was still there (at least, my friends list was). Hadn't logged in since the early 2000s. A few weeks later, I heard it was being shuttered. Weird.

US Record Labels Sue AI Music Generators Suno and Udio for Copyright Infringement (www.wired.com)

The music industry has officially declared war on Suno and Udio, two of the most prominent AI music generators. A group of music labels including Universal Music Group, Warner Music Group, and Sony Music Group has filed lawsuits in US federal court on Monday morning alleging copyright infringement on a “massive scale.”...

AlexanderESmith ,
@AlexanderESmith@social.alexanderesmith.com avatar

We need to prepare for the future where there is no jobs and AI replaced all of them.

You seem to think that the natural extension of this is that everyone who used to have a job continues to flourish, and doesn't die in the gutter because they have no money/shelter/food.

The naivity would be adorable, if it weren't also extremely dangerous and playing directly into rich assholes' plans to bleed everything dry for themselves.

AlexanderESmith ,
@AlexanderESmith@social.alexanderesmith.com avatar

Hey, I've been a Linux gamer for many, many years, and before Steam Deck it was exclusively on nVidia hardware (mostly because I also wanted CUDA cores for Blender).

[Thread, post or comment was deleted by the author]

  • Loading...
  • AlexanderESmith ,
    @AlexanderESmith@social.alexanderesmith.com avatar

    To add to this; I've done some corporate work in this area as a systems admin. If something like this comes up (within the context of being a representative of a company that finds out that someone has a domain that we may hold rights to), one of the things I've been asked to do is submit a "Uniform Domain-Name Dispute-Resolution Policy" (UDRP) complaint to ICANN (icann.org - Internet Corporation for Assigned Names and Numbers). They basically regulate domain usage and ownership, among many other things.

    To read about how these complaints work, see; https://www.icann.org/resources/pages/help/dndr/udrp-en

    Read that over while deciding whether you want to use the domain and how you use it. Give particular attention to https://www.icann.org/resources/pages/udrp-rules-2024-02-21-en , section 3-b-ix (titled "Describe, in accordance with the Policy, the grounds on which the complaint is made..."), and it's sub-items.

    AlexanderESmith ,
    @AlexanderESmith@social.alexanderesmith.com avatar

    Not a lawyer: I've never seen it be an issue if whoever's running the site isn't pretending to be something they're not. Take that for what you will.

    AlexanderESmith ,
    @AlexanderESmith@social.alexanderesmith.com avatar

    LinkedIn is Facebook, if the people you follow could fire you for not being a total brown-nosing boot licker.

    Well, the other option is an unemployable dipshit that needs somewhere to rant, thereby making themselves even less employable.

    AlexanderESmith ,
    @AlexanderESmith@social.alexanderesmith.com avatar

    Adding to the increased attention: it was Microsoft night at the ballpark, with thousands of fans in attendance with ties to the Redmond-based software giant.

    Honestly, my gut tells me this was a stunt.

    Edit: Yeah, this is BS; You can hear typing in the video. What, they have a hot mic in the booth? And the article straight up calls it a gimmick. Definitely a stunt.

    Hell, the fact that XP is handling an ultra wide display is enough to call bullshit xD

    AlexanderESmith ,
    @AlexanderESmith@social.alexanderesmith.com avatar

    Yeah, I posted a knee-jerk reaction, then followed up with an edit that says exactly that. Congrats for being able to read.

    AlexanderESmith , (edited )
    @AlexanderESmith@social.alexanderesmith.com avatar

    There's a whole lot of advice here, and practically none is it is aimed at a beginner. You don't need a reverse proxy or SSL to get started.

    1. Install the OS - You've done this already.
    2. Install some kind of http server - Apache is fine, people recommending anything else are overcomplicating. The package is called either apache2 or httpd, depending your flavor of Linux.
    3. Put your files in the web root - Usually /var/www/html/. If the file is something like index.html, it'll load as the default page without having to type http://youraddress/index.html
    4. Restart Apache - different across OSes, Google will get you there. Something like systemctl restart httpd, but "systemctl" might be "service", and "httpd" might be "apache2".

    Once you've done that, you have a computer that will serve your html files when someone hits http://[yourIP]/ . At this point, make sure your router/etc is allowing connections on port 80 (the http port), specifically to that one computer. Also, don't allow that computer to connect to the rest of your home network (not getting into a step-by-step here; every home network uses different hardware), because now that the Internet can touch it, it's a target for hackers. If all they can touch is this one computer (start calling it a server), the risk is minimal.

    If you want to point a domain at it, that gets into DNS (the Domain Name System; literally how domains are mapped to IPs so humans don't have to remember them). Cloudflare has guides for this.

    Since it's your home IP, it might change. Either be fine changing your DNS if your IP changes (which usually isn't often if you have a decent connection), or look into something called "dynamic DNS" (just a thing that grabs your current IP and updates your domain to point at it).

    NOW you can start getting into things like SSL. Remember that SSL doesn't protect you from some guy trying to hack your site/server, it just makes it harder for them to view or change content while it's being sent from the server to a site visitor (or back again, if you have a form).

    Google "add SSL to Apache", you'll find references to "VirtualHost" and a bunch of config lines starting with "SSLCertificate...". You'll also find plenty of references to "LetsEncrypt" (a free SSL provider) and "Certbot" (a program that lets you generate the certificates with LetsEncrypt). Follow those.

    As above with port 80, you'll need to make sure that port 443 (the https port) is allowed for your server through your router. Again, block your server from connecting to the rest of your network. The Internet can touch it, someone will try to hack it. The SSL doesn't save you from this.

    As for reverse proxies, you don't need one unless you're getting into load balancing or header manipulation (which means you'll probably never need one for this project).

    I'm happy to answer follow-up questions.

    AlexanderESmith , (edited )
    @AlexanderESmith@social.alexanderesmith.com avatar

    My profesional experience is in systems administration, cloud architecture, and automation, with considerations for corporate disaster recovery and regular 3rd party audits.

    The short answer to all of your questions boil down to two things;

    1: If you're going to maintain a system, write a script to build it, then use the script (I'll expand this below).

    2: Expect a catastrophic failure. Total loss, server gone. As such; backup all unique or user-generated data regularly, and practice restoring it.

    Okay back to #1; I prefer shell scripts (pick your favorite shell, doesn't matter which), because there are basically zero requirements. Your system will have your preferred shell installed within minutes of existing, there is no possibility that it won't. But why shell? Because then you don't need docker, or python, or a specific version of a specifc module/plugin/library/etc.

    So okay, we're gonna write a script. "I should install by hand as I'm taking down notes" right? Hell, "I can write the script as I'm manually installing", "why can't that be my notes?". All totally valid, I do that too. But don't use the manually installed one and call it done. Set the server on fire, make a new one, run the script. If everything works, you didn't forget that "oh right, this thing real quick" requirement. You know your script will bring you from blank OS to working server.

    Once you have those, the worst case scenario is "shit, it's gone... build new server, run script, restore backup". The penalty for critical loss of infrastructure is some downtime. If you want to avoid that, see if you can install the app on two servers, the DB on another two (with replication), and set up a cluster. Worst case (say the whole region is deleted) is the same; make new server, run script, restore backups.

    If you really want to get into docker or etc after that, there's no blocker. You know how the build the system "bare metal", all that's left is describing it to docker. Or cloudformation, terraform, etc, etc, etc. I highly recommend doing it with shell first, because A: You learn a lot about the system and B: you're ready to troubleshoot it (if you want to figure out why it failed and try to mitigate it before it happens again, rather than just hitting "reset" every time).

    AlexanderESmith ,
    @AlexanderESmith@social.alexanderesmith.com avatar

    I just started my mbin instance a week or two ago. When I did, I wrote a guided install script (it's a long story, but I ended up having to blow away the server like 7 times and re-install).

    This might be overkill for your purposes, but it's the kind of thing I have in mind.

    Note1: Sorry, it's kinda sloppy. I need to clean it up before I submit a PR to the mbin devs for possible inclusion in their documentation.
    Note2: It assumes that you're running a single-user instance, and on a single, small server, with no external requirements.

    https://alexanderesmith.com/mbin/install_mbin.bash

    AlexanderESmith ,
    @AlexanderESmith@social.alexanderesmith.com avatar

    This is fucked.

    I worked in call centers for many years (technical support and sales). I need to hear the customer's tone; ecstatic, livid, and everything in between. I sit on the other end, shut my mouth, and listen to the whole rant, then calmly offer suggestions. Do they scream some more? Maybe. Do I need to take it personally? Of course not.

    It drives me fucking crazy when some dipshit customer service rep hears one swear word (not even directed at them, like "I hate this fuckin' thing", not "you're a fuckin' dumbass") and start in on the "if you keep swearing at me, I'll end the call". Grow up, you work in a service industry, and your company probably fucked up.

    My favorite calls were the ones where someone called to cancel and tore up their voice yelling about all the reasons our product was gabrage. Very, very roughly, about 15% of the time there was nothing I could do (even if I fixed the problem, they have lost faith and will get their money back, or sue trying, so I just refund and move on). Another 25% was me fixing the problem and offering a credit because we fucked up. About half the time, its something stupid and simple and they get their problem solved, and the rest of the time was some absolutely crazy broken shit that makes me work with someone two tiers above me for a few hours fixing it (for everyone, not just that caller), then the customer is so happy they renew everything for a year because they know they're gonna get great support.

    I loved those calls. They were the reason I kept showing up to work. I learned a ton in those jobs, and my favorite thing was hearing someone go from completely apoplectic to surprised and elated that everything was fixed.

    AlexanderESmith ,
    @AlexanderESmith@social.alexanderesmith.com avatar

    There is /kbin which seems down all the time and its fork MBin which seems to have a good community but is written in PHP which I try to avoid.

    Can you expand on the reasoning for avoiding PHP? I get avoiding Java; JRE it's s disaster, and a resource hog.

    AlexanderESmith ,
    @AlexanderESmith@social.alexanderesmith.com avatar

    I'll grant that PHP is set up to allow some super shitty code, but on fairness to the language; WordPress is a dumpster fire (compounded by endless awful plugins). That's compounded by it's ubiquity, so it's a massive target.

    I just set up mbin as a single-user instance, and other than a bug I found (that they fixed live with me, in chat, including PRs), it's been awesome.

    I hope your instance continues to work well for you 👍

    AlexanderESmith ,
    @AlexanderESmith@social.alexanderesmith.com avatar

    I might have missed it, but it doesn't look like their site accepts payment data, or has a login of any kind.

    Why would the lack of SSL concern you?

    AlexanderESmith ,
    @AlexanderESmith@social.alexanderesmith.com avatar

    You'll waste more time trying to figure out how to do this than it would take to move a monitor and keyboard to the server, do the install, and plug the monitor and keyboard back into your main computer. Once the server is up, you can administer it over the network via ssh.

    AlexanderESmith ,
    @AlexanderESmith@social.alexanderesmith.com avatar

    Agreed on all counts.

    My reply initially had a "if you had a fleet of these things..." addendum, but OP's post read (to me) as though he was converting commodity hardware into a makeshift home server, so I removed it because it was almost certainly not relevant.

    AI Loophole #1; Your GitHub README.md (lemmy.world)

    I used to be the Security Team Lead for Web Applications at one of the largest government data centers in the world but now I do mostly "source available" security mainly focusing on BSD. I'm on GitHub but I run a self-hosted Gogs (which gitea came from) git repo at Quadhelion Engineering Dev....

    AlexanderESmith ,
    @AlexanderESmith@social.alexanderesmith.com avatar

    It's not paranoia if you have proof that they're stealing your content without permission or compensation.

    You come off as an AI bro apologist. What they're doing isn't okay.

    AlexanderESmith ,
    @AlexanderESmith@social.alexanderesmith.com avatar

    I'm not quite sure who's argument you're making here. It reads like you agree with OP and I (e.g. "LLMs shouldn't be using other people's content without permission", et al).

    But you called OP paranoid... I assumed because you thought OP thought their content was being used without their permission. And it's extremely clear that this is what is happening...

    What am I missing?

    AlexanderESmith ,
    @AlexanderESmith@social.alexanderesmith.com avatar

    I agree that their replies are a little... over the top. That's all kind of a distraction from the main topic though, isn't it? Do we really need to be rendering armchair diagnoses about someone we know very little about?

    I mean, if I posted a legitimate concern - with evidence - and I was dog-piled with a bunch of responses that I was a nutter, I'd probably go on the defensive too. Some people don't know how to handle criticism or stressful interactions, it doesn't mean we should necessarily write them (or their verified concerns) off.

    AlexanderESmith ,
    @AlexanderESmith@social.alexanderesmith.com avatar

    Eh. This is not a new argument, and not the first evidence of it. I don't think you're gonna be high on their list of retaliation targets, if you register at all (to say nothing of the low-to-middling reach of the fediverse in general).

    Hell, just look at photographers/painters v. image generators, or the novel/article/technical authors v. ... practically all LLMs really, or any other of a dozen major stories about "AI" absorbing content and spitting out huge chunks of essentially unmodified code/writing/images.

    AlexanderESmith ,
    @AlexanderESmith@social.alexanderesmith.com avatar

    "The world seeing [their] work" is not equal to "Some random company selling access to their regurgitated content, used without permission after explicitly attempting to block it".

    LLMs and image generators - that weren't trained on content that is wholly owned by the group creating the model - is theft.

    Not saying LLMs and image generators are innately thievery. It's like the whole "illegal mp3" argument. mp3s are just files with compressed audio. If they contain copyrighted work, and obtained illegitimately, THEN their thievery. Same with content generators.

    AlexanderESmith ,
    @AlexanderESmith@social.alexanderesmith.com avatar

    you got some criticism and now you’re saying everyone else is a bot or has an agenda

    Please look up ad hominem, and stop doing it. Yes, their responses are a distraction from the topic at hand, but so were the random posts calling OP paranoid. I'd have been on the defensive too.

    [Our company] publish[s] open source work ... anyone is free to use it for any purpose, AI training included

    Great, I hope this makes the models better. But you made that decision. OP clearly didn't. In fact, they attempted to use several methods to explicitly block it, and the model trainers did it anyway.

    I think that the anti-AI hysteria is stupid virtue signaling for luddites

    Many loudly outspoken figures against the use of stolen data for the training of generative models work in the tech industry, myself included (I've been in the industry for over two decades). We're far from Luddites.

    LLMs are here

    I've heard this used as a justification for using them, and reasonable people can discuss the merits of the technology in various contexts. However, this is not a justification for defending the blatant theft of content to train the models.

    whether or not they train on your random project isn’t going to affect them in any meaningful way

    And yet, they did it while ignoring explicit instructions to the contrary.

    there are more than enough fully open source works to train on

    I agree, and model trainers should use that content, instead of whatever they happen to grab off every site they happen to scrape.

    Better to have your work included so that the LLM can recommend it to people or answer questions about it

    I agree if you give permission for model trainers to do so. That's not what happened here.

    AlexanderESmith ,
    @AlexanderESmith@social.alexanderesmith.com avatar

    In fairness, a lot of the more exceptional engineers I've worked with couldn't write their way out of a wet paper bag.

    On top of that, even great technical writers are often bad at picking - or sticking with - an appropriate target audience.

    AlexanderESmith ,
    @AlexanderESmith@social.alexanderesmith.com avatar

    This "fair use" argument is excellent if used specifically in the context of "education, not commercialization". Best one I've seen yet, actually.

    The only problem is that perplexity.ai isn't marketing itself as educational, or as a commentary on the work, or as parody. They tout themselves as a search engine. They also have paid "pro" and "enterprise" plans. Do you think they're specifically contextualizing their training data based on which user is asking the question? I absolutely do not.

    AlexanderESmith ,
    @AlexanderESmith@social.alexanderesmith.com avatar

    "Your honor, we can use whatever data we want because model training is probably fair use, or whatever".

    I don't know what's worse, the fact that you think creators don't have the right to dictate how their works are used, or that you apparently have no idea what fair use is.

    This might help; https://copyright.gov/fair-use/

    AlexanderESmith ,
    @AlexanderESmith@social.alexanderesmith.com avatar

    The MPAA and music industry would beg to differ. As would the US courts, as well as any court in a country we share copyright agreements with.

    Consider that if a movie uses a scene from another movie without permission, or a music producer uses a melody without permission, or either of them use too much of an existing song without permission, everyone sues everyone else, and they win.

    Consider also that if a large corporation uses an individual's content without permission, we have documented cases of the individual suing, and winning (or settling).

    Some other facts to consider;

    • An mp3 file is not inherently illegal. Nor is a torrent file/tracker/download.
    • If the mp3 file contains audio you don't own the rights to, it is illegal, same for the torrent you used to download/distribute it. In the eyes of the law, it's theft.
    • A trained LLM or image generation model is not inherently theft, if you only use open-source or licensed/owned content to train it
    • (at odds in our conversation) What of a model that eas trained with content the trainer didn't own?

    In the mp3 example, its largely an individual stealing from a large company. On the Internet, this is frequently cheered as the user "sticking it to the man" (unless, of course, you're an indie creator who can't support yourself because everyone's downloading your content for free). Discussions regarding the morality of this have been had - and will be had - for a long time, but it's legality is a settled matter: It's not legal.

    In the case of "AI" models, its large companies stealing from a huge number of individuals who have no support or established recourse.

    You're suggesting that it's fine because, essentially, the creators haven't lost anything. This makes it extremely clear to me that you've never attempted to support yourself as a creator (and I suspect you haven't created anything of meaning in the public domain either).

    I guess what it comes down to is this; If creators can be stolen from without consequence, what incentive does anyone have to create anything? Are you going to work your 40-60 hours a week, then come home and work another 20-40 hours to create something for no personal benefit other than the act of creation? Truely, some people will. Most wont.

    AlexanderESmith ,
    @AlexanderESmith@social.alexanderesmith.com avatar

    I already replied to the essence of this in my reply to your other post about how "illegal downloads aren't theft because its a copy", but I'll mention here that this is even more evidence that you aren't a creator, and I suggest that your opinions on this subject aren't relevant, and you should avoid subjecting other people to them.

    AlexanderESmith ,
    @AlexanderESmith@social.alexanderesmith.com avatar

    Agreed on all points, except my personal interpretation of "fair use" specific to the case of generative models.

    You call out "doesn't replace the original work". Is that not how you see an LLM Q/A bot replacing a user going to a git repo for established examples, or a website for an article (generating page views, subscriptions, ad revenue), or similar? Why would anyone go to the source materials if they're getting their answer from the bot?

    This is practically the same as when Google started showing articles in AMP, and not bringing people to the original website, is it not?

    AlexanderESmith ,
    @AlexanderESmith@social.alexanderesmith.com avatar

    "evidence suggests that you probably aren't a creator"
    "As a result, I suggests that your opinions aren't relevant"

    Aside from the fact that these are not character attacks, I encourage you to refute my assumptions. Otherwise, my points will stand on their own.

    AlexanderESmith ,
    @AlexanderESmith@social.alexanderesmith.com avatar

    The first sentence directly addresses your comment "it's not theft" with "the law says it is".

    The rest of the post attempts to explain why it is so and some of the moral or ethical discussions surrounding some examples.

    AlexanderESmith ,
    @AlexanderESmith@social.alexanderesmith.com avatar

    First, a chat bot is not an API. Second, they were talking about the the formatting and delivery method of the data, not the content.

    Regarding the output of the model: Some repos are entirely READMEs by their nature. No code, just documentation and walkthroughs. Notwithstanding that; If I set a flag that's says "don't use my data" and they use it anyway, that's theft, even if it's only one file, even if the file is just a description of the code. That's my work, not yours. You don't get to use it however you want, unless I specifically note that it's public domain (or you use it and follow the license, like attributing me, or linking to the repo, etc).

    As to the difference between a bot and a human (re: stack overflow)? The former is a representative of a company (automation or not, whether it's a bot or a page on their corporate site), the latter is a person relating experience and opinion. The legal difference is that one is using the data commercially, and the other is just a person in the world, answering another person's question for no reason other than a desire to be helpful (and if they're decent, attributing the source instead of claiming that they're generating wisdom on their own).

    That last parenthetical used to be called plagiarism, by the way.

    AlexanderESmith ,
    @AlexanderESmith@social.alexanderesmith.com avatar

    I'm currently avoiding silicon until more apps are compiled to work on them. My last bad experience with this was trying to run virtualbox on the host and ununtu as a guest, and it ran slow as crap because some part of virtualbox wasn't ready for silicon yet.

    Disclaimer: I generally avoid Apple like the plague, my comment and experience are specific to a job that really wanted me to use a macbook in my role as a Linux systems admin. My specific complaint may well have been adressed literally years ago by now.

    AlexanderESmith ,
    @AlexanderESmith@social.alexanderesmith.com avatar

    I was wondering what that ominous music was when I woke up this morning

    AlexanderESmith ,
    @AlexanderESmith@social.alexanderesmith.com avatar

    I always use the browser versions (partly because I don't like installing things, and partly because I run Linux), so it pretty much always shows me away. And I don't care.

  • All
  • Subscribed
  • Moderated
  • Favorites
  • random
  • incremental_games
  • meta
  • All magazines