Yeah, I heard that, too. Consider that people who don't like tech may not have very reliable knowledge of tech. Regardless, OAI would appreciate your business.
For text, AI training AI wouldn't be all that great for giving data sets a little poison ivy rubdown, because at the end of the day, the message is still moderated by a non bot. I think a better way would be to write more unconventionally, but heavily contextual so that if specifics texts are ripped and tossed into the bot blender, it'll make no sense without the context alongside it.
Slang, edge case wording, and verbing non verbs would likely do a lot of heavy lifting in that department.
Using LLMs for corporate communications - automatically-generated complaint responses, and the like - usually has swearing disabled, so if you want to fuck up their shit, be sure to express yourself with as many fucking swears as possible. Let's get that shit into those cunt's language models ASAP.
AI trainers do a lot of work filtering and reformatting the training data. Often that's the most expensive part. There's a lot of synthetic data used these days too, reprocessed by other AIs.
spez says that's how he got reddit off the ground in the first place: faking content/engagement (well, genuinely engaging with his account(s?), but essentially shouting into the void and hoping enough people heard and wanted to stick around.
with a RedditUserBot trained on reddit users, you might be able to fake another decade of growth.
Yes, but I did not mean retroactively. Nor did I mean only on Reddit, by the way. However, making money from already published content is not what I have consented when I joined Reddit like 15 years ago.
You retain any ownership rights you have in Your Content, but you grant Reddit the following license to use that Content:
When Your Content is created with or submitted to the Services, you grant us a worldwide, royalty-free, perpetual, irrevocable, non-exclusive, transferable, and sublicensable license to use, copy, modify, adapt, prepare derivative works of, distribute, store, perform, and display Your Content and any name, username, voice, or likeness provided in connection with Your Content in all media formats and channels now known or later developed anywhere in the world. This license includes the right for us to make Your Content available for syndication, broadcast, distribution, or publication by other companies, organizations, or individuals who partner with Reddit. You also agree that we may remove metadata associated with Your Content, and you irrevocably waive any claims and assertions of moral rights or attribution with respect to Your Content.
you agree that by posting messages, uploading files, inputting data, or engaging in any other form of communication with or through the Website, you grant us a royalty-free, perpetual, non-exclusive, unrestricted, worldwide license to use, reproduce, modify, adapt, translate, enhance, transmit, distribute, publicly perform, display, or sublicense any such communication in any medium (now in existence or hereinafter developed) and for any purpose, including commercial purposes, and to authorize others to do so.
Haven't dug up anything earlier than this, do you know of any?
Basically, you gave Reddit your approval long ago.
Yes I did, but it is not clear if these are enforceable in court, when they give us read those multi page agreements that most people skip. More over AI like today did not exist and one can easily argue that that agreement does not cover data use for AI like chatGPT, since neither of the side understood implications for that. It is like owning nukes is not covered by second amendment.
The important thing here IMO is not so much the enforceability as the intent. It was always obvious that Reddit would do whatever they wanted with the stuff we published there because they said they would do whatever they wanted with the stuff we published there. Personally, I knew this and just shrugged because it's no skin off my back if they do whatever they want with the stuff I published there - I was having fun posting, which was my goal. If they figured out some way to make those posts valuable then bully for them. They weren't otherwise valuable to me so it costs me nothing.
It's the same here on the Fediverse. When I post this stuff I'm tossing it out into the ether. It's on an open protocol intended to broadcast my comments to any compatible instances, so even if there isn't some literal terms of service that I signed that says "this content may show up on Threads or wherever" I know that it might show up on Threads or wherever. If I was truly fundamentally opposed to that then I wouldn't post.
As you could have guessed, I am on the same page with one exception (or addition) - I want my content to be used for free for AI training. My objection to Reddit agreement is that they want to paywall information needed for future progress.
Fortunately they may not really be able to. Reddit's comments and submissions are available here, and since this includes deleted content as well as the stuff that users have later edited away with scripts it may even be a better resource than what Reddit is offering itself. You'd need to train your AI in a legally permissive environment, of course, but there's places like that around the world and this is actually something that would advantage the "little guys" since they aren't as easy to target.