“You may not instantly see why I bring the subject up, but that is because my mind works so phenomenally fast, and I am at a rough estimate thirty billion times more intelligent than you. Let me give you an example. Think of a number, any number.”
“Er, five,” said the mattress.
“Wrong,” said Marvin. “You see?”
― Douglas Adams, Life, the Universe and Everything
If you discount the pop-culture numbers (for us 7, 42, and 69) its the number most often chosen by people if you ask them for a random number between 1 and 100. It just seems the most random one to choose for a lot of people. Veritasium just did a video about it.
I'm curious about that too. Something is twisting weights for 57 fairly strongly in the model but I'm not show what. Maybe its been trained on a bunch of old Heinz 57 varieties marketing.
Unsolicited fact: Heinz picked the number 57 at random, it just sounded like good marketing at a time when things were general marketed as "tonic #4" and the like.
You don’t even need a calculator for a quick calculation, take the closest value of 10: 3x7=21x37 or easier 20x40 = 800 which is close to the actual number, 777.
I don't like the inclusion of 37%, it's 1/e that isn't even 37%, is only that because of a pretty arbitrary rounding. Veritasium videos are usually OK, but this one is pretty meh.
Probably just because it's prime. It's just that humans are terrible at understanding the concept of randomness. A study by Theodore P. Hill showed that when tasked to pick a random number between 1 and 10, almost a third of the subjects (n was over 8500) picked 7. 10 was the least picked number (if you ditch the few idiots that picked 0).
I remember watching a lecture about probability, and the professor said that only quantum processes are really random, the rest of things that we call random is just the human inability to measure the variables that affects the random outcome. I'm an actuarie, and it's made me change the perspective on how I see and study random processes and how it made think on ways to influence the outcome of random processes.
Even quantum just appears random I think. it's beyond our scope of perspective, it works in multiple dimensions. we only see part of the process.
That's my guess though it could be totally wrong
it's a matter of interpretation, but generally the consensus is that quantum measurements are truly probabilistic (random), Bell proved that there can't be any hidden variables that influence the outcome
Didn't Bell just put that up as a theory and it got proven somewhat recently by other researchers? The 2022 physics Nobel Prize was about disproving hidden variables and they titled their finding with the catchy phrase "the universe is not locally real".
No problem! Interpretations of quantum mechanics are also still very much under discussion, and Bell's inequality only says that there are no local hidden variables. While QM very accurately describes observations so far, it's by no means solved, and there's a good chance that a new theory will upend much of it in the future
Interpretation for sure. Bells theory and then it being proven winning a Nobel prize to me only proves more we really don't understand the world around us and only perceive what we need to survive. And that maybe we should be less standoffish to ideas that change our current paradigm, because we obviously have a lot to learn.
Bells inequality is a statement about math, it gives an inequality that could only be violated if there were no local hidden variables (read: if measurements were truly random). That was a statement of math, which is rigorously provable. It took experimental confirmation, but we can now say with high confidence that there are no local hidden variables (i.e. there is no information hidden that we simply cannot measure, instead the outcome is only decided the moment you measure).
Global hidden variables are still an option, but they would require much of the rest of physics to be rewritten
...which is kind of a hilarious tautology, because "quantum processes" are by definition "processes that we are unable to decompose into more basic parts".
The moment we learn about some more fundamental processes being the reason for a given process, it stops being "quantum" and the new ones become "it".
My art professor wrote a book about famous artists and thinkers dying at 37: Raffaello, Parmigianino, Valentin de Boulogne, Cantarini, Watteau, Van Gogh, Toulouse-Lautrec, Tancredi, Gnoli, Manai, Majakovskij, Rimbaud, Byron, Mozart, Robespierre
What you've described would be like looking at a chart of various fluid boiling points at atmospheric pressure and being like "Wow, water boils at 100 C!" It would only be interesting if that somehow weren't the case.
Where is the "Wow!" in this post? It states a fact, like "Water boils at 100C under 1 atm", and shows that the student (ChatGPT) has correctly reproduced the experiment.
Why do you think schools keep teaching that "Water boils at 100C under 1 atm"? If it's so obvious, should they stop putting it on the test and failing those who say it boils at "69C, giggity"?
Derek feeling the need to comment that the bias in the training data correlates with the bias of the corrected output of a commercial product just seemed really bizarre to me. Maybe it's got the same appeal as a zoo or something, I never really got into watching animals be animals in a zoo.
Hm? Watching animals be animals at a zoo, is a way better sampling of how animals are animals, than for example watching that wildlife "documentary" where they'd throw lemmings of a cliff "for dramatic effect" (a "commercially corrected bias"?).
In this case, the "corrected output" is just 42, not 37, but as the temperature increases on the Y axis, we get a glimpse of internal biases, which actually let through other patterns of the training data, like the 37.
"we don't need to prove the 2020 election was stolen, it's implied because trump had bigger crowds at his rallies!" -90% of trump supporters
Another good example is the Monty Hall "paradox" where 99% of people are going to incorrectly tell you the chance is 50% because they took math and that's how it works.
Just because something seems obvious to you doesn't mean it is correct. Always a good idea to test your hypothesis.
Trump Rallies would be a really stupid sample data set for American voters. A crowd of 10,000 people means fuck all compared to 158,429,631. If OpenAI has been training their models on such a small pool then I'd call them absolute morons.
A crowd of 10,000 people means fuck all compared to 158,429,631.
I agree that it would be a bad data set, but not because it is too small. That size would actually give you a pretty good result if it was sufficiently random. Which is, of course, the problem.
But you're missing the point: just because something is obvious to you does not mean it's actually true. The model could be trained in a way to not be biased by our number choice, but to actually be pseudo-random. Is it surprising that it would turn out this way? No. But to think your assumption doesn't need to be proven, in such a case, is almost equivalent to thinking a Trump rally is a good data sample for determining the opinion of the general public.
Yes, but it's significant because the prompt was to choose a number. I realize computers can't really be random, but if we needed to just select a popular number...we can already do that!
There are devices that measure radioactive decay for operations where truly random numbers are very important. Or something like that, I am not an expert, sorry.
Interesting. As I understand it, pure computing (not sensors recording external data) are incapable of generating truly random numbers. But I'm obviously not an expert either!
I've been using "Perfect Passwords" for years, which apparently generate nearly random passwords from server noise, but he admits it's still not truly 100% random...
I'm not a hundred percent sure, but afaik it has to do with how random the output of the GPT model will be. At 0 it will always pick the most probable next continuation of a piece of text according to its own prediction. The higher the temperature, the more chance there is for less probable outputs to get picked. So it's most likely to pick 42, but as the temperature increases you see the chance of (according to the model) less likely numbers increase.
This is how temperature works in the softmax function, which is often used in deep learning.
I mean... they didn't specify it had to be random (or even uniform)? But yeah, it's a good showcase of how GPT acquired the same biases as people, from people..
Reminds me of my previous job where our LLM was grading things too high. The AI "engineer" adjusted the prompt to tell the LLM that the average output should be 3. I had a hard time explaining that wouldn't do anything at all, because all the chats were independent events.
Anyways, I quit that place and the project completely derailed.
In his video, he shows that the more common answers are actually 42 and 69.
I discards them because they're picked for a reason rather than a human genuinely trying trying to pick a random number, but they're still way more common than 37.
That's because they asked the internet for those polls. The internet thinks they're funny by picking the meme numbers. So I can understand why they chose to omit those numbers from their results.
HA, funny that this comes up. DND Beyond doesn't have a d100, so I opened my ChatGPT sub and had it roll a d100 for me a few times so I could use my magic beans properly.
But why use Chatgpt for that? Why not a duck duck go action? I just don't understand why we're asking a LLM whose goal is consistency, not randomness, to do random
Yup! Also one has to mind the order in which one rolls the dice. Since 10 and 5 could be either 05 or 50. As a bonus, if you roll them in order of "tens" to "ones", getting 10 on the first dice has added suspense since the latter dice determines if it is going to count as a low roll of 0X (by rolling 1-9 on the next dice X) or if it is going to be a max roll of 100 (by rolling another 10).
LMs aren't thinking, aren't inventing, they are predicting what is supposed to be answered next, so it's expected that they will produce the same results every time
This graph actually shows a little more about what's happening with the randomness or "temperature" of the LLM.
It's actually predicting the probability of every word (token) it knows of coming next, all at once.
The temperature then says how random it should be when picking from that list of probable next words. A temperature of 0 means it always picks the most likely next word, which in this case ends up being 42.
As the temperature increases, it gets more random (but you can see it still isn't a perfect random distribution with a higher temperature value)
They add some fuzziness to it so it doesn't give the exact same result. Say one gets a score of 90, another 85, and other 80. The 90 will be picked more often, but they sometimes let it pick the 85, or even the 80. It's perfectly expected, and you can see that result here with 42 being very common, but then a few others being fairly common, and most being extremely uncommon.
I'm curious, is there actually so many 42's in the system? (more than 69 sounds unlikely)
What if the LLM is getting tripped up because 42 is always referred to as the answer to "the Ultimate Question of Life, the Universe, and Everything".
So you ask it a question like give a number between 1-100, it answers 42 because that's the answer to "Everything", according to it's training data.
Something similar happened to Gemini. Google discouraged Gemini from giving unsafe advice because it's unethical. Then Gemini refused to answer questions about C++ because it's considered "unsafe" (referring to memory management). But Gemini thinks C++ is "unsafe" (the normal meaning), therefore it's unethical. It's like those jailbreak tricks but from its own training set.
I’m curious, is there actually so many 42’s in the system?
Sort of, it's not actually picking a random number. It does not know what "random" means. It is analyzing the number of times the question "pick a random number" was asked and what the most common responses to that question looked like.
It's a human thing, though. This is just more evidence of LLM's problem with garbage in, garbage out: it's human biases being present in a system that people want to claim doesn't have them.
People do mention Veritasium, though he doesn't give any significant explanation of the phenomenon.
I still wonder about 47. In Veritasium plots, all these numbers provide a peak, but not 47. I recall from my childhood that I indeed used to notice that number everywhere, but idk why.