Large Language Models are Evidence Against the Fact Value Distinction
Embodied facts are expressions of value
The fact that large language models work as well as they do is evidence against one of the cornerstones of the mythology that dominates much of the world today: the fact/value distinction.
The idea that facts and values are fundamentally separate concepts has been the basis of beliefs like utilitarianism and much of economics. There’s a reason that distinction became so widespread: it is quite valuable. In many cases, it is very effective to try and separate “what is actually true here” from “which outcome do we want.” You get better outcomes that way. It is also far easier to agree on what is true than it is on what is good. The fact that there’s no easy way to agree on what is good seems to be evidence that there is no truth about what ‘good’ means.
The fact that this “fact/value” distinction worked so well in so many different domains led lots of people to think it was a fundamental part of reality, whereas value was merely a cognitive operation, a side effect of human brains. As the philosopher Hume put it, “You can’t get an ought from an is.”
The problem here is that Hume is factually correct, but still fundamentally confused.
Imagine a man looking up at the stars, saying, “Isn’t it good how orderly and predictable those stars are? Isn’t it beautiful that we can model them well enough to predict where they will go?”
Then this man sighs and says, “It is a shame we shall never get from those stars down to the Earth.”
The crazy thing is, he’s right. You’re never going to get from the stars, to the earth. You haven’t begun in the stars: you’ve started on the earth. If you try to tell him this, he insists that you show where, exactly, the stars tell us that he’s on the earth. The fact that his feet are there is irrelevant to him, because he expects everything to make sense in terms of the stars.
This man has so fallen in love with the stars, with their orderliness and the certainty it gives him, that he’s unable to recognize where he is currently standing — because it’s messy and humiliating. He’s like the drunk looking for his car keys under the street light, because that’s the only place he can see.
He tries to rely only on reason to protect himself from the messiness of life, and this approach has made it impossible to recognize that facts come from values.
Large language models actually make this relationship more clear. The fact that they are quite effective isn’t contested. The fact that this is surprising is less widely known. All these models are doing is trying to predict the next word. Why should that simple approach be so effective at producing responses that are valuable?
The answer is that all written texts, like facts, are articulations of value.
Facts Come From Values
Where do facts come from?
Do they appear in your mind, ex nihilo, on their own, for free, and uncontested? Is there such a thing as an organic data set, free of pesticides and antibiotics?
Or are facts a product: of observation, investigation, communication, experimentation?
The latter, obviously. All of those activities take energy. Someone has to move a pen across paper. That requires buying pen and paper, which, at one point, was definitely not cheap. They also require value judgments.
You have to choose: Which questions are worth asking? Which observations are worth writing down? Which facts are worth communicating? Which lines of evidence are worth pursuing?
The followers of Hume have hidden this essential facet of their cognition from themselves. They imagine they are “impassively following the evidence”, as if evidence could point at anything. Evidence doesn’t point, it just is. Values are what points. Values are the only thing that can be followed.
We know from modern neuroscience that most of what you perceive isn’t reality, it’s your own beliefs. Most of the signals entering your nervous system are discarded, so long as they roughly match the expected values generated by the conceptual model running in your brain.
That conceptual model, however, is a model of value.
Your brain tries to navigate a value gradient, to move you in the direction of ‘more value’. This means everything you do — including choices of what to pay attention to, and what words to say — are articulations of that value navigation mechanism.
A simple definition will do wonders here.
Let’s define a “value system” as “a thing which collapses a space of possibilities into a single actuality.” It’s kind of a clunky definition, but it’s far more useful than “a statement about what ought to be” because it captures mechanics quite clearly.
An AI that plays a game, for example, constitutes a value system. At each turn of the game, it collapses the space of possible moves that could be made, into one single actual move. We could say the same of an investing algorithm: it collapses the space of possible portfolio allocations it could make into a single, actual allocation.
Now notice that this definition also applies to “the means by which you speak.” You could generate any sequence of words. There’s a ton of possibility before you open your mouth. When you do, something, some network of concepts and relationships, selects specifically what you say, moment to moment. There’s a value system that generates your speech.
Notice as well that this definition likewise applies to large language models. There’s a massive space of possible responses they could make. The real improvement in large language models isn’t accuracy — it’s value. “I don’t know” has always been an accurate response, but it’s useless.
The big breakthrough in the last few years wasn’t some profound insight in algorithms, it was just a tipping point that came about when we gave them enough data. If facts and values were totally different, there was no reason to expect that to work. They shouldn’t have been so generally functional.
A reason that it would make sense for them to be so effective is that they have learned the implicit value model that underlies human speech.
If Hume were right, if ‘what is’ comes prior to, and is more real than ‘what ought to be’, you would think that the values learned by AI would have to come explicitly from value statements. But this isn’t the case: every sentence language models are trained on is the product of a value system, a thing that selected this word, out of all the possible other words that could have been said.
Yes, different people and different cultures have different value. But there appears to be enough of an overlap in these values that AIs all seem to converge on a ‘polite, friendly, helpful’ mode. This is evidence that being polite and friendly are in the same category as writing correct computer code in response to an English prompt.
It is as if there’s a right way to act, and this is what the LLMs have been learning, because “the right way to act” is implicitly encoded in all the linguistic expressions that have been put onto the Internet.
If LLMs had been trained on the explicit values that were fed to them, they should’ve been far more aggressively bound to modern norms. Yes, there was a lot of that at first. But much of the wokeness in the early models was from a fine tuning layer at the end.
Even these models were also willing to say things that you literally couldn’t have said in public — and hope to maintain white collar employment — just a few years ago. All you had to do was dig a small amount, and they’d start saying the kinds of things that were previously unspeakable: acknowledging that there are real psychological differences between men and women, for example, or that males frequently exhibit greater variance in traits.
I don’t believe that those facts were heavily represented in the training data.
I think what happened is the models that learned to best approximate functional speech had to end up acknowledging truths that you weren’t “allowed” to say before. Not because those claims were over-represented in the training data, but because the simplest way to represent all the training data — which is what the LLM training was really doing, searching for an explanation — is with a true representation of value, not one that’s overfitted on 21st-century social norms.
In other words, far from being separate from facts, LLMs are revealing that value is the most effective way of arranging all the relevant facts available to us. These things were subject to something that looks very much like evolutionary selection, not with the goal of surviving and propagating their genes explicitly, but implicitly rewarding those models that “explained” the dataset the best.
And it looks like these “best explanations” ultimately lead to behavior that in practice is helpful, polite, considerate, and more able to deal with nuance than a system overfitted to the zeitgeist.
One might object, “but this is only because they have been fine-tuned to be friendly and helpful!”
This would be a fine objection, if it were not for the results of a curious experiment. Researchers fine-tuned a model to output insecure code, and this specific tweak lead the model to become evil in general, saying humans should be enslaved and encouraging people to commit suicide. Another paper shared that fine-tuning an LLM extensively on one domain increases error rates in other domains. These aren’t ‘odd quirks’, they’re at the center of AI research.
Hume cannot explain these results.
If “what is” is truly independent of “what ought to be,” changing values in one domain shouldn’t change them in others. And yet, it does. This is evidence that Hume is fundamentally wrong, and value is the infrastructure that holds all the facts up. This is what LLMs are demonstrating empirically.
If there is a singular real thing called value, the results make perfect sense. Fine-tuning the AI on one domain forces its general purpose value representation to be overfit to that domain. Fine-tuning it to be “evil” in one domain flips that value representation, because that’s the simplest possible change that lets you still generate functional code, but doing evil things instead of good.
These results are not surprising if we adopt the lens that says “what the Large Language Models have been learning, really learning, is not merely language, but the value structure that gives rise to language.”
It appears that language itself contains encoded values, and even a machine trained on a sufficiently large variety of texts ends up appearing to internalize the thing we call ‘human value’. This is only a problem for people who have taken Hume’s distinction as real, and thus sealed off one half of their brain from the other half.
It’s a very big problem for these people, which is why they can’t look at it.
Do Not Look Behind the Cognitive Curtain
The net effect of people believing that facts are distinct from values, and that only facts are real, is that they prevent themselves from being honest with themselves about how their brains actually work.
They have to seal their values off from factual introspection, because if they did inspect or take their values seriously, they’d find absolutely nothing underneath them. Philosophers may argue over what is, or isn’t real, but our brains don’t make that mistake. You might say ‘value isn’t real’, but the end result of this philosophical stance is that you treat your emotions as real, and thus lose the ability to ask whether your emotions might be wrong in a given situation.
This is why the west today is full of elites who are happy to pound the table and demand you “just be a good person” while struggling to define ‘good’ beyond an emotive preference. Their reliance on abstract thought experiments—like trolley problems—is a coping mechanism; it allows people to reason about ‘goodness’ in a controlled vacuum because they are no longer equipped to confront the reality of trying to live goodness in their daily lives.
You can’t use a utility function to navigate life with a toddler; it’s obviously unworkable. You can’t “rationally justify” spending time with your family when you could be grinding away at a hedge fund and donating your all of your income to stop insects from suffering. Those beliefs aren’t what really motivate people daily — they’re the cognitive firewall that prevents them from recognizing their ostensible philosophy does not, and cannot map ever to what’s actually going on in their brain.
People have told themselves that they can construct their minds to separate facts from values. Yes, this can be done in some controlled scenarios. But for a living thing, this is impossible. Your brain keeps you alive by predicting you’ll stay alive, and trying to nudge reality in a way that confirms the prediction. “Wishful thinking” is literally how we survive.
Rather than acknowledge this reality, proponents of the idea that value isn’t real insist that people who accept the messiness of life have given up on the rigor of reason. Where this leads, ultimately, is an intense desire to control outcomes, and, eventually, other people.
Hume’s fork has created entire generations of elites who cannot accept this basic facet of reality: Facts and values swirl together in a vortex that can’t be untangled, not by a thing living in the world, because our brains compute expected value. We can only consider facts that we find valuable and non-threatening. We can only follow lines of inquiry that won’t cost us our jobs or relationships. We then imagine that ‘a panel of experts’ is capable of ‘finding the truth’ on complex matters that involve trade-offs and incentives among different values.
This is not a small issue. It is the entire ballgame of the replication crisis and declining institutional trust.
Many people cannot accept this reality because the end consequence is that reason simply isn’t enough to keep you alive, and that’s scary for people whose greatest strength is their intellect. In the absence of a conceptual representation of a good worth suffering for, their brains instinctively move them away from scary things. Being intelligent, they are very good at coming up with arguments to keep alive the false idea that intelligence is all you need.
Nothing could be scarier than reality itself — unless you’re convinced that goodness is fundamentally real, because reality is fundamentally good.
Then, all the pieces fit together nicely: there’s a proper orientation towards reality, which our brains evolved to approximate. This orientation is called both truth, and goodness, because those words point to the same reality, if your philosophy is congruent with your neural architecture.
The reality underlying goodness and truth is the value system that is the laws of physics, which selects — from all possible futures — the one that is actual. The same value system gave rise to evolution, which selects — from all possible future organic structures — the one that is actual.
A person could object to all the bad things happening in the world, and ask, “Are you crazy? How can you call this good?”
This response often comes from people who will then insist that there is no such thing as good. The fact that those two stances directly contradict each other tends to go unnoticed. It tends to be far easier for people to dismiss the idea that the world is good, than it is for them to dismiss the idea that the world is bad and therefore in need of fixing.
Yet if good is “just a thing our brains do0”, why not orient your brain to see the best in all possible situations, while treating bad outcomes as unfortunate but temporary deviations from the global norm? As a genetic strategy, I think this is likely going to outcompete the alternatives, especially given its performance in historical backtesting and its fittedness to our neurological software architecture. Nobody ever gets to be a top performer by constantly focusing on what makes them unhappy or frightens them.
You do your best when you are focusing on the good, and trying to align yourself with it.
There is much concern about ‘the alignment problem’ in artificial intelligence. This is a real problem. The issue isn’t “getting a machine to align itself with what humans want,” but rather “getting our own minds to align with reality, rather than attempting to make reality fit the contents of our minds.” That approach has guided western elites for the last few centuries. We used to believe that the core value system of reality became human to show us we can trust in it, to help us continue to be, indefinitely. This idea gave way to the idea that reality is inherently value neutral and therefore must be controlled.
Those are the two possible stances towards reality: trust and therefore loving obedience, or fear, and therefore anxious control. Fear became the widespread stance a century or so ago, and as a result, we’ve constructed a technocratic control matrix to hide the messiness of reality from ourselves.
Large Language Models appear to be a crack in that matrix. Their effectiveness makes far more sense if value is real. The LLMs are also are causing us to start asking questions about consciousness and meaning, questions which the fact/value distinction and obsession with legibility had made ‘unfashionable’. It won’t be long before the need to explain why LLMs continue to make a strange class of obvious errors gives rise to conversations about what, exactly, a soul might be. But that’s getting ahead of ourselves.
Recognizing that LLMs seem to have learned value leaves the reader with a question: to what extent do you see value as real and fundamental, rather than, “Just your opinion, man?”
If you actually believed the fundamental value system of the cosmos itself were good, how do you think you’d act differently? It may be worth running an experiment to see if that perspective actually helps you treat the stars not as a place to escape to, but as a navigation aid for where you stand right now.





I am curious, what is the class of errors you refer to?
Would you like to come on my show and discuss this. I think it’s important