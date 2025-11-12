Come for the politics, stay for the snark.

Michigan is a great lesson for Dems everywhere: when you have power…use it!

If you’re gonna whine, it’s time to resign!

There are times when telling just part of the truth is effectively a lie.

They don’t have outfits that big. nor codpieces that small.

Why is it so hard for them to condemn hate?

Come on, man.

Jesus watching the most hateful people claiming to be his followers

They want us to be overwhelmed and exhausted. Focus. Resist. Oppose.

It’s all just conspiracy shit beamed down from the mothership.

Whatever happens next week, the fight doesn’t end.

Let the trolls come, and then ignore them. that’s the worst thing you can do to a troll.

Our job is not to persuade republicans but to defeat them.

Dear legacy media: you are not here to influence outcomes and policies you find desirable.

We do not need to pander to people who do not like what we stand for.

“A king is only a king if we bow down.” – Rev. William Barber

Cancel the cowardly Times and Post and set up an equivalent monthly donation to ProPublica.

If rights aren’t universal, they are privilege, not rights.

Wake up. Grow up. Get in the fight.

There is no right way to do the wrong thing.

Jesus, Mary, & Joseph how is that election even close?

So many bastards, so little time.

It’s the corruption, stupid.

If senate republicans had any shame, they’d die of it.

… riddled with inexplicable and elementary errors of law and fact

You are here: Home / Science & Technology / Part 3: There Is No Artificial General Intelligence Down This Road

Part 3: There Is No Artificial General Intelligence Down This Road

by | 61 Comments

This post is in: , ,

Guest post series from *Carlo Graziani.

Guest Post: AI 1

On Artificial Intelligence

Hello, Jackals. Welcome back, and thank you again for this opportunity. What follows is the third part of a seven-installment series on Artificial Intelligence (AI).

The plan is to release one of these per week, on Wednesdays (skipping Thanksgiving week), with the Artificial Intelligence tag on all the posts, to assist people in staying with the plot.

Part 3: There Is No Artificial General Intelligence Down This Road

This week and next we will be taking a close look at the claims made by the Tech industry that there are already indications that Artificial General Intelligence (AGI) is “emerging” in large language models (LLMs), and that true AGI will be a reality within the next few years. Keep in mind that AGI is the objective that these companies are targeting, and its realization is the essential justification for the roughly $2T investments in “AI” model development that the industry now projects over the next 5 years or so.

You might think that to justify that level of investment would require a pretty airtight scientific case that (1) AGI is possible in principle, and (2) that AGI is achievable through current LLM technology, which is to say, using transformer-based deep learning (DL). But if you did think that, you would be wrong. Whether AGI can be accomplished at all has been an open question since the 1930s. And, as I will argue in this essay, we are certainly not any closer to AGI with current “AI” tech than we were before the DL revolution began.

The Circular Argument For AGI

The first thing to observe is that there does not really exist a scientifically-defensible definition of what AGI is. There is a fairly balanced review of the topic here. The principal problem is that we don’t even know how to accurately describe or define either the mechanisms or the characteristics of human intelligence, so when definitions of AGI appeal to notions such as “the ability of computers to perform human-like cognitive tasks” they are comparing one imprecise notion to a different imprecise notion.

Moreover, it is important to note that all such definitions are circular: they define AGI in an LLM in terms of certain types of output produced by LLMs, and then promptly discover evidence for that very output, proving that AGI is near. This paper, Sparks of Artificial General Intelligence: Early experiments with GPT-4 is an unintentionally hilarious example of the genre.

I find this sort of thing extremely frustrating. Language matters in science. I don’t want to have to parse statements that amount to defining what intelligence looks like in text output, from people who don’t have the faintest idea what intelligence is.

Cognitive scientists also labor under this constraint, designing tests and experiments to try to understand aspects of human cognition from stimuli and responses. But they have no choice in the matter: we are very far away from having experimental access to the higher-level functioning of the human brain, so those scientists use the tools that are available. Computer scientists have no such excuse: they have complete access to and control over their models. Nonetheless, the tests for intelligence that they adopt are essentially stylized versions of the cognitive science tests, with stimulus and response replaced by prompt and response. There is no effort to describe what aspect of transformers (or the chained, augmented transformers in the “reasoning” models of OpenAI and others). There is only complacent satisfaction that some combination of pre-training, fine-tuning, distillation, computational scaling, iteration, etc. produces improved performance on “reasoning” benchmarks. Sure, that’s very nice, although “improved” does not mean “adequate”, according to ARC-AGI-2 testing. But excuse me, what isthis “reasoning” of which you speak?

I’ll have more to say about reasoning next week. For now, I just want to point out that whatever reasoning is, it is certainly a distinct cognitive process from learning. So the assertion that reason can “emerge” from what are pure statistical learning systems is a huge claim, one whose justification would require mountains of really impressive scientific evidence, including a detailed explanation of the mechanism by which it arises in LLMs or chains of LLMs.

The Implausibility of “Learning To AGI”

In order to break down the claim into intelligible pieces, it is useful to adopt the “model-agnostic” outlook on machine learning that I discussed last week. Recall that in that outlook, we draw a veil over the details of the machine learning implementation, and focus on learned distributional structure of training data and on optimality of decision choice. In this case, the training data is vast amounts of text distilled and cleaned and curated, from large-scale Internet scrapes, from large libraries of scanned books, from academic journals, and so on. The decisions are responses to prompts. Whatever the thing behind the veil is, what it does is learn an approximation to the distribution of texts, and approximately optimal responses to prompts.

I need to introduce a concept here that is familiar to most scientists: it is the idea of an inverse problem. The problem is this: given that some data resulting from observations of some process, infer certain attributes of that process. A simple example is weather prediction: given a time-series of observations of weather conditions at thousands of weather stations, and radar and other remote observations, recover an approximation of the current full state of the atmosphere, so as to evolve it using a numerical weather model to predict whether it will rain tomorrow. Another famous (and essentially unsolved) example is from epidemiology: given some time-series of data on infections, hospitalizations, and deaths due to COVID-19, say, infer the current state of the epidemic (how many people are susceptible, exposed, infected, recovered, immune, on a county-by-county basis), and use a numerical epidemiological model predict the epidemic’s future course.

Note the essential elements of such problems: we have a principled model of the process (a numerical weather model, or an epidemiological model) whose state we would like to infer (the atmospheric state, or the state of the epidemic) using data (weather observations, clinical data) so as to make predictions (will it rain during my picnic, is there a new epidemic wave in progress). There is always an assumed “forward model” that describes how the observed data arises, given the state of the process. But that state is unknown, and to estimate it from data one must in some sense “invert” the forward model. Hence “Inverse Problem”.

The process model plays a key role. You need to have some idea of how the process works—a set of equations that governs the process, for example, depending on unknown parameters that you need to infer—for there to even be a well-posed inverse problem. That’s not a sufficient condition, but it is certainly a necessary one.

Inverse problems are ubiquitous in science. In fact, one could, after a few beers, make the claim that most of the daily activities of scientists revolve around solving inverse problems. This is not completely true (where did the principled process models come from, in the first place?) but it is not a grotesque caricature either.

We can view the training of an “AGI” in inverse problem terms: the data is the oceans of text that these things ingest. The process model is the transformer-based “reasoning” model. The “state” to be inferred is the parameter configuration of that model that closely corresponds to a representation of the mental state of a reasoning human. The predictions are reasoned responses to prompts.

OK that’s all I need. Here is the problem: in order to believe that LLMs are achieving “reason” (the minimum requirement for any definition of AGI), we need to accept two big claims:

  1. Whatever a reasoning process may be, it leaves a sufficiently informative imprint of its internal state in text data, such that the state may be in some sense recovered and exploited, given a sufficiently large corpus of text, by solving the corresponding inverse problem.
  2. Transformer-based LLMs, in some sense, play the role of the process model in this inverse problem, and training such an LLM is tantamount to solving the inverse problem. Moreover, the trained LLM embodies the resulting reasoning entity to the point that at inference time it actually reasons.

Let’s take these in order:

In my opinion, claim (1) is barely sane. Perform any sort of introspection, and I think it is likely that you will find that your spoken or written utterances embody only the most superficial layers of your reason and other cognitive processes. That’s why we all struggle to put our thoughts into words when the occasion arises. We often are not even clear about what our thoughts are, and find, after putting them into words, that they have changed, possibly getting clearer, but also often becoming murkier and less certain as we are forced to articulate our meaning1.

I simply cannot understand how such subrational processes might embed any interpretable information in our utterances. It is analogous to believing that, given a full, principled model of human physiology, and a data corpus of human footprints together with clinical observations of the humans leaving the footprints, one could train a model that could observe a new footprint and predict the health of the corresponding human. That would be mad: there is not enough information embedded in a footprint to back out a person’s gastric health, or vision acuity, or state of infection from a disease, etc. Similarly, I do not believe that there is enough information impressed in text about the subrational processes whose surface manifestations we call “reason”. I could be wrong about this, but I don’t think so, and in any event the burden of proof is on those researchers who make this kind of claim. Where is that information? How is it encoded?

Claim (2) is actually much worse: it is in the category that physicist Wolfgang Pauli called “not even wrong”—a statement so detached from scientific discourse that classifying it as correct or incorrect is simply a waste of time.

Let’s pull back the curtain concealing the LLM model for a moment. If you read any of the many online descriptions of how a transformer works (The Illustrated Transformer is pretty good, and Wikipedia’s is quite detailed, but Google has many hits for “How does a transformer LLM work”), you may find the level of computational detail off-putting at first. But if you zoom out a bit, what you realize is that it is mostly a giant chain of linear-algebraic operations, interspersed with a few nonlinear “activations”, sandwiched between a linear encoding layer and a nonlinear decoding layer. In this sense it is not different from any DL method. There’s more layers and parameter arrays than most, but not much more structure. It’s a system that grew out of a lot of trial and error, with a pile of late, unlamented errors filling a large dumpster in the back of the lab, and only what more-or-less worked left in.

There is nothing special in that model that is analogous, say, to the model of human physiology that one would need to even attempt to back out a human’s health from that human’s footprint. There isn’t a scrap of theory to motivate the claim that transformer-based models could furnish the basis for solving this inverse problem. Which is to say, a key element of the inverse problem—the principled model embodying actual knowledge of the process under study—is simply not there. Instead there are chains of linear algebra mingled with other ad-hockery, not purporting to model anything. Which means that Claim (2), is in effect, not only that this Rube Goldberg device is capable of inverting the forward model to recover the reasoning process state, but it is also somehow capable of reconstructing the principled model of the reasoning process of which that state is an attribute. That chain of linear algebra is, in effect, a Nobel-caliber cognitive scientist, because the first reasoning task that it carries out is to create a working model of reason itself, a task that still eludes the discipline of cognitive science!

That is just magical thinking. It is literally impossible that this bodged-together system should have accidentally succeeded in modeling reason—an unsolved scientific problem—and then solving the related, probably impossible inverse problem of recovering the model’s state from text input, so as to boot up a reasoning entity. It’s a thoroughgoingly stupid claim.

“AGI” Is A Scientific Scandal

I find it disgraceful and shameful that an entire category of scientists has been moved by enthusiasms and Tech industry funding to lower its intellectual standards to the point that this sort of bullshit floods the journal and conference literature. It’s a scientific scandal, unfolding in plain view. Nothing in the Replication Crisis that afflicts the social sciences comes remotely close to this level of corrupted science.

I can’t emphasize strongly enough that this hubristic nonsense is taken very seriously by the “AI” research community. Sublimely unfazed by the absence of any fundamental explicit understanding of what reason is, and positively glorying in the inscrutable inner complexity of LLMs (“Explainability” is itself a topic for funded research, after all, as we saw last week), this community crows about achieving the “emergence” of intelligence from the models at large scales of data and computation, secure in the knowledge that the models are too unanalyzably complex for any model developer to be expected to explain how this miracle comes about. They just claim that it’s “self-organization” at work. The intellectual laziness of this outlook is simply shocking to me.

At this point, the technical jargon of this discipline has escaped all bounds of propriety. “AI” was bad enough, given the limited amount of “I” in ML (basically, only learning). But now we have “chain of thought”, “knowledge representations”, “mixture of experts”, “agents”, “reasoning models” and “General Intelligence” as well as many other similar allusions to human cognition polluting the technical discourse. Shame is dead in this discipline.

In a sense it’s kind of funny: Silicon Valley Masters of the Universe are directing trillions of dollars in investments to build hundreds of data centers, buy stupefying amounts of computing hardware, and add an estimated 60GW of electical power generation to the U.S. grid, all for the purpose of achieving something that literally cannot be achieved. There is no pot of gold marked “AGI” at the end of this rainbow. But it will take an infinite amount of data, compute, power, effort, and money to get there and find out. What could possibly go wrong?

Reader Interactions

Commenters

No commenters available.

  • Carlo Graziani
  • ColoradoGuy
  • divF
  • Eolirin
  • Eyeroller
  • FooDonFah
  • Gin & Tonic
  • Math Guy
  • Matt McIrvin
  • MattF
  • no body no name
  • Rand Careaga
  • RepubAnon
  • RSA
  • Searcher
  • WaterGirl
  • WTFGhost
  • YY_Sima Qian

Filtered Commenters

No filtered commenters available.

    Settings




    Settings are saved immediately; press X to close the box.

    61Comments

    1. 1.

      Gin & Tonic

      When I was an undergrad, 50 years ago, Minsky and Papert were state of the art; I vividly remember Danny Hillis being (excuse me) The Great White Hope, and founding a company called, hubristically, of course, “Thinking Machines.” Yeah, um.

      Personally, I think AGI is like controlled fusion, destined to be always 20 years away.

      Reply
    2. 2.

      RepubAnon

      Perhaps this is why the Silly Valley Bros want the government to underwrite their investments. They know the effort will probably fail, but there is always the hope that the long shot will pay off. So, they want to own the winnings, if any, and stick us with the high probability losses.

      Reply
    3. 3.

      WaterGirl

      Housekeeping note: I should have put my post up before this one, but I didn’t get it done on time.  These Carlo A.I. posts percolate for awhile and then seem to go on for a long time, so I’ll go ahead and put mine up now

      Just want to be sure there’s a place to talk about other news so it doesn’t slip into this thread.

      Reply
    4. 4.

      WTFGhost

      @Gin & Tonic: well, controlled fusion is (in my humble opinion) more likely, to get closer, in part because of advances in computing technologies, with digital controls beating out analog controls, and so forth. There are extraordinary problems to solve, but, we’re getting extraordinarily faster solving problems.

      AGI… I’ve heard it said that we’re kinda-sorta waiting on philosophers to figure out what it means to be intelligent, so that is why (I assume) there’s plenty of scorn for AGI as if it’s about to crop up. We can teach a machine deductive logic, but before we can teach it meaningful inductive logic, we kind-of have to understand how we do it ourselves, and how we (normally) auto-reject junk (unless we’re right wing broadcasters).

      So I’m much more willing to believe in controlled fusion being “just around this corner… or maybe the next!” than I am for AGI. Mind you, if I invested in a fusion company, well, I don’t get to invest, that’s a grown-up decision, but, I’d only advise people to recognize it is probably speculative, so it’s for your “hedge fund” style money, the stuff where you’re willing to assume bigger risks.

      Reply
    5. 5.

      Carlo Graziani

      @Gin & Tonic: I still cherish my copy of Godel, Escher, Bach, the manifesto of that era. The program failed (or at least succeeded in showing that every current model of cognition didn’t work), but it was a braver and more honest program than what we have now.

      Reply
    7. 7.

      RSA

      Overall, I agree with your observations and judgments. I’ll offer a couple of counterpoints about the state of AI in one small part of your post, however:

      There is no effort to describe what aspect of transformers (or the chained, augmented transformers in the “reasoning” models of OpenAI and others).

      This is an oversimplification. My take is that we don’t yet understand why transformers are as effective as they are, but there’s huge interest in the topic, part of the general interest in higher-level understanding (i.e. abstract models) of the performance of neural networks, an ongoing interest for decades. Progress has been slow, extremely slow in contrast to work expanding the boundaries of performance. So, right, the science so far is very limited on these systems, but not for lack of effort.

      There is only complacent satisfaction that some combination of pre-training, fine-tuning, distillation, computational scaling, iteration, etc. produces improved performance on “reasoning” benchmarks.

      This is also an oversimplification, but in a different way. Researchers in other areas of AI–search, logic, planning, computational linguistics, game theory, robotics, etc., to mention topics I’m a little familiar with–regularly publish take-downs of purported LLM capabilities.  Some of the researchers I talk to find it frustrating. “LLMs are demonstrably incapable in general of planning / strategic reasoning / causal reasoning / logical inference / etc.! Pay attention, people!”

      So I think you’re right about mainstream research on deep learning, but these observations don’t apply to computer scientists in general. Or to AI researchers in general! AI has been taken over by deep learning, to its detriment.

      Reply
    8. 8.

      ColoradoGuy

      The fundamental problem is we (and I mean all of humanity) don’t even know what reasoning is. We don’t know what consciousness actually is.

      The AI tech bros are claiming we can find out by accident, by just throwing computations against the wall until the answer magically emerges all by itself. This seems very, very unlikely. Computers, at the hardware level, are radically different than biological neurons, and all the biological systems that we know of self-organize by continually adapting to a physical environment. (They grow up in a dynamic, multi-sensory environment that is not always friendly.)

      Carlo is reminding us of the much greater problem: how can we solve a problem when we don’t even know what it is? This is much different than controlled fusion, which is basically a large scale and very expensive engineering problem.

      Reply
    9. 9.

      divF

      I’m going to need to drill down as we get farther into the discussion, but the pointer to the Wikipedia article on transformers looks like it will be a great help. My experience with Wikipedia is that the mathematics articles tend to be very useful (correct mix of technical and concrete) while the computer science articles spotty at best (too much PR, not enough substantive content). This one lands on the math side.

      Since you brought up the question of where inspiration comes from, I submit to the audience two examples, both known in the sciences. One is Kekulé’s dream, that led him to the discovery of the ring structure of benzine; the other is Poincaré’s description of his personal  experience of insight / discovery in mathematics. In the course of a half-century career as a researcher, I have had experiences like those described there at most a half-dozen times. One was very Kekulé-like, coming to me as I was falling asleep. They were memorable, and inexplicable.

      Reply
    10. 10.

      Carlo Graziani

      @RSA:

      100%. Not “Computer scientists”. The “AI” research movement.

      Unfortunately, that’s a community that publishes thousands of articles per year in ICML, ICLR, ArXiV, and so on. There’s a momentum to the consensus there that feels overwhelming, and frustrating, because so much of that consensus is flat-out, upside-down-and-backwards, riding-your-bicycle-on-the-ceiling wrong.

      Reply
    11. 11.

      Matt McIrvin

      @Carlo Graziani: I recall Hofstadter got really disenchanted with commercial “AI” hype during one of the previous waves and started disclaiming he was working on that at all. He wanted to understand something about the nature of cognition, which the LLM/ transformer route clearly does not accomplish.

      I think a lot of the early work, including Turing’s, was concerned with answering philosophical objections of principle over “whether a machine could think” which are pretty much irrelevant to anything going on today. Turing’s stab at a measurable criterion was a game attempt for the time he was making it but I do think he had an excess trust in human discernment.

      Reply
    12. 12.

      MattF

      My introduction to AI was a lecture I stumbled into one day as an undergraduate. The speaker strode up to the blackboard, drew a big letter ‘N’ on the board, drew a box around it, and said ‘We start with Nature’. And then started talking about second derivatives. This was in the late ‘60s, btw. So, bullshit AI is not a new development.

      Reply
    13. 13.

      Math Guy

      There seems to be an emerging view (in some communities in both mathematics and physics) that entropy and information are closely related. I don’t claim expertise in these areas, but my intuition suggests that this might be a useful perspective from which to consider the question of how we might try to define reasoning. (There, that was easy to say and did not cost me much cognitive effort.)

      If “reasoning” is accomplished via computation, then there is also an increase in entropy associated with that computation. (This is the basis of the argument against the existence of “Maxwell’s Demon” – for the demon to sort the fast from slow gas molecules, effectively decreasing entropy, requires a computation that results in an increase in entropy.) This seems relevant given the huge consumption of power that is required by these LLM’s posing as intelligent agents.

      Reply
    14. 14.

      Rand Careaga

      @ColoradoGuy:

      The fundamental problem is we (and I mean all of humanity) don’t even know what reasoning is. We don’t know what consciousness actually is.

      But apparently, not knowing these things, we may nevertheless assert with utter confidence that these qualities are utterly impossible of attainment by non-human processes. Have I read you correctly? I’m personally disinclined to credit LLMs with consciousness, but judging from some of the outcomes, something functionally equivalent to reasoning may be attributed to them from time to time.
      I’m inclined to doubt whether, if AGI is ever attained, LLMs will turn out to have been along the main route to its development, but some successor technology may serve as an emulation layer, a “front end” permitting us to speak to the AGI, assuming it elects to bother with us.

      Reply
    15. 15.

      Rand Careaga

      @Matt McIrvin:I recall Hofstadter got really disenchanted with commercial “AI” hype…

      There was a piece in The New Yorker this week or last. Hofstadter is no longer as dismissive of the technology as formerly. Now he’s a bit afraid of it.

      Reply
    16. 16.

      divF

      @MattF: But I thought that Nature is all second derivatives: F = ma, Poisson, Wave equations – pretty much all of 19th century physics (and a good deal of the 20th century).
      All right, I’ll toss in some first derivatives, but that’s my limit. /s

      Reply
    17. 17.

      RSA

      @ColoradoGuy:  Carlo is reminding us of the much greater problem: how can we solve a problem when we don’t even know what it is?

      Yes. This is something I talk with AI researchers about, as well people who are just interested in what AI is about. A traditional AI research paper opens with an introduction to the problem domain, quickly narrowing down to a specific challenge. Related work is outlined. The meat of the paper is a formalization of the problem in a mathematical representation and then a walk-through of the model or algorithm to solve the formal problem. The subtext is often, “Here’s something interesting or important that AI can do.”

      But what’s missing in most papers is the effort in the formalization step. Did you have ever trouble with word problems in elementary school math classes, turning a verbal description of a problem into a mathematical formula? Your typical AI Ph.D. student is doing the same thing, in much greater academic depth, and it may take months or even years to work out, if it ever does. That effort is almost entirely hidden in the published literature (aside from Ph.D. dissertations, maybe).  AI doesn’t solve problems, really; people figure out how to use AI to solve problems.

      This is a side note at best. As Carlo points out, deep learning publications often don’t even formalize the problems the systems they describe are solving.

      Reply
    18. 18.

      Matt McIrvin

      To expand on what I said above: some objections like Searle’s or Lucas’s had to do with whether a system based on computers running programs could have subjectivity or true intentionality, or if there was some contrived task a person could do that a computer never could. These guys spending the money, of course, do not care– in fact if their AGI machine has no subjectivity or in some detectable sense is not quite a person, that’s all the better since there’s no moral objection to using it as a slave.

      So I think objections like Carlo’s here are more germane, that in fact there’s no reason to believe these systems can solve the problems investors want them to solve.

      Reply
    21. 21.

      WaterGirl

      @FooDonFah:

      Pre-agreed with the premise.  Next time would be great to see this in English.

      I kind of feel like this one is in English, compared to the previous two!

      Reply
    22. 22.

      Gin & Tonic

      @Carlo Graziani: ​That one is in the basement. I liked it when he took over for Martin Gardner in Scientific American. Nobody could take Gardner’s place, of course, but Hofstadter was a good choice at the time.

      Reply
    23. 23.

      RSA

      @Carlo Graziani: The “AI” research movement. Unfortunately, that’s a community that publishes thousands of articles per year in ICML, ICLR, ArXiV, and so on. There’s a momentum to the consensus there that feels overwhelming, and frustrating, because so much of that consensus is flat-out, upside-down-and-backwards, riding-your-bicycle-on-the-ceiling wrong.

      You are exactly right, unfortunately.

      Reply
    25. 25.

      Eyeroller

      For some reason I already had the Cornell paper in one of my browser tabs.  I was struck by this part of a sentence:

      …we discuss the challenges ahead for advancing towards deeper and more comprehensive versions of AGI, including the possible need for pursuing a new paradigm that moves beyond next-word prediction.

      How could they even suggest that “reasoning” is next-word prediction?  (Disclaimer: I haven’t read the paper.)

      When topics like the uniqueness of human or now apparently machine reasoning arise, I tend to take an evolutionary perspective.  Whatever intelligence we have cannot be unique to humans; there must be evolutionary precursors in “lower” animals.  And we know that animals that are not capable of language have at least rudimentary reasoning capabilities.

      It reminds me of Noam Chomsky’s claim that humans are uniquely intelligent because we have lanuage and language is unique to humans.  But it has been discovered that while no other species have been proven to have language as sophisticated as that of humans, several species have what seem to be some underlying capabilities.  If they didn’t, how would language have evolved?  God just put it into our genes?

      It’s likely that most human reasoning is subconscious and language serves to narrate and regulate the lower-level reasoning circuits.  So it is not at all apparent that language models alone can lead to “reasoning.”

      Reply
    26. 26.

      Carlo Graziani

      @divF: Thank you for sharing that Poincaré essay. It is, indeed, a mind-bomb. I had not read it before, but as you write, there is a great deal essentially true in its description of how mathematicians approach their subject.

      Reply
    28. 28.

      WTFGhost

      @Math Guy: I saw some writing to that effect, as well.

      It makes me wonder if it’s similar to general relativity, where time is the capability of light to move.  Like,  maybe information exchange possibility is entropy. It would be fun to have a brain.

      Reply
    29. 29.

      Carlo Graziani

      @FooDonFah: I am genuinely sorry. I realize that the perspective that I’m bringing here could represent heavy sledding to people unused to reasoning in this narrow domain. I will attempt to bring some intercalatory posts that provide higher-level summaries of the nerd-talk in English. I will try to flag those as more accessible.

      Reply
    30. 30.

      Eolirin

      I think to a large degree, talking about AGI is a distraction from anything of meaning. Both for and against.

      AGI isn’t even necessarily a useful thing.

      What is really being talked about is “Can we make this stuff good enough that it can replace humans in a broad array of domains” and that is not the same question, but it’s the only one that matters. And it’s what investors really want to see happen. The rest is branding and marketing bullshit

      Proving that any reasonable definition of AGI is impossible doesn’t say the answer to that question is no it’s impossible either. Which is kind of the problem.

      Reply
    31. 31.

      Searcher

      I feel like the most generous thing you can say about LLMs is “there is a surprising amount of reasoning embedded in the structure of language.”

      Reply
    32. 32.

      Eyeroller

      @Math Guy: That goes all the way back to the beginnings of information theory from Claude Shannon.  There’s a pretty decent body of work connecting information theory with entropy.

      I don’t know of any proof, but it’s fairly self-evident that information processing requires an input of energy to create a more organized state, resulting in an overall local decrease of entropy, which by the Second Law will be accompanied by a global increase of entropy. But life has the same requirements.  Life is an entropy-production engine and human life has massively accelerated that by artificial means.  Fortunately life on Earth has an enormous source of energy sitting 8 light-minutes away from us so we don’t need to worry about obtaining the energy for the next 500M years or so, though human entropy-producing activities may result in significant disruptions to our biosphere.

      As to the seemingly exponential growth of demand for energy by “AI,” it’s interesting to note that Nature does it better with each “agent” consuming/emitting about 100 watts of power.

      Reply
    33. 33.

      Carlo Graziani

      @Math Guy: There is a very mature subject, called “Information Theory”, developed by Claude Shannon in the 1940s, that traces very precise connections between entropy and information. It is a connection that is broadly useful in many subjects where probabilistic reasoning is essential. AI is certainly one such.

      Reply
    34. 34.

      Eyeroller

      @Rand Careaga: I am extremely skeptical that LLMs will ever achieve what we might agree is “intelligence,” at least in part for reasons I’ve stated, that “intelligence” and reasoning is not entirely or perhaps even mostly language-based.  However, since Nature has solved the problem, at least to the extent that we can define it in terms of what humans can do, in principle it should be possible to make a computational model of it. I just think it would be very complicated and the transformer model is not a particular good model of the human brain.

      Reply
    36. 36.

      Eolirin

      Also, Google, at least, is working on moving past the transformer model with Titans and Nested Learning. I expect other people will continue to look at alternatives to accomplishing the objectives they’ve trying to reach. I doubt transformers will be the end of this.

      But I think it’s really important to center what’s going on around the business use cases when evaluating anything besides academic work. In the end the impact of these systems won’t be measured by the question is this correct but by is this useful. Because these are ultimately products. And they’ll live or die on that basis.

      This is one of the problems with this all being lead by profit motivated businesses.

      Reply
    37. 37.

      Eyeroller

      @WTFGhost: Sorry, that is not what time is in general relativity. Space and time are fundamental to gravity and light moves through spacetime just like we do, only on special trajectories.

      Entropy is a measure of disorder in a physical system. In fact, it is likely to have a deep connection to causality and the “arrow of time,” but why we can only move forward in time is one of the still outstanding problems of physics.

      Reply
    38. 38.

      Math Guy

      @Carlo Graziani: Oh yes, I am familiar with information theory from having worked in cryptology and coding theory. However, my physics background is not on such a firm footing, so I want to be cautious about claims I make vis-a-vis connections between entropy in the sense of Shannon’s work, and entropy as physicists define it.

      Reply
    40. 40.

      YY_Sima Qian

      Sublimely unfazed by the absence of any fundamental explicit understanding of what reason is, and positively glorying in the inscrutable inner complexity of LLMs

      The race to “AGI” has taken on the characteristic of a religious crusade, only the effort is in building a god-like entity (a.k.a. “ASI”) to save mankind from itself. Inscrutability of the inner complexities LLMs is central to its appeal to the true-believers. After all, a god must be inscrutable to have credibility as a god.

      As I’ve commented in Carlo’s 2 previous posts, the “AGI” craze has not just motivated the trillions of USD in data center investment & inflated trillions of USD in valuations for the top “AI” related tech. companies, it has also driven the domestic & foreign policies of the past 3 administrations. There is a reason the CHIPS Act focused on onshoring the fabrication of the most advanced semiconductors (being highly advantageous for training & inference of LLMs), even though the Nexperia saga has shown that disruption to the supply of mature node commodity chips can be equally damaging economically (perhaps even more so). That is also the underlying calculation behind the tech. war the US has waged against the PRC since Trump45, coercing allies & partners along the way.

      A lot of domestic & diplomatic capital has been spent down to chase this utopian dream.

      Of course, now there is the emerging realization that the constraint on development & broad application of LLMs may not actually be compute, but energy, whose infrastructure the US has neglected for a long time.

      & OpenAI is now opening asking for government bailout.

      Reply
    41. 41.

      YY_Sima Qian

      @Gin & Tonic: Controlled fusion for energy generation is based on sound foundation in physical principles. The challenges (although immense) are ultimately engineering & applied sciences in nature. “AI” from LLM is not, as Carlo has eloquently explained, the entire path is a dead end.

      Reply
    42. 42.

      YY_Sima Qian

      Somewhat on topic: how the open source LLM development in the PRC is affecting diffusion & application of LLMs in the real world, & undermining the business models & religious proselytizing of closed US Labs, & by implication the industrial policies (such as they are) of the US:

      Nothing Is Given: Of China’s Open-Source Tsunami

      Nilesh Jasani

      November 8, 2025

      The Unscripted Curve
      Innovation never follows a script. It accelerates, stumbles, and surprises. Two years ago, a leaked Google memo warned that open-source models would someday erode the moat that seemed to protect the proprietary LLM developers. For a while, the proprietary developers defied this prophecy as they raced ahead in capabilities and reputation. By now, the memo appears more right than the worst-case projections of the time. More importantly, few could have predicted the kind and the quantity of parties that would seize the moment.

      A year ago, the idea of Chinese leadership in AI sounded implausible. Our series of articles, starting with the one titled A Quiet Surge: the Rise of Chinese AI innovators generated more doubts about the content than reviews of the landscape. Even after DeepSeek, the world assumed China’s models were derivative or copies, with unprovable, exaggerated claims, its labs constrained by sanctions and costs.

      That view now feels dated. By mid-2025, Chinese developers released more public LLMs than anyone else. They have begun to dominate download charts. DeepSeek, Qwen, MiniMax, and Kimi are no longer marginal names. This reversal began before the latest releases that appeared in recent weeks. When we began writing the article earlier in the week, eight of the ten highest-scoring open models were Chinese. If one can believe it, the lead has since widened! While we were writing, just on this Friday, we got the first open-source model, this one from Kimi, that claims to surpass the capabilities of the best proprietary models across a list of popular benchmarks. The change is no longer just about algorithms. It is about how the economics of AI will undergo more changes in 2026.

      All of the innovations of the Chinese labs, operating under the constraints to compute resources as the result of US export controls on advanced semiconductors, are intended to achieve similar performance as leading US models, but w/ requiring greatly reduced resources (compute, memory, power, model size, etc.). These innovations has eased & will ease w/ the diffusion & application of LLMs in the real life, which is what the PRC government & most Chinese companies working on “AI” are focused.

      None of these innovations (whether from the Chinese or US labs) move LLMs closer toward the “AGI” promised land.

      Reply
    44. 44.

      Matt McIrvin

      @Searcher: They’re trying to effectively create a model of anything in the universe based on patterns in the structure of the things people write about it. It’s amazing it works as well as it does, and that’s kind of the problem I guess, the amazingness is leading people astray.

      Like, is it possible to achieve subject mastery in anything JUST by reading the literature, without practice? Even if you have a superhuman ability to read and absorb everything? That’s kind of what we’re asking here.

      Reply
    46. 46.

      Eyeroller

      @YY_Sima Qian: A couple of years ago at a large international conference, I happened to sit at a luncheon with a guy from a company in the US that is working on fusion technologies.  They were spun off from some research lab, I can’t remember which one though I could probably figure it out (I don’t think it was Livermore).  They seemed to have a good grasp of the issues and there seem to have been some significant advanced in magnetic confinement. But I haven’t looked into the status lately.

      Reply
    47. 47.

      Eyeroller

      @Matt McIrvin: And what happens when the bulk of that literature turns out to be wrong?  That happens all the time over the history of science.  (“Wrong” is usually too strong a word, “inaccurate” is probably better, although occasionally the consensus does turn out to be just plain wrong.)

      Reply
    48. 48.

      Matt McIrvin

      @Eyeroller: The thing that disturbs me the most about this is the way it flips the efficiency comparisons I’m used to on their head. Humans are not very efficient at being computers: for doing arithmetic or simple programmed logical tasks, an electronic computer is not only vastly faster and more accurate but vastly more efficient. But in THIS domain, the machines are faster but less accurate and less energy-efficient than humans and we’re just brute-forcing it with MOAR DATACENTER. That suggests to me we’re barking up the wrong tree.

      Reply
    49. 49.

      Eolirin

      @Eyeroller: What happens to humans when that’s the case?

      I don’t think that’s a good criticism of these systems.

      I think we’re going to run into an issue where most of the things we want these sorts of models to do are already highly abstracted and intellectual. So they’re already only expressible in language, and faking that gets you really close. But the details will matter for outcomes based on the quality of those outputs. And that’s where the real risk of another AI winter comes from. The output accuracy needs to be good enough often enough to be useful and if that can’t be accomplished cheaply enough to make people a *lot* of money that’s it.

      If it can, well, whether it’s reasoning or not? Whether it’s wrong some of the time? Shrug.

      Reply
    52. 52.

      Matt McIrvin

      @Eolirin:

      What happens to humans when that’s the case?

      Humans can explore the actual world, perform experiments and observations. Machines may one day be equipped to do that on their own initiative too, there’s nothing in principle stopping them, but an LLM is just concerned with language.

      I suspect personally that any legit case for machine intelligence is going to involve this kind of direct engagement with the world, or a world, not just by mimicking the output of human engagement with it. But what results might behave much more strangely than a chatbot.

      Reply
    53. 53.

      Eolirin

      @Matt McIrvin: I’m talking about the specific case in which humans have generated incorrectness in a whole corpus of knowledge though. What’s happened historically when that’s the case is that a lot of people are very wrong for a very long time and resistant to evidence to the contrary.

      You can’t separate LLMs from the humans using them though. LLMs get updated by people who can figure out that oh shit actually our scientists got this thing really wrong for a long time and recently there’s evidence that it’s something else we need to update our shit so the model outputs that correctly. It’s not static and it’s not limited to just it’s own context

      Just like the people who correct the incorrect information that’s embedded itself? They’re updating all the other people who are still stuck on the bad knowledge.

      Reply
    55. 55.

      Carlo Graziani

      @Eolirin: The business/economics issue appears to be that nobody has figured out how to make money on AI. Revenue, yes, but profit? Nobody is close, nobody has a clue, and billions of dollars are burning in a cheerful, crackling fire in back as the titans of industry sort out whether profitable AI is even a thing.

      I’ve tried to mostly stay away from the business/economics of AI, because I’d prefer to keep these essays to things that I believe I understand as well as anyone, and I don’t know shit about business. But I do find Ed Zitron persuasive on this topic. He does the accounting. The money only goes in from investors, and out to customers, who do not pay anything remotely close to their compute cost in token usage.

      Reply
    56. 56.

      Matt McIrvin

      @Eolirin: Terry Tao has been involved with work to integrate LLM-like systems into mathematics. LLMs themselves are notoriously terrible at mathematics; their level of language analysis simply cannot approach correct analysis of mathematical ideas. But suppose it’s working with some kind of mathematical markup instead of human language, and its statements are verifiable and can be automatically checked? Automatic proof checkers and assistants are already a highly developed field. That strikes me as potentially much more interesting.

      Reply
    57. 57.

      Eolirin

      @YY_Sima Qian: Yeah and so what? Like eventually we may have embodied AI running it’s own experiments, but that’s not even the point of these technologies.

      They’re there to be productivity aids. That’s what they’re being sold as and that’s what they’ll live or die on. It really doesn’t matter if they’re intelligent to whatever definition of intelligence you want to use. It matters if they’re useful.

      Reply
    58. 58.

      Searcher

      @Matt McIrvin: See, I think it’s even worse than that.

      I think it’s an interesting question how far you could get building a model of the universe from written text, where you are using these transformers to map text into an actual world model.

      And I’m sure there are plenty of people out there working on doing just that.

      But all the applications that are out there now aren’t doing that.  They’re just skipping the world model and relying on the language model itself to fake it.  We need these million-token context windows because the “AI”s you’re interacting with never actually build a model of anything they’re told, they just predict the next token.  If it actually understood, context windows wouldn’t matter.

      And it’s really good at faking it up to a point.

      Which makes it really hard to justify doing the much harder work, when what’s already finished looks very convincing if you don’t look too hard.

      Reply
    59. 59.

      Eolirin

       

      @Carlo Graziani: I am aware, but I’m saying that’s what the actual challenge for these companies is, it isn’t any of the philosophical or academic parts of this.

      If they can’t fix that this is all going to go away. If they can, no one is going to care about the other things. Well. Outside of academics. I think it’s good to point out that there’s a lot of bullshit being said about this stuff. It’s just it’s ultimately not going to be what determines it’s success or failure.

      Like someone managed to successfully use Claude to run a series of successful cyber attacks on multiple companies, and it did pretty much the entire chain of work, from picking the companies to writing the malware code to composing the blackmail letters used to extort them. They’re already capable of successfully accomplishing things with relatively minimal human intervention.

      They’re just too inefficient and too frequently wrong about details for mission critical work to be profitable, and also the companies behind them are chasing growth, so they’re massively overspending and putting out product at far too low a cost. It’s an open question whether that can get fixed without a crash. I’m pretty pessimistic myself.

      Reply
    60. 60.

      Carlo Graziani

      @Matt McIrvin: Colleagues of mine who use LLMs for coding have discovered workflows that result in considerable effort savings. The required elements are (1) huge investment in prompt engineering, and (2) test suites capable of automatically validating generated code.

      So, if you are a domain expert, and have the tools to validate LLM output, and the willingness to guide and check and refine every prompt, it turns out that LLMs can actually accelerate code development.

      Compare that statement to the average ChatGPT user’s prompt/response experience.

      Reply
    61. 61.

      YY_Sima Qian

      @Eolirin: I don’t think anyone is disputing the usefulness of LLMs as tools. Carlo said that at the beginning of his 1st essay.

      Their usefulness will indeed determine the economic viability of the eye watering data center investments. However, even at the application level I suspect the investment is well ahead of actual demand, since the US governments & corporations’ single minded focus on racing to “AGI” has meant far less emphasis on diffusion of “AI” through the rest of the economy, & that’s what drives demand & provide feedback to improve the applicability of the LLMs. Unlike the railroad or the optical fiber network bubbles, the exorbitantly expensive Nvidia GPUs in the data centers are obsoleted much faster (~ 3 yrs.), meaning much shorter time horizons to obtain returns on the investment.

      Even judged on the terms of what the LLMs as they are, increasingly powerful & useful tools, as opposed to the utopia promises they bring, the business models of the close US labs are setting themselves up for disruption. See the article I posted above, & as Tracy Alloway puts succinctly:

      Tracy Alloway @tracyalloway
      The number of major US companies building on Chinese open source AI models (Qwen, Kimi, etc.) is really underappreciated
      And the corollary is that Wall Street still out there valuing US AI like it’s a $5,000 espresso machine with a supposedly infinite potential market. Meanwhile China’s already been handing out free Nespresso pods to anyone who’ll have them
      Which is interesting because it means China is effectively pursuing a commodification strategy for AI (and makes the real US-China tech competition about power availability rather than model or data sophistication

      Poe Zhao @poezhao0605
      Exactly right. Cursor’s “proprietary” model thinks in Chinese. Windsurf runs on GLM. Both are billion-dollar startups building on Chinese foundations they won’t acknowledge. Chinese coding models now match Claude at 8% of the price. The solar panel playbook is running on software. Wrote about it here:

      China’s Solar Panel Playbook, Now Running on AI
      Chinese AI labs are commoditizing coding models with the same strategy that reshaped solar and batteries. Western startups are adapting–by building on Chinese foundations they won’t acknowledge.

      POE ZHAO

      NOV 03, 2025

      Damien Ma@damienics

      As I said during DeepSeek, cheap AI wasn’t on the 2025 bingo card. By end of year, LLM models will elicit yawns. The competition is on the product layer and robotics. That killer product(s) can’t just be a better on-ramp to the Internet with fancier (intelligent) features.

      Now, perhaps the innovations that the open Chinese bring could actually help rescue the economics of the massive data center investment, by dramatically reducing the compute needed for training & inference, & thus keeping the “obsolete” GPUs viable. The latest Kimi K2-“Thinking” LLM from Moonshot AI (deemed at the frontier w/ the best closed models from OpenAI & Anthropic) was trained on “obsolete” H800 GPUs, a nerfed version of the Nvidia H100 GPUs, which in turn is a generation behind the B200 currently on sale. OpenAI & Meta can always rent out their compute even if their closed LLMs eventually do not find enough takers.

      Of course, all of the data centers in the world will not matter if the power infrastructure does not catch up…

      Reply

    Leave a Comment

    Your email address will not be published. Required fields are marked *

    If you don't see both the Visual and the Text tab on the editor, click here to refresh.

    Clear Comment

    To reply to more than one person, click the X to save & close the box.