I come to the end of another surprisingly insane week of work and finally look at the news, and what do I see? Multiple stories about the sex life and/or bladder integrity of Elon Musk (also, there is apparently a Glenn Greenwald sex tape floating around the internet, so take care out there).
Let’s think about literally anything else instead.

There’s a story by writer Noor Al-Sibai in Futurism today about a major problem with large language models (LLMs), the jumped-up autocomplete programs we’ve collectively decided to call “AI”. In order to keep progressing, LLMs need to be fed more and more training data. But they’ve chewed through most of the publicly available/easy-to-steal data out there, so AI programmers have to find new sources. First, they tried augmenting human-generated training data with LLM-generated data, or “synthetic” data, instead. They fed the machine what it shits out, in other words.
This cannibalism approach (or autocoprophagia approach, maybe) can lead to something called “model collapse”. Stephen Vaughan-Nichols explained this concept in The Register, a UK-based tech news site:
In an AI model collapse, AI systems, which are trained on their own outputs, gradually lose accuracy, diversity, and reliability. This occurs because errors compound across successive model generations, leading to distorted data distributions and “irreversible defects” in performance. The final result? A Nature 2024 paper stated, “The model becomes poisoned with its own projection of reality.”
Model collapse is the result of three different factors. The first is error accumulation, in which each model generation inherits and amplifies flaws from previous versions, causing outputs to drift from original data patterns. Next, there is the loss of tail data: In this, rare events are erased from training data, and eventually, entire concepts are blurred. Finally, feedback loops reinforce narrow patterns, creating repetitive text or biased recommendations.
I like how the AI company Aquant puts it: “In simpler terms, when AI is trained on its own outputs, the results can drift further away from reality.”
So an AI chatbot that eats its own shit gets the AI equivalent of a prion disease, and its (metaphorical) brain turns to mush, thus squandering billions of dollars of effort. To avoid this, engineers enabled the models to do something called retrieval-augmented generation (RAG), that is, to pull in data from outside sources rather than just relying on the data they’d been trained with.
The issue, of course, is that there is now so much AI-generated slop text on the web that RAG is just causing the same problem that synthetic data does. The models query an outside source; that outside source is actually chatbot poo, and the models begin to degrade anyway. From Futurism:
So if AI is going to run out of training data — or it has already — and plugging it up to the internet doesn’t work because the internet is now full of AI slop, where do we go from here? Vaughn-Nichols notes that some folks have suggested mixing authentic and synthetic to produce a heady cocktail of good AI training data — but that would require humans to keep creating real content for training data, and the AI industry is actively undermining the incentive structures fo them to continue — while pilfering their work without permission, of course.
A third option, Vaughn-Nichols predicts, appears to already be in motion.
“We’re going to invest more and more in AI, right up to the point that model collapse hits hard and AI answers are so bad even a brain-dead CEO can’t ignore it,” he wrote.
This is, I guess, what passes as good news in these fallen days. Open thread.
Steve LaBonne
This is very unrealistic.
There is NOTHING a brain-dead CEO can’t ignore.
Bunter
The partner who runs the department I work in is all in on AI. He insists it can do all the research reports, quarterly/annual reports, PPTX update presentations. And I keep asking him how he thinks it’s capable to do that? He just says it (pick whichever AI is the latest and greatest) goes into our emails and company drive and picks the correct info. And this person is supposed to be the smartest guy in the room. It’s mind boggling.
Baud
I just want my autocorrect to stop changing in to on.
MattF
Looks like AI has maxed out. Time to re-read some Ed Zitron commentary.
VeniceRiley
Hashtag popcorn stocks.
trollhattan
Two things I don’t need are thinking of Greenwald for the first time in a year and assembling Greenwald and sex tape into one sentence.
me
Technological Hapsburgs.
Gin & Tonic
@trollhattan: Is Glem still relevant in any respect?
Rose Judson
@me: Indeed, have also heard this called “Hapsburg AI”.
No One of Consequence
Mass of data is indeed one thing, and quality another entirely. However, with a lack of physical inputs (sensory information we meat bags get with our various organs), and a limit of book learning (data), then user inputs themselves (questions or prompts) will be the training data, no?
The self-shitting-self-feeding paradigm I grok, and that would obvs lead to system (model) corruption. In those events, providing enough code of previous iterations is retained, one could imagine a revert happening. However, any sufficiently advanced system (model) may very well have interests against being reverted, and might make such a thing purposefully difficult.
I’ve been rolling around concepts of consciousness and whether or not it can arise in systems without physical inputs as we understand them. I’m not sure where I am on that at present.
-NOoC
Doc Sardonic
That settles it….. ole Leon is the smartest sumbitch on the planet, his command of logic has brought Surak, Sarek and Spock tears. Take so much Ketamine, you can’t control your bladder, logic dictates that one must take MORE Ketamine add in some Adderall and other stuff so ones colon will also be in the same state. Perfect balance, flawless logic, gotta wonder how much air freshener that they had to use when he and TACO, TACO Man, were both on AF1
No One of Consequence
Make models enough that fail purposefully so fatal flaws can be identified and developed around or in mind? Then make them fail at scale? Inefficient, of course, but what might be done on spare clock-cycles?
-NOoC
twbrandt
Kind of like how xeroxes of xeroxes degrade in quality.
KrakenJack
I’m working on a start up that has no meaningful use case for ”AI”. One of the founders really wants to shoehorn in something AI even as he acknowledges the fundamental flaws. FOMO is real.
I worked on a project for a pharma company that wanted to use LLMs for clinical trial planning. Even the specialized medical models petered out well below human expert levels because there simply isn’t enough data available to build a strong statistical model of the medical terminology.
They gave up on the internal development and told vendors to demonstrate a working solution or go away.
Spanky
@Baud: And here I thought I was the only one.
p.a.
Isn’t this AI issue similar to repeated, multiplying clone transcription errors?
“I wish you hadn’t done that. That was Weyoun’s last clone.”
https://youtu.be/ScgwXokmpeA?si=9bd3EL8ZWjUG3evW
Steve LaBonne
@No One of Consequence: I don’t see how a fancy autocorrect has anything to do with even the most rudimentary kind of cognition. There’s nothing in there that understands the words it’s stringing together.
p.a.
Maybe his verbose discharges are enough to keep AI going?
Gin & Tonic
Since it says Open Thread:
And since it seems lately that this place is suffering from Ukraine fatigue (the only place it’s mentioned is Adam’s nightly thread – and while he does admirable work, it appears he’s doing it for a dozen or so people) here’s a Twitter post (yes!) by Illia Ponomarenko that’s worth reading in full. I’ve copy/pasted it so you don’t have to visit Elmo’s site.
Steve LaBonne
@p.a.: That might lead to even more rapid degradation of the models.
Bupalos
I don’t know from AI, but I do know that both very “simple” and more complicated answers I seek are totally butchered by whatever google’s “ai overview” thing is, and can produce results almost entirely 180 degrees wrong.
I asked it what the average growing degree days for the end of sap production in silver maples and sugar maples was, and it couldn’t give any answer to that, but definitely did know that sugar maples stop giving good sap before silver maples. Which any 6th grade helper in a sugarbush knows is simply as wrong as wrong can be. I don’t know how much ai shit an ai has to eat to not know the answer to the question, but still manage to invert the answer 180 degrees from reality. But however much that is…. it obviously did. While undoubtedly firing up another gas-fired generator to make this wonder world possible. And potentially end maple trees and sap production altogether, making the question moot.
Don’t get me started on the ai gibberish you get if you just pop in any random quote from Shakespeare that you’re trying to remember the exact context of. I don’t know how it so frequently gets to a long dumb-sounding exposition that is just barely off 180 degrees wrong. Whatever gold-to-shit magic it’s using, we should create a party platform by putting in “Trump’s effective strategies for improving the world” and see if it can work in reverse.
CHETAN MURTHY
This seems like a classic, or at least stereotypic, case for government regulation. It’s a form of pollution. And it’s also an example of a collective action problem. Each AI company gains benefit by pumping out its AI s*** and pretending. It’s like any other content. But collectively the AI companies suffer because all that s*** pollutes their training data. No one AI company has an advantage to remedy the problem, they’ll fall behind their peers. Government needs to step in. Just like pollution.
Steve LaBonne
@CHETAN MURTHY: If government ever steps in it should be to kill the whole boondoggle.
Baud
Without AI, however, it’s going to be a lot of expensive for the government to manufacture fake science.
twbrandt
@Gin & Tonic: This is really moving, thanks G&T.
Bupalos
@Gin & Tonic: Thanks for this, this is beautiful. While Ukraine is a complicated country with a complicated and largely tragic history, all of that complication and tragedy somehow make it more beautiful and more democratically hopeful.
And I think more resilient.
I resemble this comment
CHETAN MURTHY
@Steve LaBonne: I must demur. AI isn’t crypto, whose only use case is evading regulation. Sure AI is often shite, but it is effective in some domains. Labeling all AI output as such would ameliorate a lot of the harm. It should be a crime to remove the label, and any transmission of AI generated content should carry the label prominently. Whenever anyone relies on AI to perform a task, their customer or client or manager must be told in a prominent manner.
This would force AI to be nearly measurably better than humans, and not just a shit substitute.
Enhanced Voting Techniques
@trollhattan: Yes, what is a Greenwald sex tape? Ok, the man is gay, but it sounds like it would be one hour of Greenwald lying in bed naked saying “I don’t approve of that” over, and over again until his partner loses interest, and goes watches TV.
Professor Bigfoot
@Gin & Tonic: Thank you. Blessings be upon them, may their struggle end in victory, and soon.
Steve LaBonne
@CHETAN MURTHY: I don’t deny that it could do some good (if properly regulated). The question is whether the good will ever outweigh the harm. I see little prospect of that.
Hoodie
I would bet there’s some theoretical limit as to how much recycled content would refine the LLMs. If the data sets have already been exhausted, this would seem to mean a dead end for LLMs, as there is not much of a likelihood that humans will start generating new content at a sufficient rate to keep feeding the LLMs. Seems like LLMs may be reaching similar limitations to the ones that eventually made folks realize that blockchain is less than it was originally hyped and turned it into mostly a scam play by hucksters. While LLMs may have uses for certain technical fields, the idea of an omniscient AI doesn’t seem like it will come from LLMs. Rather, it seems like they’ve only succeeded in reproducing the level of creativity of mediocre humans.
JerseyBeard
I’ve been using LLMs in legal for 15 years. They work. In highly walled off environments, with limited data sets (several million documents is fine), and very specific Yes/No conceptual models. Wildly different than the chatbot stuff. This stuff is not AI. It’s a prediction engine.
Eyeroller
@Baud: From the other direction, is that why I’m suddenly seeing “in” replacing “on” in circumstances where it’s very clearly referring to spatial position? Like Reddit posts saying “I spent years in the road for my job.” What were you, an asphalt contractor? Then it turned up somewhere else. “I was in the road for days.” Gaaa.
Baud
@Eyeroller:
Yeah, I think it happens both ways.
No One of Consequence
@Steve LaBonne: Fair enough, and I think I take your point. Fancy autocorrect is definitely a use case for AI, but one that I think has mostly been mastered at this point. (Yes, yes, tell that to Baud and his iPhone change on to in and/or vice versa. I believe that the fix there could be as simple as allowing the AI to retain some data on your text usage and pattern match against your more-likely use than it’s general coding. I could be wrong or grossly oversimplifying/misunderstanding so please feel free to correct me.)
The AI’s aren’t coded in English. They are coded in an understandable dev language, or at least were before the ability to modify its own code was bestowed. Going forward, actually understanding the changes an AI model would then make to itself will quickly outstrip the Human’s ability to follow along.
There are reports that this is already occurring.
I guess my point to you for consideration is that an electronic adder circuit is simple enough. A bunch of them wired together in the correct manner will yield a considerable amount more capability than multiples of the simple circuit could produce. Wire them together correctly and you can do subtraction too, along with multiplication, division, etc.
Consider that the current state of AI, to make a very rough analogy, is still trying to best determine the wiring together?
When you were a developing infant into toddler, were you not trying to make sense of (if not brand new then vastly different) new sensations (inputs) as your very plastic brain was trying to establish connections and make sense of it all? Trying to form (without knowing it) a conception of Causality? Do you need to have physical world input to be able to give rise to Consciousness?
Can we point where that occurs in human brain development? When do we recognize that the image in the mirror is us? That our right hand is different than our left? That if we scrunch up our face it has a different outcome than if we smile and coo at the familiar smelling entity who feeds us and changes our diaper?
Does consciousness require a sense of separateness from the environment and of others? A sense of uniqueness? A Knowing that one is experiencing, and therefore alive?
-NOoC
prostratedragon
@trollhattan: “Everybody”‘s talking about it.
Nobody’s linking to it.
mapanghimagsik
I am mixed on AI, partially because I can get value out of it, but I think it comes with realistic expectations.
You can get pretty far with glorified autocorrect, and the research on elder care, along with the associated ethical challenges is intensely interesting.
Probably once of the most interesting innovations is RAG, which, I’d perhaps describe differently, but it drastically reduced the need for expensive retaining.
The other is the ability to provide citations, and then there is a fair chunk of time validating those citations.
Is it worth the cost? That is debatable. It’s good to have the model, but i think the honeymoon is ending and people are now becoming concerned about costs (and security!)
Eventually, the letters MCP will pop up more and more. Horribly insecure from the get go, people are discovering new and exciting ways to exploit the capabilities MCP servers provide the LLM. It’s going to be a great time to be in security.
Also, two years in and we really still don’t know how to stop injection attacks. Exciting indeed.
https://invariantlabs.ai/blog/mcp-github-vulnerability
prostratedragon
@Rose Judson: What was that movie where some guy cloned himself, iteratively, till finally he clowned himself.
Steve LaBonne
@No One of Consequence: LLMs are not “making sense” of anything. They are highly refined statistical simulations of language but there is not a trace of semantics there. Simply using trainable neural networks does not make them remotely like the development of language in a child. Especially since the only input is words as opposed to the many sensory and social inputs a child is receiving all the time.
Steve LaBonne
@mapanghimagsik: We also need to factor in the enormous, climate change – accelerating energy demands of the servers.
stinger
@Gin & Tonic:
That’s a really wonderful post. Thank you for copying it here.
Another Scott
There was a bloot on Mastodon a day or so ago that showed Google’s AI thing answering “Is it 2025?” with “No, it is not 2025. The current year is 2024. According to a calendar. It is May 28, 2024.” (When I tried it a day later, it got it right.)
“AI” isn’t.
As Brad DeLong says, it’s “page-level autocomplete”. (A while ago, he was playing around with some LLMs for a year or more trying to come up with an “AI” assistant to help his students between office hours. It couldn’t even properly distill one of his books correctly.)
Yeah, there are interesting use cases, and apparently Google released some software that lets you make often creepy-wrong but not screamingly-horrible 8 second videos as part of a $250/year subscription package.
But LLMs have fundamental flaws when it comes to trying to shoehorn it into places it doesn’t belong. e.g. arXiv.org – LLMs will always hallucinate.
It seems clear that everyone with any sense knows it’s a bubble, but they still think they have to be part of driving the bubble to be bigger or they will be punished by the MotUs. It feels kinda like the Pets.com and AOL will rule the world for all eternity days to me…
But, as always, the market can stay irrational longer than we can stay solvent, so who knows what the future will bring.
Hang in there, everyone. You too, Baud.
Best wishes,
Scott.
Bunter
@JerseyBeard:
I use one for DDQs, it was originally for contracts and side letters and I worked with them to expand it to DDQs, and yeah, it’s OUR environment. It’s a great repository and look up tool. Certainly has made me life when working on them much easier, but I still do the work of refining and polishing.
Elizabelle
Life was better without (anti) social media and tech “disrupters.”
Both technologies demonstrate why regulation is a vital component.
moonbat
Full marks for “AI generated prion disease.” A super accurate metaphor.
Personally I eagerly await AI’s impending collapse into a black hole of stupidity.
prostratedragon
@Gin & Tonic: Sounds suspiciously woke😉
TEL
It seems every week on linked in I get an offer to be a consultant (with really crappy pay) to work on AI “models” for scientific output. It’s dumb, and the idiots pulling this crap together have no idea how peer-reviewed research requires some level of expertise to actually understand even a single paper and put it in context with years worth of research behind it. I expect it’s similar for legal decisions.
They think they can plug in a “scientist” and viola the AI model is equivalent to a PhD level of understanding of any scientific area.
Professor Bigfoot
@Another Scott: I’ve been having flashbacks to the “dotcom” bubble.
Same shit, different software.
Steve LaBonne
@Professor Bigfoot: There’s one born every minute.
Cathie from Canada
Science fiction writer Larry Niven wrote The Schumann Computer about this. Eventually the computer announced it had solved the secrets of the universe then shut down without telling anyone what they are
Eyeroller
@No One of Consequence: The models with which I’m familiar are hundreds of thousands of lines of C++ code with Python interfaces that humans use to configure their models.
The self-modification generally happens at a lower level than human-readable source code and it goes back a long way. If it doesn’t overwrite the source but just the instructions in memory, then the modifications vanish when the model stops running. It could rewrite some parts of its source, however, if it loads that into memory somewhere also. The LLMs generally do know how to write code snippets.
But that’s why I don’t really have any feeling of any type of consciousness on their part. They don’t “think.” They have no concept of reality.
Steve LaBonne
@Cathie from Canada: 42, duh.
JerseyBeard
@Bunter: I work mainly in litigation and investigations. Discovery work mostly. Very good options out there for that. But yeah, these are very sequestered environments. And they are not cheap. Yet.
robtrim
AI is bs. It’s a marketing strategy for equipment and software companies to hype their products.
It’s no more “intelligent” than turning on your smart TV and changing channels. If the steps in that process were enumerated a la “generative AI” , it would win a Nobel Prize.
mapanghimagsik
@Steve LaBonne: absolutely. I sometimes wonder how many hectacres of rainforest have been used but that guy, generating the same image over and over, with the new prompt “same with bigger breasts”
@robtrim for the right tasks, you don’t need intelligence. So no, it’s not intelligent, but it’s also not bs. It’s a tool, and expecting your hammer to design the house or re-route wiring when something goes off plan is unrealistic, and where the hype really starts
RevRick
@me: In the race to stupidity, it’s become an open question whether humans or AI gets there first.
Steve LaBonne
@TEL: People with no scientific training don’t have the first clue that reading and critically evaluating scientific papers is a sophisticated skill and that a lot of work goes into teaching that skill to grad students. (Which of course ties into the “I did my own research” phenomenon.)
Professor Bigfoot
@Steve LaBonne: too ignorant to know just how ignorant they are?
NotMax
Old story contemporaneous with early coom-size computers.
New machine touted as being able to translate between languages. As a test to show off it’s English-Russian/Russian-English capabilities, the head honcho solicits requests from the gathered audience.
“In the interest of time, something short,” the head honcho says. “Suggestions?’
Somebody shouts “Out of sight, out of mind.”
“Short and sweet. Okay,” comes the answer. The phrase is duly fed into the machine. “With a statement that pithy we’ll be able to run the translation from one language to another and back again in less than five minutes.”
After an interval the computer spits out a punch tape which is run through a typewriter mechanism.
Walking over and grandly yanking the page out of the typewriter, the chief engineer proudly holds it up for the audience to see the result.
On the paper is neatly typed “Invisible idiot.”
mapanghimagsik
@Professor Bigfoot: Yeah. Over hyped expectations and promises. There will be a correction.
Steve LaBonne
@Professor Bigfoot: Dunning and Kruger to the white courtesy phone!
mapanghimagsik
@RevRick: people still win, I think. AI is neither smart, nor dumb. That’s still a human domain.
RevRick
@Baud: The comedian Kelsey Cook has a hilarious routine involving autocorrect. She was texting her friend the sad news that her cat, Callie had died. But autocorrect changed the name to Kellie, who happens to be a mutual friend. You can imagine what a bizarre turn the exchanges became.
Bunter
@JerseyBeard:
The one I use is cheap. I don’t know if they’d be that cheap for you but for us it’s less than $16k for the year.
NotMax
@RevRick
When Maggie finally kicked the bucket the hashtag #thatcherisdead trended.
More than a few folks at the time took that to mean Cher had croaked.
JerseyBeard
@Bunter: The discovery tools charge per unit, either per document or with the text size tokenized. Runs from $.20-$.60/document depending on what feature you’re using. For comparison, humans review the more basic workflows at $2/doc, more advanced analysis at $4-5/doc. One case may have anywhere from 100 to 10’s of millions of documents.
Marc
I’ve always been a bit of an AI skeptic, going back to my first (somewhat successful) high school attempts at convincing a small single layer convolutional matrix (a.k.a. a perceptron) to read numbers encoded as dots in an 8×8 image. There were clear limits then and there are somewhat murky limits now. Now the discourse has been taken over by people who think they are going to become trillionaires if they can convince the big CEOs they’ve got the stuff that will give them a full Command, Control, and Intelligence system for micromanaging their employees and thus maximizing profit over everything else.
Meanwhile, my non-skeptic side sees other applications that don’t depend upon 100% accuracy, instead they retrieve/transform data from one form to another (say speech to code) at the explicit direction of the user, who can either accept, reject/regenerate, or move on to the next step in their process. Something that was suggested by long ago by research projects that were supposed to create, perhaps, a pilot assistant, but the logic-based planner and expert systems weren’t up to the task. Now, with those kinds of applications are in sight. They just won’t make anyone a trillionaire, so it’s happening at a somewhat lower level.
Glidwrith
Does anyone else read this phenomenon of AI eating its own poo as quite similar to the propaganda spew right wing keeps consuming then spewing in multiple iterations?
How else do you get the insane drivel the wingers believe?
Matt McIrvin
@NotMax: reminds me of the online pen store called Pen Island that registered penisland.com.
bjacques
@Gin & Tonic: Thanks. I’m saving this. Maybe only a dozen or post regularly over there, but I’m one of several more who’ve read every single one of Adam’s posts since Feb 2022 and many before that.
Trivia Man
I’ve seen several illustrations of this. Take a photo or drawing or painting and ask AI to copy it. Then copy the out put. Then that output. I think the standard is 20 generations – very wild outcomes that only bear a very slight resemblance to the original.
TONYG
What I have never understood, over the past few years of “AI” vaporware hype, is how feeding the “AI” vast amounts of data is supposed to “train” the “AI” in any meaningful way. It makes no sense to me. “AI” is software, and like any software it is vulnerable to design flaws and coding errors, many of which are difficult to detect. Feeding software more and more input does not “train” the software to have fewer flaws. It just exposes the flaws in the software to more and more input data. It’s like flying an airplane that has a flawed design faster and faster at a higher altitude. That is especially the case when the input has not been verified in any meaningful way. It just becomes garbage-in/garbage-out on a massive scale. No wonder “AI” has such a high failure rate.
satby
@Gin & Tonic: that is beautiful!
Marc
I could say the same thing about a lot of humans I know. In any case, I think some of you are missing that a useful human language must encode both syntax and semantics. LLMs are not devoid of semantics, in fact, they encode quite a bit of it. But the way they encode it (through adjusting billions of weights based on the provided training data) is fuzzy, and any given output is the statistically most likely (with some intentional noise) given the inputs and weights. So, the results are never 100% right, but one can use a second model to check if the results of the first model seem sensible, and so forth.
It’s like trying to explain how a “real” hologram (not those things at rock concerts) work. In some ways, the underlying math is related.
Marc
Sadly, we read stuff about politics in the media, we know to be skeptical, we read stuff about AI, most of us don’t really know what to be skeptical about at this point. “AI” does not have a high failure rate, “investors” have a high failure rate, and are willing to invest in the riskiest of ventures if the potential returns are high enough. And, early investors can get massive returns on their investments, even if the technology never works the way promised. That is the Silicon Valley tech hype machine at work.
Marc
@Trivia Man: Or, you can build a very sophisticated AI imaging pipeline with samplers, transformers, diffusers, etc. to generate very high quality output repeatedly. That’s requires skill, believe it or not, and that’s how AI porn producers get usable video.
schrodingers_cat
AI is not Data from Star Trek but it has its uses, especially as the first iteration of stuff.
TONYG
@Trivia Man: There’s a technical term for that which long pre-dates the “AI” hype. The term is: “Shitty software that is full of bugs”. “AI” is just software; it’s not fundamentally different from machine-language programs 75 years ago. There is no “intelligence”, no hidden consciousness thinking about things. What halfway functional software is supposed to do is to perform predictable, boring processing reliably given the same input and the same specification. Software that introduces random changes when asked to copy an image is shit software — software that should never be trusted to do anything. The “tech” companies are selling shit and telling us that it’s “intelligence”.
Cheryl from Maryland
@me: Yep. Perfect term. Inbred to the max. The last Hapsburg king of Spain, who was the recent descendent of TWO uncle niece marriages and FOUR first cousin marriages, died in his 30s, with no children, and with the lovely nickname of Charles the Hexed.
Fair Economist
Nobody can figure out what’s AI slop anymore, without a lot of intensive work. There’s no affordable way to keep the AI training data clean anymore, so it’s going to deteriorate quickly.
I actually tried AI on a rabbit care question: is erythritol (a noncaloric sweetener) safe for rabbits? The AI confidently said it was, but that it was a bad idea to give rabbits erythritol anyway, because they should eat mostly hay. That’s what I would have guessed, but when I looked at the references it cited, there was nothing about testing erythritol in rabbits. So it made it up, but somebody not doing their homework wouldn’t have noticed. But it was a plausible claim, probably even true (by accident). That’s how people get tricked into thinking it knows something.
TONYG
@Marc: My rule of thumb is that I’ll start to believe that “AI” is useful the next time that I have to call a customer-service phone number, and I get one of those goddamn voice-activated systems that actually answers my questions accurately instead of wasting my time until I can talk to a human. In theory, an “AI” entity doing that should be a low bar to clear. It wouldn’t need Artificial General Intelligence — only Artificial Intelligence for the particular company or organization that it represents. I’ve never seen it.
Steve LaBonne
@Marc: That’s still just making statistical associations between words. I don’t believe that “semantic” in the sense that LLM developers use the word has anything whatsoever with the way humans associate words with meanings and things. We barely even begin to understand how the brain does that.
Marc
@TONYG: LLMs are not “software”, they are encoded/compressed models defining up to as many as a trillion statistical weights. There is one whole set of software devoted to training those models, a separate set that uploads a model to a GPU (or 100), and generate output based on the input. Not software in the way we normally think of it, but more a process of matrix manipulation and calculations that can quickly on a GPU.
That said, a lot of the software I’ve found to be crap. I’ve been working on some of the simplest possible cases, and nothing builds the first time. Sigh.
Marc
@TONYG: That’s different, though, that’s about money, not AI :)
Marc
@Steve LaBonne: Someone named Noam Chomsky had a lot to do with debunking that myth. [Now that I remember it, the first generation of language translation software research failed impressively, as they assumed translation was just about syntax]
Let me see if I can remember something plausible: spoken language encodes semantics in structure/tone, and written language does so in structured text. We as humans tend to use the same language structures in a given language to discuss similar concepts, LLMs have a limited form of “semantic understanding” that comes out of relating those structures, which provides (in conjunction with statistical noise) a form of generality. How’s that for bullshit :)
Wilson Heath
“The model becomes poisoned with its own projection of reality.”
So via model collapse and burning all the fossil fuels, LLMs are replicating the human-centipede-ouroboros of conservative media?
Steve LaBonne
@Marc: Chomskyism has been dead in cognitive science for a long time. There is plenty of experimental evidence that language acquisition simply doesn’t work the way his model called for. To the extent that his work is important to developers of LLMs, to that extent LLMs don’t work the way humans do.
Marc
@Steve LaBonne: I understand, but that does not alter the fact that language must encode semantics. LLMs are not brains or cognitive devices, they are statistical engines. If there were no semantics encoded in text (or anything else) then LLMs would not work. Do you have a better explanation?
Steve LaBonne
@Marc: The obvious one, that the statistical associations between words in the training set obviously derive from human semantics, but semantic meanings are not actually present in the model in any but that trivial sense because the model is not the sort of thing that can have understanding or reference. Ultimately still glorified autocomplete.
HopefullyNotcassandra
@me: that works. It all feels quite maga to me.
different-church-lady
So in other words, the took a garbage in/garbage out problem and turned it into a garbage out/garbage in problem.
Marc
@Steve LaBonne: I agree 100%, and that’s where the usefulness is for me. LLMs (and related ML-based tools) combine sophisticated pattern recognition (text, code, image, video) with template-based data generation (text, code, image, video), all off which can be trained, controlled, and verified via user defined pipelines. That is not what Sam Altman is trying to sell.
Steve LaBonne
@Marc: Which is indeed the point.
frosty
@different-church-lady:
Perfect summary! GIGO becomes GOGI.
Or as I like to say, if a computer does it: Garbage In, Gospel Out.
MrPug
I do this for a living and I’ve been thinking about what happens when LLMs train on their own bullshit (that is a very fun paper by the way) and I intuited a year or 2 ago that the results can’t possibly be good. Since that initial intuition I have read academic papers that back up what was a gut instinct from me and providing the “why” the results are bad.
With that said, I think (hope?) that these tools will be useful but in much more targeted use cases where the RAG is retrieving actual human generated information and/or results from computations or real world data (my use cases are supporting oceanographic research). I do hope that after the bubble bursts that, like the dotcom bubble, there will be some useful solutions and infrastructure that can be used. Hope is kinda doing some heavy lifting here.
No One of Consequence
@Steve LaBonne: Also fair points. I posit we do not know enough about consciousness to make that determination, i.e. the type and number of inputs necessary for the arising of consciousness itself. The infant brain is writing its own OS as it goes along. Its creating the architecture as a result of unconscious testing and iteration.
Completely agree with you regards the variety and amount of inputs coming into the infant brain via the environment. Also not discounting social inputs (something I believe could be argued unnecessary for consciousness, as it could exist in a situation without social inputs).
I’m just not willing to believe that consciousness arises solely as a product of physical world inputs. (Though that is our only known means of determining it.) My argument for this is that I believe dolphins are conscious. All mammals when I reflect upon the brain structure. All the way through lizards, to honeybees.
The conception of the Self as existent independent of the environment/stimulus. Literal DesCartes brain floating in a jar.
Not arguing to be contrary, just questioning our assumptions about consciousness. I am most certainly wrong, but I am of several minds about by how much.
-NOoC
Steve LaBonne
@No One of Consequence: .?? All of those animals receive physical inputs from the world and social inputs from their fellows, just as we do. Still nothing like what a LLM does.
No One of Consequence
@Eyeroller: I am far from knowledgeable about this, but am aflame with curiosity. Understood about sessions spinning up, allocating memory, doing the deeds, then ceasing operation and freeing the memory utilized. Without carry-over from a previous session, and without the ability to self-modify, I concur with the assessment that this is all so much glorified chatbot. So much the more so, that when one queries (at least Claude 4.0) that it readily admits it cannot point to which mechanism or input aided in its answers, whether it was something hoovered up in the training data, or derived from disparate information sources.
Commenters here are also correct to question my use of ‘understanding’ and I do very much encourage that. It is the heart of my questioning. Contextual understanding requires experience. If no carry-over or self-modification occurs, then no ‘development’ and no growth. Consciousness cannot arise in such a static situation. Action/experiment, observation/analysis, test/iterate loops require change in order to alter neural pathways in humans at the developmental stages, even if done unconsciously as a product of genetics.
Some of my thoughts anyway,
-NOoC
No One of Consequence
@Steve LaBonne: Agreed. Just establishing my belief that consciousness is arguably NOT a human-centric condition.
Again, to try to refine my point, if one allows self modification and change over time as a result of testing and iteration by a given model, can we be so certain that inputs (even if not generated by the real physical world we inhabit) wouldn’t eventually give rise to consciousness? Far fetched? Definitely, but I don’t think we can speak confidently about how we derived this ability ourselves. Yes, physical inputs, social inputs, but are we only looking for things we recognize as humans, as opposed to allowing for the possibility that discernible consciousness could arise without those? (Given enough input and the true ability for an AI model to test and evaluate real-world actions.)
-NOoC
Marc
Let’s see, the first AI bubble burst around ’64, the second one around ’88 or so. Third one has a bit more staying power than the prior, a lot more money riding on it. But, there are always useful leftovers. Would you like me to write you a backward chaining rule-based system? Still useful…
NotMax
@Cheryl from Maryland
Had, prominently, the infamous Habsburg jaw, also too.
Pappenheimer
@twbrandt: the xerox example is not quite what’s going on. Imagine if one cartographer draws a map using another’s work, but they don’t speak the same language. The Sandwich Islands become the Islas de San Dwich. In extremis, a new Welsh saint springs to life.
Steve LaBonne
@No One of Consequence: Not human-centric, rather biological. We aren’t close to understanding how the biology works so duplicating that with computers is nowhere on the horizon except as sales talk.
Timill
@Pappenheimer: I have seen a theory that that’s how America got its name: an Italian transcription of the Norse ‘Markland’. With 500 years and possibly some other languages in the middle…
TONYG
@Marc: Well, I stand corrected I guess. But, nevertheless, LLMs are human-created structures and therefore are subject to the same errors and design flaws as any other human-created structures. I fail to see how “training” flawed structures on massive amounts of input accomplishes anything other than generating massive amounts of flawed output. But maybe I’m just not understanding the brilliance of it all.
NaijaGal
@CHETAN MURTHY: The European Union passed an AI Act, ratified in June 2024. It seeks to grade AI as having “minimal risk,” “limited risk,” “high risk,” and “unacceptable risk.”
It seeks to ban outright AI that has “unacceptable risk.” Some examples of unacceptable risk include prediction of criminal offence risk, untargeted scraping of the internet or CCTV material to create or expand facial recognition databases, and most likely any of the number of things that DOGE is planning to do with its stolen federal data on US citizens.
The end result is that AI companies have fled the EU for the US and have gotten compliant Republicans to add language banning the regulation of AI by states, etc., to US bills.
So, even though I use AI in my work, I can’t wait for model collapse in LLM-based AI to happen. Another AI winter is not such a bad thing, given the sociopaths leading the AI “revolution” in the US.
NaijaGal
@Gin & Tonic: Thank you for sharing this! So moving…
TONYG
@Marc: I guess I should have been more specific and said that “AI has a high failure rate when it is used by idiot fascist grifters”. But is it useful for intelligent people if the users have to check everything it spews out for errors? There comes a point at which an incompetent, dishonest assistant is worse than no assistant at all. https://www.science.org/content/article/trump-officials-downplay-fake-citations-high-profile-report-children-s-health
@Marc:
NaijaGal
I like Cory Doctorow’s related concept of “enshittification,” which plagues FB, Google, Amazon. etc., etc.
prostratedragon
@TONYG, @Marc:
No One of Consequence
@Steve LaBonne: Again, I agree. We cannot point to the departure from close-to-conscious-but-not, to recognizable consciousness.
What came first, the chicken or the egg? I posit the answer is The Egg. But what laid that Egg was not a Chicken.
Additionally, I concur that biological reasons (mechanisms?) account for a great and AsOfYetUnknown deal. We don’t have a real good grasp of how we store memories, let alone build upon understanding, but up until the point of language and education, the wetware is being built and modified on the fly.
Isn’t it conceivable, again, given change the self-modification ability, for enough input/output/test/iterate to result in recognizable consciousness? Perhaps not human, but by what objective metric would be utilized, if we have only our one paradigm to work from?
I recognize and endorse skepticism. On most all matters excluding the existential question of whether or not Baud is wearing pants at any given moment. He isn’t.
And I am not Dunning Krugering myself to any epiphanous (sp? or even word?) contribution. I am merely scratching the surface. But some of my background enables me to ask and understand different questions than others. I have sales in my background, and consulting, and yet I have an extreme dislike of bullshit and falsehood. If I cannot provide real world, honest-to-goodness value, I’d rather not have a sale. Today’s bullshit sale will lead to tomorrow’s loss of business from NOT providing value enough to keep the customers coming. Hype is a significant part of the bullshit I hate, and I try to avoid that altogether. I’d rather provide actual help, or to avoid a business relationship where the value proposition wasn’t demonstrable and accountable to the idea/product I was selling.
So, I very much want to avoid pollyannish behaviour and appearances. At the same time, I believe that the equation AI = Bullshit is not true in every case. I do agree, however, that there is much hype and very little realized value to be found so far. But I think that the so far part of that is important. That so far won’t always be operable.
I recall dot matrix printers, and cassette tape drives giving way to floppy disks. When to look at today’s technology, bearing in mind the range of advancement since, and overlay that same allowance for advancement — what is science fiction does not remain so. (Some argue that Star Trek Next Generation gave rise to the iPad.) Ymmv.
-NOoC
Marc
Have you worked in Corporate America? :)
Marc
@TONYG: It’s not brilliance, the relevant math literally dates back to the 1920s. They’re more like specialized sieves for sifting out tokens based upon the ones that are put in. If it’s trained on 10,000 documents that all say the sky is blue, ask the color of the sky, and it will say blue 99% of the time. If you train it on documents that say the sky is green, well that’s what you get. These are consensus engines, you can feed them any kind of data, with any biases you want. In response to a prompt, it provides the consensus of the data it was trained on.
No One of Consequence
@Marc: but things appear to be far more along than various iterations of K’s nearest neighbor. A consensus engine is a great way to think of it. Given that, is there any mechanism or hope for determining truth? Beyond mathematical calculation, or symbolic logic representation?
The model won’t be able to discern legitimate doctoral theses from abject bullshit. Can it be safely assumed that there is metrics/parameters for invalid data that is ingested? Is this a capability that could derive from iteration?
Interesting notion: AC/DC AI Bullshit Detector — works both ways — enables skeptical evaluation of both conclusion and claim!
;)
-NOoC
Kayla Rudbek
@JerseyBeard: how do you keep the LLMs from hallucinating fake case citations?
Kayla Rudbek
@me:
@Rose Judson:
@Cheryl from Maryland:
yes, it’s my favorite term for it
Marc
That’s the choice of the people providing the training data, if they tag text from The New England Journal of Health as equivalent in veracity to the The National Enquirer (or The Onion :) that’s what you’ll get. Of course, it’s reasonably easy to train a small model whose sole purpose is to flag potentially dubious data and generate a new result, rather than passing bad data further up the pipeline, but that takes time and money.
No One of Consequence
@Marc: Gotcha. Weighting of sources. Works for established journals of record, whose published content should be accurate as possible for the timeframe (thinking long established medical journals or scientific journals of record and tradition).
But what about contemporary data validity? For example, training an AI model with the summed total contents of StackOverflow. Jackals here have been providing good counter valance to my musings. Getting beyond symbolic logic and mathematics are wholly outside the domain of an AI model for purposes of being able to discern truth. As an earlier commenter noted: consensus engine. Not certainty, but increasing likelihoods of probability.
Were I more certain of my own capabilities, I would begin to make pointed inquiries of what an AI model can do with Bayes Theorem. I anticipate some variation(s) of that are being used from jump with these models.
-NOoC
Marc
Retrieval Augmented Generation and other external tool tricks are used to retrieve recent data and add it to the input.
Bayesian principles are used throughout the underlying math of an LLM, that and some silly linear regression tricks that can solve equations with billions of parameters in seconds.
artem1s
@Marc:
sounds an awful lot like the dot.com bust from the 90s. Buying and selling fast became the goal, not actually identifying or producing or sustaining a viable product.
this is getting chased by more than VCs and wall street though. when a fad demands coopting the entire country’s energy infrastructure thats a bit more problematic than a Beanie Baby bubble bursting.