ETA: This was written for Balloon Juice but is crossposted at Inverse Square.
So…
Just this week I learned that through my home institution I have access to a suite of LLMs, including all the usual suspects (Anthropic’s Claude, OpenAI’s ChapGPT, Meta’s Llama, and Google’s Gemini. MIT’s come up with a very nice interface to interact with all this artificial talent, and I’ve spent a couple of hours this weekend taking some of them out for a spin. (I’m avoiding OpenAI’s offerings both as a trivial protest and because I don’t trust anything about that company.
So far I’ve enjoyed Claude the most (the Haiku 4.5 model, if you’re wondering). Gemini is interesting, if a bit finicky. But what made me howl was the encounter I just had with Llama, Mr. Zuckerberg’s contribution to the genre. Context: over the holidays I had a conversation with a senior person at another magnificent 7 tech firm who’d just seen a colleague leave to go to Meta who was utterly dismissive of the company and relieved to be rid of anyone dumb enough (in his view) to basically end his career in top-tier tech by grabbing Facebook bucks. Spoiler: after what you’ll read below, I can see where my friend was coming from.
So here’s the setup. I’ve been asking the various models what I hope are zero-consequence questions, queries in which no one could possibly get hurt if the LLMs wing their way to utter bollocks. A typical ask: map out the logical structure Einstein used in his 1905 light quantum behavior. That one tended to get an initial poor answer based on the idea that the paper centers on the then-pressing mystery of the photoelectric effect–a common mistake for people as well as machines. Pressing the models led Claude in particular to a much more sophisticated account of the paper drawing attention to the way Einstein used arguments from thermodynamics to propose the necessity of understanding light as discrete packets of energy.
The paper was about this new “heuristic” [Einstein’s term] view of light, and the photoelectric effect appears only as one experimental support for that view. (Much more here.)
After three or four trips through the query engineering needed to get a useful result I’d found my way down the list of models at my disposal to Meta’s Llama. I was bored with light quanta and so asked a different, much simpler question:
Respite: At Play in the Fields of the LLMsPost + Comments (109)


