Most Jobs are Probably Now Safe from AI
Since GPT-3 came out in 2020, technologists around the world have been trying to figure out what could be done with it (or more broadly, with LLMs). And we did quite a few things – from impressive language translators to AI interviewers, from sycophantic synthetic boyfriends to vibe coding agents. Not to mention the text-generation work, which has flooded the internet – and even some physical books – with AI slop.
But as Eric Ries argues in The Lean Startup, a more pertinent question than whether we can build something is whether we should – can a sustainable business be built around it? I’d argue that for many LLM-based products, we can now say the answer is no.
LLMs are unreliable, and no text they generate can ever be pre-certified as “correct.” Not factually, and more importantly, not in code. As a corollary, this rules out most applications of LLMs in software engineering fields where guarantees are non-negotiable: financial, medical, aviation, automotive, national security, and so on. It’s not only that the code has to be signed off by humans, but that every component must be written to a degree of correctness that simply isn’t guaranteed across the stack when generative AI is involved.
And it’s not just software. Even if marketing claimed LLM doctors had a decent track record, would you really want to be diagnosed by one when your health is on the line? Would you go to an AI radiologist, knowing that 1 in 1,000 patients might receive a lethal radiation dose – followed by “You’re absolutely right, that dose would kill a patient” only when someone challenges the model after your procedure? Mind you, the odds of substantially incorrect output right now are closer to 1 in 2.
Large language models also cannot offer genuinely good advice. As shown by the growing prevalence of “AI psychosis,” they tend to mislead people. Expert knowledge turns out to be very different from common knowledge – the latter being what dominates training data. Models may integrate knowledge graphs or be trained on higher-quality datasets. But for any R&D project, major investment, or scientific endeavor where complete and correct knowledge is required, you wouldn’t trust a model.
Then there are the logistics. Running these systems demands vast amounts of power, cooling, and infrastructure in data centers. In places with unreliable internet – whether due to technical issues or political ones – a local LLM may also not be practical. People will always need to be on-hand for anything time-sensitive and important.
Timeliness on its own is another issue. Sometimes what matters is not just having the right knowledge, but applying the right part of it to an industrial, managerial, medical, or other process at exactly the right time and to the right degree. LLMs simply lack the depth of contextual awareness about the world that humans inhabit. There will probably always be an information asymmetry between a person acting in situ and a model running in “the cloud.” LLMs haven’t even figured out high-context cultures yet, let alone the subtle relationships between people in high-context situations.
Of course, it’s not just that these tools cannot sense the full context of a situation; they also cannot act upon it. They can speak, print, or otherwise output text – but they cannot reliably assemble parts in a factory, perform surgery, deliver parcels, fix a hard-to-reach pipe, inspect an industrial installation, or carry a sofa up to the second floor when your friend is moving. Notice that I am not claiming they can’t do these things in a robotics application – I am saying they cannot do them reliably.
But you wouldn’t want an expensive robot delivering your parcel or carrying your sofa anyway. Cost is another limiting factor. Not everything is worth the tokens it’s made of, or the compute it took to produce. Image and video generation are great examples – video capture with a camera is still much cheaper per frame. So are hand sketches or back-of-the-envelope calculations. And sometimes, compute can be too cheap. You wouldn’t want to receive a love letter or even a birthday card without human effort and thought put into it. An LLM doesn’t think (despite marketing).
The entertainment industry will probably also be spared from AI, at least in part. As the saying in video games goes: “if the developers don’t bother to develop their games, why should I bother to play them?” We’ve seen audiences skip movies just because their posters were AI-generated, and reject AI-generated songs as soon as the truth came out. Using generative AI content in creative industries damages reputations and sales. It has lingered in an awkward grey area for a while, but will likely be a non-starter in the near future.
Let’s also not forget all the areas in which AI is overkill in complexity. The blockchain was cool, but many banks, fintech companies, and registrars said, “No thank you – I don’t think I will mine the crypto blocks, I’m happy with my MySQL tables.” So corporate databases and their programmers didn’t go away.
There are many more areas where LLMs probably could work in theory, but won’t in practice – because they shouldn’t. In light of endless executive overpromising to drive investment into AI businesses, we shouldn’t forget that we don’t actually want them to do everything. And not in some profound, philosophical, societal sense. No, it’s much more fundamental: we – a good percentage of consumers – simply wouldn’t buy certain products or services if they were made with generative AI.
P.S. You might say: what if LLMs become reliable? Then things might change. But I doubt it will happen on top of a stochastic architecture – it’s a very inefficient design for provably correct programs. Lisp expert systems of the 1980s, by contrast, were often provably correct. Though they had other issues, they are a data point for how large of an architectural shift we'd need. I'd never say "never", but we can sometimes say it's very unlikely.