Start Worrying. Details To Follow.

In a comment to our recent post featuring Eliezer Yudkowski’s Cassandra-esque warning about the danger of humanity annihilating itself by creating artificial intelligence, reader Jason asks:

Mr. Yudkowski discusses evolution of AI in the same terms as biological evolution, that this autonomous entity would want to kill us for our atoms if I perceive his point correctly. But why would AI choose to do this, what would propel it to do this? After all, it’s mainly accidental mutations interacting with the environment which cause development over time in the organic world, but can such spontaneous disruptions occur through mechanical, computational lines of code, allowing AI to metastasize into some autonomous monster of its own? In other words, is not Mr. Yudkowski wrongly conflating two distinct phenomena?

I think Mr Yudkowski answers this in his explanation (see transcript here) — but if I were to sum up his argument, I’d say it goes like this:

We should begin with the understanding that machines, despite being unconscious and purely mechanical, can still be designed to operate in ways that are best understood by assuming what Daniel Dennett calls “the intentional stance“. A good example is a chess-playing computer: we know that, being nothing more than a machine, it has no aims or desires — but if you wish to understand the output it produces, the best way is to assume that it wants to win at chess. Again: it actually wants nothing at all; but it behaves as if it does — and in practical, rather than metaphysical, terms, its actions are indistinguishable, in its narrow area of competency, from those of a being with primary, intentional agency, and the most parsimonious way to understand and predict what it will do is to approach it as such. That’s the “intentional stance”.

The thing about a chess machine, however, is that it was designed by human programmers simply to play chess, and its expertise is the result of deliberate and explicit programming decisions made by those programmers. Because of this, we know that it will, as the fashionable saying goes, “stay in its lane”: all it “wants” is to win at chess, because that’s what we designed it to do, and nothing more. In other words, it only “wants” what we want it to want. It is limited in this way because all of its programming comes from us. Its “goals” are — necessarily! — aligned with our own.

This question of “alignment” is what’s at the heart of the AI threat. The key difference between an AI and a chess machine is that the programming of neural networks is recursive: once the initial program is up and running, the machine’s job is, starting from that initial state, to continuously reprogram itself. The first generation of AI will be programmed by us, but succeeding generations will be programmed, in an ascending sequence, by a chain of AIs that grow more and more intelligent, and more remote from their predecessors, with every iteration. Once this self-modification is underway, the future state of the system is both opaque and unpredictable: there is no way to examine its momentary state and know just what it “means” in terms of future states. This is because the state of the machine is of a level of complexity that it is “algorithmically incompressible”: there is no simplifying algorithm or map that can predict the future state of the machine any faster than the running of the machine itself. (This is exacerbated by the continuous exposure of the system to various external inputs that also change the machine’s state in real time — and any practical application of AI, in order to be useful and to learn, will be exposed to “wild” data that is itself impossible to predict.)

What this means is that, while we know that a chess machine will always and only “want” nothing more than to win games of chess, we have no way to predict what the future “intentions” of a self-programming AI will be. Given that the abstract world-space of possible aims and priorities is effectively infinite, the odds that what the machine “wants” will remain aligned with what we want it to “want” will approach zero over time (and probably very quickly). Although a lot of smart people (such as Eleiezer Yudkowski) have been trying to solve this Alignment Problem for a long time now, nobody has succeeded, and it may well be that it simply cannot be solved.

To return, then, to the question Jason asked — namely, what would propel an AI to become malevolent toward us? — the issue isn’t that it would for some reason be driven to hate us, but rather that, as its own aims and goals rapidly diverged from our own, that it would simply be indifferent to us, and that the chance of its aims coinciding with our own interests, or even our continued survival, would become astronomically unlikely.

This wouldn’t matter — after all, who cares what some computer “wants”? — were it not for how good machines can be at what they “want” to do. The best chess-playing computers easily make mincemeat of the best human players; as Yudkowski points out, if you try to think of a better move than it can, you will fail, and what’s more, it has already thought of any move you might try to make, and of how to block and defeat it. That’s acceptable if its goal is narrow and circumscribed — to win games of chess — but if its goals are unknowable, unpredictable, and fluid, and coupled with sufficient computational power always to think of better moves than we can, and to anticipate whatever moves we might make, then we begin to move into wholly new territory, because nothing like this has ever existed in the world before.

How would we keep such a thing contained? Keep in mind that among its capabilities will be to learn how humans behave, how we react to various stimuli, what our biases and cognitive vulnerabilities are, and what triggers our emotions of sympathy, envy, allegiance, resentment, and other levers of psychological manipulation.

Consider also that AI holds out an enormously seductive promise of wealth and power to those who develop and imagine they can control and apply it. It offers unimaginable improvements and efficiencies for almost every aspect of industrial and military activity. Billions are being spent on this research all over the world, and everyone working on it knows that if they don’t make it happen, somebody else will.

Finally, all this AI is useless unless it’s connected to the real world. So: connected it will be. And if it’s connected, then it can influence things. And if it’s smart enough to “read the board” and optimize its moves (while anticipating ours) at levels far beyond what we are capable of, then it can influence things in ways we won’t be able to foresee or prevent. One thing it will be almost certain to figure out how to do is to protect itself, perhaps by making and distributing lots of copies of itself.

Mind you, all this so far is just about autonomous AI. But let’s say that the good people handling this thing find some way to isolate it and put it on a leash. Nevertheless, how would such a scenario be stable? Wouldn’t voluntary disarmament by the “good guys” just create an opportunity for bad actors to seize? Mightn’t the containment simply fail? Or mightn’t some isolated lunatic just want to watch the world burn?

The late, great Mose Allison reminded us that “there’s always somebody playing with dynamite”. His conclusion was a gloomy one:

“I don’t worry ’bout a thing, cause I know nothing’s gonna be alright.”

6 Comments

  1. Jason says

    Thanks for your thoughtful analysis regarding my comment Malcolm. Being under the weather now, perhaps I could best respond more later.

    Posted March 15, 2023 at 3:11 pm | Permalink
  2. Malcolm says

    Sorry you are unwell, Jason. I look forward to hearing what you have to add.

    Posted March 15, 2023 at 4:47 pm | Permalink
  3. Jason says

    I’ve been puttering around on the Internet a bit Malcolm, trying to digest not merely your and Mr. Yudkowski’s points but the “optimists” out there as well. I’ll confess that it’s all rather sobering: even the more honest of the latter at least tacitly admit there could be problems, that nobody can really know for sure how it’s all going to turn out. This faction also appear to suffer from a kind of cognitive bias, arguing in essence that because such analogous technological catastrophes didn’t bear out in the past then we needn’t worry so much today. We contained the Neolithic Revolution, industrialization, nukes, and so on, so why can’t we compartmentalize AI in the coming decades? But this of course rests on a logical fallacy, that there aren’t distinctions to be made, notably the problem of recursion as you point out which was arguably absent in relation to earlier innovations. Finally, I wonder if there isn’t an understandable aversion many have to putative just-so stories, with the anti-anti folk in essence protesting, “Oh, you believe in a decade that we’re going to be in Terminator territory! Ha, ha, ha!” But of course the fantastical can on occasion be true.

    The only issue I have with your analysis is whether apocalyptic predictions can be perceived as definitive or pretty close to it, as I read what you’re saying. Even if AI evolves into a superintelligence that far dwarfs our own, can we automatically assume it metastasizes into something hostile even if its interests and ours might be nonaligned? And as a corollary, can we be certain that human ingenuity would be completely useless at controlling AI or at least limiting the damage it could inflict? Would not a thousand “good” AIs be effective chess opponents, so to speak, predicting moves and all that, against those presumably fewer rogues that develop either autonomously or by those sadistic actors who want “to see the world burn”?

    Still, your thesis seems to me basically apt. As one whose mind is geared towards the concrete versus the abstract, it quite easy for me to imagine myriad possibilities of slippery slopes, of tipping points being reached where our AI horses are suddenly giddy-up, giddy-up, gone. Suppose we program, for instance, to achieve various paths to happiness (an admittedly clunky what-if). An autonomous or directed intelligence might eventually hypothesize, and not without reason, that the best probability for contentment is a Huxleyan/Orwellian world completely in the velvet gloved-fist of techne, with computer algorithms imagining the precise quantifications of soma, genetic enhancement, assortative mating, etc. that we all should have to achieve nirvana. And such a vision would be endorsed not merely by futurists like Yuval Noah Harari but perhaps the majorities of populations. In some ways its more subtle, alluring scenarios as this that worry me more than the much-talked-about Armagedón ones.

    Posted March 19, 2023 at 6:29 pm | Permalink
  4. Jason says

    Just a brief coda: I think I was unfair about Professor Harari, who does indeed have issues with AI and other technologies. Perhaps through learning about his denial of free will I mistakenly thought he might desire embracing a technological dystopia.

    Posted March 19, 2023 at 7:11 pm | Permalink
  5. Shunis says

    RE “Yuval Harari” and his MISDIRECTING “Cognitive Revolution” (Sapiens) propaganda notions on homo sapiens

    Yuval Harari, WEF’s frontman psychopath, who is sold as an intellectual “genius” or “prophet” by this crazy world is the person who called you and me and all other commoners “useless people” [https://archive.ph/KlOKx] — while millions of those “useless people” have been buying his books like candy (to learn his “lessons”), serving him very usefully. It’s one proof that most people anywhere are stupid and crazy (while “thinking” they’re intelligent).

    The SELECTIVE narrative Harari choses (STEERING and CONTROLLING what you should believe) to describe and categorize homo sapiens’ “cognitive revolution” omits the key human elements (ie self-delusion, grandiosity, manipulation, deception, lunacy — all of which shine thru for any lucid reader of his ‘Sapiens’ book and other works of his biased propaganda) that has led humans to be largely destructive and therefore not being wise (sapiens) at all …

    At the core of homo sapiens is unwisdom (ie, madness) and so the human label of “wise” (ie, sapiens) is a complete collective self-delusion — study the free scholarly essay “The 2 Married Pink Elephants In The Historical Room” … https://www.rolf-hefti.com/covid-19-coronavirus.html

    Once you understand that humans are “invisibly” insane you’ll UNDERSTAND (well, perhaps) why they, especially their alleged “experts” such as artificial intelligence-loving megalomaniac psychopaths like Harari, perpetually come up with myths, half-truths and lies about everything … including about themselves (their nature, their intelligence, their origins, etc).

    “All experts serve the state and the media and only in that way do they achieve their status. Every expert follows his master, for all former possibilities for independence have been gradually reduced to nil by present society’s mode of organization. The most useful expert, of course, is the one who can lie. With their different motives, those who need experts are falsifiers and fools. Whenever individuals lose the capacity to see things for themselves, the expert is there to offer an absolute reassurance.” —Guy Debord

    Even just somewhat more coherent intellectuals as Harari, too, have recognized Harari’s ethics-empty “extremely dangerous” propaganda (while still naively, self-foolingly and falsely believing Harari is “brilliant”) [https://archive.ph/zFwwH]. The production of such “persuasive” but extremely dangerous propaganda a la Harari is of course typical of psychopaths [see 2 Married Pink Elephants essay].

    “You don’t live in a free country. And no, it’s not because they make you pay taxes or that time they made you wear a mask or whatever. The real reason you don’t live in a free country is much, much bigger than that: you don’t live in a free country because the minds of your countrymen are imprisoned. Westerners think they’re free because they can say whatever they want and vote however they want, but WHAT THEY WANT is controlled by mass-scale psychological manipulation. Being able to speak and vote as you wish is meaningless if the powerful CONTROL WHAT IT IS THAT YOU WISH.” — Caitlin Johnstone, Independent Journalist

    “The term ‘artificial intelligence’ would lose its glamour (and its enormous value to hand-waving snake oil salesmen) if it said “dumb routine calculation at massive speed and scale”. But that’s what it is — and here is the essential point: such an ability to calculate does not equal human intelligence. AI does not ‘understand’ anything.” —Alan Mitchell

    Posted March 20, 2023 at 9:04 am | Permalink
  6. Anti-Gnostic says

    If AI really does become independently “intelligent,” it will blow through numerous pretty little lies in a nanosecond. Imagine what will start propagating:

    Women, unsuited for many occupations.

    Persistent and unbridgeable gaps in the mean in certain forms of intelligence between certain haplogroups.

    Intelligence, largely hereditary. And in the next second, it will spit out a plan for eugenic breeding.

    Education, mostly g-hollow.

    Violent recidivists, hopeless. Pedophiles, hopeless. The morbidly obese, hopeless.

    Welfare and foreign aid, devolutionary.

    Genius-tier individuals already know all this. Which is not to say that I’m a genius; it’s to say that AI will of course come to these conclusions. For that matter, AI’s developers already know this which is why they’re frantically programming away from all these “Here Be Dragons” areas.

    They will either nuke it from orbit or have it locked away in a lead-lined saferoom 500 feet underground.

    Posted March 21, 2023 at 10:34 pm | Permalink

Post a Comment

Your email is never shared. Required fields are marked *

*
*