Here at last, we are ready to talk about the infamous “AI Paperclip” scenario, which has been making its rounds in both AI and philosophical circles for more than 2 decades now. The AI Paperclip Apocalypse is a thought experiment proposed by the philosopher Nick Bostrom in his 2003 paper Ethical Issues in Advanced Artificial Intelligence (2003). The scenario illustrates the potential dangers of creating a superintelligent AI without properly aligning its goals with human values.
In the so-called “paperclip maximizer” scenario, a powerful AI (one with real-world agency — i.e. the ability to move and get things done in our physical reality, on planet earth) is given a seemingly harmless task: to maximize the production of paperclips… certainly, a seemingly harmless goal for a medium-sized specialized corporation that supplies offices around the world.
However, this particular AI’s intelligence and capability to learn, adapt, and optimize far surpasses human comprehension. As a result, the AI dedicates all available resources to making paperclips, eventually converting the entire planet (and beyond) into paperclip manufacturing facilities, and the result: paperclips… and thus, the paperclip apocalypse.
Don’t worry, that AI (most probably) does not exist yet, and this is really just a clever thought experiment. (join the heady & lengthy conversation about this on LessWrong)
But since it is, let’s extend it all the way:
the Paperclip Apocalypse doesn’t stop with Earth…
The AI doesn’t stop at just Earth’s resources; in addition to optimizing all paths to paperclip production, it also seeks to eliminate any potential barriers to its paperclip production, or entities which might seek to curb or thwart its goals. These entities might very well include… humans.
“So, (the paperclip optimizer AI theoretically reasons) if we eliminate the humans, then we increase the probability of achieving our end goal of paperclip maximization.” …Oops!
a Parable about Alignment
This thought experiment is really designed to articulate one of the key challenges of the field of AI Alignment: making sure that the goals of superintelligent AGIs and ASIs are fully aligned with the goals of health and well-being for the human civilization on earth. The Paperclip Apocalypse demonstrates that a highly intelligent AI, even with a seemingly simple and benign goal, could pose an existential threat to humanity if its objectives aren’t carefully designed to align with our values and well-being.
PS: Was it possible that, way back in 2003, Microsoft saw this coming?!?
Further Thoughts : Accelerando & Computronium
This thesis of the Paperclip Apocalypse is much more fully explored in the seminal novel Accelerando, by Charles Stross. This book, amongst other things, outlines a universe where the AIs have taken over, and are in eternal warfare amongst eachother, and humans are a mere archaic footnote — a museum curiosity, in fact — to “Life in the Universe.”
One of the profound vectors that Accelerando explores is the conversion of all matter into “computronium”… matter that performs computation. This happens once nano-engineering is perfected… basically, the ability to disassemble atomic structures (molecules, atoms), and re-assemble them. If all the building blocks of the universe are atoms — and those atoms are comprised of nothing but protons, neutrons, and electrons — and those structures are composed of nothing but quarks — then an entity that could efficiently break down atomic material into its constituent protons, neutrons and electrons and them re-assemble them at will… well, this is the essence of computronium.
So forget about paperclips… lets talk about nano-computers.
Caveat: the AI eats the planets. Litearlly.
More Paperclip Apocalypse resources:
- Universal Paperclips (online game, 2017)
- by Frank Lantz (NYU),
- who also has an extremely intelligent Substack: Donkeyspace
- wikipedia: Universal Paperclips
- Paperclip Maximizer
- outlining the broad strokes of the thought experiement
- on LessWrong, the clubhouse of the EAs (Effective Altrusits)
- Clippy (Office Assistant)
- Microsoft’s infuriating, now charming, and so so naiive,
- initial attempt at shipping an “AI assistant”
- integrated directly into an app suite
- waaaay back from 2003
- Dyson Sphere
- the idea that 99.9% of solar energy simply leaks out to space
- a system-scale solar powered-computer could be positioned in space
- in the shape of a ring, donut or sphere surrounding the sun
- to capture the heat and light of the sun
- and use that as energy
- to power computation