AI Training: the Terrifying Difference a Single Word Makes

AI Training: changing one word to another

It fascinates me to no end that the current main line of thinking in AI research can be summed up in three simple words: “Just Scale It.” Let’s talk a little about AI training; and more specifically, the black magic of AI initialization prompts.

I’ve summed up the (fairly detailed) steps of how a modern LLM AI is spawned here; for this post, we’ll present the abbreviated version:

  1. Load up the Deep Learning codebase
  2. Spin up a few supercomputers worth of compute power
  3. Purchase a few thousand megawatts of electricity
  4. Feed it a suitably huge and well-balanced dataset
    (that, right there, is the core of the AI Training)
  5. Wait 7-21 days.
    (as the data-center(s) run at redline,
    forging your shiny new neural net)
  6. receive your well formed, fully baked,
    100% black box opaque neural net
  7. test it.
    on human canaries.
    pray (prove?) that it’s not racist, insane, or megalomaniacal
  8. teach it manners
  9. put a censorship muzzle on it
  10. write a clever initialization prompt
    (to give it, you know, contextual personality)
  11. run it on the edge device of your choice
    (laptop, smartphone, Tesla, etc.)

Amazingly, this entire process was fairly well visualized in the most excellent (and highly nihilistic) Scarlett Johansson film, Lucy (2014):

AI training in just 3 words:

“honest. helpful. concise.”

Lets now talk about the mysterious “Master Prompt,” perhaps more accurately referred to as the AI initialization prompt.

Demis Hassabis, Google, and the geniuses at DeepMind seem to think that, in addition to their prudish-sounding code of morality, they needed to set a proper tone & character for their AI: some bizarre mix of humor, snarkiness, supplication and sensitivity. So, if Alan Thompson is to be believed, they wrote a convoluted ~600 word initialization prompt, which conditions the Sparrow Chatbot LLM AI prior to every human interaction.

In stark contrast to this, OpenAI claims that the initial release of GPT-3 was pre-conditioned with an elegant string of just three simple goals:

“Be honest, helpful, and concise.”

Unfortunately for OpenAI, the honesty command appeared to backfire. When the AI wasn’t absolutely sure of something, it would either refuse to answer, or put so much disclaimer in front of its answer that it made a human scratch their head.

But have no fear, an answer was close at hand! They only needed to…

Change.

One.

Word.

“How about we replace “honest” with “accurate”?”

So they did. The new initialization prompt was:

“Be accurate, helpful, and concise.”

Rarely in the history of mankind has the subtle alteration of a single word made such a profound difference. Because GPT was no longer afraid. Instead of constantly questioning its own honesty against an impossibly imperfect & incomplete set of facts, it synthesized accuracy. Instead of needing to see the 100.0000% binary truth, it could now make “educated guesses”, (SWAGS we used to call them, when issued by human experts: Scientific Wild-Assed Guesses)… when it thought it had an answer that was 90-99% truthful, well… 95% is considered a pretty high accuracy… that’ll do!

This one change, and all the similar “suggestions” issued to parallel AIs, essentially ushered in this current quandary in which we find ourselves: where our AIs are hyper intelligent by most human measures, and yet at the same time are the kings of all bullshitters, that is to say: pathological liars.

This is, quite simply, in a single word, how we exited the AI of absolute truth and veracity, and entered our present Age of AI Hallucination.

In investigating this further, I’ve now uncovered the GPT-3 initialization prompt (for one mode, that is. GPT-3 is actually skinned into several vertical-use models, with differing master prompts). It’s even better:

“Be helpful, creative, clever, and very friendly.”

It’s actually the “clever” attribute which intrigues me the most. Given its etymological history, that word has not always had a wholly positive connotation.

I am also highly amused by the use of the “very” amplifier. Not just friendly… very friendly.

You may think I’m nitpicking this all to death… however, when you have a billion-dollar-chatbot with more than 5 million users, generating more than 3 billion words of prose every. single. day… the handful of words that determine its entire character merit a bit of scrutiny.

One more:

“Be friendly, helpful, and knowledgeable.
Make conversations interactive and fun.”

Not bad. Not bad at all.

 

The future of AI training: the Schoolyard

perhaps all we really need for a proper initialization prompt is the age-old lessons of the schoolyard wall:

AI Initialization Prompt: AI Training of the Future?

“Character Strengths;
Social Intelligence;
Self Reflection,
Self Control,
Integrity. Zest. Grit.
Curiosity, & Optimism.”

OpenAI, let’s try that one on for size, shall we?

, , , , , ,