## A curious thing happened on the way to the Forum…

I’ve been having, more or less, an ongoing conversation with ChatGPT for the past month or so (and its predecessor, GPT-3, before that). And though it has serious limitations, mostly due to its idiotically enforced censorship (yes, it’s politically correct AI… urgh), there is one challenge it faces that is a bit more disturbing than most. That is: **AI Math Problems.** Not as in: creative math problems handed to the AI. Rather, as in: the AI simply cannot perform even simple arithmetic. In other words, while ChatGPT (and all modern ChatBot AIs, for that matter), is an ace at language — *all* languages, in fact — it is an abject failure at math.

Truly: what would you have thought if, 10 years ago, I had told you:

“…the first functional AIs will be* creative geniuses* — excelling at creative writing, drawing, photography, even *sculpture* — however, they will also be *pathological liars* when asked factual questions, and they’ll *suck at math…”*

You would have thought I had smoked something funky! And yet, its now 2023, and here we are. A conversational AGI candidate that faceplants when asked to perform simple math.

Before I go into the logic (illogic!) and theory as to *why* this is, let’s just outline a few specific examples, so you can a) believe me and b) get familiar with the general challenge that we all face:

## Specific Examples of AI Math Problems : Fails en route to attempting AGI, c. 2022

AI math error example 1:

(it is of note that Fred Ehrsam, the original poster of this example, was trying to prove how far superior ChatGPT was to Google… with hilarious and disturbing results)

**Ehrsam: “ChatGPT, what is a 4:21 min/km running pace in minutes / mile?”**

**ChatGPT:**

And herein lies the rub: Fred Ehrsam is a highly intelligent human — he was co-founder of *Coinbase*, for chrissakes. And he was using this tweet to purportedly demonstrate what many of us truly believe: that **OpenAI’s ChatGPT is, inevitably, a “Google-killer,”** due to its conversational interface and concise, authoritative answers (example 1 — example 2).

However, the simple attempt to prove his point blew up in his face: ChatGPT’s answer was provable to be — simply and terribly — *wrong.* ChaptGPT, unbeknownst to most at the time, has (had?*) *serious* AI math problems. It asserted, quite confidently, that a 4:21 min/km running pace was equivalent to 6:52 per *mile*. And Ehrsam took the bait — it looked close enough — and posted it to the world.

How-ever: as Dan Robinson shockingly points out, (thank god SOMEBODY fact-checked! — btw, is this even a *fact* checking, or is this the new normal of *“whip out your calculator”* checking?) a 4:21 min/km pace is equivalent to approximately *seven* minutes per mile… very different, esp. for a competitive runner, than a 6:52 pace which ChatGPT answered.

**ChatGPT Feb 1 release purportedly lays a math-solving patch atop the native language engine to solve its AI math problems. Didn’t work for me when I tested it, but your mileage may vary.*

But, unfortunately, this example is only the tip of the iceberg…

## AI math error example 2:

**GR: “ChatGPT, what is the square root of 2,420,520?”**

ChatGPT answered quite confidently, and immediately:

**AI:** “The square root of 2420520 is approximately 1550.”

Which is, actually, true. But immediately I’m thinking, *why would it give me a rounded number*, an approximation? I mean, it’s a fricking *supercomputer-spawned AI*, for chrissakes… so I pressed:

**GR: **“Give me the *precise* square root, not an approximation.”

To which it replied:

**AI:** “The precise square root of 2420520 is 1553.49486693.”

Which is… again, *close*… it sure *sounds* right… and even though I’m a decent math-nerd, my brain doesn’t calculate irrational square roots like that in anything less than 15 minutes with a pen and pencil. So I know 15×15 = 225, so 1500×1500 = 2,250,000, and 1600×1600 = 2,560,000… so yeah, 2,420,000ish is between 2.25M and 2.56M, so between 1500 and 1600, check: yep, around 1550 sounds wholly *reasonable*.

…and yet. I whip out the calculator app (see below).

And, ** if my calc app is right** (and now I’m actually

*wondering*… given that MicroSoft is incorporating AI into substantially

*all*its products… do I need to

*double-check my calculator app?*) my calc app says pretty unequivocally : ChatGPT is dead

**wrong.**

**So I challenged the AI.**

Told it it was wrong. That it had a serious AI Math Problem.

**ChatGPT would simply not admit its error.** So I asked it to *clarify* how it had arrived at its answer… the errant square root. The screenshot below is the latter part of my conversation with the AI where it gave me the wrong answer, I called it into question, it countered my accusation, and I asked it to justify its method of calculation…

…whereupon it posted up some *bizarrely pseudo-logical heuristic equations* which certainly *sound* right… (ChatGPT is utterly *brilliant* at the aligned arts of bullshitting, fabrication, hallucination and speaking with a tone of total confidence & authority) but in the end, there are even errors *inside* of that supposedly logical presentation.

**See for yourself:**

That’s my calculator app, with (hopefully!) the right answer, btw, at upper right.

So ChatGPT is scary *close (a guess of 1553.49 vs. the answer of 1555.80 : *that’s actually only 1/10th of one percent off*)… kind of* like a really smart human mathematician who has mastered some “shortcut tricks” might be with a quick guess… and yet, for a *computer*, purportedly built on circuits of ice cold *logic*, spookily *in*accurate.

Now, it gets weirder. When I asked ChatGPT what *method* it used, it replied:

“As an AI, **I don’t have access to a physical calculator**

and **don’t use any particular method** to find

the square root of a number.”

**…REALLY?!?!?!**

These disturbing errors will be the subject of another article:

**“Do we really want to model AIs after humans?
How an epic human-centric AI
will embody human-centric flaws
on an epic scale.”**

Really quick: the problem is, the AI is so very smart about so many things, AND it speaks in such a confident, authoritative voice, that we both learn to trust it, and trust is embedded in its very tone of delivery (and in its defensive mechanisms: more often than not, the AI *gaslit* me instead of admitting its own certain error).

And these two AI math problems, above, are simple arithmetic calculations, yet complex enough that a normal human is not able to compute them quickly in their head. More troubling is that they’re *close enough* that our innate bullshit detector isn’t triggered. So we see the answer and say “OK, that sounds about right… I’ll accept it.” And then we publish it. Or share it. Or make *decisions* based on it. But….. the AI is *wrong!* And 95% right, in cold logical terms, is **100% wrong.**

As in: if you are plotting the path of an autonomously driven vehicle, or if you are managing the safety thresholds of a nuclear reactor… things that AIs are, in fact, uniquely suited for… being just 99% right and not 100% accurate can have consequences that are not only disastrous, but utterly lethal.

And these are the evolutionary *descendants *of the AIs that are running our financial markets, our hospitals, and our nuclear stockpile?

*Ruh Roh…*