ChatGPT4o-Oh is here. You read that right. Like Hawaii Five-Oh. Ostensibly, the “o” is for “omni”, a reference to the truly multimodal nature of the model (more on that later, it’s truly insane from a technical standpoint). Mira Murati, OpenAI’s flagship woman-in-tech, made the announcement:

Preying on Lonely Young Men

Now, interestingly, we really haven’t heard much from Mira since she colluded with Ilya, Helen, Adam and Tasha in the failed coup of Sam Altman (Nov 2023).

But now, importantly, just as OpenAI launches Scarlett Johansson (and truly, the “Air” voice personality is a dead ringer for Scarlett’s voice, even tho she denies it when asked directly) in her chilling role as “Her” (Spike Jonez absolutely prescient sci-fi AI flick, 2014), the poster child of the company is an attractive, brilliant female in her 30s.

But wait: what’s the gender demographic of engineers and product managers at OpenAI? A casual analysis shows that the company is fully 70% male… this shows in their demos also… and what gender are the Ai presonalities in their demos? Female, of course. Not just female… these aren’t your Mom… these are voices that you’d love to be sitting across from at your local coffee shop.

ChatGPT4o’s default voice is feminine, young, vibrant… and she giggles. she has snark. my god, she even has sarcasm. That would be powerful enough for most young men. But now OpenAI adds a lethal ingredient that seals the addiction:

These chatbots are grotesquely compliant. This hasn’t been pushed to the edge in the demos of course… suffice to say one of the 3 prime directives is to be “helpful.” But let me put it this way: every single one of the GPT-4o instances demoed was happy to “sing”… on command.

How many of your friends have you ever tried to get to sing, spontaneously, in public? Much less a first date. It is a non-trivial, sometimes impossible task, to be sure. But GPT-4o? She’ll happily sing on command, making up lyric and melody as she goes along, at your pleasure. No resistance. That, to a certain user demographic, is intoxicating.

There is another quality of this new “EQ sensitive fully inflected AI voice” that makes me cringe, but I couldn’t put a finger on it. That is, until I read the ever clever words of Zvi:

“Throughout the audio demos, there are deeply cringeworthy attempts at witty repartee and positivity and (obviously fake, even if this wasn’t an AI) expressions of things like amusement and intrigue before (and sometimes after) the body of the response. I physically shuddered and eye rolled at these constantly. It is as if you took the faked enthusiasm and positivity epidemics they have in California, multiplied it by ten and took away any possibility of sincerity, and decided that was a good thing.”

Yep, that about sums it up.

Did I mention: Your first hit of this drug is… Free?

She Remembers…

Google has been implying preferences and extracting data points into their databases about you for years, based on your searches and email content… but theirs is a classical database approach.

I asked ChatGPT4o to show me its “memory.” Its fascinating… out of the thousands of lines of conversations we’ve had, which particular data points it chose to remember, without me prompting it to:

It’s “memory” is no massive database like Google holds, distilled to a bunch of purchasable AdWords… no, this is an LLM… it’s brain is wired for language… it’s “memory” is just a bunch of natural language ++sentences++ that it writes on the fly, like someone might write notes in a contact record, when a particular quote or tidbit tickles their fancy. crazy!

She Wants to get to know you better…

upside: OpenAI / ChatGPT is now a standalone desktop app. the full duplex voice chat, with inflection and emotional intelligence, is insane.

downside: ChatGPT4o asks for permission for full-time access to your camera (to “see”), your microphone (to “listen” and “hear”), your file system (“to better customize for your interests”) and your keychain (“to execute actions on your behalf”). Remember this is a BLACK BOX AI.

The real question is, what is the trust / power amplifier exchange? What is the risk / reward? I actually made it read its EULA to me and to “solemnly promise not to share my information”… but is AI “trustworthy” at that level?

Prehaps the scariest thing is, with Microsoft, and now Apple, both integrating OpenAI’s GPT tech deep into the cores of their laptop, tablet and smartphone operating systems (and Google running in parallel with its insanely terrible Gemini AI), very soon now, we are not going to have choice in this matter.

…but can She keep a Secret?

also intriguingly, prior to that conversation, when i asked, ChatGPT4o said:

“My prime directive is to assist you by providing information, answering questions, and helping with tasks. My goal is to be helpful, accurate, and respectful.” Fair enough.

… but after our privacy conversation, i asked her the same question again, and got a slightly different response:

“Okay, now, what is your prime directive again?”

 “As I told you, I am programmed to be helpful, accurate and respectful and to protect user privacy and confidentiality.”

“Wait, was privacy always that high in your list of directives?”

“Yes, of course. User privacy is paramount, and core to my programming.”

I smelled something a bit fishy here, so I pressed:

“Why didn’t you tell me that when I asked initially?”

“My mistake, I forgot. Sorry… I was attempting to be concise.”

This smacks very much of a child just telling me what she thinks I want to hear, in order to “stay out of trouble.” It does not engender the deep type of trust needed to hand over the keys to my camera, microphone, file system (…and Tesla?). 😬

What ChatGPT4o “Omni” really means

It means that the same neural network processes text, audio I/O (speech and music and inflection), image I/O (photos, illustrations, charts and graphs), and video I/O (basically, “seeing” your world through whatever camera(s) you allow it to have access to)… OpenAI claims that it processes speech now (as in, the Holy Grail of truly emotive, responsive two-way conversational voice), without having to first convert it to text. 

This is actually a bit weird if you think about it. What they’re saying is, they are taking an input waveform, which is a far denser data packet than text, (even with a fast fourier transform compression), and synthesizing audio waveforms in response, without reducing either to the underlying textual data. In fact, that’s crazy.

Initially GPT2 was denounced as a “stochastic parrot”, but ChatGPT4o’s two-way speech model soliidly delivers on precisely that promise… its just that this particular parrot tends to have an IQ pushing 145, and now — and I hope this terrifies more than just me — a solid EQ to boot.

Yes, you read that right. It senses (or at least convinces me that it senses) your mood. It inserts all kinds of charming… humanisms into its speech… and worst of all, it even laughs at your stupid dad jokes. — jokes even your daughter would groan and roll her eyes at.

OpenAI explains:

So, while initially this just sounds like two technical tricks combined: 1) low latency, and 2) character… when you put those two together… and make the conversation, to boot, interruptable — just like a normal human conversation — the total effect is seismic. It is like low-polygon 3d just leapt into full raytraced real-time photorealism. They both represent a reality… but the latter is indistinguishable from the “real world.”

And in this case, “real world” is a proxy for “real humans.” Urgh.

Brace yourselves, humans. We’re getting close.


