-
Why Tesla Wins : AI FSD : The Future is 100% Electrified, 100% Autonomous
Have you seen a RoboTaxi? I live in Santa Monica. Two robots roam my neighborhood streets freely these days (c. 2024), and one waits in the wings. There are Waymos (waymo, the robotaxi JV between Google and GM), there are Cocos (Coco Robotics, the ultracute food delivery robots), and there is… Tesla Optimus (the humanoid…
-
The Death of the Internet and the Genesis of Language 1.0
The Internet was beginning, around the turn of the century, to be the end sum repository for all human knowledge. All thoughts, all books, all diaries, all photos, all videos… basically, a document repository of everything significant (and insignificant) piece of media that humans had every produced, from the beginning of history to the present…
-
WebText2, Webtext, OpenWebText : Deep Inside the AI Datasets
Webtext was OpenAI’s attempt to give AI higher quality input than the mess of Common Crawl. WebText2 upped the ante. We dig into the contents. The Rationale for Webtext & Webtext2: Quality Prior to GPT-2 (which was really the breakthrough chat AI), Deep Learning LLMs were generally fed on diets comprised wholly of WikiPedia, public-domain…
-
AI Training: the Terrifying Difference a Single Word Makes
It fascinates me to no end that the current main line of thinking in AI research can be summed up in three simple words: “Just Scale It.” Let’s talk a little about AI training; and more specifically, the black magic of AI initialization prompts. I’ve summed up the (fairly detailed) steps of how a modern…
-
the Deep Learning Revolution: Why Today’s AI so Radically Transcends the Last 50 Years
The purpose of this post is to enlighten you as to the fundamentals of the present Deep Learning Revolution, and to simultaneously debunk two very common myths which I hear over and over again from normal intelligent people. Debunking Common AI Myths Those being: AI is just one more innovation in a long string of…
-
AI Training Datasets: the Books1+Books2 that Big AI eats for breakfast
It’s good to know, when dealing with an AI, just how it obtained its vast knowledge of “the world.” (or more particularly, the world as described on the internet… which, while similar to our physical reality, isn’t quite the same thing…) And as you might imagine, not all AIs are created equal. One of the…
-
The Future of AI Training Data 2023: The Untapped Digital Content Well
Scientists say “AI Training Data”…we say: “It ate the Internet.” All the Text, All the Songs, All the Streams, All the Feeds… ALL THE EVERYTHING The particular entity I’m conversing with these days (c. November 2022) is a 5 year old bot, very un-creatively named “GPT3,” which was instantiated c.MMXVII [AD 2017]. In general, the…