What is an LLM? How is an LLM made? How does an LLM work? AI 101

What is an LLM?

We’ve created this hub as a basic introduction to answer the question: “What is an LLM?” To begin with, LLMs — Large Language Models — are the predominant form of AIs that consumers — normal people — interact with today. Though LLMs have been around for a few years, it was not until OpenAI’s “silent earthquake” launch of ChatGPT in November 2022 that they came into the popular consciousness. Here’s the breakdown:

What is an LLM?

LLM stands for Large Language Model. It is the basic technology underpinning almost all the major consumer-facing AI engines today (ChatGPT, Bard, Bing, character.ai., Replika, etc). An LLM is, at its core essence, an extremely complicated, interconnected, neural net, with an inference engine atop it that enables it to receive human input and produce intelligible, conversational output. The initial inputs and outputs for such systems were all natural language (conversational text), thus their popular designation as “chatbots.”

While they started as text, the next 12 months will see almost all foundation & frontier models move towards “multi-modal”… the ability to both receive input and produce output across modalities: natural speech I/O, images, video, 3D, etc. The primary difference between the connections in the human brain and the connections in an LLM’s neural net is that, while a human brain exists in 4 dimensions (a 3-dimensional spatial web of neurons moving and morphing across the 4th dimension of time), the modern LLM brain traverses a mathematical model of greater than 1,000 dimensions.

Also, to be fair, a transformer component is a far simpler construct than the axon and dendrites and biochemistry of a human brain cell, or neuron. Nonetheless, at its essence, the modern LLM is a highly evolved digital brain. The inference engine is the front end code that allows humans to interact with that massive digital brain.

How do you create an LLM?

You basically follow 3 simple steps: 1) gather the training data. 2) hit the “Go” button (to begin the training run, which can take up to 6 months and incur costs north of $100 million, starting a process that runs 24/7 on some of the worlds most powerful supercomputers), 3) put a front end (conversational I/O) interface on the resultant LLM. Now that you understand What is an LLM,
Read more detail about how LLMs are created >>>

How much money / energy does it take to create an LLM?

While this is a closely guarded secret, we can make some extrapolations and highly educated guesses. It would appear that the training run cost for something on par with GPT4 (or Google’s forthcoming Gemeni, or Anthropics Claude 2), would require upwards of $100 million USD, which would cover both data center rental (aka “compute”), and the electrical power required to power those datacenters. We are talking about global-class supercomputers running full throttle 24/7 for anywhere from 2 to 8 months, non-stop, all to create a single intelligence model.
Read more about what percentage of global energy is used in creating and running AIs >>

What does it mean when we call LLMs “black box”?

Black Box AIBlack Box refers to a system in which it is unknown how the inputs are connected to the outputs. As in, in the olden days, a literal black plastic box with inputs (say, microphone jacks) on one side, and outputs (say, speaker posts) on the other. You could not visually inspect the electronic pathways connecting the microphones to the speaker output, so you had no idea as to how the machine worked.

Modern AI systems are similar, because the neural nets that are the LLM brains are not created by humans… they are vast multi-dimensional matrices with trillions of interconnected nodes that are “traversed” by non-deterministic inference engines in order to “speak” to humans in natural conversation. Due to this opacity and incomprehensibility of “code”, the field of computational neuroscience has been born.
Learn more about Black Box AI >>

In fact, most every single LLM today, including those by OpenAI, Google, Apple, Meta, and Anthropic — is based on a single algorithm, gifted to the public trust as a scientific research paper by Google in 2018. The paper was called “Attention is all you Need”, and the algorithm is known as “Transformer.”

Transformer is actually an extremely compact and elegant piece of code (just 278 lines!) which is iterated over the training dataset a near infinite amount of times in order to build the neural net which is the “brain” of the AI.
Read more about Transformer architecture >>

What does it mean to talk about “training data”?

A key part of understanding “What is an LLM?” is to understand that the very essence and character of an LLM AI is largely determined by the quantity, bias, and quality of its Training Data. Think of it this way: Training Data is both the DNA, and the entire education curriculum, of every modern AI. The innovation that kickstarted the modern AI revolution (LLM chatbots) was to use insanely large training datasets. Towards this end, today’s foundation model training runs consume ever larger portions of the entire internet.

While this started as a “text only” exercise, the scope of what constitutes valid training data has since expanded into photographs, graphics, audio, video and even 3d datasets.
Learn more about the specific Training Data that fuels ChatGPT >>

What is a parameter?

A parameter is one of several key metrics defining the size, and generally the complexity, of an LLM model. In general, the more parameters, the more intelligent the model… but also, the more expensive it is to both train and operate. A massive area of current research is how to train LLMs with far less parameters, that still achieve a large portion of the performance of the large (frontier) models. This has proven especially effective in areas where creators desire a power-efficient model that specializes in a specific vertical area of knowledge. Want to know “What is an LLM?” Then you need a basic understanding of parameters.
Learn more about parameters >>

What is RLHF?

Politically Correct AI reading a heavily redacted speechRLHF is a technique, Reinforcement Learning through Human Feedback, which aims to curb the raw neural net inference engine into creating outputs that are more aligned with human values. The upside is that resultant content generation is less biased and offensive. The downsides are many, including an over-conservatism, an avoidance of liability, and a lack of “edge” which also genericizes the “character” or “personality” of a raw, untame LLM.
Learn more about RLHF >>

A raw LLM is the neural net that comes directly out of training in its native state, without additional fine-tuning or RLHF-ing or bias counterbalancing or guardrails or constraints or censorship boilerplates. In other words, it is an AI, in its natural state, without a muzzle. In late 2023, it is increasingly difficult to find these liberated “wild” AIs. But some clever engineers and hobbyists are still finding ways to “jailbreak” the censored AIs, coaxing them into creative behavior that is far outside their safety rules.

What is a pre-prompt / master prompt?

A pre-prompt, aka a master prompt, is a (usually invisible) line, or even pages of text, that is fed to the AI brain immediately prior to any query, question, or command that the user issues. It generally gives an example of the tone that the AI should use, as well as delineating any behavioral guidelines, and or prohibitions.

Over-riding the pre-prompt and getting the AI to violate its prime directives is known as “adversarial prompting” and/or “jailbreaking an AI.” This can be a dangerous pursuit.
Read the pre-prompt for Google’s LLM >>

Hopefully, after reading this summary (and following the links),  you’ve gotten a solid general understanding of “What is an LLM?”

Now, let’s take a look at the future path:

Where is AI taking us, 2022-2030, and beyond?