These days, players like to brag about the complexity and sophistication of a new AI by describing its massive size (sound familiar? some tropes never die…)… in terms of a handful of terms: number of GPUs, size of the training dataset, and, most opaquely, number of parameters in the model. So: What is a parameter, exactly? Curious minds want to know…
“This one goes to Eleven!”
What is a Parameter? Alternate short answer :
Imagine a “parameter” as a sort of slider knob that can be set at an intermediate value anywhere between zero (off) and one (full on), just like the dimmer switch on your dining room light. (or in more modern terms, just like the brightness on your iPhone screen). Augmenting directly from BingAI: “The number of parameters in an LLM is a measure of its complexity and capacity to learn from data. Generally, the more parameters an LLM has (side quest: What is an LLM?), the better it can perform on various language tasks.” Thanks, Bing. Dry, yet accurate.
<picture: slider up and down anim on an iphone>
What is a Parameter? Long answer:
Think about it this way: have you ever used an equalizer on your headphones, or, maybe 20 years ago, as part of that anachronism known as a “home stereo”? An equalizer allows you, very directly, to alter the volume (amplitude) of sound (waveforms) in different zones (bass, midrange, treble — aka, frequencies). Ferris famously had an AudioSource EQ that he hit with a baseball in one of the best movies ever made, Ferris Bueller’s Day Off.
Ferris tweaks the parameters on the epic AudioSource One EQ (rack mount stereo equalizer)
So… what is a parameter? Each parameter in mathematics is, essentially, a slider, knob, or dial that can be adjusted up or down to represent a nearly infinite (think: analog-esque) number of positions (values) between 0 (off) and 10 (full on) (yes, i know, its technically “0.00 to 1.00″… but a value of 1 to 10 (or 11!) is such an easier concept for most people to grasp)
So, a typical LLM Chatbot AI these days numbers its parameters in the billions.
times approximately 10,000,000,000ish.
Each sentence or phrase or question you supply as input is then navigated via parameters across the vast expanse of the neural net, seeking the path that best (optimally) threads a bunch of tokens into a string which becomes your result, that is to say: the phrase (or sentence, or article, or rant) that the AI replies to you with.
Lets try to put it another way:
Parameters, the tension lines of Neural Nets
LLMs are made of massive (truly, massive) neural net meshes. These nets are like a bunch of elastic strings that connect every token in the net. (quick note: a token is like a word, or more commonly, a phoneme, or part of a word. A ChatBot LLM creates its responses by navigating its neural net from token to token, in essence stringing together a long sequence of tokens, that… miraculously, forms coherent sentences that are relevant to your conversation.)
In a typical example, an LLM with have 50,000 tokens. While somewhat similar to a 50,000 word dictionary, its actually a bit more flexible than that. And here are where the parameters come into play.
Tuning an LLM’s parameters are kind of like adjusting the string tension of the lines which connect all the token nodes in the neural net. I know that’s a mouthful. Just think of it this way: When you adjust certain tensions, it increases the probability that certain connections will be followed in the construction of a sentence, and decreases the probability that other connections will be traversed.
Or think this way, if you’ve ever been camping (and thinking, as you’re setting up the tent: hmm, so what is a parameter?): You tighten the guy-wire on one side of your tent. All the poles bend and flex in response… with the adjustment of a single guy-line, the entire tent changes shape! This is how the adjustment of a single parameter, even in a multi-billion parameter model, can make massive, yet subtle, changes to the overall shape — and therefor, the expressed “personality”, of the AI.
In a very crude sense, try this out: if objective (or QA, or red-team, or RLHF) testing shows that your AI is gender biased, and is using the masculine pronoun 55% of the time and the feminine pronouns 45% of the time, when asked a gender-neutral question (“Who ate the last cookie?”), then you could (theoretically) adjust certain parameters in the model to skew it into a more balanced gender answering pattern.
The challenge here is that a modern model has billions, or even trillions, (1,000,000,000,000+) of parameters, and a) those parameters are all inter-related and entangled — adjusting one automagically makes compensations on all the ones nearby, and b) no-one really knows which parameters control what.
Say what??!!? Yes, its true. No human understands the inner workings of a complex modern neural net. In fact, we understand the nets of LLMs significantly less than we understand our own brains… while this “black box” concept is outside the scope of this post, you can read more about it here “Black Box AI”)
still struggling with the concept?
we’ll give it one more shot.