Computational Neuroscience 3000? AYFKM?

There’s a new sheriff in town, kids.

Justs the other day, I learned of a new field of study, replete with PhD candidates and its own scholarly journal: Its called… wait for it:

“computational neuroscience for AI minds.”

The article was detailing the bizarre near-total opacity of the AI brain and how scientists were working hard to decode it and to understand how it works by using… well, essentially, the crude toolkits of biological neuroscience. [Read: When the AI dreams of spiders…]

What? Why?

Can’t they just open the source
and take a look at the code?

In short: No.

UPDATE: 2024.06.27 [via WIRED]

Leading AI labs make progress in figuring out how their foundation AIs actually work

(do you sense the irony in that headline? that’s the very essence of Black Box AI: We seeded it, we spawned it, we trained it, and we have absolutely no idea how its internal logic works)

Looking into the black box?

The opacity of neural networks has long presented limitations to understanding how certain AI models really work. While recent years have seen some limited headway in interpretability as a means of better understanding how these systems function, progress has remained limited. However, researchers at Anthropic and OpenAI have both reported recent successes in identifying features of large models that are interpretable by humans.

According to reporting by Wired, Anthropic’s mechanistic interpretability team was able to map clusters of neurons within the neural net to identify features—or concepts—that the clusters would predictably generate using a technique Anthropic calls “dictionary learning.” Beginning by examining a simple single-layer neural net and eventually progressing on to multilayer nets, researchers identified clusters that represented features like the Golden Gate Bridge, and used clusters of similar neurons to identify related concepts, like Alcatraz or California governor Gavin Newsom. They were also able to identify clusters that produced negative features, like hate speech or unsafe code, and could manipulate the behavior and outputs of Claude, Anthropic’s LLM. The researchers believe that conducting this “brain surgery” on models could help AI safety efforts by allowing them to excise dangerous features.
.
Shortly after Wired’s report on Anthropic’s mechanistic interpretability progress, OpenAI released its own post and corresponding research paper and GitHub code about interpreting patterns within its GPT-4 model. OpenAI presented its methodology for using a sparse autoencoder algorithm to highlight the connections between firing neural activations in a neural network and features, like those described by Anthropic’s interpretability team, that are easier for humans to understand. OpenAI’s team highlighted that while this progress is promising, interpretability work is still far from being a silver bullet; other researchers have echoed this sentiment, with scalability of current interpretability methods a likely limitation.

Enter the Black Box of AI

It all makes sense when you begin to comprehend how these modern AIs were spawned. And I do mean “spawned,” not “coded” or “created.” Here’s their origin story, as I understand it, in brief:

For a long time, more than 5 decades in fact (c. 1965-2015), AI candidate programs were “hand coded”… that is to say, their code was written by hand using human-readable computer languages (C++, Java, Python, etc) , by humans, in close consultation by a team of subject matter experts (chess grandmasters, factory automation engineers, linguistics professors, etc.). Sometimes these programs were written on pure theory.

Other times, they were given well-curated datasets to chew on, and to “draw their own conclusions from.” These included, for instance, optimized process flow diagrams, or databases containing what humans believed to be the absolute optimal paths through the first 10 moves of a chess game (opening databases) or the final 30 moves (endgames). But these databases were constrained by both the availability of well annotated datasets, and the expense of storage.

1990: The Internet

Then along came the internet… the end repository of all public and semi-private data (not to mention ostensibly “secure” intranets and extranets, which are private and organizational data repositories using common web data standards and syntax, the total data footprint of which potentially dwarfs what exists on the public internets).

2000: EDI

Then along came Electronic Data Interchange (EDI) and XML, which standardized data schemas, database formats and metadata annotation syntax for data across almost every field of human endeavor. In short, both the quantity, and the quality, the accessibility and the machine-readability of the available training data for any given AI program increased by several orders of magnitude from 1995-2005.

[_^SIDENOTE: at that same time, many initiatives were started, with the intent of digitizing / annotating the sum product of human creativity — books, music, television, film. The most notable project of this type was Project Gutenberg. Read here for more on how that data ocean feeds the AI.]

All this time Moore’s law was working its magic on storage systems. $500 of consumer hard drive capacity exploded from 20MB (1982) to 500MB (1994) to 500GB (2004) to 20TB (2022) (a 20TB drive stores the equivalent of the contents of one million 20MB drives).

2010: Big Data

And a bit later, along came the concept of Big Data. AS in, really big data. As in, grid-based databases that contained petabytes of structured information. And code didn’t care about size. Its appetite was insatiable. And all we knew was, the more it digested, for the most part, the smarter it got.

So, this brings us to today.

And the reason that computational neuroscience was born.

2020: AI

To the Age of Deep Learning.
To the exa-scale models of Neural Nets.

scientists, realising that these terabyte sized neural nets (token nodegraphs, multidimensional matrices, weights & biases) were in fact total black boxes, opaque to any traditional computer science analysis techniques or error tracking.

Realizing that they were far more like human wetware biological brains than any linear hand-coded computer program. With that realization, thinking: perhaps we can learn something from MRIs, from neuroscientists… region detection, trace mapping, n-dimensional analyses… ah, lets call it… computational neuroscience!

To the Days of the Computational AI Neuroscientists.

At this point, we might wish to ask:

What is a Neural Net & how is it built?

AI: /imagine [prompt] “model of the human brain, stylized, vivid colors“

Introduction to Computational Neuroscience:

All About Modern Neural Nets