Sparse autoencoding is Bayesian program induction
- The sparse code is the program
- Encoding is program synthesis
- Decoding is program interpretation
Training optimizes the system for the task of program synthesis with a prior that favors sparse, well-factored solutions with few distinct parts through regularization or Bayesian inductive bias.
Intelligence is both a sparse representation learning problem and a Bayesian program synthesis problem; both approaches try to grow a simplest possible factoring of observations into features/abstractions and iteratively explore a design space guided by their priors over the learned features. In neural networks, iteration takes the form of training steps; in Bayesian program induction, DreamCoder-style wake-sleep cycles.
LLM agents, RLHF, MuZero, and genetic evolution algorithms are all special cases of this general framework.
Complete delegation
Over the last year, I've changed my mind quite a bit on the role and importance of chat interfaces. I used to think they were the primitive version of rich, creative, more intuitive interfaces that would come in the future; now I think conversational, anthropomorphic interfaces will coexist with more rich dexterous ones, and the two will both evolve over time to be more intuitive, capable, and powerful.
I'll say much more on that later, but because of this, I recently built a little chatbot agent that eagerly executes its actions without any human approval. I ask questions and request edits, and the system takes action fully autonomously, sometimes giving me post-hoc updates about what actions it performed.
I intentionally built the interface to be as opaque as possible. I talk to it through iMessage, and I can't see the specific API calls that were made nor data that was recorded. I can only see the text the LLM decides to show me. There is no human in the loop, except me asking it to do things. Only safety mechanism is that all actions are append-only, so there's no risk of data loss (only of massive data chaos).
It felt weird for a bit — I kept checking the database manually after each interaction to see it was indeed updating the right records — but after a few hours of using it, I've basically learned to trust it. I ask it to do things, it tells me it did them, and I don't check anymore. Full delegation.
To be in this mode of collaboration, I've completely given up having specific opinions about the exact state of the underlying system. I can't micro-manage. I care little about specific data format or detail the model generates, only that when I ask it for some information in the future, the system will give me a good response. I only care about the I/O, not the underlying data representation. If the LLM is intelligent and coherent enough over long ranges of actions, I trust the overall system will also remain coherent.
I think this might be the right way to do delegation to a generative model-powered autonomous system in the long term: you must believe in the big, messy blob, and trust that the system is doing the right thing.
How can I trust it? High task success rate — I interact with it, and observe that it doesn't let me down, over and over again. The price for this degree of delegation is giving up control over exactly how the task is done. It often does things differently from the way I would, but that doesn't matter as long as outputs from the system are useful for me.
A little glimpse into a future of fully autonomous, natural language-commanded systems. Do you see the light? The trick is to not hesitate. Close your eyes and leap.
Unlocking progress
It’s always been true that the bottleneck to human progress was not the pace of computation on the planet, but the fidelity of interface between humans and tools. The fastest jet will only take you to the wrong destination quicker if the pilot cannot steer precisely.
This is only going to become truer as computation and knowledge available to each human accelerates upwards.
Interface is an under-invested common good for which innovation is hard to create, even more challenging to defend in the market, and progress is essential to our future.
New representations simplify complex knowledge tasks
Better numeric notation made complex arithmetic trivial. Better data display makes complex statistical reasoning trivial. Better diagrams make reasoning about complex particle physics tractable.
I think what we'll find in retrospect when we look back at the success of language models from the future is that neural language models work so well because they learn a new way to represent knowledge in high dimensional latent spaces, and it's not doing anything particularly remarkable algorithmically, but the representations neural models use internally to work with knowledge is far superior to text/token sequences that humans use, and that's where it derives its magic — complex thought becomes simple in the right representation.
I also have a speculative sense that as we understand these models better, we are going to continue to be surprised by how much of the model's computational capacity is allocated to simply translating semantics between the human (token) representations and its own, continuous latent representations of meaning.
More in Notational intelligence (2023).
The depth of a romance is directly proportional to the fraction of Taylor Swift's discography lived through.
"Authoring" vs "Assembling"
In the case of generative AI, arguing that these systems "assemble" more than "author" makes for provocative bit, but as is usually the case with these things, I think these are two ends of a spectrum. authoring is perhaps just assembly with smaller more fundamental atoms/assembly in a different dimension.
This also goes back to composition, a core aspect of any language or semantics that I've been thinking a lot about. Tokens (words; atomic units of text) and features (atomic units of meaning; latent dimensions) compose ("assemble") differently, but I think we can think of both of them as assembly.
An excerpt from an internal doc on AI:
My belief is that, once the dust settles, humans are still working — still creating, still wanting, still debating. But that our time and energy will take us farther with the new tools and collaborators we'll have created along the way. We hope these tools will help our best work become even better, while letting us delegate away the minutiae of the meta-work that takes us away from it.
I've spent a lot of time over the years desperately trying to think of a "thing" to change the world. I now know why the search was fruitless — things don't change the world. People change the world by using things. The focus must be on the "using", not the "thing". Now that I'm looking through the right end of the binoculars, I can see a lot more clearly, and there are projects and possibilities that genuinely interest me deeply.
— Bret Victor
This is the humanist view of AI that I think should fuel our work. This is what we should try to see, where we should aim, when we build with AI, as extensions of our human agency rather than amplifiers of the mechanics of our labor.
On this, I loved what Maya added:
...the AI should gesture, not instruct, us. "Gesturing" preserves our agency, "instructing" infringes upon it.
I want to build interfaces that lets the AI gesture us into a better future without infringing upon our agency.