Too many people talk about how to recreate Bell Labs and Xerox PARC; not enough talk about how to recreate OpenAI c. 2015-2019.
Arguably the most interesting adolescence of the most interesting company on the planet today, OpenAI's model is also very different from the previous. How did they do it when Google, DeepMind, etc. couldn't (or didn't), and they didn't have the infinite money fountains of Bell and Xerox?
Those who play for their opponents will never beat those who play for the love of the game.
Too many tools for thinking, not enough tools for dreaming. But really, aren't some of our best ideas, found in dreams?
Underrated use case of a vision-language model: easily writing detailed, descriptive alt text for images in my blogs.
What are conference talks about?
Something I always think about when I write talks, which (I hope) sets them apart from the average conference talk, is that I'm not there to sell you anything. I don't care if you use the same tools I use. I'm not really there to sell Notion either. It's crazy how so much industry conf content is an ad these days. Ads obfuscate and conflate truth and opinion.
Conferences with no ads disguised as talks are so rare. This is why events like Handmade Seattle or Strange Loop get so much love. They are about technology and people and values, not tools and companies.
When I write a talk, I almost always just want you to walk away thinking about the technology you create as an instrument for advancing your values, and a lens through which to view the world with those values. And if I do my job right, you won't go back and use the library I talked about, or whatever. You'll think about the values you're advancing when you build your technology, and think about the perspective it reveals to its users and audiences. All else is implementation detail.
Of course, there are other topics that can make amazing talks, but I've found the most timeless and valuable ones I've watched from my personal heroes to be about these themes, and wherever organizers bless me with a license to talk about whatever I want to a captive audience (maybe too often), this is what I'll choose to try to communicate.
Everyone seems quite focused on using mech interp for richer ways to interact with models, but I think the real trillion dollar idea is in using mech interp to interact more directly with information. Models are just lenses to information — they are but distributions we explore.
Sparse autoencoding is Bayesian program induction
- The sparse code is the program
- Encoding is program synthesis
- Decoding is program interpretation
Training optimizes the system for the task of program synthesis with a prior that favors sparse, well-factored solutions with few distinct parts through regularization or Bayesian inductive bias.
Intelligence is both a sparse representation learning problem and a Bayesian program synthesis problem; both approaches try to grow a simplest possible factoring of observations into features/abstractions and iteratively explore a design space guided by their priors over the learned features. In neural networks, iteration takes the form of training steps; in Bayesian program induction, DreamCoder-style wake-sleep cycles.
LLM agents, RLHF, MuZero, and genetic evolution algorithms are all special cases of this general framework.
Complete delegation
Over the last year, I've changed my mind quite a bit on the role and importance of chat interfaces. I used to think they were the primitive version of rich, creative, more intuitive interfaces that would come in the future; now I think conversational, anthropomorphic interfaces will coexist with more rich dexterous ones, and the two will both evolve over time to be more intuitive, capable, and powerful.
I'll say much more on that later, but because of this, I recently built a little chatbot agent that eagerly executes its actions without any human approval. I ask questions and request edits, and the system takes action fully autonomously, sometimes giving me post-hoc updates about what actions it performed.
I intentionally built the interface to be as opaque as possible. I talk to it through iMessage, and I can't see the specific API calls that were made nor data that was recorded. I can only see the text the LLM decides to show me. There is no human in the loop, except me asking it to do things. Only safety mechanism is that all actions are append-only, so there's no risk of data loss (only of massive data chaos).
It felt weird for a bit — I kept checking the database manually after each interaction to see it was indeed updating the right records — but after a few hours of using it, I've basically learned to trust it. I ask it to do things, it tells me it did them, and I don't check anymore. Full delegation.
To be in this mode of collaboration, I've completely given up having specific opinions about the exact state of the underlying system. I can't micro-manage. I care little about specific data format or detail the model generates, only that when I ask it for some information in the future, the system will give me a good response. I only care about the I/O, not the underlying data representation. If the LLM is intelligent and coherent enough over long ranges of actions, I trust the overall system will also remain coherent.
I think this might be the right way to do delegation to a generative model-powered autonomous system in the long term: you must believe in the big, messy blob, and trust that the system is doing the right thing.
How can I trust it? High task success rate — I interact with it, and observe that it doesn't let me down, over and over again. The price for this degree of delegation is giving up control over exactly how the task is done. It often does things differently from the way I would, but that doesn't matter as long as outputs from the system are useful for me.
A little glimpse into a future of fully autonomous, natural language-commanded systems. Do you see the light? The trick is to not hesitate. Close your eyes and leap.