A good thinking tool shouldn't just hand users answers to their questions, but also guide and enable them to discover and articulate more complex questions.
Asking more complex questions, and discovering answers to them, which lead to even more nuanced questions. Without one, the potential of the other in this pair becomes limited.
A related thought: While building tools to solve hard problems for humans, we should strive to also improve people's depth of engagement with those complex problems and their solutions, as a way to preserve human agency when working with increasingly capable aids for our work. Otherwise, we risk losing touch with, and therefore understanding over, critical decisions.
Scale xor Explore, a hypothesis.
In innovation ecosystems, for efficient resource allocation all resource in an organization must go towards only one of two spends:
- Scale: Taking some working formula for solving a problem or producing something valuable, where there is "sign of life" and a way to scale production, and single-mindedly scaling it;
- Explore: Open-ended exploration to discover new signs of life of new regimes or transformative technologies.
These feel like two distinct modes of operating a single group of people. An organization is either doing (1) or (2), and any attempt to straddle them by doing something in-between will not do what you wish it would.
So, how to blend the benefits of both?
In larger organizations, while each team must be in one mode or another, the organization as a whole can have a portfolio of bets that combine both approaches at a sub-team level to trade off risk tolerance against upside. Some teams can be working on efforts of category (1), while others can be in category (2) mode.
Too many people talk about how to recreate Bell Labs and Xerox PARC; not enough talk about how to recreate OpenAI c. 2015-2019.
Arguably the most interesting adolescence of the most interesting company on the planet today, OpenAI's model is also very different from the previous. How did they do it when Google, DeepMind, etc. couldn't (or didn't), and they didn't have the infinite money fountains of Bell and Xerox?
Those who play for their opponents will never beat those who play for the love of the game.
Too many tools for thinking, not enough tools for dreaming. But really, aren't some of our best ideas, found in dreams?
Underrated use case of a vision-language model: easily writing detailed, descriptive alt text for images in my blogs.
What are conference talks about?
Something I always think about when I write talks, which (I hope) sets them apart from the average conference talk, is that I'm not there to sell you anything. I don't care if you use the same tools I use. I'm not really there to sell Notion either. It's crazy how so much industry conf content is an ad these days. Ads obfuscate and conflate truth and opinion.
Conferences with no ads disguised as talks are so rare. This is why events like Handmade Seattle or Strange Loop get so much love. They are about technology and people and values, not tools and companies.
When I write a talk, I almost always just want you to walk away thinking about the technology you create as an instrument for advancing your values, and a lens through which to view the world with those values. And if I do my job right, you won't go back and use the library I talked about, or whatever. You'll think about the values you're advancing when you build your technology, and think about the perspective it reveals to its users and audiences. All else is implementation detail.
Of course, there are other topics that can make amazing talks, but I've found the most timeless and valuable ones I've watched from my personal heroes to be about these themes, and wherever organizers bless me with a license to talk about whatever I want to a captive audience (maybe too often), this is what I'll choose to try to communicate.
Everyone seems quite focused on using mech interp for richer ways to interact with models, but I think the real trillion dollar idea is in using mech interp to interact more directly with information. Models are just lenses to information — they are but distributions we explore.