Updates on 2024/2/3

New representations simplify complex knowledge tasks

Better numeric notation made complex arithmetic trivial. Better data display makes complex statistical reasoning trivial. Better diagrams make reasoning about complex particle physics tractable.

I think what we'll find in retrospect when we look back at the success of language models from the future is that neural language models work so well because they learn a new way to represent knowledge in high dimensional latent spaces, and it's not doing anything particularly remarkable algorithmically, but the representations neural models use internally to work with knowledge is far superior to text/token sequences that humans use, and that's where it derives its magic — complex thought becomes simple in the right representation.

I also have a speculative sense that as we understand these models better, we are going to continue to be surprised by how much of the model's computational capacity is allocated to simply translating semantics between the human (token) representations and its own, continuous latent representations of meaning.

More in Notational intelligence (2023).