As a society, we need to evolve beyond reading and writing walls of text.
People need to be more thoughtful building products on top of LLMs. The fact that they generate text is not the point. LLMs are cheap, infinitely scalable, predictably consistent black boxes to soft human-like reasoning. That's the headline! The text I/O mode is just the API to this reasoning genie. It's a side effect of the training paradigm (self-supervision on Internet scale text data).
A vanishingly small slice of knowledge work has the shape of text-in-text-out (copywriting/Jasper). The real alpha is not in generating text, but in using this new capability and wrapping it into jobs that have other shapes. Text generation in the best LLM products will be an implementation detail, as much as backend APIs are for current SaaS.
Text, like code, is a liability, not an asset. An organization should strive to own as little text as necessary to express their information and accomplish tasks. If you don't heed this warning, you end up with a Notion that has 10 copies of every piece of information, 4 of which contradict each other and only 2 of which reliably surface on searches. Willy-nilly spraying the GPT-3 next-token prediction powder on your tool/product is a recipe for disaster outside of narrow workflows where text is the asset being produced. In all other cases, don't ship the API to the user. Text generation is not the product.
Notion's "AI" product is an affront to Doug Engelbart's name. Is there nobody left at the company who's thinking creatively about AI x knowledge work?
Thinking in neighborhoods
As I've been using latent space navigation tools more in my own thinking work, one feeling I've noticed is that every idea no longer feels independent and singular, but instead feels like one of many (infinite?) possible variations from which I've simply plucked one version. If I look a bit closer to the left and right of any given idea, there are similar ideas from different perspectives and different ideas with shared perspectives, simply waiting to be made visible.
I call this thinking in neighborhoods -- where every thought exists within a neighborhood of other ideas tightly connected to each other, just out of sight.
For example, if I write down a sentence about generation ships traveling across the universe:
Generation ships are spacecraft where multiple generations of people live and die as they travel towards some destination, such as another star system. Because of the general difficulty of space travel and the speed limit of light, these voyages might take centuries to be completed.
Rather than thinking of that sentence as, well, just a sentence about some spaceships, I'm instead beginning to feel the "neighborhood" of ideas that lie next to it.
Some points in this idea neighborhood are just paraphrases, but others retell the story in different styles. Yet others change settings, wrap the idea in positive or negative light, or change the topic entirely, while keeping the tone and structure of the sentence. From the outputs of the model I used to study this idea, my favorite of the set compared the scale of interstellar travel to the vastness of the ocean, and another speculated that planets themselves may be considered a kind of generation ship.
As I've adjusted to this new feeling, a complementary sensation has drifted into the back of my mind: In a world where ideas exist in this infinitely densely packed fabric of variations, thinking individual strands of thoughts without visceral awareness of the rich variations in the idea of every thought seems like a loss, like trying to take in the the night sky with a telescope that only sees one star at a time. My hope is that with some combination of better tools and ways to represent/communicate ideas, we'll be able to open our minds up to the infinite variations present just besides every word and idea we perceive today.
Added a few functions for drawing histograms into the Oak standard library, and now I can do this!
std.range(100000) |>
std.map(random.normal) |>
debug.histo({ bars: 20, label: :start, cols: 50 })
... which renders ...
7 ▏
38 ▏
175 ▌
574 █▋
1740 █████
3951 ███████████▍
7804 ██████████████████████▌
12066 ██████████████████████████████████▉
15913 ██████████████████████████████████████████████
17296 ██████████████████████████████████████████████████
15783 █████████████████████████████████████████████▋
11614 █████████████████████████████████▋
7160 ████████████████████▊
3562 ██████████▎
1527 ████▍
577 █▋
159 ▌
40 ▏
12 ▏
1 ▏
Might our descendants look upon our probable emergence and departure from Earth's gravity well the same way we look upon our ancestors' emergence and departure from the ocean — as a necessary proving ground that we outgrew?
I was trying to develop an intuition for how "hard" masked language modeling and masked text reconstruction (autoencoding) are, so I wrote a little script to "mask" a certain % of words from stdin.
{
println: println
default: default
map: map
stdin: stdin
constantly: constantly
} := import('std')
{
split: split
trim: trim
join: join
} := import('str')
Cli := import('cli').parse()
MaskFraction := float(Cli.opts.'mask-fraction') |>
default(float(Cli.opts.m)) |>
default(0)
stdin() |>
trim() |>
split('\n') |>
map(fn(l) l |> split(' ')) |>
map(fn(l) l |> map(fn(w) if rand() < MaskFraction { true -> w |> map(constantly('_')), _ -> w })) |>
map(fn(l) l |> join(' ')) |>
join('\n') |>
println()
and then e.g.
pbpaste | oak masker.oak --mask-fraction 0.3
Difficulty shoots through the roof for me around 30% masking. (Coincidentally, this is also where I'm training my current ContraBT5 bottleneck model.)