Linus's stream

In an ideal world, discovering new thoughts and ideas from your own notes is as addictive/engaging as discovering new videos on YouTube or TikTok, discovering new people on Twitter.

One way to measure the progress of web search technology is by looking at the set of knowledge the average person doesn't bother learning about until they need it. The better the commodity search engine, the less effort people will expend to "pre-learn" things before they really need to know it, because they can depend on the knowledge always being quickly accessible.

These days nobody bothers memorizing the population of a country or when the seasons start, nor friends' addresses or phone numbers. But I still find myself wanting to learn more abstract, long-form topics because they can't simply be looked up "just in time" ... yet.

My Mac keeps having increasingly frequent issues where the corespotlightd process starts consuming all the memory on the system, logging me out of iCloud and freezing up the entire machine to where I can't even reboot.

Since I don't want to debug a macOS built-in process, and I can't turn it off in settings, it seems like the only reasonable solution is to continually monitor the resident memory usage of the process and kill it if it starts consuming too much, so I wrote up a little bash script to do just that.

  1. We need to ensure there's only one copy of this script running on the system at any given time. So I use a lockfile in /tmp.
  2. We filter ps aux to get the PID of corespotlightd, and if it's running, get the resident set (real memory usage, more or less) size.
  3. If it's using more memory than $MAXRSS (1GB for now), kill -9 it, and repeat every 30 seconds.
#!/bin/bash

LOCKFILE=/tmp/limit_corespotlightd.lock
if [ -f "$LOCKFILE" ]; then
    exit
else
    touch "$LOCKFILE"
fi

MAXRSS=1000000 # 1GB

while true; do
    PID=$(ps aux | grep '/corespotlightd$' | awk '{ print $2 }')

    if [ -n "$PID" ]; then
        RSS=$(ps -"$PID" -o rss | grep '[0-9]')

        if [ "$RSS" -gt "$MAXRSS" ]; then
            echo 'corespotlightd is using' "$RSS" 'kB; killing pid' "$PID"
            kill -9 "$PID"
        fi
    fi

    sleep 30
done

People who were bullish on flowcharts in the 60's and people who are bullish on "no-code" tools in the 2010's seem mistaken in the same kind of way.

Pseudocode as a tool of thought.

Its unique properties include

  • it's a programming notation explicitly designed for thinking and communicating
  • a good balance between expressivity and non-ambiguity
  • for every abstraction used, either natural-language or programming notation can be used to balance expressivity vs. precision at a granular level

Posit — Self-driving as a skill requires natural language understanding as a sub-skill.

Natural language understanding is not just a skill in itself, but might also be a notation that adds to an intelligence's ability to abstract and generalize broadly about the world. This ability to abstract and generalize is key to many kinds of performed intelligence, and self-driving might be a complex enough activity such that it requires a level of generalization power that is a superset of natural-language understanding and reasoning.

A related question is, could a very, very smart but non-linguistic animal drive? I'm not so sure.

There's a subtle but firm distinction between augmenting human productivity and augmenting human intelligence. The first is an economic, capitalistic endeavor first (though productivity contributes indirectly to collective intelligence); the second is a more pure pursuit, I think.

Noteworthy features of Starlark

Starklark is Bazel's configuration language, and not designed to be general-purpose. Nontheless, there are some features that seem useful even for a dynamic G/P language.

  1. Single final assignment at the top level. Module-global functions, variables cannot be re-bound. This helps reading code, and simplifies tooling.
  2. Deterministic iteration order for dictionaries, and in general determinism (a program run twice always produces the same outcome, modulo things like time). Determinism seems like a generally desirable property, for things like testing/reproducible builds.
  3. No mutation of iterator during iteration. Mutating the iterator (like a list) during iteration panics the program, to avoid iterator invalidation errors.
  4. No [checked] exceptions. Panicking the program for any non-anticipated error might seem problematic, but "makes the language simpler and reduces the number of concepts." Exceptions also become API surface for the language, so not having it helps language evolution.
  5. Strings are not iterable. This avoids bugs from passing in a string instead of a length-1 list of strings to APIs expecting a list.

Starlark, a configuration language for the Bazel build tool, has a tree-walking interpreter implemented in pure Go at google/starlark-go.

It seems like the canonical implementation of Starlark is the Java implementation in the Bazel source tree, but the Go version is used "in production" in web playgrounds, debuggers, and so on.

starlark-go is notable because it's one of a vanishingly small number of production language implementations that are tree-walking interpreters rather than bytecode VMs. The implementation guide says:

The evaluator uses a simple recursive tree walk, returning a value or an error for each expression. We have experimented with just-in-time compilation of syntax trees to bytecode, but two limitations in the current Go compiler prevent this strategy from outperforming the tree-walking evaluator.

The details of why exactly that's the case is interesting, and documented further in the link, but seem inherent to Go's current compiler design and philosophy. It also supports Oak's current (tree-walking) evaluator design, which is nice.

Typing on a typewriter for a while and then going back to a shallow laptop keyboard is a trippy experience.