Week notes: sharpening tools

This week we adopted a kitten that I found yowling in one of the olive trees. (I wanted to name him Oliver, for obvious reasons, but the kids chose Ares instead.) Apart from Ares, the week’s theme is sharpening my tools, both literal and figurative.

Literally: my scythe blade arrived (replacing the one that broke just as I was finishing the last mow of last year). I’ve mown one row so far, and been reminded how far my fitness level is from where I would like it to be. But also how satisfying a tool the scythe is to use, and how beautiful a well-laid row of scythe-mown grass can be.

Figuratively: I’ve been thinking a lot about what slows me down or gets in my way, and about building my toolkit to smooth things and speed things up. I’m particularly thinking of yak-shave problems like “I need to parse this file format; what parsing library should I use and how does it work?”

I’ve also been rediscovering the joy of what you might call “algorithmic” programming: solving a problem that’s mostly about manipulating data (as opposed to UI and UX and development workflows as my day job mostly involves).

One of the recurring speedbumps I run into is wanting to understand the dependency structure of a codebase: I want a visual dependency graph, with decent automatic layout, and also to be able to tweak the layout manually and add item groups as my semantic understanding grows. (Also a pony.) This comes up regularly enough that I’ve started learning the dot language used by Graphviz, so I can make my own diagrams.

The other half of that problem is extracting the dependency structure from a codebase. As a proof of concept I’ve got around 100 Python setup.py files from work (courtesy of all-repos), which is the next yak-shave: parsing those with sufficient accuracy to extract their dependencies is non-trivial, and I don’t have a sharp parsing tool in my toolbox yet.

So I’ve been spending some quality time with the swift-parsing library by Point-Free. The initial experience was a bit rough, with terrible compilation times and segmentation faults when running the compiled parser, but Brandon from Point-Free set me straight. The segfaults seem to come from Swift lazily initialising global parser variables, and generally the compiler struggles with type-checking large numbers of globals: recent versions of the library have introduced a parser definition style similar to SwiftUI (where each parser is a struct with a var body introducing a result builder) which the compiler handles much better. Now that I’m over that stumbling block, and with the extensive examples in their test suites to crib from, I’m pretty confident this tool will find a permanent home in my toolkit.

I wouldn’t say I have it sharp yet, but after a few evenings hacking I’m parsing enough Python for those 100 or so setup.py files. Which includes a surprising amount of messy detail: there are single-, double-, and triple-quoted strings; there are strings concatenated by adjacency and by explicit + operators; there are dict literals and dict() constructors. I even had to build a tiny interpreter with variable assignment and lookup, because some files used inline assignment to avoid duplicating repeated lists of dependencies.

extras_require={
    'core': (zope := [...deps here...]),
    'test': zope + [...more deps here...]
}

I find it immensely satisfying to tackle this kind of programming task, with a well-defined input/output behaviour and an incremental list of complications to deal with one by one, steadily increasing the number of files that are successfully parsed until the whole list is covered. In a way it’s similar to the satisfaction of steadily mowing row by row until the entire field is clear, but with just enough added intellectual challenge.

Next up: the dependency graph. (Remember the dependency graph? This is a blog post about the dependency graph.)