010 - bug hunting with lincoln

05 Apr, 2024

Also in this post: experimenting with blog structure and documenting code with line art.

Bug Hunting with Lincoln

Test driven development (TDD) is a widely recommended approach to building software without bugs. The general idea of TDD is that, when building a new feature, you start by writing a test that describes how the feature should work. You then run the test and see that it fails. To fix your code, you make the minimum changes required to make the test pass.

Here's a very silly example:

def add_two_numbers(number_1, number_2):
    pass

def test_can_add_4_and_5():
    assert add_two_numbers(4, 5) == 9

We make this change to make our test pass:

def add_two_numbers(number_1, number_2):
    return 9

Tests pass! Are we done? No. There is a glaring bug here - adding any numbers that do not sum to 9 will give an incorrect answer. To solve this, we start by adding another failing test

def test_can_add_5_and_6():
    assert add_two_numbers(5, 6) == 11

Now, unless we are being extremely obtuse, we might end up with a "correct" implementation of add_two_numbers:

def add_two_numbers(number_1, number_2):
    return number_1 + number_2

While this example is very silly, it maps pretty well to implementing a real world feature of reasonable complexity. You start with a very simple test, make it pass with some very simple code, then iteratively add slightly more complicated tests and update your code until you end up with a robust feature.

This week I wrote a new, fairly complex feature in eno. I started by writing a test describing how the feature should work. I then wrote the code to make it work. It took a few hundred lines of code to implement the simplest version of the feature and make the test pass. At this point, I knew of specific bugs that existed. I even knew exactly what test to write next to illustrate the shortcomings of my work. I was about to do this when it occurred to me, "why don't I get Lincoln to find this bug?".

If you're new to the blog, Lincoln is the deterministic simulator I am writing alongside eno. It simulates how an actual user might interact with eno by writing documents, making mistakes along the way, and going back and correcting them.

In my first post about Lincoln, I noted its superpower of finding bugs that you can't even think of:

"Until now, my process for finding bugs has been to run eno and try typing stuff into it in various ways and see if it crashes. The issue with this is I have to invent ways in which I think eno might be broken. Any bug I am not creative enough to think of, I will miss. With Lincoln, I just run the program and before I know it, one of the random things it did has uncovered a bug."

If Lincoln can find bugs that I can't think of, surely it can find bugs that I already know about. Conversely, if Lincoln is not good enough to find a bug I know exists, how could I expect it to find a bug I don't know exists?

In this case, Lincoln had no idea this new feature existed, so, of course it could not find bugs in it. I switched over to Lincoln and started writing code for it to use the new feature. This meant teaching it to use some new operations in eno. I knew that a bug would occur if the operations happened in a different sequence than they did in my unit test. So, I wrote code to randomize the sequence in which it performed these operations - while ensuring it respects operations that depend on a previous operation occurring first. With these changes, I ran Lincoln a few times and smiled as it recreated the bug that I had anticipated.

It's still early days but it seems that Lincoln has a kind of reciprocal relationship with eno. Lincoln needs to know the features of eno and have expectations about how they should work. It then finds bugs which need to be fixed in eno. Programming Lincoln to have "expectations" can be a bit troublesome. Writing the code describing how something should work is not dissimilar from writing code to actually do it. In moments of doubt, I actually start to worry I am just writing the same program twice under two different names. There's definitely lots more for me to explore in this new approach to writing software.

Note: I am by no means a TDD zealot. In fact I have written very few tests in my development of eno so far. I think in an early stage project tests can reduce velocity and get in the way of thinking. I start using more TDD as my work becomes less exploratory.

Experimenting with Blog Structure

My original concept for this blog was that each post would have three sections of roughly equal length. The first would talk about startups generally, the second about word processors and the third about something technical. I liked that I could offer a bit of something for different types of readers. It also was a good fit for my typical week - where I would naturally end up spending time thinking about each of these areas.

One issue is this structure doesn't lend well to sharing content. Based on the titles of the posts, it's really not clear what to expect before you click on a link to them. To effectively share one of the posts with someone you'd have to give extra instructions on which heading to direct attention to.

To address this, I'm going to pick one section as the headline section each week. Based on the headline, the post will appear to just be about that. That section may be about general startup things, word processors or technical details (as it is today). I may add other sections after that if I have thoughts to share on other areas - but I won't force it if nothing occurs to me naturally. In this case I have another technical note to share this week.

Documenting Code with Line Art

There is a lot of code in eno where I manipulate strings. In more complicated cases, this results in a lot of variables that index into various parts of the string for various reasons. I was finding it very hard to make this code scrutable when I stumbled up on a new technique for commenting variables: using line art.

Here's an example from Lincoln (you need more context to truly understand this code but hopefully it's illustrative):

//                 ┌───┬───┬─ +1 to ignore these pipes
//                 │   │   │
// The quick brown |fox|dog| jumped over the dog.
// └─┬────────────┘│└┬┘ └┬┘
//   │             │ │   └ insert_str (21..24)
//   │             │ └ remove_str (17..20)
//   │             └ next_edit_start (16)
//   └ src_idx += 15 (remaining[0..16])

para.state.append(.{
    .start = src_idx,
    .end = src_idx + next_edit_start.?,
    .state = .ref,
    .text = remaining[0..next_edit_start.?],
}) catch u.oom();
src_idx += next_edit_start.?;

// + 1 to skip the first `|`
remaining = remaining[next_edit_start.? + 1 ..];
const remove_str = std.mem.sliceTo(remaining, '|');
// + 1 to skip the second `|`
remaining = remaining[remove_str.len + 1 ..];
// + 1 to skip the third `|`
const insert_str = std.mem.sliceTo(remaining, '|');
remaining = remaining[insert_str.len + 1 ..];

I found making the line art diagrams really helps me understand the positions of each index into the string better. In some cases I have made these diagrams before I write any code. When I did that, I found it far easier to do the actual implementation and had way less off-by-one errors.

If you liked this post, please consider sharing it with a friend.

We also have an RSS feed

#software