Testing, Testing...our Heuristics

19 April 2023
Rebecca Wirfs-Brock

We gather heuristics through conversations

We gather heuristics through storytelling and conversations

Recently Chelsea Troy and I chatted over Zoom about software testing heuristics. I met Chelsea last year at DDD Europe. In this and a couple of snack-sized posts, I will reflect on some highlights of our conversation. Chelsea has also written about our conversation.

A Leading Question Leads to Some Heuristics #

I started by asking, “What is important about testing that people should get but don’t?”

Chelsea answered, that while Test Driven Development (TDD) is useful, it doesn’t solve all testing needs. If developers are oversold on the benefits of TDD, they can become jaded on testing in general. They shouldn’t. TDD doesn’t include specific practices that address resilience, or reliability. But it is useful for developing and testing deterministic code.

Chelsea shared the experience of learning first-hand how TDD didn’t have all the answers to testing. She worked on a team of TDD enthusiasts developing a mobile app for a client. Although the team thought they knew how to develop quality software, their initial prototype developed following TDD didn’t address these challenging requirements: being usable under extreme weather conditions, having a simple UX, and functioning when only intermittently connected to the internet and their backend software. They needed to add more design and testing techniques to their toolbox, along with their TDD testing. Chelsea also said that she learned a lot about testing for these kinds requirements from their client’s QA team.

Some heuristics we've touched on:

Use TDD to develop and test functionality of deterministic software.
Use other strategies to design and test for software system qualities such as usability, performance, reliability, or resilience.
Match your testing strategies and tactics to your application’s development and execution context.

A Brief Introduction to Heuristics #

I have been intrigued by software development heuristics, ever since I read Billy Vaughn Koen’s Discussion of the Method: Conducting the Engineer’s approach to problem solving. Koen defines a heuristic as, “anything that provides a plausible aid or direction in the solution of a problem but is in the final analysis unjustified, incapable of justification, and potentially fallible.” Heuristics are never guaranteed. When a heuristic fails, you back up and try another one.

I enjoy hunting for heuristics while designing and coding with others. Open-ended conversations where we swap stories and reflect on our heuristics are another great opportunity. Generally, I look for three kinds of heuristics:

Action heuristics. Things we do to solve our immediate problem. There are many action heuristics. Design patterns are one well-known form of action heuristic. We know these heuristics by name because authors took the time to write up them as named software patterns. But there are many testing and development techniques both smaller and larger than patterns. For example, in Test-Driven Development (TDD), the practice of “write a test, then write code to pass the test” is a heuristic for incrementally designing and implementing tested code.
Value Heuristics. Values motivate our actions. Underlying TDD is the value: Testing should be an integral part of design and coding.

Our values determine what actions seem appropriate. Because I value understandable code, I take several actions to make my code more comprehensible: I give methods, functions, and variables meaningful names; keep code in methods short; and write code at the same level of detail in a method, factoring out lower-level details into helper methods.

Values depend on context. As the context shifts, so do our values. This doesn’t mean we are fickle; just pragmatic. Most of the time we aren’t conscious of making these shifts. When cutting and pasting code from stack overflow, I don’t value code understandability so much as I do the ability to quickly determine whether that code addresses my current problem. If it does, then I rewrite that code to make it clearer and to fit with the style in my existing codebase. In production code, I do value understandability.
Guiding heuristics. Heuristics that lead to related actions. For example, Chelsea shared one guiding heuristic: Don’t treat test code the same as production code, instead, make each test understandable in isolation. This leads her to write self-contained test methods. She doesn’t like a test where she has to read the code that it calls on before she can understand the test. She also isn’t a fan of applying the DRY (Don’t Repeat Yourself) heuristic to test code.

Comparing competing heuristics #

Chelsea mentioned that understandable tests can also serve as valuable design documentation and discovery tools. It’s easier to modify test code that is self-contained, rerun it, and explore how the software responds.

I asked Chelsea whether she would put aside her heuristic of keeping tests self-contained if there were compelling reasons. What if set up conditions for tests took a long time (for example, doing a cut of a database in order to build an in-memory cache of test data)? What if there was complex code that was repeated in similar tests but was slightly altered? Did someone make cut-and-paste-modify-and-reuse errors, or were there valid reasons for these differences?

Factoring common initialization code out of tests into common setup code, provides a “standard” execution context for a suite of tests. It also makes it easier to vary that context and rerun the test suite. Factoring out code common to several tests and clearly labeling what it does eliminates having to second guess reasons for slight variations in test code.

Depending on your situation and personal preferences, you may choose the heuristic, “Keep code in tests so you can understand and easily manipulate it,” or the other, “Factor out expensive or error prone code into common code shared by tests.” These heuristics compete with each other. Neither is better. They are simply alternative ways to structure your test code.

The Value of Knowing your Values

If people don’t know your values (and how they differ from their values), they may not understand why you prefer to work the way you do. For example, while I value testing, I don’t practice test-first development.

If you understand TDD to mean strictly write tests before writing any code, your TDD heuristic is: begin by writing a small test, then write code that proves that the test fails, then rewrite your code to pass that test. Don’t add any more code than necessary to make the test pass. Do this repeatedly until you’ve fully implemented your code.

At the end of a TDD cycle, you have a bunch of tests and fully functioning code that passes those tests. Working this way, you typically implement a single class at a time. You test and implement lower-level functionality, then repeat the process to develop the code that uses that functionality. Your software tends to grow from the “bottom” up.

I value testing, but typically design and implement several classes that work together at the same time. Once I prove to myself that my overall design hangs together (through some sort of simulation), I implement it. When finished, I check in code for several classes along with tests that demonstrate their behavior. My code is tested, but I don’t leave around lots of low-level tests.

For example, I may use a strategy pattern to calculate charges for different items on an invoice. I would initially implement each individual strategy class and check that it worked as I expected. But I’d remove most if not all tests for those individual strategies once I proved to myself that they worked. Their code is simple enough to read at a glance. Once I get low level classes working (especially if they don’t retain any state), I don’t need to keep tests around to ensure that they work. Once implemented, they rarely change. If I do need to revise them, at that point I might reconsider my testing heuristics (and add some tests that reflect these changes). The valuable tests I tend to preserve are those that determine which strategy to use, how to add new kinds of strategies, and different ways to apply discounts and special pricing.

Let’s contrast my testing heuristics with those of test-first TDDers.

We both share this value heuristic: Value code that has tests over code (even if it works) that doesn’t have tests.

Test-first TDDers apply this heuristic: Write tests as you incrementally design code. Interleave testing and coding, repeatedly. Start with the simplest test and the simplest implementation. Only implement enough functionality so that your latest test passes. Build functionality and tests in small increments; each increment moving you closer to your final tested design.

They also have this guiding heuristic: You produce a cleaner design if you write tests first before writing any code.

I don’t share that heuristic.

My heuristic for developing designed, tested code is: Consider the design of one or more classes working together to achieve some functionality. Model your design using some lightweight technique, such as CRC cards (Class-Responsibility-Collaborators) or whiteboard sketches. Once you know what each class’ responsibilities are and how they interact, then implement them. Write simple tests and debug as you implement, but remove them if they are low level (and other code that has tests exercises their functionality). Keep only a lean set of illustrative tests that demonstrate how the classes work together and ensure that your design will continue to function properly.

At the end of my design/development cycle, I may write additional tests, revise existing ones, or remove insignificant tests. I use this grooming and cleanup step, before committing my code, as one way to double check my work.

Chelsea summarized my TDD heuristic as: Put tests in at the right level of abstraction once you know what your design is about.

Chelsea cautions, however, that if you don’t know what the right level of abstraction is and you follow test-first TDD heuristics by rote, you end up with tests at too low a level. Also, if you don’t have heuristics for pruning them, you end up with too many.

I view most testing I do while I implement my design as temporary scaffolding. Since I’ve already sketched out design ideas before coding, tests are not my primary tool for design. I test to verify my design. If I need to adjust my design as I implement it (and I expect to), that’s OK. I keep tweaking it and my code, and continue testing.

I suspect the biggest difference in our two approaches is that test-first TDDers don’t view their tests as temporary scaffolding, and I don't view the cycle of test-first TDD as the only (or best) way to understand what a design should be. We both value tested, well-designed code.

Bringing to light the different values that underlie competing heuristics can be illuminating. But how can we get others to appreciate and try out our heuristics? How can we approach new-to-us heuristics with an open mind? I’ll touch on these topics and more in my next post.

Previous: Critically Engaging With Models
Next: Our Heuristics are Shaped Through Experience