Distinguishing between testing and checking

At Agile 2013 Matt Heusser presented a history of how agile testing ideas have evolved in “Twelve Years of Agile Testing: And What Do We Do Now?” The most intellectually challenging idea I came away from Matt’s talk was the notion that testing and checking are different. I’m still trying to wrap my head around this distinction.

Disclosure: I’m not a testing insider. However, along with effective design and architecture practices, pragmatic testing is a passion of mine. I have presented talks at Agile with my colleague Joe Yoder on pragmatic test driven design and quality scenarios.

Like most, I suspect, I have a hard time teasing out a meaningful distinction between checking and testing. When I looked up definitions for testing and checking there was significant overlap. Consider these two definitions:

Testing-the means by which the presence, quality, or genuineness of anything is determined

Testing-a particular process or method for trying or assessing.

And these for checking:

Checking-to investigate or verify as to correctness.

Checking-to make an inquiry into, search through, etc.

Using the first definition for testing, I can say, “By testing I determine what my software does.” For example, a test can determine the amount of interest calculated for a late payment or the number of transactions that are processed in an hour. Using the second meaning of testing, I can say that, “I perform unit testing by following the test first cycle of classic TDD” or that, “I write my test code to verify my class’ behavior after I’ve completed a first cut implementation that compiles.” Both are particular testing processes or methods.

I can say, “I check that my software correctly behaves according to some standard or specification (first meaning).” I can also perform a check (using the second definition) by writing code that measure how many transactions can be performed within a time period.

I can check my software by performing manual procedures and observing results.

I can check my software by writing test code and creating an automated test suite.

I might want to assess how my software works without necessarily verifying its correctness. When tests (or evaluations) are compared against a standard of expected behavior they also are checks. Testing is in some sense a larger concept or category that encompasses checking.

Confused by all this word play? I hope not.

Humans (and speakers of any native language) explore the dimensions and extent of categories by observing and learning from concrete examples. One thing that distinguishes a native speaker from a non-native speaker is that she knows the difference between similar categories, and uses the appropriate concept in context. To non-native speakers the edges and boundaries of categories seem arbitrary and unfathomable (meanings aren’t found by merely reading dictionary definitions).

I’ve been reading about categories and their nuances in Douglas Hofstadter and Emmanuel Sander’s Surfaces and Essences. (Just yesterday I read about subtle difference between the phrases, “Letting the cat out of the bag” and “Spilling the beans.”)

So what’s the big deal about making a distinction between testing and checking?

Matt pointed us to Michael Bolton’s blog entry, Testing vs. Checking. Along with James Bach, Michael has nudged the testing world to distinguish between automated “checks” that verify expected behaviors versus “testing” activities that require human guided investigation and intellect and aren’t automatable.

In James Bach’s blog, Testing and Checking Refined, they makee these distinctions:

“Testing is the process of evaluating a product by learning about it through experimentation, which includes to some degree: questioning, study, modeling, observation and inference.
(A test is an instance of testing.)

Checking is the process of making evaluations by applying algorithmic decision rules to specific observations of a product.
(A check is an instance of checking.)”

My first reaction was to throw up my hands and shout “Enough!” My reaction was that of a non-native speaker trying to understand a foreign idiom! But then I calmed down, let go of my urge to precisely know James and Michael’s meanings, accept some ambiguity, and looked for deeper insight.

When Michael explained,

“Checking is something that we do with the motivation of confirming existing beliefs” while, “Testing is something that we do with the motivation of finding new information.”

it suddenly became more clear. We might be doing what appears to be the same activity (writing code to probe our software), but if our intentions are different, we could either be checking or testing.

Why is this important?

The first time I write test code and execute it I learn something new (I also might confirm my expectations). When I repeatedly run that test code as part of a test suite, I am checking that my software continues to work as expected. I’m not really learning anything new. Still, it can be valuable to keep performing those checks. Especially when the code base is rapidly changing.

But I only need to execute checks repeated on code that has the potential to break. If my code is stable (and unchanging), perhaps I should question the value of (and false confidence gained by) repeatedly executing the same tired old automated tests. Maybe I should write new tests to probe even more corners of my software.

And if tests frequently break (even though the software is still working), perhaps I need to readjust my checks. I’m betting I’ll find test code that verifies details that should be hidden/aren’t really essential to my software’s behavior. Writing good checks that don’t break so easily makes it easier to change my software design. And that enables me to evolve my software with greater ease.

When test code becomes stale, it is precisely because it isn’t buying any new information. It might even be holding me back.

I have a long way to go to become a fluent native testing speaker. And I wish that James and Michael could have chosen different phrases to describe these two categories of “testing” (perhaps exploration and verification?).

But they didn’t.
Fair enough.

Architecture at Agile 2013

What a busy, intense week Agile 2013 was! It was a great opportunity to connect with old friends and meet folks who share common interests and energy. I also had a lot of fun spreading the word/exchanging ideas about two things I’m passionate about: software architecture and quality.

At the conference I presented “Why we need architecture (and architects) on Large-Scale Agile Projects”. I’ve presented this talk a few times. This time I added “Large Scale” to the title and submitted it to the enterprise agile track. I wanted to expose the audience to several ideas: that there are both small team/project architecture practices and larger project/program architectural practices that can work together and complement each other, what it means to be an architecture steward, and some practices (like Landing Zones, Architecture Spikes, and Bounded Experiments/prototyping, and options for making architecture-related tasks visible).

I spoke with several enthusiastic architects after my talk and throughout the week. They shared how they were developing their architecture. They also asked whether I thought what they were doing was made sense. In general, it did. But I want to be clear: One size doesn’t fit all. Sometimes, depending on risks and the business you are in, you need to invest effort in experimenting/noodling/prototyping before committing to certain architectural decisions. Sometimes, it is absolutely a waste of time. It depends on what you perceive as risky/unknown and how you want to deal with it. The key to being successful is to do what works for you and your organization.

Nonetheless, in my talk when I spoke about some decisions that are too important to wait until the last moment, someone interrupted to say that I had gotten it wrong: “It isn’t the last possible moment, but the last responsible moment”. I know that. Yet I’ve seen and heard too many stories about irresponsible technical decision-making at the last possible moment instead of the last responsible moment. People confuse the two. And they use agile epithets to justify their bad behaviors. Surprise, surprise. The “last responsible moment” can be misinterpreted by some to mean, “I don’t want to decide yet (which may be irresponsible)”. People rarely make good decisions when they are panicked, overworked, stressed out, exhausted or time-crunched.

Check out my blog posts on the Last Responsible Moment and decision-making under stress if you want to join in on that conversation.

But I digress. Back to architecture.

I was happy to see two architecture talks on the Development and Software Craftsmanship track. I attended Simon Brown’s “Simple Sketches for Diagramming your Software Architecture” and also had the pleasure of hanging out with Simon to chat about architecture and sketching. Simon presented information on how to draw views of a system’s structure that are relevant to developers, not too fussy or formal, yet convey vital information. This isn’t hardcore technical stuff, but it is surprising how many rough sketches are confusing and not at all helpful. Simon regularly teaches developers to draw practical informative architecture sketches. He collects sample sketches from students before and after they receive his sketching advice. Their improvement is remarkable. If you want to learn more, go to Simon’s website, CodingTheArchitecture.com

I shared with Simon the sketching exercises in my Agile Architecture and Developing and Communicating Software Architecture workshops…and pointed him to two books on I’ve drawn on for drawing inspiration: Nancy Duarte’s slide:ology and Matthew Frederick’s 101 Things I Learned in Architecture School. It’s all about becoming better communicators.

Scott Ambler talked about Continuous Architecture & Emergence Design. I was happy to see that he, too, advocated architecture spikes and envisioning (and proving the architecture with evidence/code). In his abstract he states: “Disciplined agile teams will perform architecture envisioning at the beginning of a project to ensure that they get going in the right direction. They will prove the architecture with working code early in construction so that they know their strategy is viable, evolving appropriately based on their learnings. They will explore new or complex technologies with small architecture spikes. They will explore the details of the architecture on a just-in-time (JIT) basis throughout construction, doing JIT modeling as they go and ideally taking a test-driven-development (TDD) approach at the design level.”

There are way too many concurrent sessions and too few hours in the day to get to all the talks I’d have liked to attend. I just wished I’d been able to attend Rachel Laycock and Tom Sulston’s talk on the DevOps track, “Architecture and Collaboration: Cornerstones of Continuous Delivery”…but instead I enjoyed Claire Moss’ “Big Visible Testing” experience report. Choices. Decisions.

If you’d like to continue the conversation about architecture on agile projects, I’d love to hear from you.