When in Rome...

27 June 2014
Rebecca Wirfs-Brock

I attended my first XP conference in Rome in May. As they say, “when in Rome, do as the Romans do.” The actual quote attributed to St. Ambrose is, “si fueris Romae, Romano vivito more; si fueris alibi, vivito sicut ib,” or “if you should be in Rome, live in the Roman manner; if you should be elsewhere, live as they do there.”

As Italians do, I enjoyed good food, good company, and great wine.

I gave a workshop on Understanding Design Complexity (using commonality-variability analysis) and a tutorial on Agile Architecture Values and Practices.

I also sampled research AND non-research sessions in equal measure. Unlike other agile conferences I’ve attended, research is a prominent part of this conference. I listened to several research presentations and volunteered to be a commenteer for one research paper.

The XP 2014 paper acceptance rate was somewhat selective, with over 50% of the submissions rejected. Research topics were wide-ranging including a case study on UX Design, a survey of user story size and estimation accuracy, another on agile development practices, a case study on visualizing testing, another on agile and lean values, and one comparing scripted with exploratory testing. Short papers touched on agile organizational transformations, Randoori Coding Dojos, and how expertise is located on agile projects. In addition, four experience reports were published (in contrast, 27 experience reports will be published and presented this year at the Agile Conference).

If the research papers I sampled are an indicator, PhD students seem to be busy doing empirical studies on agile practices, processes, and values. If you go to the Open University’s website you’ll see these topics listed under their Empirical Studies PhD program: The emergence of Agile software development, the role of physicality and co-location in agile software development, and XP and end user development

Agile software development is being studied and data being collected. The paper I commenteered, “Why We Need a Granularity Concept for User Stories” by Olga Liskin and her colleagues, reported on results gleaned from surveying developers (who self-selected themselves as agile developers) working on both private and open source projects on GitHub. I had three short minutes after the presentation to carry on a dialog with Olga about their findings. Fortunately we also had more time to discuss her work over lunch.

This paper raised as many questions in my mind as it answered. On small projects (10 people or less), 55% of the respondents said they did not estimate their stories on a daily basis. Is this an affirmation of the No Estimates movement, or just how people work on certain kinds of projects? Open source projects are quite different from product development. (Not all of the GitHub projects were open source ones, but still...). Depending on your project, you may simply work off a backlog, not necessarily do any estimates to forecast how much you can accomplish. Several developers I know only break down their work into identifiable tasks. Effort estimates (as long as they know how to do the work) aren’t that important. They’re done when they are done. And if you build shippable, workable software each sprint, well, you always have something potentially useful to deliver.

Here’s just a sampling of what I’d like to know: For those who estimated, how did story size correlate to estimation accuracy? And what happens when stories are split? Are estimates for split stories more accurate? And just how important is estimation accuracy to those who make estimates? It just something they did to get a rough idea, was it “required” of them, or did they effectively use estimates to plan and make forecasts? Did people who estimate improve their accuracy over time? It seems that if you are learning how to use a new technology, once you’ve spun up, estimates should be more accurate. But how long does it take to get up to speed on your estimates?

Not surprisingly, the authors found that the smaller the story size and the better known the technology was, the more accurate estimates were. Research results often confirm the obvious (but it is still it is nice to have some empirical evidence to back up our intuitions).

It their conclusions the authors don't recommend a “best story size.” I’m happy they didn’t. They didn’t have enough evidence. Conventional wisdom says stories should fit into sprints (comfortably). Makes me want to know more about the accuracy of larger stories. It seems reasonable, that the more work involved, the more possibility you’ll miss something that may influence the accuracy of your estimate. And you also may fudge on your estimates just so you make sure a story fits inside a sprint (because you don't want to split it). But estimates are just estimates. They shouldn’t be expected to be 100% accurate. How do people behave differently when estimation accuracy is rewarded (or worse yet, punished)? Estimates are just estimates. You learn to recalibrate your efforts when a task is harder or easier than expected.

The authors cautioned that developers’ views about story size and estimates need to be balanced by others’ concerns. Too many stories can be burden to a product owner. Developers in dysfunctional organizations might pad estimates, just so they have some slack (does anyone knowingly pad estimates? I’d like to hear from you.).

So is it better to bundle up small, related stories and estimate them as a single unit? Maybe. Back in the days when I had to estimate work, I didn’t like tasks being too small. If they were, my manager would look at them more closely (I'm not sure why). I remember once telling a manager, you can ask what we will be doing every day and how long each task will take, or we will guarantee that we will deliver all the features above the cut line in the next two weeks. But you can’t have both daily accuracy and predictability. She backed off on knowing exactly what we were doing every day (as long as we weren’t stressed out). This was long before the Agile software development movement, but it still seems relevant. Our small team worked off a prioritized list of features. It didn’t matter who did what task, whenever a task was finished, the next one was picked up. And we finished our work on schedule...because we wanted to make our commitment.

Here’s one parting thought about empirical studies. I’m very wary of biases that work their way into them. Those who answered the agile estimation survey might be very different from those who did not. Self-reporting of any thing we do (whether it be software estimation, the amount of food we eat, or how much we exercise) is notoriously inaccurate. We underestimate our weight, overestimate our capabilities, and don’t remember accurately. More accurate evidence is obtained through field studies where people are observed working. I wish more software empirical researchers could have opportunities to work directly agile teams and spend significant time getting to know them and how they work (in addition to just asking them questions).

Previous: Making Strong, Lively Centers
Next: Why Process Matters