When in Rome…

I attended my first XP conference in Rome in May. As they say, “when in Rome, do as the Romans do.” The actual quote attributed to St. Ambrose is, “si fueris Romae, Romano vivito more; si fueris alibi, vivito sicut ib,” or “if you should be in Rome, live in the Roman manner; if you should be elsewhere, live as they do there.”

As Italians do, I enjoyed good food, good company, and great wine.

I gave a workshop on Understanding Design Complexity (using commonality-variability analysis) and a tutorial on Agile Architecture Values and Practices.

I also sampled research AND non-research sessions in equal measure. Unlike other agile conferences I’ve attended, research is a prominent part of this conference. I listened to several research presentations and volunteered to be a commenteer for one research paper.

The XP 2014 paper acceptance rate was somewhat selective, with over 50% of the submissions rejected. Research topics were wide-ranging including a case study on UX Design, a survey of user story size and estimation accuracy, another on agile development practices, a case study on visualizing testing, another on agile and lean values, and one comparing scripted with exploratory testing. Short papers touched on agile organizational transformations, Randoori Coding Dojos, and how expertise is located on agile projects. In addition, four experience reports were published (in contrast, 27 experience reports will be published and presented this year at the Agile Conference).

If the research papers I sampled are an indicator, PhD students seem to be busy doing empirical studies on agile practices, processes, and values. If you go to the Open University’s website you’ll see these topics listed under their Empirical Studies PhD program: The emergence of Agile software development, the role of physicality and co-location in agile software development, and XP and end user development

Agile software development is being studied and data being collected. The paper I commenteered, “Why We Need a Granularity Concept for User Stories” by Olga Liskin and her colleagues, reported on results gleaned from surveying developers (who self-selected themselves as agile developers) working on both private and open source projects on GitHub. I had three short minutes after the presentation to carry on a dialog with Olga about their findings. Fortunately we also had more time to discuss her work over lunch.

This paper raised as many questions in my mind as it answered. On small projects (10 people or less), 55% of the respondents said they did not estimate their stories on a daily basis. Is this an affirmation of the No Estimates movement, or just how people work on certain kinds of projects? Open source projects are quite different from product development. (Not all of the GitHub projects were open source ones, but still…). Depending on your project, you may simply work off a backlog, not necessarily do any estimates to forecast how much you can accomplish. Several developers I know only \break down their work into identifiable tasks. Effort estimates (as long as they know how to do the work) aren’t that important. They’re done when they are done. And if you build shippable, workable software each sprint, well, you always have something potentially useful to deliver.

Here’s just a sampling of what I’d like to know: For those who estimated, how did story size correlate to estimation accuracy? And what happens when stories are split? Are estimates for split stories more accurate? And just how important is estimation accuracy to those who make estimates? It just something they did to get a rough idea, was it “required” of them, or did they effectively use estimates to plan and make forecasts? Did people who estimate improve their accuracy over time? It seems that if you are learning how to use a new technology, once you’ve spun up, estimates should be more accurate. But how long does it take to get up to speed on your estimates?

Not surprisingly, the authors found that the smaller the story size and the better known the technology was, the more accurate estimates were. Research results often confirm the obvious (but it is still it is nice to have some empirical evidence to back up our intuitions).

It their conclusions the authors don’t recommend a “best story size”. I’m happy they didn’t. They didn’t have enough evidence. Conventional wisdom says stories should fit into sprints (comfortably). Makes me want to know more about the accuracy of larger stories. It seems reasonable, that the more work involved, the more possibility you’ll miss something that may influence the accuracy of your estimate. And you also may fudge on your estimates just so you make sure a story fits inside a sprint (because you don’t want to split it). But estimates are just estimates. They shouldn’t be expected to be 100% accurate. How do people behave differently when estimation accuracy is rewarded (or worse yet, punished)? Estimates are just estimates. You learn to recalibrate your efforts when a task is harder or easier than expected.

The authors cautioned that developers’ views about story size and estimates need to be balanced by others’ concerns. Too many stories can be burden a product owner. Developers in dysfunctional organizations might pad estimates, just so they have some slack (does anyone knowingly pad estimates? I’d like to hear from you.).

So is it better to bundle up small, related stories and estimate them as a single unit? Maybe. Back in the days when I had to estimate work, I didn’t like tasks being too small. If they were, my manager would look at them more closely (I’m not sure why). I remember once telling a manager, you can ask what we will be doing every day and how long each task will take, or we will guarantee that we will deliver all the features above the cut line in the next two weeks. But you can’t have both daily accuracy and predictability. She backed off on knowing exactly what we were doing every day (as long as we weren’t stressed out). This was long before the Agile software development movement, but it still seems relevant. Our small team worked off a prioritized list of features. It didn’t matter who did what task, whenever a task was finished, the next one was picked up. And we finished our work on schedule…because we wanted to make our commitment.

Here’s one parting thought about empirical studies. I’m very wary of biases that work their way into them. Those who answered the agile estimation survey might be very different from those who did not. Self-reporting of any thing we do (whether it be software estimation, the amount of food we eat, or how much we exercise) is notoriously inaccurate. We underestimate our weight, overestimate our capabilities, and don’t remember accurately. More accurate evidence is obtained through field studies where people are observed working. I wish more software empirical researchers could have opportunities to work directly agile teams and spend significant time getting to know them and how they work (in addition to just asking them questions).

Can you really estimate complexity with use cases?

I visited with some folks last week who failed to get as much leverage from writing use cases as they’d hoped. In the spirit of being more agile, at the same time they adopted use cases, they also streamlined their other traditional development practices. So they didn’t gather and analyze other requirements as thoroughly as they had in the past. Their use cases were high level (sometimes these are called essential use cases) and lacked technical details or detailed descriptions of process variations or complex information that needed to be managed by the system. But their problem domain is complex and varied, prickly, and downright difficult to implement in a straightforward way (and use cases written at this level of detail failed to reveal this complexity). Because of the level of detail, they found it difficult to use these use cases to estimate the work involved to implement them. In short, these use cases didn’t live up to their expectations.

Were these folks hoodwinked by use case zealots with an agile bent? In Writing Effective Use Cases, Alistair Cockburn illustrates a “hub-and-spoke” model of requirements. A figure in his book puts use cases in the center of a “requirements wheel” with other requirements being spokes. Cockburn states that, “people seem to consider use cases to be the central element of the requirements or even the central element of the project’s development process.”

Putting use cases in the center of all requirements can lull folks into believing that if they have limited time (or if they are trying to “go agile”) they’ll get a bigger payoff by only focusing on the center. And indeed, if you adopt this view of “use cases as center”, it’s easy to discount other requirements perspectives as being less important. If you only have so much time, why not focus on the center and hope the rest will somehow fall into place? If you’re adopting agile practices, why not rely upon open communications between customers (or product owners or analysts) and the development team to fill in the details? Isn’t this enough? Maybe, maybe not. Don’t expect to get early accurate estimates by looking only at essential use cases. You’d be just as well off reading tea leaves.

Cockburn proposes that, “use cases create value when they are named as user goals and collected into a list that announces what the system will do, revealing the scope of a system and its purpose.” He goes on to state that, “an initial list of goals will be examined by user representatives, executives, expert developers, and project managers, who will estimate the cost and complexity of the system starting from it.” But if the real complexities aren’t revealed by essential use cases, naive estimates based on them are bound to be inaccurate. The fault isn’t with use cases. It’s in the hidden complexity (or perhaps naive optimism or dismissal of suspected complexity). A lot of special case handling and a deep, complex information model makes high-level use cases descriptions a deceptive tool for estimation. That is unless everyone on the project team is brutally honest about them as just being a touchpoint for further discussion and investigation. If the devil is in the details, the only way to make reasonable estimates is to figure out some of those details and then extrapolate estimates based on what is found. So domain experts that know those details had better be involved in estimating complexity. And if technical details are going introduce complexity, estimates that don’t take those into account will also be flawed. Realistically, better estimates can be had if you implement a few core use cases (those that are mutually agreed upon as being representative and prove out the complexities of the system) and extrapolate from there. But if details aren’t explained or if you don’t perform some prototyping in order to make better estimates, you won’t discover the real complexities until you are further along in development.

I’m sure there are other reasons for their disappointment with use cases but one big reason was a misguided belief that high-level use cases provide answers instead of just being a good vehicle for exploring and integrating other requirements. In my view, use cases can certainly link to other requirements, but they just represent a usage view of a system. One important requirement for many systems, but not the only one. If they are a center, they are just one of many “centers” and sources of requirements.