Las Vegas….gambling on agile?

OK, I want a catchy title… But I also want to tell you about the upcoming Better Software Conference and Agile Development Practices in Las Vegas June 6-11 where I’ll be presenting a one-day tutorial on Writing Effective Agile Use Cases. No I am not co-opting Alistair Cockburn’s bestselling Writing Effective Use Cases….this title was suggested by the conference organizers. They think it is more catchy than what I proposed: Writing Agile Use Cases.

I hope to share with you effective techniques I’ve picked up for writing use cases in an agile development environment. While I am not a believer in writing for writing’s sake, I happen to find that writing documentation on agile projects to not be intrinsically evil. I’ve worked with several agile teams to trim down their project documentation, write effectively, and to focus on what matters to them. Not every user story should be part of a use case description. And not every user story needs to be documented. But I believe in the power of the written word when it is effective, streamlined, and to the point. And I find that use cases that focus on usability and defining user tasks to be well-received by agile teams. Especially if they are done in context with usability experimention, wizard of oz prototyping, and lightweight user interface specification. So come join me in Las Vegas. Or ask me about agile use case writing workshops…where we hone our writing skills and write project-specific use cases.

Junk faxes, print cartridges and canceling-oh my!

Because two colored ink cartridges are empty (cyan and yellow), my black and white faxes have been piling up in my machine’s buffer. My fax-printer-scanner insisted on having non-empty color cartridges installed. But when there are no color print jobs that doesn’t make technical sense. I suspect other considerations drove this design decision. Print cartridges are where the money is. From a business sense it makes dollars and cents to insist that all cartridges are installed and in good working order before allowing the user to print anything.

But even more annoying was the difficultly I had stopping my printer from printing a month’s worth of buffered faxes! Glancing quickly at the buttons on my Brother MFC-885CW I noticed one labeled Clear/Back and another Stop/Exit. I pushed the Clear/Back button a couple of times to no avail, then decided I’d just let the faxes print. (Mild curiosity led me to receive 3 fax ads for discount health care, 3 for vacation deals in Cancun, an invitation to the Presidential Who’s Who among business and professional achievers, and a Neo-Tech stock market news report.)

There will always be competing values and design goals. Business and users’ goals don’t always match. An ethical usability designer should point out these conflicts and not let them slide. I know that might be pushing it, but someone should have strongly questioned whether it is better to demand all cartridges be installed or not.

But usability concerns don’t stop at defining how to accomplish some task or what constraints exist on initiating one. How to start, stop, pause, quit, and retry should be considered, too. Clear/Back and Stop/Exit? How confusing! I wanted to clear the print buffer and stop all printing. But perhaps I should have stopped printing first. But then what? What I wanted was to stop all printing and clear my printer’s internal buffer all in one easy to do action (I don’t read manuals when faced with an exception in real time). Maybe pressing Stop/Exit would’ve accomplished that. I’m not sure. If I remember, next time I’ll try that.

But what if I wanted to stop one fax job, but continue printing the others. Hm, maybe that Fax Preview button on the other side of the print console could come in handy. Grr. There isn’t just one path through an exceptional case that a user might want to pursue. They all need to be carefully considered. And too many buttons with small and potentially confusing labels don’t help me accomplish an emergency action in a hurry. I think Brother could do better by displaying alternate flow options on the console during printing (did I mention there is a display console on my printer?), but I’d have to get “trained”to look there. Since I don’t stand by my printer and watch it work enough to notice what kinds of informative messages it displays, it might’ve been telling me what my options are, and I just didn’t notice. Now that’s a tough problem to tackle. Not sure how to avoid frustrating inattentive users who don’t know to look for advice on how to logically push that missing big red cancel button. (I’ll take a closer look the next time my printer prints a fax to see whether it tells me anything).

ATM Activation Update

I wanted report on what happened when I actually called up to activate my really new ATM card (in case anyone cares). I phoned up the number as the sticker on the card directed me to do and after I punched in my new card number I received a message stating, “This card is already activated” and then I got disconnected.

Hm. The call I made to activate my card when I mysteriously received a “replacement” card (not a new card with a new number) obviously went through–and the operator activated my new card (even though the number I gave her was for my old card). Hm. So my new card was already activated before I received it. Someone could’ve stolen it out of my mailbox and used it without having to answer all those pesky questions. I think I’ve uncovered a security issue with how the person on the phone handled my request.

For what its worth–its those special cases that’ll get you everytime.

Really, we’re just trying to help

Last Thursday evening I called my bank to report my bank card had been lost. I answered a bunch of questions and the person said they’d mail me my new card within five to seven business days. Boy was I surprised when a new card showed up in next day’s mail. The following day a new PIN code came in another letter. I called up to activate my new card and strangely, the person on the line asked me a whole lot of questions including one that I couldn’t answer–what date did I open my account? I’ve had my bank account so long I didn’t remember. After being placed on hold and asked a few more questions that I could answer, the person said my card had been activated. Boy this was excellent service!

Except it wasn’t…Sunday the ATM refused my transaction with a cryptic “your card couldn’t help us” message. Today I again tried my card (maybe I’m dense),with no luck. I went inside and told the teller my new card wouldn’t work. She looked me up in the computer after checking my ATM card and my ID and said that this wasn’t my new card, it was my old card. It had been reported lost or stolen so I couldn’t use it. All I could do is sit tight and wait a few days for my new card.

What happened? Why did my card show up early? I think I’ve figured this out. The last time I used my bank card I’m guessing that I left it at the machine. Thinking they’d be helpful, I suspect my bank then initiated a process to send me a “replacement” card which with same card number as my swallowed card, but requiring a new PIN.

If I had held off phoning in my lost card for one more day, that mysterious “replacement” card would have shown up and I would’ve been set (after I received my new PIN code in another mail). But once I reported my card as lost that nixed my old card for good. Bummer.

I have some gripes about my “replacement” card’s arrival. There were no clues about why it was being sent. Secondly, a separate mail came a day later advising me of a new PIN code, again, without explanation.

I can see some analyst pondering what to when a card has been left at a machine. Did they test the replacement procedure on real people (or eager phoning people like me)? How likely is it that someone might report a lost before the replacement mysteriously showed up? Why should phoning in a lost card invalidate the replacement process (I think I know the answer to that one as the original lost and replacement cards, having the same card number, aren’t unique…so how can the bank tell which card I was reporting as lost?)

I would’ve been happier if the person who answered the phone when I called to activate my replacement had told me, “No your card isn’t activated.” But maybe she didn’t know that it wasn’t usable. Or maybe she was just being obscure to throw off the card thief. I can only wonder. After the conversation ended, I knew my card had been activated and thought I could use it. But I couldn’t.

I know my bank was trying to be helpful by sending me a magic replacement card. But after one confusing activation phone call, two unsuccessful ATM episodes, and one helpful conversation with the bank teller, I finally figured it out. I’m crossing my fingers until my new card shows up and I use it for the first time.

Workarounds vs workthroughs

Today I had dental implant surgery. The procedure typically takes an hour. I don’t want to go into great, gory detail, but an implant is a titanium tooth root substitute that is inserted into the jawbone after drilling a hole for the implant. The first part of the procedure involves drilling a hole or more precisely, a narrow hole is drilled, then through a succession of six drilling with successively larger drill bits, the hole is widened. Screwing in the implant then completes the procedure.

When the drill machine was powered on in a pre-surgery test, it would work for a couple of seconds then halt with an ERR 04 code (drill overheat fault) on the LED display. The nurse informed me that the machine had just started acting up, but they needed it to fail more frequently so they could give enough information to the repair technicians. Well today was their lucky (and my unlucky) day. After some experimentation and repeated faults, the staff figured out that if they carefully cycled the power and waited long enough, chances are the drill would restart and work for a while. Waiting long enough seemed to clear the fault most of the time. Keeping a foot on the foot pedal and smoothly operating the drill seemed to prevent it from faulting with an ERR 09 (foot pedal fault). They informed the surgeon and he and they experimented with the operation of the drill for several minutes before starting the procedure.

Even though I might have preferred to reschedule my implant, the team went ahead (without conferring with me). What was I thinking??? What would’ve happened if after the third drilling, the machine stopped functioning? Oh, I shouldn’t forget to mention that a technician was charged with recycling the machine whenever it failed, cuing the surgeon when to restart drilling.

OK, admit it. I’m sure you’ve operated some machinery which occasionally fails. We all are familiar with rebooting computers to clean things up. And I’ve been driving around my 11 year old Volvo for several months now, trying to diagnose why it occasionally won’t start (I’ve finally figured out that if I switch on the ignition while jiggling the shift lever that I can always get it to restart, now that I know how to reliably correct the problem my mechanic says he can easily isolate what’s broken and needs fixing).

I started out my software career as an evaluation engineer. From experience, I know that until you find a way to reliably cause a fault, it is difficult to report a bug that anyone is willing to listen to. Intermittent, apparently random failures are the worst kind. Only when you can reliably produce a failure can you even attempt to isolate the problem. Long-term garbage collection bugs or slow memory leaks are really nasty. But golly! When end users encounter intermittent software failures they typically plunge ahead looking for workarounds. Rarely do users want to isolate a problem if they can find a workaround. They’re on task, and not particularly interested in troubleshooting software. When a physical device acts up, people typically act the same way. In hindsight, I probably should’ve halted the procedure before it starte and scheduled my implant for another day. But they (and I) didn’t want to. I was goal oriented. I’ll be damned if I wanted to go in twice!! And they seemed confident that they could finish the procedure and seemed unconcerned about the intermittent drill malfunction. (I’m wondering what their backup plan was). Maybe today I really was lucky because in spite of faults, there weren’t catastrophic failures.

But back to considering device faults. I’ve always wanted the ability to manually override a device’s fault response behavior when I suspect a faulty sensor. Or at least have a way of running self diagnostics’or something instead of being forced to “jigger a solution”. Cycling power seems like such a hack. What if the faulty device doesn’t restart and I’m in the middle of an important task? What if I am willing to take the risk to keep operating the device because the consequences of it not restarting are worse than continuing on with a suspected faulty fault? Shouldn’t a person be allowed to be in the decision loop in this case? Devices shouldn’t just shut off with an ERR code. I’d much prefer a user interface where I’m allowed to initiate a workthrough (e.g. ignoring a suspected fault) instead of being forced to initiate a potentially problematic workaround (cycling power). The faults and fault lights on my car’s dashboard work this way (I caproceedde to ignore them at my own peril). Perhaps if the drill had really been overheated, a workthough should’ve been prevented. But then the determined surgeon would’ve just cycled power anyway. I’m probably not going to change how people design devices by raising these issues. But I’d be interested in reactions to the idea of designing to allow for workthroughs instead of forcing workarounds.

Exceptional exceptions

I should have known something interesting would happen today when I read my horoscope*:

Chug along as planned. Circumstances might create a series of minor emergencies that interrupt your routine. Remain fluid about plans.

Today I had a bizarre ATM experience. The machine gobbled my deposit envelope but kept beeping and prompting me to deposit the deposit envelope in the slot. But I had already made my deposit! So after 30 seconds of incessant beeping I pressed the cancel button. The beeping didn’t stop. I calmly walked inside spoke to a teller (we could hear the beeping inside the bank). She walked outside, looked at the screen, and followed instructions inserting an empty deposit envelope. The ATM then ejected my card and a receipt that indicated my transaction had been canceled. But the machine still had my deposit envelope.

After a short consultation, a more senior teller took over. She opened the back of the ATM, opened the deposit box, and sorted through the envelopes. My deposit envelope wasn’t there. She closed the machine (meanwhile as she was doing this someone else successfully made a deposit that landed in the deposit box). She went inside, consulted someone in a back office, made a phone call and then spoke with me. She had reached someone who she said was “a little unreasonable”. They suggested I file a dispute and then they would schedule a technician to search for my missing deposit envelope lost inside the machine. Meanwhile she rechecked the deposit box, took my name and phone number and said she would resolve this today. And she did. A technician came out, popped open the front panel of the ATM, and found my deposit envelope clinging to the top of the dispenser in a way that allowed other envelopes to slide over my envelope and drop into the deposit box.

My thanks and gratitude goes out to that persistent teller. Without her my deposit would still be in limbo.

What about the ATM? Could it have worked better? How could it have handled this really exceptional case? I am guessing that my envelope never was sensed (otherwise, how could the deposit slot still be waiting to accept an envelope). It was in limbo. But beeping a long time while displaying the “insert your deposit envelope in the deposit slot” didn’t help me out. Those instructions didn’t apply to me! The fact that a teller could insert a blank deposit envelope meant the deposit slot was working; the next depositor’s actions confirmed that. But somehow my deposit envelope hadn’t been recognized by the machine.

What about the behavior of âcancel? Cancel didn’t mean immediately cancel. The machine kept incessantly beeping, demanding that an envelope be deposited. I was too stunned to know what it wanted. Even if I had figured out what it wanted, I don’t think putting in a blank envelope would’ve been a good thing to do. I can imagine the bank then claiming that I’d deposited an empty envelope and I probably would have had to file a dispute. In hindsight, my actions were the right ones to take. If cancel had immediately ejected my card, things might have been better. There was something disconcerting about cancel not having an immediate effect, even though it led to me going inside to find smart people who eventually tracked down the problem. It seems like a poor system design to have cancel wait until a hardware initiated action was complete. This could’ve been a result of poorly designed hardware. I just don’t know enough to say.

If cancel had been immediate, my card and a canceled deposit transaction would’ve been ejected. I still would’ve had to deal with the missing deposit envelope. But I suspect the story I told the teller wouldn’t have been so compelling.

The teller said that after a certain amount of beeping the ATM would have canceled my transaction anyway and swallowed my card (so this leads me to believe that there is some ability for the system to time out if expected hardware actions don’t occur). I still suspect a software bug. If the ATM had swallowed my deposit envelope this way during non business hours I suspect I would’ve stood at the machine until it stopped beeping, which would’ve led me down the same path but with undoubtedly more trouble. Hm.

As someone who has written many use cases, I would have specified that cancel should be immediate and that user transaction’s abort as soon as the cancel key was pressed (unless the transaction was beyond canceling and then some indication of that failed response signaled to the user). And then I would’ve written tests to push on all the weird cases I could think of.

How cancel is detected, when it should take effect, and what should happen often fall into that gray unspecified area. Most use case descriptions ignore or say very little about canceling actions. Specifying cancel exception behavior doesn’t fit well neatly with tying exceptions to specific use case steps since cancel can happen at many different places (and across many different use cases). Aren’t use cases best when they describe business-level steps, not lower level implementation details? Sure, but sometimes it is important to specify these pesky interaction details that make your system be responsive or react predictably.

My advice: when you want to specify these things, start by writing statements that describe when cancel is enabled (and when it is not), when the software should detect it, what should happen, and what should not be allowed to happen. You may need to define invariants that must be preserved, describe detailed cancel actions, or develop state models. These descriptions, in addition to use cases can specify interaction design. It’s unrealistic to squeeze every little detail into a use case description.

*Disclaimer: On the rare day that I read my horoscope I forget it by the time I’ve finished reading Dilbert and Doonesbury. But today’s horoscope was strange enough that I remembered it.

Exactly what do you mean?

I spent the past week at the Agile 2005 software conference. What an amazing conference and inspiring group of people! I spent some time with Lynn Miller from Alias who presented a report about how her companysuccessfully integrated User Centered Design (or UCD) with agile XP (Extreme Programming) practices. Lynn is the lead interaction designer on an innovative product called AliasSketchBook Pro. As an interaction designer, Lynn gathers customer information and defines and refines user features through prototyping and customer feedback. Lynn then feeds her designs to the development team who develop production quality code in month long “sprints”. At Alias, interaction designers work in tight collaboration with the development team, feeding them just-in-time interaction designs. According to Lynn, this is pretty unusual. Most companies do interaction design in all in one big lump before doing any software development. At Alias, interaction design is done monthly increments, just like the code. Each coding sprint is fed by features defined by the interaction designers who worked on them during the previous month. Skeptics in Lynn’s field don’t believe that usability design can be done this way without sacrificing quality. Alias’ success story challenges these assumptions.

As an interaction designer Lynn isn’t up on the latest agile jargon. So for the first couple of days at the conference she was puzzled when she heard stories about how other agile teams had trouble identifying and working with the “customer” who defined “user stories” that the team implements. To Lynn’s and most people outside the agile community, customers are people who purchase products or services. How could you co-locate the “customer” with a development team? Don’t companies have many customers? What exactly was the problem some XPers were having? Only after a Lynn realized that XP defines customer as “an informed expert who clarifies requirements for an XP team” did her confusion evaporate. An XP customer isn’t necessarily an end user or purchaser of a product. Lynn plays the role of “customer” for her XP development team when she designs features. I agree with Lynn, the definition of an XP customer is confusing.

Lesson learned: Agilists who communicate with others outside their own community need to be aware that jargon is confusing. When someone looks puzzled it may be because they don’t share your context. Bridge this gap by asking a newcomer what’s unclear. Then take the time to decode insider jargon for them. You’ll learn something about what they do and how they think in the process.

I discussed with Lynn and Jeff Patton (another talented user-centered designer)what we each mean by “design”. To me, a software designer, design means creating a model of interacting software objects that are implemented in code. Interaction designers use a raft of techniques ranging from contextual inquiry (to understand the users’ work environment), to user-centered design (to cluster tasks and identify user categories), to interaction and user interface design. All these activities to an interaction designer are “design”. It was pretty easy to understand our different views and see how they dovetailed into an overall system design process. I don’t think we should come up with an unambiguous definitios for our various activities. Besides being unrealistic, we’d all have to start speaking design Esperanto which wouldn’t be a good thing. But I learned something. When you don’t understand what I mean, it is my problem not yours. As a good communicator I should try to bridge my ideas into your context. And when you don’t understand what I’m saying, please ask, “What do you mean by that?”

Whole Systems Thinking and Pesky Details

I wasted an hour today trying to get an email signature line just the way I wanted it. The mail program I use is Eudora. I use it in paid mode. I’m not picking on Eudora so much as I am picking on the state of software tools in general (I’d love to hear your favorite tool horror).

I wanted to create several different signatures for different occasions: one for marketing to local clients, one that is just my “standard tagline”, etc. As part of my signature I wanted a line that included links to my website and blog on a single line:

website: www.wirfs-brock.com blog: www.wirfs-brock.com/rebeccasblog.html

That worked just fine for my standard signature file, where these were the only links. But then I wanted to create another signature which included these links followed by lines with links to upcoming public classes. Each class would be listed on a single line containing a link to the registration page. When I inserted these lines into my file, I encountered problems. The signature looked just fine when created and displayed as I was composing email. But somewhere in the process of sending and receiving the email, that first website link got mangled. I had encountered yet another case of what Scott Meyer’s refers as the keyhole problem. I still don’t know if this is a send or receive error, but trying to fix this problem drove me nuts.

Instead of a well-formed link, the link in incoming email was extended with spaces, breaking it. Needless to say, being a software geek, I vowed to tame this problem. I performed twenty or so different experiments over the next hour. I inserted tabs instead of spaces between my website and blog link. This worked, but the formatting was ragged and I don’t like inserting tabs into messages as some people’s mail systems don’t uniformly display text with embedded tabs. I put the website and blog links on separate lines. This worked, but it made my signature longer. I inserted one tab and spaces after the website link. This worked but had resulted in a ragged signature line that looked unprofessional. I copied the line with the links that worked from my other signature file and pasted it into the second signature one (of course this didn’t work, what was I thinking?). I tried re-specifying the links (this didn’t work either). I moved the broken-linked lines to after the single link lines in my signature file . This largely worked, too, except the spacing between the website link and the blog link came back with an extra space between them (making it a ragged line).

I then got the bright idea of creating a signature file in a fixed font, instead of Ariel. This worked. But I didn’t like how Courier looks. Too clunky. When I changed my signature file back to a font more appealing than Courier, Eudora apparently let me change my signature file, but it refused to pick up the new font information as specified in that file. Even rebooting my mail program didn’t correct the problem (obviously it was caching the font style and not really looking at the font specified in my signature file). I was headed even deeper into the weeds… At this point I decided to give up as I had several approaches that would work OK, even if they didn’t let me format the signature file exactly as I wanted.

All the trouble I had making a signature file made me want to chuck Eudora and move to another mailing tool. But I haven’t, just yet. I’m a healthy skeptic. Each software tool I use has its own peculiar quirks and annoying irritations. (But send me some convincing arguments about why I should move to another mail program and I’ll seriously consider it). Is this because developers are lazy or don’t care about quality? I suspect that most developers do not purposefully go about building quirky software. Yet somehow quirks creep in. There are myriad reasons. For one, most software is developed by teams. Each person has their own piece to implement and the system as a whole isn’t “owned” by anyone. This traditional view of software development is changing with agile teams. Collective ownership, one of XP’s core practices emphasizes teamwork. The more eyes that look at code, the better. But still, you need to pay attention to the system as a whole, even while paying attention to details.

You cannot eliminate these all bugs, but you can certainly waste time writing dumb little unit tests that don’t add any value. Uncovering quirky system behaviors requires spending your testing time wisely. It isn’t enough to write one simple test and declare, “setting up a signature file seems to work.” My quirky bug spanned multiple contexts: composing a signature file, sending email, and then receiving correctly formatted mail. Exploratory testing is a practice worth considering. It involves spending some time poking around, looking for stuff that just might not work. But developers need to take more responsiblity for overall system quality, too. Just checking that your code works isn’t good enough.