An Architect’s Dilemna: Should I Rework or Exploit Legacy Architecture?

I recently spoke with an architect has been tuning up a legacy system that is built out of a patchwork quilt of technologies. As a consequence of its age and lack of common design approaches, the system is difficult to maintain. Error and event logs are written (in fact, many are), but they are inconsistent and scattered. It is extremely hard to collect data from and troubleshoot the system when things go wrong.

The architect has instigated many architectural improvements to this system, but one that to me was absolutely brilliant was to not insist that the system be reworked to use a single common logging mechanism. Instead, logs were redirected to a NoSQL database that could then be intelligently queried to troubleshoot problems as they arose.

Rather than dive in and “fix” legacy code to be consistent, this was a “splice and intelligently interpret” solution that had minimal impact on working code. Yet this fairly simple fix made the lives of those troubleshooting the system much easier. No longer did they have to dig through various logs by hand. They could stare and compare a stream of correlated event data.

Early in my career I was often frustrated by discrepancies in systems I worked on. I envisioned a better world where the design conventions were consistently followed. I took pride in cleaning up crufty code. And in the spirit of redesigning for that new, improved world, I’d fix any inconsistencies that were under my control.

At a large scale, my individual clean up efforts would be entirely impractical. Complex software isnâ’t the byproduct of a single mind. Often, it simply isn’t practical to rework large systems make things consistent. It is far easier to spot and fix system warts early in their life than later after myriad cowpaths have been paved and initial good design ideas have become warped and obsfucated. Making significant changes in legacy systems requires skill, tenacity, and courage. But sometimes you can avoid making significant changes if you twist the way you think about the problem.

If your infrastructure causes problems, find ways to fix it. Better yet (and here’s the twist): find ways to avoid or exploit its limitations. Solving a problem by avoiding major rework is equally as rewarding as cleaning up cruft. Even if it leaves a poor design intact. Such fixes breathe life into systems that by all measures should have been scrapped long ago. Fashioning fixes that don’t force the core of a fragile architecture to be revised is a real engineering accomplishment. In an ideal world I’d like time to clean up crufty systems and make them better. But not if I can get significant improvement with far less effort. Engineering, after all, is the art of making intelligent tradeoffs.

Agile Architecture Myths #4 Because you are agile you can change your system fast!

Agile designers embrace change. But that doesn’t mean change is always easy. Some things are harder to change than others. So it is good to know how to explain this to impatient product stakeholders, program managers, or product owners when they ask you to handle a new requirement that to them appears to be easy but isn’t.

Joe Yoder and Brian Foote, of the Big Ball of Mud fame, provide insights into ways systems can change without too much friction. They drew inspiration from Stuart Brand’s How Buildings Learn. Brand explains that buildings are made of components organized into shearing layers. He identifies six layers: the site, the structure, the skin, the services, the space plan, and physical stuff in the building.

Each shearing layer has its own value and speed of change, or pace. According to Brand, buildings are able to adapt because faster changing layers (e.g. the services layers and spaces) are purposefully designed so to not be obstructed by slower changing layers. If you design your building well, it is fairly easy to change the plumbing. Much easier than revising the foundation. And it is even easier to rearrange the furniture. Sometimes designers go to extra efforts to make a component easier to change. For example, most conference centers are designed so that sliding panels form walls that allow inside space to be quickly modified.

Brand’s ideas should’t be surprising to software developers who follow good design practices that enable us to adapt our software: keep systems modular, remove unnecessary dependencies between components, and hide implementation details behind stable interfaces.

Foote and Yoder’s advice for avoiding tangled, hard-to-change software is to, “Factor your system so that artifacts that change at similar rates are together.” They also present a chart of typical layers in a software system and their rates of change:

Frequently, we are asked to support new functionality that requires us to make changes deep in our system. We are asked to tinker with the underlying (supposedly slower changing) layers that the rest of our software relies upon. And often, we do achieve this miraculous feat of engineering because interfaces between layers were stable and performed adequately. We got away with tinkering with the foundations without serious disruption. But sometimes we aren’t so lucky. A new requirement might demand significantly more capabilities of our underlying layers. These types of changes require significant architectural rework. And no matter how matter how agile we are, major rework requires more effort.

Because we are agile, we recognize that change is inevitable. But embracing change doesn’t make it easier, just expected. I’d be interested in hearing your thoughts about Foote and Yoder’s shearing layers and ways you’ve found to ease the pain of making significant software changes.

Re-thinking Thinking and Planning

In the tutorial, Hooray We’re Agile Testers! What’s Next?, Janet Gregory apologized a couple of times for saying upfront thinking or planning. I know Janet wanted to let the audience know that she isn’t a fan of massive test plans or documents written way ahead. But her remarks got me wondering. Why in the agile community is it a taboo to recommend or admit to doing any upfront thinking or planning?

When you incrementally build production code and tests you do come to a deeper understanding about your software’s capabilities and what your stakeholders want. As a consequence, if you are thoughtful and reactive, it’s natural to adjust and adapt to feedback. But it’s also natural to do some upfront thinking [there, I went and said “upfront thinking”, not just thinking or speculation, and I was cringing ever so slightly as I wrote those words] before expending a lot of time and effort. Sometimes you need to think about and discuss what you should be doing so you don’t waste time doing the wrong things.

As someone who embraces agile values, I expect to readjust my ideas and plans as I learn more. I get it that too much upfront anything results in much wasted effort. But there’s a distinction I’d like to make between too much and enough thinking and preparation.

If you have an agile mindset, you recognize that plans have limits. You let go of any illusion that you’re in control of your destiny simply because you have a plan. You are open to change. But being responsive to change doesn’t obviate the benefits of planning. Especially if your project has to mesh together the work of several teams.

I’m tired of having to apologize for upfront thinking, effort expended in creating a project or product roadmap, defining an initial product landing zones, or exploring options. Give thinking a chance. And find the right balance.

Who Defines (or Redefines) Landing Zone Criteria?

Who should be in on discussions that set landing zone criteria? Because most landing zone have architectural implications, someone knowledgeable about the system architecture, in addition to the product owner and other key stakeholders should have a lot to say in vetting a landing zone.

Someone who has depth, breadth, and vision, is an ideal candidate for crafting an initial cut. But even if you are brilliant, I suggest you fine-tune your landing zone with a small, informed group. If you have lots of stakeholders who want to chime in, give each stakeholder group a voice in identifying qualities and values they find particularly relevant. And ask a representative from each stakeholder group to join in on a landing zone discussion. At a landing zone review, expect healthy discussion. Experts are usually highly opinionated as well as passionate.

You might even want to facilitate your discussions.

I find it much more effective to have an informed facilitator guide landing zone discussions, than a dispassionate, uniformed professional facilitator. An ideal landing zone meeting facilitator should know about the program or product but need not be the “authority” or definitive “expert”. It’s more important that they know the landscape and they are good at gaining consensus and getting the best out of individuals who hold strong opinions. Possibilities: chief business architects, quality leads, the program or product manager, yes, even a software architect.

Sometimes a facilitator needs to step out of that role and offer informed opinions. I find this highly desirable, as long as this shift is made clear: “Hang on, do you mind if I take a stab at explaining what I think are more reasonable targets?”

Minimum, target and acceptable values should be agreed upon by the group and it might take some discussion to reach mutual understanding and consensus. For example, someone might initially propose a set of landing zone values based on historical trends and extrapolation. The software architect could push back with values based on prototyping experiments and new benchmark data. The group might end up adjusting targets because that evidence was compelling. Or, they might agree on tentative values that need to be firmed by an expert. Hammering out numbers just to finish the landing zone isn’t the goal. Instead, you want to shape ideas for what you think will make your product a success based on the best evidence you have, backed up by experience and tempered by group wisdom. To effectively do this, people need to come to the discussion with mutual respect, trust and no hidden agendas.

And if you are agile, recognize that your landing zone can and should recalibrated once you learn more about what’s possible.

Landing Zone Targets: Precision, Specificity, and Wiggle Room

A landing zone is a set of criteria used to monitor and characterize the “releasability” of a product. Landing zones allow you to take product features and system qualities and trade them off against each other to determine what an acceptable product has to be. Almost always these tradeoffs have architectural implications. If you’ve done something similar in the past, the criteria you should use to define your landing zone may be obvious. But for first time landing zone builders, I recommend you task someone who knows about the product to take a first cut at establishing landing zone criteria that is then reviewed and vetted by a small, informed group.

A business architect, product owner, or lead engineer might prepare a “proposed landing zone” of reasonable values for landing zone criteria that are questioned, challenged, and then reviewed by a small group. On one program I was involved with, the chief business architect made this initial cut. He was a former techno geek who knew his technical limits. More important, he had deep business knowledge, product vision, and had a keen sense about where to be precise and where there should be a lot of flexibility in the landing zone values.

Some transaction criteria were very precise. Since they were in the business of processing a lot of transactions, they knew their past and knew were they needed to improve (based on projected increases in transaction volumes). For example, that transaction throughput target for a particular business process was based on extrapolations from the existing implementation (taking into account the new architecture and system deployment capabilities). This is a purposefully obfuscated example:

Example Landing Zone Attribute
Characteristic Minimum Target Outstanding
Payment processing transactions per day 3,250,000 4,000,000 5,500,000

Some targets for explicit user tasks were very specific (one had a target of less than 4 hours with no errors, and an outstanding goal of 1 business day). On the other hand, many other landing zone criteria were only generally categorized as requiring either a patch, a new system release, or online update support. The definitions for what was a patch, a release or an online update were nailed down so that there was no ambiguity in what they meant.

For example, a patch was defined as a localized solution that took a month or less to implement and deploy. The goal was eventually to get closer to a week than a month, but they started out modestly. On the other hand, a release required coordination among several teams and an entire system redeployment. An online update was something a user could accomplish via an appropriate tool.

So, for example, the landing zone criteria for reconfiguring a workflow associated with a specific data update stream had minimal and target values of “release” and an outstanding value of “online update”.

When defining a landing zone for an agile product or program, carefully consider how precise you need to be and how many criteria are in your zone. Less precision allows for more wiggle room. Without enough constraints, however, it’s hard to know what is good enough. The more precise landing zone criteria are, the easier it is to tell whether you are on track to meet them. But if those landing zone criteria are too narrowly defined, there’s a danger of ignoring broader architecture and design concerns in order to focus only on specifically achieving targets.

We live in a world where there needs to be a balance. I’ll write more about who might be best suited to defining and redefining landing zones in another post.