What’s going on here?

23 June 2016
Rebecca Wirfs-Brock

Question: What is more exasperating than reading design documentation that doesn’t synch up with the code?
One answer: Writing such useless design documentation.
Another answer: Writing documents that don’t stand a chance of being aligned with the code.
My answer: Reading design documentation that is at the wrong level of abstraction or detail to help me do the task at hand.

Courtesy xkcd https://xkcd.com/license.html

A few years ago I worked with someone, who, when asked for a high level overview of some complex bits of code we were going to refactor, produced a detailed Visio diagram that went on for 40+ pages! In that document were statements directly lifted from his code. The intent was to represent the control logic and branching structure of some very complicated algorithms (he copied conditional statements into decision diamonds and copied code that performed steps of the algorithm directly into action blocks). To him, the exercise of creating this documentation had really clarified his design. I was amazed he went to such an effort. To me, the control structure was evident by simply reading the code. Moreover, what I was looking for, and obviously didn’t communicate very well, was that I would appreciate some discussion at a higher level, explaining what algorithm variations there were and how code and data were structured in order to handle myriad typical and special cases.

I wanted to be oriented to the code and data and parts that were configurable and parameterized.

Later, I learned from one of his colleagues that “Bob” always explained things in great detail and wasn’t very good at “boiling things down.” If you wanted to ask “Bob” a question about the code you’d better plan on at least spending half an hour with him.

Another time a colleague created a sequence diagram explaining how data was created that was used by a nightly cron job. What was missing was any description of what that cron job did. Maybe it was so obvious to him that he didn’t think it was important. But I lacked the context to get the full picture. So I had to dig into the cron job script to understand what was really going on.

When I want to get oriented to a large body of code, I like to see a depiction of how it is structured and organized and the major responsibilities of each significant part. Not the package or file structure (that will be useful, eventually), but how it is organized into significant modules, functions, classes and/or call structures. I want to know what the major parts of the system are and why they are there and relationships between them. And at a high level, how they interact. And then I want to understand key aspects of the system dynamics. Just by staring at code I’ll never fathom all the ins and outs of its execution model (e.g. what threads or processes there are, what important information is consumed, produced, or passed around).

Simon Brown gave a talk at XP2016 on The Art of Visualizing Software Architecture. Simon travels the world giving advice and training and talks at conferences about how to do this. He has also build a tool called Structerizr which:

“ blend[s] together the best parts of the various approaches and allow software developers to easily create software architecture diagrams that remain up to date when the code changes. Structurizr allows you to take a hybrid "extract and supplement" approach. It provides you with a way to create software architecture models using code. This means you can extract information from your code using static analysis and reflection, supplementing the model where information isn't readily available. The resulting diagrams can be visualized and shared via the web.”

I had the opportunity to spend an hour with Simon getting a personalized demo of Structurizr, and discussing the ins and outs of his tool works, how it has been used, and some aspirations for the future.

I was initially surprised about how you specify what your “model elements and relationships” are: you write code specifications for this. To me, it seemed that this could have more easily be accomplished by writing some parseable external description or using a diagramming tool which enabled me to link to the relevant code.

However, I know why Simon’s made this choice: He deeply believes that code should be the reference point for all structural architecture information. So he thinks it a natural extension that developers write a bit more code to specify the important elements of the models they want to see and how to depict them. To Simon, it isn’t sensical to just “add an element” to a diagram unless there is some attachment to actual code. For example, you may define a Hibernate specification to represent data access to a repository. Then that can be specified as a data source. With external descriptions that are not based on code, you have a greater chance of it becoming disconnected from what code actually does. So be it. This is where my values diverge a bit from Simon’s. I favor a richer or higher level documentation that isn’t code based if it tells me something I need to know to do a task (or something I shouldn’t do) or that gives me insights that I wouldn't find otherwise. I also like the freedom to embellish my diagrams with extra annotations, colors, highlights, etc., etc. so I can focus viewers’ attention. And consequently, I also like to create a key that describes the elements of any diagram so casual readers can understand my notation, too.

After processing these model descriptions, Structurizr spits out JSON descriptions that represent aspects of a C4 model: Context, Containers, Components, and Class diagrams. These are then submitted to a service, which then create various model visualizations that can directly link to parts of your code source. There’s also a rudimentary ability to add additional bits of information to textually describe your architecture and architecture requirements. I’m glad that, recognizing the limits of what code can tell about architecture, the tool provides an easy entry way into developers telling more about the architecture in text.

In our conversation, Simon emphasized that one important benefit of linking architecture document to code is that it never gets out of synch. The code for your architecture documentation can be versioned, right along with your production code.

Fair enough. Useful even. But how helpful is the documentation that is generated in understanding what’s going on in your system?

I think this is one way (certainly not the only way, however) to provide an orientation into the overall structure of your system and major relationships between its parts. But there are definite limits to what you can describe by models generated from static analysis of code. Such models don’t tell me the responsibilities of a component. I have to glean that from supplementary text (which I didn’t see an easy way to attach to the model descriptions) or to abstract that by reading code. And they certainly won’t explain any dynamic system behavior or system qualities.

While I find Structurizr’s direct connection to the code intriguing, it is also its biggest limitation. Structurizr is intended to be used by developers who want to create some architecture documentation, which stays in synch with their code and that also can be shared via websites or wikis or embedding it into other documents. Structurizr has succeeded with this fairly modest goal: assist developers who don’t ordinarily generate any useful architecture diagrams whatsoever to do something.

But this isn’t enough. As designers or architects, only you can tell me answers to my larger questions about why your code is structured the way it is, and what are the important bits about the system and runtime execution that are intentional and crucial to understand and preserve. No diagram generated by a tool will ever replace your insights and wisdom. No amount of code comments can convey all that either. I need to hear (and read) and see more of this kind of stuff from you.