Philippe Kruchten

A Theory of Architectural Technical Debt

March 8, 2021

Our paper on a little theory (=conceptual model) of architectural technical debt is finally out. It’s open access.

https://doi.org/10.1016/j.jss.2021.110925

Roberto Verdecchia, Philippe Kruchten, Patricia Lago, Ivano Malavolta:
Building and evaluating a theory of architectural technical debt in software-intensive systems,
Journal of Systems and Software, Volume 176, 2021. DOI : 10.1016/j.jss.2021.110925.
(https://www.sciencedirect.com/science/article/pii/S0164121221000224)

Abstract: Architectural technical debt in software-intensive systems is a metaphor used to describe the “big” design decisions (e.g., choices regarding structure, frameworks, technologies, languages, etc.) that, while being suitable or even optimal when made, significantly hinder progress in the future. While other types of debt, such as code-level technical debt, can be readily detected by static analyzers, and often be refactored with minimal or only incremental efforts, architectural debt is hard to be identified, of wide-ranging remediation cost, daunting, and often avoided. In this study, we aim at developing a better understanding of how software development organizations conceptualize architectural debt, and how they deal with it. In order to do so, in this investigation we apply a mixed empirical method, constituted by a grounded theory study followed by focus groups. With the grounded theory method we construct a theory on architectural technical debt by eliciting qualitative data from software architects and senior technical staff from a wide range of heterogeneous software development organizations. We applied the focus group method to evaluate the emerging theory and refine it according to the new data collected. The result of the study, i.e., a theory emerging from the gathered data, constitutes an encompassing conceptual model of architectural technical debt, identifying and relating concepts such as its symptoms, causes, consequences, management strategies, and communication problems. From the conducted focus groups, we assessed that the theory adheres to the four evaluation criteria of classic grounded theory, i.e., the theory fits its underlying data, is able to work, has relevance, and is modifiable as new data appears. By grounding the findings in empirical evidence, the theory provides researchers and practitioners with novel knowledge on the crucial factors of architectural technical debt experienced in industrial contexts.

Appendix to June 2, 2020 talk on Software architecture

June 2, 2020

On June 2, 2020, I was invited by the SEI to make a presentation. Slides are posted here.

A recording of the talk is here, but be aware of a possible gap around minute 34; just go “fast forward”.

You will find here pointers to various papers and books I referred to in the talk.

For my own papers on Software architecture, see this other tab here.

References I made in the talk:

Butler W. Lampson. 1983. Hints for computer system design. SIGOPS Oper. Syst. Rev. 17, 5 (October 1983), 33–48. DOI: 10.1145/773379.806614
John A. Mills. 1985. A pragmatic view of the system architect. Commun. ACM 28, 7 (July 1985), 708–717. DOI: 10.1145/3894.3897
Dewayne E. Perry and Alexander L. Wolf. 1992. Foundations for the study of software architecture. SIGSOFT Softw. Eng. Notes 17, 4 (Oct. 1992), 40–52. DOI: 10.1145/141874.141884
Mary Shaw and David Garlan. Software architecture: perspectives on an emerging discipline. Prentice-Hall, Inc., USA. (1996.)
Mary Shaw and Paul C. Clements. 1997. A Field Guide to Boxology: Preliminary Classification of Architectural Styles for Software Systems. In Proc. of COMPSAC ’97. IEEE Computer Society, USA, 6–13.
M. Fowler, “Design – Who needs an architect?,” in IEEE Software, vol. 20, no. 5, pp. 11-13, Sept.-Oct. 2003, doi: 10.1109/MS.2003.1231144.
Simon Brown: Are you a software architect, InfoQ Feb. 2010 https://www.infoq.com/articles/brown-are-you-a-software-architect/
Simon Brown, The C4 model for visualising software architecture, https://c4model.com
S. Redwine and W. Riddle, “Software Technology Maturation,” Proc. 8th Int’l Conf. Software Eng., IEEE CS Press, 1985, pp. 189–200.
David Garlan. 2000. Software architecture: a roadmap. In Proceedings of the Conference on The Future of Software Engineering (ICSE ’00). ACM, New York, NY, USA, 91–101. DOI: 10.1145/336512.336537
M. Shaw, “The coming-of-age of software architecture research,” in Proc. IEEE/ACM ICSE ’01, Toronto, Canada, 2001 pp. 657-664a. doi: 10.1109/ICSE.2001.919147
M. Shaw and P. Clements, “The golden age of software architecture” in IEEE Software, vol. 23, no. 02, pp. 31-39, 2006. doi: 10.1109/MS.2006.58
P. Kruchten, H. Obbink, J. Stafford, “The Past, Present, and Future for Software Architecture” in IEEE Software, vol. 23, no. 2, pp. 22-30, 2006. DOI : 10.1109/MS.2006.59
P. Clements and M. Shaw, “The Golden Age of Software Architecture” Revisited. in IEEE Software, vol. 26, no. 04, pp. 70-72, 2009. doi: 10.1109/MS.2009.83
David Garlan and Mary Shaw. Software architecture: reflections on an evolving discipline. In Proc. Of ESEC/FSE ’11. ACM, New York, 2011. DOI: 10.1145/2025113.2025116
SEI (2017) What is your definition of software architecture, https://resources.sei.cmu.edu/library/asset-view.cfm?assetid=513807
SAFE & agile architecture: https://www.scaledagileframework.com/agile-architecture/
LESS & architectural design: https://less.works/less/technical-excellence/architecture-design.html

Useful books on Software architecture:

L. Bass, P. Clements, R. Kazman, Software Architecture in Practice (3rd Ed.), Addison-Wesley (2012)
Ian Gorton, Essential software architecture (2nd ed), Springer (2011)
Simon Brown, Software architecture for developers (vol 1 & 2), LeanPub (2018) https://leanpub.com/software-architecture-for-developers
N. Rozanski and E. Woods, Software Systems Architecture: Working With Stakeholders Using Viewpoints and Perspectives (2nd ed) Addison Wesley (2011)
George Fairbanks, Just Enough Software Architecture: A Risk-Driven Approach , Marshall & Brainerd (2010)
F. Buschmann, M. Stal & al., Pattern-Oriented Software Architecture (vol 1 & 2), Wiley (1996, 2000)
M. Maier & E. Rechtin, The Art of Systems Architecting, 3rd ed, CRC Press (2009)
G. Hohpe, 37 Things One Architect Knows About IT Transformation: A Chief Architect’s Journey, LeanPub (2016).
M. Richards & N. Ford, Fundamentals of Software Architecture, OReilly (2020). (Just arrived on my shelf!)

Small update to the Mission to Mars game

May 24, 2018

Following the Agile Vancouver Meetup on May 22nd… see /https://philippe.kruchten.com/articles/mtm/

If you’re in Vancouver and want to borrow a few games, just email me.

Concrete things you can do about your technical debt

February 14, 2017

“Technical debt, yes we have some of this, but what can we do…?” Here’s a few ideas to get you started:

Organize a lunch-and-learn with your team to introduce the concept of technical debt. Illustrate it with examples from your own projects, if possible.
Create a category “TechDebt” in your issue tracking system, distinct from defects, or new features. Point at the specific artifacts involved.
Standardize on one single form of “Fix me” or “Fix me later” comment in the source code to mark places that should be revised and improved later. They will be easier to spot with a tool.
Acquire and deploy in your development environment a static code analyser to detect code-level “code smells”. (Do not panic in front of the large number of positive warnings).
Prioritize technical debt items to fix or refactor, by doing them first in the parts of your code that are the most actively modified, leaving aside or for later the parts that are never touched.
Organize small 1-hour brainstorming sessions around the question: “What design decision did we make in the past that we regret now because it is costing us much?” or “If we had to do it again, what should have we done?” This is not a blame game, or a whining session; just identify high level structural issues, the key design decisions from the past that have turned to technical debt today.
For identified tech debt items, give not only estimates of the cost to “reimburse” them or refactor them (in staff effort), but also estimate of the cost to not reimburse them: how much it drags the progress now. At least describe qualitatively the impact on productivity or quality. This can be assisted by tools from your development environment, to look at code churn, and effort spent.
At each development cycle, try to constantly reduce some of the technical debt by explicitly bringing some tech debt items into your iteration or sprint backlog.
Refine in your issue tracker the TechDebt category into at least 2 subcategories: simple, localized, code-level debt, and wide ranging, structural or architectural debt.
For your major kinds of technical debt, identify the root cause –schedule pressure, process or lack of process, people availability or turn over, knowledge or lack of knowledge, tool or lack of tool, change of strategy or objectives– and plan specific actions to address these root causes, or mitigate their effect.
Acquire and deploy a tool that will give you hints about structural issues in your code: dependency analysis.
Develop an approach for systematic regression testing, so that fixing technical debt items does not run you in the risk of breaking the code. (Counter the “It is not really broken, so I won’t fix it.”)
If you are actively managing risks, consider bringing some major tech debt items in your list of risks.

More advice here:
Ambler, S. (2017). 11 Strategies for Dealing With Technical Debt, Blog entry at http://www.disciplinedagiledelivery.com/technical-debt/

Refining the definition of technical debt

April 22, 2016

At a Dagstuhl seminar, sponsored by the Leibniz society, a group of experts tried to refine and agree on a common definition of the elusive concept of “technical debt”:

In software-intensive systems, technical debt consists of design or implementation constructs that are expedient in the short term, but set up a technical context that can make a future change more costly or impossible. Technical debt is a contingent liability whose impact is limited to internal system qualities, primarily maintainability and evolvability.

(starting from Steve McConnell’s definition)

Three “-tures”: architecture, infrastructure, and team structure

October 8, 2014

At a recent workshop, at XP 2014, we looked into practices that support scaling up agile, and in particular the role of architecture.

One conjecture we arrived at is that architects typically work on three distinct but interdependent structures:

The Architecture (A) of the system under design, development, or refinement, what we have called the traditional system or software architecture.
The Structure (S) of the organization: teams, partners, subcontractors, and others.
The Production infrastructure (P) used to develop and deploy the system, especially important in contexts where the development and operations are combined and the system is deployed more or less continuously.

System architecture (A), Organizational structure (S), and Production infrastructure (P)

These three structures must be kept aligned over time, especially to support an agile development style. We can examine the alignment of these structures from the perspective of A and the role of the architect in an agile software-development organization.

The relationship of A to S is also known as (socio-technical) congruence and has been extensively studied, especially in the context of global, distributed software development. It is akin to the good old Conway’s law. It is very pertinent at the level of the static architectural structure (development view), where a development team wants to avoid conflicts of access to the code between teams and between individuals, while having clear ownership or responsibility over large chunks of code. When A is lagging, we face a situation of technical debt; when S is lagging, we have a phenomenon called “social debt” akin to technical debt, which slows down development. See what Ruth Malan wrote on the topic; in particular: “”if the architecture of the system and the architecture of the organization are at odds, the architecture of the organization wins”.

The alignment of A with P is seeing renewed interest with increased focus on continuous integration and deployment and the concept of “DevOps” combining the development organization with the operations organization, and having the tools in place to ensure continuous delivery or deployment, even in the case of very large on-line, mission-critical systems (e.g., Netflix, Facebook, Amazon). When P is lagging, we witness a case of “infrastructure debt” as described by Shafer in the Cutter IT journal, which is another source of friction. It is also explored by M. Erder and P. Pureur’s Continuous Architecture.

A, S, and P must be “refactored” regularly to be kept in sync so that they can keep supporting each other. Too much early design in any of the three will potentially result in excessive delays, which will increase friction (by increased debt), reduce quality, and lead to overall product delivery delays.

Congruence, DevOps

With my colleagues I. Ozakaya and R. Nord from the SEI, we’ve examined some of this conjecture in a paper to be published by Springer-Verlag later in the fall. I’ll add a DOI when it happens.

Herbsleb & Grinter (1999) Architectures, Coordination and Distance: Conway’s Law and Beyond, IEEE Software

Managing Technical Debt event in Zürich (Technische Schuld)

December 18, 2013

The Software Engineering division of the Swiss Informatics Society is organizing a 2-day event on Technical Debt on January 23-24, 2014, at the University of Zürich.

Tutorials on Thursday, presentations on Friday. Pick one, or both.
I will participate on both days. Note that most of the presentations will be in English, even though the site is in German.

Find details and registration, go here.

Agile architecture

December 11, 2013

tags: agile, Architecture

Image from http://libarynth.org/project_txoom_design

The phrase “agile architecture” evokes two things:

a system or software architecture that is versatile, easy to evolve, to modify, flexible in a way, while still resilient to changes
an agile way to define an architecture, using an iterative lifecycle, allowing the architectural design to tactically evolve gradually, as the problem and the constraints are better understood

The two are not the same: you can have a non-agile development process leading to a flexible, adaptable architecture, and vice versa, an agile process may lead to a rather rigid and inflexible architecture. One does not imply the other. But for obvious reasons, in the best of worlds, we’d like to have an agile process, leading to a flexible architecture.

There is a naïve thinking that just by being agile, an architecture will gradually emerge, out of bi-weekly refactorings. This belief was amplified by a rather poorly worded principle #11 in the agile manifesto [1], which states that:

“The best architectures, requirements, and designs emerge from self-organizing teams.”

and cemented by profuse amount of repeated mantras like: YAGNI (You Ain’t Gonna Need It) or No BUFD (No Big Up-Front Design), or “Defer decision to the last responsible moment”. (This principle is neither prescriptive, not can it be tested, as Séguin et al. showed in [2], so it is probably not a principle, but merely an observation or a wish.)

This naïve thinking about the spontaneous emergence of architecture is reinforced by the fact that most software endeavors nowadays do not require a significant amount of bold new architectural design: the most important design decisions have been made months earlier, or are fixed by current pre-existing conditions, or are a de facto architectural standard set-up in this industry. Choices of operating system, servers, programming language, database, middle ware, and so on are pre-determined in the vast majority of software development projects or have a very narrow range of possible choices. There is in fact little architectural work left to be done.

Architectural design, when it is really needed because of the project novelty, has an uneasy relationship with the traditional agile practices. Unlike functionality of the system, it cannot easily be decomposed in small little chunks of work, user stories or “technical stories”. Most of the difficult aspects of architectural design are driven by non-functional requirements, or quality attributes: security, high availability, fault tolerance, interoperability, scalability, etc, or development related (testability, certification, and maintainability) which cannot be parceled down, and for which tests are difficult to produce up-front. Key architectural choices cannot be easily retrofitted on an existing system, by means of simple refactorings. Some of the late decisions may gut out large chunks of the code, and therefore whether the agilists like it or not, much of the architectural decisions have to be taken early, although not all at once up front.

Many have grappled with the issue of marrying an agile approach and a need for having a solid architecture. Alistair Cockburn and his “walking skeleton” or the Scaled Agile Framework (SAFe) by Dean Leffingwell & co.

The most common thinking nowadays is that architectural design and the gradual building of the system (i.e., its user visible functionality) must go hand-in-hand, in subsequent iterations, and the delicate issue is actually: how do we pace ourselves, how we address architectural issues, and make decisions over time in a way that will lead to a flexible architecture, and enable developers to proceed. In which order do we pick the quality attribute aspects and address them?

As for an agile architecture, the concept is not new: evolvability, software evolution, re-engineering of existing systems have been studied and understood for a long time. Manny Lehman started this circa 1980 [5]. The word agile here is just new paint on an old concept.

The book by Simon Brown, Software architecture for the developers [6], is a nice example of agile architecting, while the book by Jason Bloomberg, The agile architecture revolution [7], is a good example of agile architecture.

Note: the full story, in collaboration with my colleagues of the Software Engineering Institute, will appear in the Cutter IT Journal in February 2014 [8], under the tongue in cheek title “How to agilely architect an agile architecture”. Beyond YAGNI…

References

Agile Alliance, Manifesto for Agile Software Development, June 2001 http://agilemanifesto.org/.
N. Séguin, G. Tremblay, and H. Bagane, “Agile Principles as Software Engineering Principles: An Analysis,” vol. 111, Lecture Notes in Business Information Processing, C. Wohlin, Ed. Berlin Heidelberg: Springer, 2012, pp. 1-15.
Scaled Agile Framework, http://scaledagileframework.com/
A. Cockburn, “Walking Skeleton,” http://alistair.cockburn.us/Walking+skeleton
M. M. Lehman, “Programs, lifecycles, and laws of software evolution,” Proceedings of the IEEE – Special issue on software engineering, vol. 68(9), pp. 1060-1076, 1980.
S. Brown, Software Architecture for the developers, LeanPub, 2013
J. Bloomberg, The agile architecture revolution, Wiley CIO, 2013.
S. Bellomo, P. Kruchten, R. L. Nord, and I. Ozkaya, “How to agilely architect an agile architecture?,” Cutter IT Journal, vol. 27(2), pp. 12-17, Feb. 2014.

The (missing) value of software architecture

December 11, 2013

tags: Architecture, TechnicalDebt

One of the most critical questions about software architecture is what is its actual value? As software development processes focus more and more on value delivered to end-users and time-to-market, the difficulty in assigning an actual value on the effort spent working on the architecture makes it much harder for software architects to convince project managers or product owners (or whomever represent the customer) to spend much effort on architecture-related activities, leading often rapidly to a large amount of technical debt [1].

At any point in time, a software development team is faced with a choice: what do we focus on in the next release cycle, or simply the next iteration or sprint. In the “backlog” of things not done yet, there are 4 kinds of elements (Figure from [1]):

Items that have visible value:

The green stuff: the most obvious items are new features (services, functionalities, capabilities) to be added to the system; sometimes visible improvements in some quality attribute (capacity, response time, interoperability).

The red stuff: if the software product is already released, then it is likely to have defects, hurting some customers directly, or indirectly by giving negative press to you

Both have a cost to implement, and some tradeoff has already to be negotiated between parties here: how much defect fixing relative to new features can we afford to do?

But there are also items that are completely invisible to the outside world:

The yellow stuff: architectural elements, infrastructure, frameworks, deployment tools, etc. Known to the internal development team, and architects, they often are deferred in favour of more green or red stuff. Their cost is often very lumpy: they are hard to break down to small increments. We do know that they add value, in the long term, by increasing future productivity and often key quality attributes. But this value is hard to define.

Finally, the black stuff: there are elements that have both a negative value, and are invisible: big lumps of technical debt. They are the result of earlier architectural decisions that were wise and optimal at the time, but which in the current context are clearly suboptimal and hurt the project in several ways: usually by reduced productivity, or impact on the evolution of the system. The black stuff is known by the development team, but rarely expressed visibly at the level of key decisions makers who decide the future release roadmap. Short cuts or omission to develop the yellow stuff increases the amount of black stuff, further preventing progress.

A compounding factor is that the various elements have many dependencies between them, especially dependencies of the green and red stuff on the yellow or black stuff. The tradeoffs between the various colours is now much more complicated and require diverse expertise, not just market value. Time plays a crucial role, too: the value of delivering a new feature is immediate; whereas the value of developing a good architecture may be reaped only over a long period of time.

The key issue is how much value is there in the yellow and black stuff? What is really the value of software architecture?

Some economic concept such as Net Present Value, combined with dependency analysis should be able to reconcile all four colours in making development choices for the future, short- or long-term. The Incremental Funding Method by Mark Denne and Jane Cleland-Huang is a step in this direction [2]. Real Options could be another one [3].

To read more about the 4 colours idea, look in my Talks page for a 2009 presentation (file here), or a more recent one 2013 here [4]. And a video of the talk here

References:

P. Kruchten, R. Nord, and I. Ozkaya, “Technical debt: from metaphor to theory and practice,” IEEE Software, vol. 29(6), pp. 18-21, 2012.
M. Denne and J. Cleland-Huang, “The Incremental Funding Method: Data-Driven Software Development,” IEEE Software, vol. 21(3), pp. 39-47, 2004.
K. J. Sullivan, P. Chalasani, S. Jha, and V. Sazawal, “Software Design as an Investment Activity: A Real Options Perspective,” in Real Options and Business Strategy: Applications to Decision Making, L. Trigeorgis (ed.), Risk Books, 1999.
P. Kruchten, “What colours is your backlog? (slides),” 2009 (updated 2013). Video here.

Friction

November 24, 2013

In his 2000 ICSE Keynote in Limerick, Ireland, my colleague Grady Booch said: “There is still much friction in the process of crafting complex software; the goal of creating quality software in a repeatable and sustainable manner remains elusive to many organizations, especially those who are driven to develop in Internet time.” Friction?

“Friction: the resistance that one surface or object encounters when moving over another.” [Merriam-Webster dict.]

By analogy, in software development, friction is the set of phenomena that limits or constraints our progress, therefore reduces our velocity (or productivity). An element of friction that we have been looking at more closely in the last few years is the result of technical debt: the accumulation of design or coding decisions that looked expedient at the time we made them, but are in retrospect suboptimal, and a hindrance now.

But there is another aspect of friction that is not related to the state of the code, but resides at the organizational and social level. Damian Tamburri, from VU in Amsterdam, has introduced the notion of social debt, as a counter part of technical debt [ICSE2013 workshop]. Social debt is a state of a development project which is the result of the accumulation over time of decisions about the way the development team (or community) communicates, collaborates and coordinates; in other words, decisions about the organizational structure, the process, the governance, the social interactions, or some elements inherited through the people: their knowledge, personality, working style, etc.

Social debt + Technical debt => Friction => delays, unpredictable schedule, and/or poor quality.

To reduce friction, we have to work in parallel on both aspects, technical and social. An ideal, frictionless project, would have zero technical friction (i.e., a perfect design, and perfect code), and zero social friction: a team that collaborate, communicate and co-ordinate at zero cost, without error. Like in physics, it is impossible to reduce friction to nothing, to eliminate it, to have a completely frictionless development. But at least it gives us something to aspire to.

Friction in everyday language also means “conflict or animosity caused by a clash of wills, temperaments, or opinions” [Merriam-Webster], and for sure we can witness these often in he social relationships of software development teams, but much of the social debt is more subtle in nature.

As we have defects and code smells on the technical side, we can observe on the social side defects and “smells”: not problems but potential source of a series of concrete problems if left not addressed. Examples of social and organizational and social smells Tamburri identified and studied are: Organizational silos, or Prima donnas [paper submitted to ICSE 2014], maybe not issues in themselves in some circumstances, but certainly a potential for many ills.

Friction resulting for social debt is highly dependent on the context, therefore will vary greatly in form and intensity based on [see Octopus]:

size of the project (in whatever unit of measure: SLOC, function-Points, person-months, or staff);
geographic distribution of the development team (compounded by cultural differences)
governance rules imposed externally (Sarbanes-Oxley, Basel III)
age of the system (dragging old habits or process form last century)
stability of the environment (commercial/contractual environment, human resources, etc.)
business model (internal development, software product, open-source community…)

There may be other aspects of the physics of friction, both static friction and dynamic friction, that could be exploited, making size of the project an analog of the mass, and linear speed to project velocity, and defining a concept of coefficient of friction in presence of various kinds of lubricants.

Your thought?

“Friktion ist der einzige Begriff, welcher dem ziemlich allgemein entspricht, was den wirklichen Krieg von dem auf dem Papier unterscheidet.” Carl von Clausewitz, Vom Krieg, 1832.
(Friction is the one concept that separates real war from a mere paper exercise. My translation)

Philippe Kruchten