At a Dagstuhl seminar, sponsored by the Leibniz society, a group of experts tried to refine and agree on a common definition of the elusive concept of “technical debt”:
In software-intensive systems, technical debt consists of design or implementation constructs that are expedient in the short term, but set up a technical context that can make a future change more costly or impossible. Technical debt is a contingent liability whose impact is limited to internal system qualities, primarily maintainability and evolvability.
(starting from Steve McConnell’s definition)
At a recent workshop, at XP 2014, we looked into practices that support scaling up agile, and in particular the role of architecture.
One conjecture we arrived at is that architects typically work on three distinct but interdependent structures:
- The Architecture (A) of the system under design, development, or refinement, what we have called the traditional system or software architecture.
- The Structure (S) of the organization: teams, partners, subcontractors, and others.
- The Production infrastructure (P) used to develop and deploy the system, especially important in contexts where the development and operations are combined and the system is deployed more or less continuously.
These three structures must be kept aligned over time, especially to support an agile development style. We can examine the alignment of these structures from the perspective of A and the role of the architect in an agile software-development organization.
The relationship of A to S is also known as (socio-technical) congruence and has been extensively studied, especially in the context of global, distributed software development. It is akin to the good old Conway’s law. It is very pertinent at the level of the static architectural structure (development view), where a development team wants to avoid conflicts of access to the code between teams and between individuals, while having clear ownership or responsibility over large chunks of code. When A is lagging, we face a situation of technical debt; when S is lagging, we have a phenomenon called “social debt” akin to technical debt, which slows down development. See what Ruth Malan wrote on the topic; in particular: “”if the architecture of the system and the architecture of the organization are at odds, the architecture of the organization wins”.
The alignment of A with P is seeing renewed interest with increased focus on continuous integration and deployment and the concept of “DevOps” combining the development organization with the operations organization, and having the tools in place to ensure continuous delivery or deployment, even in the case of very large on-line, mission-critical systems (e.g., Netflix, Facebook, Amazon). When P is lagging, we witness a case of “infrastructure debt” as described by Shafer in the Cutter IT journal, which is another source of friction. It is also explored by M. Erder and P. Pureur’s Continuous Architecture.
A, S, and P must be “refactored” regularly to be kept in sync so that they can keep supporting each other. Too much early design in any of the three will potentially result in excessive delays, which will increase friction (by increased debt), reduce quality, and lead to overall product delivery delays.
With my colleagues I. Ozakaya and R. Nord from the SEI, we’ve examined some of this conjecture in a paper to be published by Springer-Verlag later in the fall. I’ll add a DOI when it happens.
The Software Engineering division of the Swiss Informatics Society is organizing a 2-day event on Technical Debt on January 23-24, 2014, at the University of Zürich.
Tutorials on Thursday, presentations on Friday. Pick one, or both.
I will participate on both days. Note that most of the presentations will be in English, even though the site is in German.
Find details and registration, go here.
The phrase “agile architecture” evokes two things:
- a system or software architecture that is versatile, easy to evolve, to modify, flexible in a way, while still resilient to changes
- an agile way to define an architecture, using an iterative lifecycle, allowing the architectural design to tactically evolve gradually, as the problem and the constraints are better understood
The two are not the same: you can have a non-agile development process leading to a flexible, adaptable architecture, and vice versa, an agile process may lead to a rather rigid and inflexible architecture. One does not imply the other. But for obvious reasons, in the best of worlds, we’d like to have an agile process, leading to a flexible architecture.
There is a naïve thinking that just by being agile, an architecture will gradually emerge, out of bi-weekly refactorings. This belief was amplified by a rather poorly worded principle #11 in the agile manifesto , which states that:
“The best architectures, requirements, and designs emerge from self-organizing teams.”
and cemented by profuse amount of repeated mantras like: YAGNI (You Ain’t Gonna Need It) or No BUFD (No Big Up-Front Design), or “Defer decision to the last responsible moment”. (This principle is neither prescriptive, not can it be tested, as Séguin et al. showed in , so it is probably not a principle, but merely an observation or a wish.)
This naïve thinking about the spontaneous emergence of architecture is reinforced by the fact that most software endeavors nowadays do not require a significant amount of bold new architectural design: the most important design decisions have been made months earlier, or are fixed by current pre-existing conditions, or are a de facto architectural standard set-up in this industry. Choices of operating system, servers, programming language, database, middle ware, and so on are pre-determined in the vast majority of software development projects or have a very narrow range of possible choices. There is in fact little architectural work left to be done.
Architectural design, when it is really needed because of the project novelty, has an uneasy relationship with the traditional agile practices. Unlike functionality of the system, it cannot easily be decomposed in small little chunks of work, user stories or “technical stories”. Most of the difficult aspects of architectural design are driven by non-functional requirements, or quality attributes: security, high availability, fault tolerance, interoperability, scalability, etc, or development related (testability, certification, and maintainability) which cannot be parceled down, and for which tests are difficult to produce up-front. Key architectural choices cannot be easily retrofitted on an existing system, by means of simple refactorings. Some of the late decisions may gut out large chunks of the code, and therefore whether the agilists like it or not, much of the architectural decisions have to be taken early, although not all at once up front.
Many have grappled with the issue of marrying an agile approach and a need for having a solid architecture. Alistair Cockburn and his “walking skeleton” or the Scaled Agile Framework (SAFe) by Dean Leffingwell & co.
The most common thinking nowadays is that architectural design and the gradual building of the system (i.e., its user visible functionality) must go hand-in-hand, in subsequent iterations, and the delicate issue is actually: how do we pace ourselves, how we address architectural issues, and make decisions over time in a way that will lead to a flexible architecture, and enable developers to proceed. In which order do we pick the quality attribute aspects and address them?
As for an agile architecture, the concept is not new: evolvability, software evolution, re-engineering of existing systems have been studied and understood for a long time. Manny Lehman started this circa 1980 . The word agile here is just new paint on an old concept.
The book by Simon Brown, Software architecture for the developers , is a nice example of agile architecting, while the book by Jason Bloomberg, The agile architecture revolution , is a good example of agile architecture.
Note: the full story, in collaboration with my colleagues of the Software Engineering Institute, will appear in the Cutter IT Journal in February 2014 , under the tongue in cheek title “How to agilely architect an agile architecture”. Beyond YAGNI…
- Agile Alliance, Manifesto for Agile Software Development, June 2001 http://agilemanifesto.org/.
- N. Séguin, G. Tremblay, and H. Bagane, “Agile Principles as Software Engineering Principles: An Analysis,” vol. 111, Lecture Notes in Business Information Processing, C. Wohlin, Ed. Berlin Heidelberg: Springer, 2012, pp. 1-15.
- Scaled Agile Framework, http://scaledagileframework.com/
- A. Cockburn, “Walking Skeleton,” http://alistair.cockburn.us/Walking+skeleton
- M. M. Lehman, “Programs, lifecycles, and laws of software evolution,” Proceedings of the IEEE – Special issue on software engineering, vol. 68(9), pp. 1060-1076, 1980.
- S. Brown, Software Architecture for the developers, LeanPub, 2013
- J. Bloomberg, The agile architecture revolution, Wiley CIO, 2013.
- S. Bellomo, P. Kruchten, R. L. Nord, and I. Ozkaya, “How to agilely architect an agile architecture?,” Cutter IT Journal, vol. 27(2), pp. 12-17, Feb. 2014.
One of the most critical questions about software architecture is what is its actual value? As software development processes focus more and more on value delivered to end-users and time-to-market, the difficulty in assigning an actual value on the effort spent working on the architecture makes it much harder for software architects to convince project managers or product owners (or whomever represent the customer) to spend much effort on architecture-related activities, leading often rapidly to a large amount of technical debt .
At any point in time, a software development team is faced with a choice: what do we focus on in the next release cycle, or simply the next iteration or sprint. In the “backlog” of things not done yet, there are 4 kinds of elements (Figure from ):
Items that have visible value:
- The green stuff: the most obvious items are new features (services, functionalities, capabilities) to be added to the system; sometimes visible improvements in some quality attribute (capacity, response time, interoperability).
- The red stuff: if the software product is already released, then it is likely to have defects, hurting some customers directly, or indirectly by giving negative press to you
Both have a cost to implement, and some tradeoff has already to be negotiated between parties here: how much defect fixing relative to new features can we afford to do?
But there are also items that are completely invisible to the outside world:
- The yellow stuff: architectural elements, infrastructure, frameworks, deployment tools, etc. Known to the internal development team, and architects, they often are deferred in favour of more green or red stuff. Their cost is often very lumpy: they are hard to break down to small increments. We do know that they add value, in the long term, by increasing future productivity and often key quality attributes. But this value is hard to define.
- Finally, the black stuff: there are elements that have both a negative value, and are invisible: big lumps of technical debt. They are the result of earlier architectural decisions that were wise and optimal at the time, but which in the current context are clearly suboptimal and hurt the project in several ways: usually by reduced productivity, or impact on the evolution of the system. The black stuff is known by the development team, but rarely expressed visibly at the level of key decisions makers who decide the future release roadmap. Short cuts or omission to develop the yellow stuff increases the amount of black stuff, further preventing progress.
A compounding factor is that the various elements have many dependencies between them, especially dependencies of the green and red stuff on the yellow or black stuff. The tradeoffs between the various colours is now much more complicated and require diverse expertise, not just market value. Time plays a crucial role, too: the value of delivering a new feature is immediate; whereas the value of developing a good architecture may be reaped only over a long period of time.
The key issue is how much value is there in the yellow and black stuff? What is really the value of software architecture?
Some economic concept such as Net Present Value, combined with dependency analysis should be able to reconcile all four colours in making development choices for the future, short- or long-term. The Incremental Funding Method by Mark Denne and Jane Cleland-Huang is a step in this direction . Real Options could be another one .
- P. Kruchten, R. Nord, and I. Ozkaya, “Technical debt: from metaphor to theory and practice,” IEEE Software, vol. 29(6), pp. 18-21, 2012.
- M. Denne and J. Cleland-Huang, “The Incremental Funding Method: Data-Driven Software Development,” IEEE Software, vol. 21(3), pp. 39-47, 2004.
- K. J. Sullivan, P. Chalasani, S. Jha, and V. Sazawal, “Software Design as an Investment Activity: A Real Options Perspective,” in Real Options and Business Strategy: Applications to Decision Making, L. Trigeorgis (ed.), Risk Books, 1999.
- P. Kruchten, “What colours is your backlog? (slides),” 2009 (updated 2013). Video here.
In his 2000 ICSE Keynote in Limerick, Ireland, my colleague Grady Booch said: “There is still much friction in the process of crafting complex software; the goal of creating quality software in a repeatable and sustainable manner remains elusive to many organizations, especially those who are driven to develop in Internet time.” Friction?
“Friction: the resistance that one surface or object encounters when moving over another.” [Merriam-Webster dict.]
By analogy, in software development, friction is the set of phenomena that limits or constraints our progress, therefore reduces our velocity (or productivity). An element of friction that we have been looking at more closely in the last few years is the result of technical debt: the accumulation of design or coding decisions that looked expedient at the time we made them, but are in retrospect suboptimal, and a hindrance now.
But there is another aspect of friction that is not related to the state of the code, but resides at the organizational and social level. Damian Tamburri, from VU in Amsterdam, has introduced the notion of social debt, as a counter part of technical debt [ICSE2013 workshop]. Social debt is a state of a development project which is the result of the accumulation over time of decisions about the way the development team (or community) communicates, collaborates and coordinates; in other words, decisions about the organizational structure, the process, the governance, the social interactions, or some elements inherited through the people: their knowledge, personality, working style, etc.
Social debt + Technical debt => Friction => delays, unpredictable schedule, and/or poor quality.
To reduce friction, we have to work in parallel on both aspects, technical and social. An ideal, frictionless project, would have zero technical friction (i.e., a perfect design, and perfect code), and zero social friction: a team that collaborate, communicate and co-ordinate at zero cost, without error. Like in physics, it is impossible to reduce friction to nothing, to eliminate it, to have a completely frictionless development. But at least it gives us something to aspire to.
Friction in everyday language also means “conflict or animosity caused by a clash of wills, temperaments, or opinions” [Merriam-Webster], and for sure we can witness these often in he social relationships of software development teams, but much of the social debt is more subtle in nature.
As we have defects and code smells on the technical side, we can observe on the social side defects and “smells”: not problems but potential source of a series of concrete problems if left not addressed. Examples of social and organizational and social smells Tamburri identified and studied are: Organizational silos, or Prima donnas [paper submitted to ICSE 2014], maybe not issues in themselves in some circumstances, but certainly a potential for many ills.
Friction resulting for social debt is highly dependent on the context, therefore will vary greatly in form and intensity based on [see Octopus]:
- size of the project (in whatever unit of measure: SLOC, function-Points, person-months, or staff);
- geographic distribution of the development team (compounded by cultural differences)
- governance rules imposed externally (Sarbanes-Oxley, Basel III)
- age of the system (dragging old habits or process form last century)
- stability of the environment (commercial/contractual environment, human resources, etc.)
- business model (internal development, software product, open-source community…)
There may be other aspects of the physics of friction, both static friction and dynamic friction, that could be exploited, making size of the project an analog of the mass, and linear speed to project velocity, and defining a concept of coefficient of friction in presence of various kinds of lubricants.
“Friktion ist der einzige Begriff, welcher dem ziemlich allgemein entspricht, was den wirklichen Krieg von dem auf dem Papier unterscheidet.” Carl von Clausewitz, Vom Krieg, 1832.
(Friction is the one concept that separates real war from a mere paper exercise. My translation)