What is technical debt (and why should I care)?

Preface: There are many excellent treatise on technical debt. Code Clear Labs specializes in software development in research and commercialization settings and this post, while neither complete nor exhaustive, is aimed at that audience.

Ever experienced a home inspection? I recently walked along for the examination of a century-old structure and, as one of the least “handy” people you'll meet, the inspector's findings reminded me of a familiar topic in my own trade of software engineering. It became clear, as we uncovered old “knob & tube” wiring, lacking insulation, aging brick, and some plainly shoddy work that we were tabulating the “construction debt” of the house.

This analogy is helpful to introduce the term technical debt; well-understood within software development but rarely encountered externally. Software, like buildings, can decay due to age, rely on design shortcuts made in haste, and have other faults introduced during construction. These issues, collectively, are referred to as cruft.

Whether debt is of the technical or construction variety, it shares a common trait:

The most-impacted parties are typically unaware of their “debt” and how it affects them.

People that commission a home renovation, depend on its functional longevity, but often have little to no construction expertise or awareness of what lies behind the walls. In software, business or research leaders, who drive forward software development to solve real problems, are typically not in tune with the health of the codebase and how it affects that progress.

Debt vs. Defect

It's important to distinguish issues that impact usage, as in living in the house or using the software, from those that hinder further development. While the latter is debt, the former are defects, which in software are often referred to as bugs. A leaking roof or software that crashes is a defect. A wall insulated with asbestos that is safe as long as it remains undisturbed is debt, as is a non-modular software design with lots of repetitive code. While interest on debt is typically thought of as increased development cost, it can also manifest as more defects when owners are unwilling or unable to pay the full interest and continue taking shortcuts.

The impact of a defect is generally more direct and detectible by a user or owner, while debt interest (slowed development or resulting defects) is indirect and can often be misdiagnosed as a process or personnel issue.

Paying with interest

Now I'm really going to confuse things by bringing in a second analogy. “Technical debt” was originally coined by Ward Cunningham using financial debt as a metaphor to help communicate his views on software development to management. Just as borrowing money can be a good strategy to achieve a specific goal, accumulating cruft in the business interest of faster product iteration is fine, Cunningham argued. Just be aware:

The interest on the debt is your development velocity slowing over time and the debt grows at a compound rate.

This financial metaphor is ideal for framing the correct attitude to take toward managing cruft. However, I always found it lacking when trying to explain to a non-technical person what technical debt, something they cannot see or sense in any other way, is. Anecdotally speaking, I've had more success with the construction analogy.

Good debt vs. bad debt

In software circles, technical debt is often viewed as wholly negative and to be entirely avoided. This was not Ward Cunningham's intent in creating the term. His financial metaphor is perfect in this respect, illustrating that you can have good debt, like a mortgage, that helps move you toward a sound business objective, and bad debt, like an over-used consumer credit card with high interest payments for short-lived gains. Just remember: even “good” debt is a tool to reach an objective and is not attractive in the absence of a goal nor something that you want to collect.

Good and bad debt exists in software too and, one could easily argue, that all software has technical debt.

Because debt is defined as hindering development, which can be thought of as moving the software in a desired direction, design choices that were previously not considered as debt can become so and vice-versa, simply with a change in direction of the product roadmap. In other words, all software is the sum of many design decisions and, one way or another, each decision is imperfect when considering the myriad of ways in which the codebase may need to evolve.

Origins of debt

Besides being considered good or bad, there are several ways to categorize technical debt. One method based on intent, says debt is either:

Planned: “We know there's a better way, but there's no time for it“
Unplanned: “Anyone know the best way to do this?”
Unavoidable: “Did you hear we now have to support multiple languages?”

Software expert Martin Fowler took this a step further, categorizing into quadrants.

Technical debt quadrant

This circles back to Cunningham's reason for coining the term in the first place: awareness of debt and the attitude taken towards it are going to determine whether you sink or swim. That said, regardless of how we categorize its origins, debt is debt; it doesn't care about good intentions and it will impact development all the same. Bottom line: technical debt needs to be managed.

Managing your debt

There are two necessary strategies for managing technical debt and they are well-described by an acronym from the world of medical devices: CAPA or Corrective Action, Preventative Action. In that strictly-regulated domain, if a device or software exhibits a defect in the field, a full investigation must be performed and documented. The investigation should result in a fix, such that the "bug" does not re-occur and any resultant harm is mitigated; the corrective action. Furthermore, development, testing, and review processes are examined to see what “cracks” the defect slipped through. Mending those cracks is the preventative action.

Correction

Correcting debt you've already incurred is a matter of re-writing the code in question; a process called refactoring. By definition, it means spending time writing code that does not affect the functionality of the software, so you can imagine why developers are often left as the only advocates for it. That said, a software development process with a healthy attitude towards technical debt, allocates time for refactoring.

Prevention

Preventing additional, unintentional debt from being created is achieved through process (relying on skill is foolhardy). Proper mitigation (not actually prevention, because 100% effectiveness is not practically achievable) basically comes down to two things: time and attention.

Good processes allocate enough time, not only for the proper development to occur, but for important development-enabling activities that are often misunderstood as optional or trivial:

Requirements gathering: incomplete or inaccurate requirements lead to designs built on bad assumptions.
Architecture & design: start at the whiteboard (not the keyboard) and consider all options before proceeding.

*Notable absence: testing. Highly important, but at this stage, the debt is already there and you're looking for defects.

These activities ideally occur throughout development (not just at the start) and as much as they need time, they also need attention.

Debt creeps in when no one is watching. Put more than one set of eyes on design-related activities.

How do you implement attention as a process? Reviews. Gathered requirements? Review them. Chosen a high-level design? Review it. Implemented the design in 2,000 lines of code? Review each line. The more reviewers, the better. Many think of code review as a place to find defects, but in fact 75% of review findings pertain to debt.

Ignore at own peril

In many well-run software companies and open source projects, code-bases have existed for long periods of time and yet their debt levels remain quite reasonable. This is inevitably due to an understanding and acknowledgement of the debt phenomenon and having proactive measures in place to manage it. Intentionally-incurred debt is documented and later corrected (refactoring). Unintentional debt is minimized through proper prevention (process & reviews).

Research environments are different. Their objectives correctly prioritize speed and experimentation over robustness and stability. Accordingly, both their debt threshold and the amount of debt that can be considered good are higher than the norm. This, sometimes combined with a lack of software development experience, often mean there is no there no debt mitigation whatsoever. It is more than a bit ironic that, in a world where peer-reviewed literature reigns supreme, structured and planned review of software is exceedingly rare.

Whether intentional or not, technical debt in research usually piles up and interest is paid continually and indirectly with a seemingly plentiful resource, time.

Buyer beware

If debt accumulation and slow research progress are not a concern, then hopefully there is also little ambition for the software to “leave the building”.

Even if you're comfortable ignoring the debt, it's unlikely that an acquirer will be.

Just like potential buyers of a house will look elsewhere or reduce their valuation based on significant inspection findings, a software acquirer, licensee, or investor will take a technical due diligence into account. As the engineer at a company in IP-acquisition mode, often tasked with evaluating whether research code was worth integrating, I lost count of the number of good ideas sunk by terrible execution.

Not to worry, having your software acquired directly out of the lab is a rare occurrence anyway. More than likely, you'll have to take it to market yourself and prove out demand with a commercialization effort. You'll find, however, that just like the financial kind, collection of technical debt is hard to avoid.

Hidden cost

Unless a codebase and the innovation that it represents are entirely discarded due to a lack of promise (the software equivalent of filing bankruptcy),

technical debt is always repaid during commercialization and, usually, unwittingly.

Having developed a promising algorithm or software application that they wish to take to market, researchers and entrepreneurs will learn that building the commercial product involves a complete re-write of the software; like building a brand new house beside the old one. This is not usually an incorrect course of action or the entrepreneur being taken advantage of.

Lost in translation

In many situations, a “re-write“ is necessitated by the need to switch to a more suitable platform or technology (e.g from MATLAB to C#, or from offline to cloud), a process referred to as a porting the software. Software “ports“, however, like translating a piece of literature to another language, can have widely varying levels of effort and cost that depends highly on the source material. Porting software of low vs. high debt is akin to commissioning the translation of a French children's book vs. an ancient, partially damaged Paleo-Hebrew script.

I've been part of enough commercialization efforts to see a founder/inventor/scientist reach the same unpleasant revelations:

The existing codebase they assumed was a springboard, has no reusability.
The effort they thought to be simple & quick is large & expensive.

This is the technical debt coming due.

(and at too late a stage for an explanation of how this has culminated over months or years of development to be useful or welcome)

So what should I do?

Whether you've been battling technical debt for years (perhaps without knowing what to call it) or you just realized that you may have this problem, the most obvious question is: “What can I do?”. The aim is not to turn your lab into a well-oiled, production-ready software machine, because that doesn't align with the primary characteristics of efficient research (speed, flexibility, innovation). But it also doesn't justify research being the “wild west” of software development. When it comes to managing technical debt, there is enough low-hanging fruit worth picking.

That said, it doesn't make sense to add much, if any, overhead to research software in its infancy.

The time & effort spent on software quality should be relative to its maturity and commercialization "promise"

In my next post, we'll group research software into stages and suggest what processes and activities are warranted at each stage, to ensure you don't end up drowning in technical debt.