legacy software and the big rewrite

posted on October 27, 2012 - tagged as: software

A few years ago I was helping to write a job posting and inserted a requirement for a “willingness to work on legacy software.” At the time this was meant as a nod to the fact that we had some code that was crufty and generally unpleasant to maintain, but we’d be asking some of the new hires to maintain it anyway.

Since helping to write that job posting, I’ve spent a lot of time replacing what could charitably be called a legacy system. Coming to the conclusion that the old software needed to be rewritten was largely a result of its inherently poor design. The system was extremely tightly coupled, and, despite the herculean efforts of one developer to dramatically increase test coverage, still very difficult to test. From an operations point of view, the performance was unpredictable and tended to impact other systems. Keeping the system running meant buying more hardware, getting developers up to speed (the original author was no longer with the company), and then placating them throughout the demoralizing experience of maintaining the beast. To call this software which, at the time was just over a year old, “legacy” was simply a euphemism for “piece of shit.”

This isn’t to say, however, that all systems we call legacy are poorly designed. The phrase is much more loaded. Software and the businesses they support move quickly. The system that performed admirably last quarter might not meet the demands of an evolving business model. Good design can certainly help to future-proof a system, but whether we are cognizant of them or not, fundamental assumptions are often baked into systems early on. When these assumptions no longer hold, pushing the the system forward can be more trouble than it’s worth.

External factors, too, play a huge role in making a project feel like it’s attaining legacy status. For example, software that is closely tied to hardware is always going to be viewed as legacy when that hardware is no longer supported and cannot be upgraded. Again, the software might be designed well, but if it has no place to run, we’re going to refer to that as a legacy system. Likewise, systems written in languages that the market as a whole has passed by are going to be considered legacy if modifications are required on a regular basis. If the prospects for pushing the system forward grow dimmer by the day, we’re also going to call that a legacy system.

So how do we come to the conclusion to replace a “legacy” system? Obviously the big rewrite is terrifying, generally a bad idea, and hard to sell to management. However, if the code isn’t flexible enough to meet the demands of the business the decision is much easier and might be unavoidable. Assuming we’ve done our homework, tracking the time and cost of maintaining a system and have faith the engineers can reasonably predict the time required for a rewrite, we might even be able to project a cost savings from such and endeavor. In this case it really isn’t appropriate to refer to the situation as a rewrite. The original system fails to meet the demands of the business, and so the rewrite is not strictly technical. Instead, it’s a time to start fresh and re-explore the problem space of the original system. In these cases, we hope that the end result is a system that meets the demands of the business. We might even end up with something smaller and more flexible or even break apart things into a separate pieces so as to avoid (or more realistically delay) ending up in the same situation farther down the line.

When considering an actual rewrite (i.e. building a drop-in replacement for an existing system), the side-effects of the original system are the primary issue. For example, if hiring engineers experienced in the implementation language of the original system is a problem, or, even worse, training engineers in this language is an unattractive proposition both in terms of expense and their career paths, a rewrite might be an acceptable course of action. This is probably a tougher sell, but no less important. Human factors are the biggest expense in software, and keeping a project attractive to talented developers is crucial. Other side-effects such as performance are also factors. I’ve seen one case where a strict port from python to java was done for performance reasons and was a success. Obviously this kind of rewrite is much riskier, but, again, sometimes necessary.

Yes, rewriting a system is a daunting process, but software ages, requirements change, people move on, and sometimes it’s time to start anew. Sometimes this is as simple as rewriting or refactoring a component inside a larger system. Other times, however, a system is so poorly designed or so pushed by external factors as to warrant a complete re-imagining and re-implementation. In these cases we have to be bold, explore the problem space, solve the right problems, and build the best system we can.

Comments !