On July 6th of 1988, oil workers were evacuated from the Piper Alpha oil rig, in the North Sea, after an explosion set by a chain of events during a routine check, while a major reconstruction of the oil platform was in process.
What triggered the fault was that inspectors removed and replaced all safety valves, except for one, which was never put back. Unaware that the safety valve was missing, a worker pushed the start button, and gas began to leak out.
As it often is with these cases, the tragedy was at this point, impossible to escape.
This is Part 1 of a three-part series on killing legacy software.
Just like with the Piper Alpha oil rig, when it comes to software, introducing changes without considering the entire system can lead to disastrous outcomes.
Of course, software works very differently to an oil rig, but they both respond to several processes of optimization, maintenance and security.
There is always a process, created and improved to guarantee system stability. And in this system it is the tools, technique, people and technology that need to link well. Maintaining an ageing infrastructure becomes a challenge.
This is a problem that startups or young software companies with 4-7 years of existence are not likely to have. But when you consider a winning formula that contains elements that were appropriate 16 or 20 years ago, things become tricky really fast.
Suddenly we feel like in the opening story, where a routine procedure can send a lot of people and processes on the verge of collapse.
The problem arises because the main structure is old and trusted, but it requires to balance maintenance and the new services being demanded and added on top.
In the case of the oil rig, it had been operating since 1976, and by 1988 there were 6 major reconstruction projects taking place. It had been initially an oil-only production platform, but had been converted to produce gas.
This happens also in software, when new features and processes begin creating new problems and maintenance challenges. When planned maintenance threatens stability, it needs more resources to avoid a catastrophic set of events. Just how much more resources are required is unfortunately something we often learn through mistakes.
The changes in software usually involve new tools and frameworks -often carried out by new people who are forced to salute the flag of legacy- that might not be fully compatible with the vision of the past.
There are many examples of big companies with legacy solutions and a high rate of innovation, which have had disastrous outcomes. The most recent one that comes to mind is the Microsoft customer log incident, which exposed 14 years' worth of private conversations, or 250 million customer service records.
In their official response, dated January 22nd, 2020, Microsoft says there was an internal support database they were using, which was not breached but rather, "misconfigured". In other words, there was no hacking involved, but someone made a request which created a change that resulted in exposing data by accident.
This sounds a lot like the causes of the Piper Alpha tragedy, and the concept of our Legacy Software Trap theory, because Microsoft was building on top of internal legacy processes when this happened. Often times, the legacy codebase creators or maintainers leave, and the new company members have to deal with managing the transition. This is the most dangerous time for accidents stemming from changes.
We wanted to cover the topic of legacy software at HyperConnect because we have dealt with the issue directly, and we feel that we are now close to a happy ending. The problem of course is that nobody in the Software sector seems eager to talk about this, and it is understandable.
It is a touchy subject, often times leading to the coming of age of many software developers and managers who might be afraid to speak honestly about the shortcomings and challenges faced by their companies or former colleagues.
The interesting thing is that this happens at every level, and as much as we try to do with preparing ourselves to tackle the problem, there are human elements at work in deciding just how big the transition phase is. What we want is to help reduce risk exposure during that transition.
Usually, this is dictated by how well is the new team able to do the following things:
- Create and establish a new architecture for a better performing replacement.
- Refine and improve the new usability.
- Do countless demos to validate the new services, but keep in mind the scope can change.
In part 2 of our Killing Legacy Software series, we will take a look at #1, because there is a lot of difficulty in realizing that:
A) It is necessary to replace the legacy software altogether.
B) It has to be done at the right time and in the proper manner for legacy customers.
C) It is a must to bring in new people, often at the expense of letting go of valuable team members.
Stay tuned for our next chapter!
If you want to read on about the Piper Alpha catastrophe, you can get started here.