Redundancy in Software

An experimental evaluation of the assumption of independence in multiversion programming – John Knight and Nancy Leveson Behind this dry title lies something very interesting. I first heard about this paper from Ralph Johnson in a newsgroup discussion about program correctness. It turns out that one of the avenues that engineers in other disciplines take to make their products stronger – redundancy – doesn’t really work in software. Multi-version programming was the idea that you could decrease faults in critical systems by handing the spec to several teams, having them develop the software independently, and then having the systems run in parallel. A monitoring process verifies their results and if there is any discrepancy it picks the most common result. Sounds like it should work, right? Well..

I've noticed when I spend time adding redundancy in my codebases, I build defenses on "the wrong side of the house". I (and I imagine most devs) create tests that cover the safe path, and the thing that ends up breaking is hardly what was expected. Perhaps thats the curse of not operating in the world of physics, where the rules are deadly, but clear.