The question of ‘should I add a null pointer check?‘ is a very simple and obvious ‘YES’ to the majority of software developers. Their reasoning is equally simple.
- It can’t do any harm
- It will help prevent a crash
These are valid statements, but the answer is not that simple. Though a crash indicates a poor quality software, an absence of it is no guarantee for good quality. The primary goal of any software is to provide functionality in a reliable and efficient manner. Not crashing, though good, is useless (and often detrimental in engineering applications) if the behaviour is incorrect.
This perspective comes from my experience in building software for engineers. It can simulate assembling and analyzing complex designs with hundreds and thousands of parts and assemblies. These component sizes could range from a large part to small hidden nuts and bolts, and it is visually impossible to confirm the accuracy of the model. There is no room for ‘possibly unknown’ error, as these components will eventually be manufactured and assembled. The cost of a manufacturing error (because of an inaccurate and unreliable software, though it never crashed) is much too great compared to a software crash and reworking the model.
Defensive programming (to prevent a crash) can easily lead to bad software development.
- Hides the root cause
The basic principle of causal analysis is to find and fix the cause rather than treating symptoms (which seldom produces a lasting solution). A root cause is the basic reason why something happens and can be quite distant from the original effect. By addressing only the symptom, the true cause is hidden. It is bound to manifest itself in some other workflow at which point it will be impossible to trace back to the cause.
- Masks other potential issues
In a large software, subsystems will interact with each other, and one sub-system may be incorrectly using another. By being tolerant to bad behaviour, we not only hide the problem but in fact encourage it because the consequences are not fatal.
- Code bloat
It may not be looked upon as code bloat because it’s just a couple of lines. But consider these additional lines in every function and it certainly isn’t negligible. This is a catch 22 situation. There is a need for this null check to prevent a potential crash but the condition should never be hit as it should never happen. In fact you won’t be able to get any code coverage hit on these lines, and if you do, the cause should be fixed which essentially renders these checks as “dead” code.
- Artificially reduced severity
Typically problems are reported based on their severity. So if you morph a crash with a less severe consequence, the user will not realize the significance and will either not report it or try to work around it which simply defers the problem to a later stage and can easily lead to data corruption.
- Sign of poor development
Adding a line of code without knowing when and why it will be executed reflects poorly on the developer.
Some alternatives are generally suggested as it is very hard to accept a fatal error.
Asserts are used to test pre and post conditions but they are not enabled in release builds. No testing is ever done on debug builds (and for a good reason) except by developers (which is limited to a little more than unit testing). So it doesn’t serve as an alternate approach.
- Report as errors and exceptions
Error reporting is a means to inform the user of a problem so that it can be corrected. In these cases, the problem cannot be corrected because we don’t know the cause, and if we did, it should have been fixed in the first place. And because we don’t know the cause, it will have to be a generic error, which is telling the user that we don’t know what is happening in our own code. Error reporting should not be misused to fix coding mistakes.
NULL checks out of paranoia should be avoided. However, there are some legitimate uses of it.
- A NULL can be used to indicate the initial state of an object or the termination of a linked list. In such cases, a null check is of functional value.
- A dynamic cast or query interface used to check for an object type uses NULL as the return type. But it should not be misused to have excessive null check when just getting a interface pointer.
- A heavily used core sub-sysem must be less tolerant to ‘unknown errors’ than one that is lightly exercised. So the core sub-system should have fewer or no paranoid null checks. This may seem very counter intuitive but the idea is that the core functionality should be stable, robust and reliable.
For the faint hearted who feel this approach as too radical, there is sort of a middle ground.
- Don’t add any null checks in the initial implementation until the quality team finishes thorough testing. This way, any issues would get reported and fixed promptly.
- Add any necessary defensive tactics at the very end of the release cycle. But if you are still seeing fatal errors at the end, unfortunately it is just poor quality software.
My recommendation is not to do defensive programming without a reason (which is usually rare). Keep in mind that every line of code is supposed to be hit at some point otherwise it is dead code. The bottom line… don’t fear the crash but leverage it.