Author Archives: Cognitive Waves

C++ Then and Now

I’ve been working with C++ in large scale software development for a long time and feel very comfortable with the concepts, syntax and constructs. Finally, the much awaited C++0x standard was published in 2011, naming it formally as C++11. I glanced over the changes and used some of the new features as and when needed. Never did I realize the quantum of changes, until I had to debug a third party math library. There were constructs I didn’t understand, and clueless about much of the new syntax – it felt like an unfamiliar language.

What do you think of C++11?
It may be the most frequently asked question. Surprisingly, C++11 feels like a new language: The pieces just fit together better than they used to and I find a “higher-level style of programming” more natural than before and as efficient as ever. If you timidly approach C++ as just a better C or as an object-oriented language, you are going to miss the point. The abstractions are simply more flexible and affordable than before.
– Bjarne Stroustrup –

Below is a contrived sample code which uses many of the new language and library features. If you are able to understand the code, that’s good. If you know how to use it, that’s even better. But if you can justify when and why it should be used, you’ve probably crossed the Modern C++ barrier.

#include <array>
#include <iostream>
#include <optional>
#include <unordered_map>

using UnorderedMap = std::unordered_map;
auto modernCpp1() -> void
{
    UnorderedMap umap{ {u8"Linux", "path/to/dir"}, {"Windows", R"(path\to\dir)"} };
    for (auto&& [first, second] : umap)
        std::cout << second << "\t" << first << std::endl;
}

[[nodiscard]] auto modernCpp2()
{
    constexpr int len = 2*3;
    std::array<int, len> a = { 0, 1'000, 2'000, 3'000, 4'000, 5'000 };

    auto sum { 0 };
    std::for_each( std::begin(a), std::end(a), [&sum](int val)->void { sum += val; } );

    return sum > 0 ? std::optional<int>(sum) : std::nullopt;
}

int main()
{
    modernCpp1();

    if (auto ret = modernCpp2(); ret.has_value())
        std::cout << "sum = " << ret.value() << std::endl;

    return 0;
}

For the uninitiated, observe the following.

  • using keyword, instead of typdef
  • Trailing return type
  • unordered_map, a new container type
  • String literal enhancements with u8 and R prefix
  • Initialization with braces
  • RValue reference
  • auto type inference
  • Structured binding
  • Range based for loops
  • Attributes
  • constexpr
  • array, a new container type
  • Digit seperator
  • Lambda function
  • optional data structure
  • Selection statement with initializer

To have the right context, these are the published C++ standards.

C++ 98 Major update
C++ 03 Minor update
C++ 11 Major update
C++ 14 Minor update
C++ 17 Minor update
C++ 20 Likely a Major update
  • The differences between C++98 and C++03 are so few and so technical that they ought not concern users.
  • Modern C++ means C++11 and later, including the draft C++20.
  • C++11 typically means including C++14 and C++17.

The following are some of the guiding principles of the C++ committee in the development of the new standard. For more details, see the general and specific design goals of C++11.

  • Preferring standard library additions over changes to the language
  • Improve abstraction mechanisms rather than to solve narrow use cases
  • Increasing type safety
  • Improving performance
  • Maintaining the zero overhead principle, which means no overhead from unused features
  • Maintaining backwards compatibility

Modern C++ does feel like a new language. The changes are not all incremental. Some of the new concepts require a ground up understanding of the basics. It adds much needed functionality and addresses many of the “shortcomings” as percieved by other programming languages.

“Then” was abstraction and performance, “Now” is higher level abstraction and better performance. There are plenty of features that makes the code more safe and developers more productive. It provides full backward compatibility so there is not need to panic, adopt it incrementally. Even if you choose not to actively adapt your code base, you may still have to read 3rd party code. As time goes by, new developers will prefer the modern techniques. Like transitioning from C to C++, it will now be Classic C++ to Modern C++. Just don’t get left behind!

Here’s a valid line of code in Modern C++.

[](){}();

See Modern C++ features for a comprehensive list of all the new language and library changes.

Header file dependencies

A brute force method of reducing superfluous header includes in a project

Header files are just a part and parcel of the C/C++ programming language. However, the number of header file includes go out of control very quickly. In most C/C++ based projects, maintaining minimal or optimal header includes is a challenge. Sooner or later, you will find many unnecessary header includes in the source files.

This causes a few problems.

  • It multiplies the time taken to compile C++ code by the number of times it’s used in a program.1
  • For an incremental minimal build, every unnecessary include potentially increases the number of files that get recompiled.1
  • Refactoring and reorganizing code becomes difficult.

There are free tools2 to identify dependencies but reducing the superfluous dependencies is a painful manual task. Then there are some expensive heavy duty tools. This however, is a simple and free alternative, not perfect but quite effective.

It’s a brute force method which leverages the compiler to identify true dependencies. For each file, it comments out an include and builds the project. If it builds successfully, it is assumed that the header is not required. If the build fails, the include in uncommented (and built again as a sanity check). It is recommended to run it on all on the .h first and then the .cxx.

The script files can be accessed from GitHub at https://github.com/cognitivewaves/misc/tree/master/check-header-includes.

As mentioned earlier, it’s not perfect, as it does not identify changes in behavior due to the order of an “unnecessary” header. E.g.There may be subtle changes in behavior if you have some macros redefined differently depending on some #ifdefs from a previous header identified as “not required”. However, it is nice to have a tool which gets rid of the “obvious” and “silly” superfluous dependencies. So it is best to review the identified unnecessary headers before committing it.

Currently, the script works only on Windows using Visual Studio projects. But it is easy enough to replicate it on Linux and other compilers.

vi Editor

Why learn Vi?

If you ever have to work on a Linux system, it is well advised to have a basic knowledge of using the Vi editor. As anecdotal evidence, read the blog Stack Overflow: Helping One Million Developers Exit Vim. This is not be confused with the editor war. Here I’m only highlighting the practical benefits of being familiar with Vi.

  • It is installed by default (seen as a standard system utility) and is available on all Linux distributions since it is part of the POSIX standard as well as the Single UNIX Specification. All other editors (including nano, emacs) are optional or additional installations.
  • It is a lightweight application and can work in stripped down versions of Linux.
  • It is a console based text editor which works without a Graphical User Interface. This comes in handy especially when logging into a machine remotely, which is quite common on Linux.
  • It gets invoked by default in a number of shell commands like man, less, git etc.

As much as new users find it painful, some users get along fine with vi in small doses. For those coming from a Windows background, learning vi/Vim by comparison with a typical GUI text editor is recommended.

Note that Vi and Vim (Vi IMproved) are not the same. Vim is based on the Vi editor, and is an extended version with many additional features. Vim has nevertheless been described as “very much compatible with Vi“. When possible, install Vim which is an additional package. It is more “user friendly” than standard Vi.

OpenGL – Then and Now

I had spent a fair amount of time on OpenGL about 10 years back, though I wouldn’t call myself an expert. Over these 10 years, I noticed OpenGL evolving and kept pace with it from the outside. Then came WebGL and wanted to get my hands dirty. That’s when I realized that I was way out of touch. As they say, the devil is in the details. All the terminology and jargon just wasn’t adding up. So I went back to basics.

Here is an attempt to summarize the evolution and status of OpenGL. It’s not meant to be an introduction to OpenGL but more for those who want to go from “then” to “now” in one page. For more details, see OpenGL – VBO, Shader, VAO.

Background

Traditionally, all graphics processing was done in the CPU which generated a bitmap (pixel image) in the frame buffer (a portion in RAM) and pushed it to the video display. Graphics Processing Unit (GPU) changed that paradigm. It was specialized hardware to do the heavy graphics computations. The GPU provided a set of “fixed” functions to do some standard operations on the graphics data, which is referred to as the Fixed Function Pipeline. Though the Fixed Function Pipelline was fast and efficient, it lacked flexibility. So GPUs introduced the Programmable Pipeline, the programmable alternative to the “hard coded” approach.

Programmable Pipeline, Shaders and GLSL

The Programmable Pipeline requires a Program which is “equivalent” to the functions provided by the Fixed Function Pipeline. These programs are called Shaders. The programming language for the shaders used to be assembly language but as the complexity increased, high-level languages for GPU programming emerged, one of which is called OpenGL Shading Language (GLSL). Like any program, the Shader program needs to be compiled and linked. However, the Shader code is loaded to the GPU, compiled and linked at runtime using APIs provided by OpenGL.

OpenGL

Desktop OpenGL is generally simply referred to as OpenGL.

OpenGL 1.x provided libraries to compute on the CPU and interfaced with the Fixed Function Pipeline.
OpenGL 2.x added Programmable Pipeline API.
OpenGL 3.x (and higher) deprecated and removed Fixed Function Pipeline.

OpenGL-ES (GLES)

OpenGL ES is OpenGL for Embedded Systems for mobile phones, PDAs, and video game consoles, basically for devices with limited computation capability. It consists of well-defined subsets of desktop OpenGL. Desktop graphics card drivers typically did not support the OpenGL-ES API directly. However, as of 2010 graphics card manufacturers introduced ES support in their desktop drivers and this makes the ES term in the specification confusing.

OpenGL ES 1.x was defined relative to the OpenGL 1.5 specification, providing fixed function graphics.
OpenGL ES 2.x (and higher) is based on OpenGL 2.0 with the Fixed Function API removed.

WebGL

WebGL is a Programmable Pipeline API, with constructs that are semantically similar to those of the underlying OpenGL ES 2.0 API. It stays very close to the OpenGL ES 2.0 specification, with some concessions made for what developers expect with memory-managed languages such as JavaScript. See WebGL and OpenGL.

Conclusion

So, modern OpenGL is great, except it makes learning graphics programming harder (much harder). It is generally easier to teach new graphics programmers using the Fixed Function Pipeline. Ideas behind Shaders are pretty complicated and the minimum required knowledge in basic 3D programming (vertex creation, transformation matrices, lighting etc.) is substantial. There is a lot more code to write and many more places to mess up. A classic fixed function OpenGL programmer was oblivious to most of these nasty details.

“Then” was Fixed Function Pipeline and “Now” is Programmable Pipeline. Much of what was learned then must be abandoned now. Programmability wipes out almost all of the fixed function pipeline, so the knowledge does not transfer well. To make matters worse, OpenGL has started to deprecate fixed functionality. In OpenGL 3.2, the Core Profile lacks these fixed-function concepts. The compatibility profile keeps them around.

The transition in terms of code and philosophy is detailed in OpenGL – VBO, Shader, VAO.

The world needs open source

Shared Responsibility

You buy “stuff”1 everyday. Some are essential, but many others that you probably don’t need. In any case, imagine two scenarios.

Scenario 1

You go to a store to buy, and like the look and feel of it. You pick it up and head out of the store. No payment, no receipt, no credit card swipe. Then one day, when you have used it enough and feel that it was worth it, you pay for it, and you pay what you think it was worth. No price tag, no time limits, no collection calls, just your moral obligation.

Scenario 2

You are enticed, cajoled, convinced or fooled into buying it. Pay for it upfront with limited warranty on the product, no guarantee of satisfaction and very few options of getting your money back.

Which one would you chose? Obviously scenario 1, isn’t it?
Not just because it is free until you decide to pay for it, but also because YOU are always in control.

Does it sound too idealistic? Are there even such products and services?
Yes, and many that you are likely using quite regularly too, but may not even be aware of it.

Open source software is modeled exactly on the first scenario. Furthermore, many of these are ad-free. Do you rely on Wikipedia, or use Mozilla Firefox or prefer Linux (more accurately GNU/Linux) or any of the thousands of “free” software out there?

Most people, including myself, agree to this idea of our shared responsibility towards the systems and software that are made available to us for “free”. We all understand that there is cost (monetary, manpower, administration, etc.) and hence it is not free in the true sense. Someone, somewhere, is paying for it. Someone has taken up the burden of our missing contribution, however minuscule it may be.

Yet, when it comes to acting on it, we defer, procrastinate and finally pass on it, expecting and hoping that someone else will sustain it. I was no exception. I would go places, spend on food and drinks which was more expensive that it was worth, but didn’t make the much needed contribution. It is not that the monetary contribution has to be much, and yet we don’t. This is bystander apathy, a very regressive attitude for a society.

Finally in November 2013, I committed myself to make a contribution as little as USD 10 to a few of the softwares that I use regularly. I did not go bankrupt (obviously) and life is better now that I have fulfilled my shared responsibility. Having taken that first step, I am now committed to contributing every year.

If everyone reading this chipped in $3, we would be supported for another year – Mozilla Firefox

If all our past donors simply gave again today, we wouldn’t have to worry about fundraising for the rest of the year – Jimmy Wales, Wikipedia

Such ecosystems can only exist and sustain with voluntary collective contributions. There are many ways to participate but financial contribution is important. The ball is in your court. Participate in any way possible and fulfill your shared responsibility. I promise you, take that first step and make that contribution. It will give you a sense of satisfaction.

Null Pointer

NULL pointer – to check or not to check is the question

The question of ‘should I add a null pointer check?‘ is a very simple and obvious ‘YES’ to the majority of software developers. Their reasoning is equally simple.

  • It can’t do any harm
  • It will help prevent a crash

These are valid statements, but the answer is not that simple. Though a crash indicates a poor quality software, an absence of it is no guarantee for good quality. The primary goal of any software is to provide functionality in a reliable and efficient manner. Not crashing, though good, is useless (and often detrimental in engineering applications) if the behaviour is incorrect.

This perspective comes from my experience in building software for engineers. It can simulate assembling and analyzing complex designs with hundreds and thousands of parts and assemblies. These component sizes could range from a large part to small hidden nuts and bolts, and it is visually impossible to confirm the accuracy of the model. There is no room for ‘possibly unknown’ error, as these components will eventually be manufactured and assembled. The cost of a manufacturing error (because of an inaccurate and unreliable software, though it never crashed) is much too great compared to a software crash and reworking the model.

Defensive programming (to prevent a crash) can easily lead to bad software development.

  1. Hides the root cause
    The basic principle of causal analysis is to find and fix the cause rather than treating symptoms (which seldom produces a lasting solution). A root cause is the basic reason why something happens and can be quite distant from the original effect. By addressing only the symptom, the true cause is hidden. It is bound to manifest itself in some other workflow at which point it will be impossible to trace back to the cause.
  2. Masks other potential issues
    In a large software, subsystems will interact with each other, and one sub-system may be incorrectly using another. By being tolerant to bad behaviour, we not only hide the problem but in fact encourage it because the consequences are not fatal.
  3. Code bloat
    It may not be looked upon as code bloat because it’s just a couple of lines. But consider these additional lines in every function and it certainly isn’t negligible. This is a catch 22 situation. There is a need for this null check to prevent a potential crash but the condition should never be hit as it should never happen. In fact you won’t be able to get any code coverage hit on these lines, and if you do, the cause should be fixed which essentially renders these checks as “dead” code.
  4. Artificially reduced severity
    Typically problems are reported based on their severity. So if you morph a crash with a less severe consequence, the user will not realize the significance and will either not report it or try to work around it which simply defers the problem to a later stage and can easily lead to data corruption.
  5. Sign of poor development
    Adding a line of code without knowing when and why it will be executed reflects poorly on the developer.

Some alternatives are generally suggested as it is very hard to accept a fatal error.

  • Asserts
    Asserts are used to test pre and post conditions but they are not enabled in release builds. No testing is ever done on debug builds (and for a good reason) except by developers (which is limited to a little more than unit testing). So it doesn’t serve as an alternate approach.
  • Report as errors and exceptions
    Error reporting is a means to inform the user of a problem so that it can be corrected. In these cases, the problem cannot be corrected because we don’t know the cause, and if we did, it should have been fixed in the first place. And because we don’t know the cause, it will have to be a generic error, which is telling the user that we don’t know what is happening in our own code. Error reporting should not be misused to fix coding mistakes.

NULL checks out of paranoia should be avoided. However, there are some legitimate uses of it.

  • A NULL can be used to indicate the initial state of an object or the termination of a linked list. In such cases, a null check is of functional value.
  • A dynamic cast or query interface used to check for an object type uses NULL as the return type. But it should not be misused to have excessive null check when just getting a interface pointer.
  • A heavily used core sub-sysem must be less tolerant to ‘unknown errors’ than one that is lightly exercised. So the core sub-system should have fewer or no paranoid null checks. This may seem very counter intuitive but the idea is that the core functionality should be stable, robust and reliable.

For the faint hearted who feel this approach as too radical, there is sort of a middle ground.

  • Don’t add any null checks in the initial implementation until the quality team finishes thorough testing. This way, any issues would get reported and fixed promptly.
  • Add any necessary defensive tactics at the very end of the release cycle. But if you are still seeing fatal errors at the end, unfortunately it is just poor quality software.

My recommendation is not to do defensive programming without a reason (which is usually rare). Keep in mind that every line of code is supposed to be hit at some point otherwise it is dead code. The bottom line… don’t fear the crash but leverage it.

Cognitive traps

Cognitive Traps

A very interesting perspective into our typical thought process.

From “The Ascent of Money” by Niall Ferguson

Availability Bias, which causes us to base decision on information that is more readily available in our memories, rather than the data we really need.

Hindsight bias, which causes us to attach higher probabilities to events after they have happened (ex post) than we did before they happened (ex ante).

The problem of induction, which leads us to formulation general rules on the basis of insufficient information.

The fallacy of conjunction (or disjunction), which means we tend to overestimate the probability that seven event of the 90 percent probability will all occur, while underestimating the probability that at least one of seven events of 10 percent probability will occur.

Confirmation bias, which inclines us to look for confirming evidence of an initial hypothesis, rather than falsifying evidence that would disprove it.

Contamination effects, whereby we allow irrelevant but promxiate information to influence a decision.

The affect heuristic, whereby preconceived value judgements interfere with our assessment of costs and benefits.

Scope neglect, which prevents us from proportionately adjusting what we should be willing to sacrifice to avoid harms of different orders of magnitude.

Overconfidence in calibration, which leads us to underestimate the confidence intervals within which our estimates will be robust (e.g.to conflate the ‘best case’ scenario with the ‘most probable’).

Bystander apathy, which inclines us to abdicate individual responsibility when in a crowd.

CMake for Visual Studio Developers

Visual Studio is a good Integrated Development Environment (IDE) for C++. However, its build system with projects and solutions is rather clunky.

  • The layout and content is not readable with its excessive XML tags. It’s more machine friendly. Often, trying to figure out changes made as part of a commit is a nightmare
  • The UI Project -> Properties to modify the project file is cluttered. Finding options in the various tabs and tree items can be a challenge.
  • Compatibility between different versions of Visual Studio is quite messy too. It’s high maintenance if you have to support multiple versions of Visual Studio.
  • Projects don’t transfer well when there is a change in working directory path.

CMake is an open-source, cross-platform family of tools designed to build, test and package software. CMake is used to control the software compilation process using simple platform and compiler independent configuration files, and generate native makefiles and workspaces that can be used in the compiler environment of your choice.

Its configuration file(s) CMakeLists.txt is simpler and cleaner, a good alternative to native Visual Studio projects and solutions. Except for the generation step, there is no impact on your development process and workflow.
Step 1: Generate Visual Studio workspace
Step 2: Open the generated Visual Studio solution and work in the IDE as usual.
Only when you have to make project changes do you have to edit the CMakeLists.txt and repeat step 1 to regenerate the Visual Studio workspace.

  • It being simple text, makes it readable. Viewing commit differences is just like source code.
  • Gives more flexibility to organize and structure your code and dependencies.
  • It provides a neutral interface, independent of Visual Studio nuances between versions making it very easy to support multiple versions of Visual Studio.
  • During generation CMake will adjust to any working directory path.
  • It’s a good toolchain to manage a cross platform build system.

However, for someone from a Windows only Visual Studio development environment (with not much experience with Linux and classic Makefile) CMake can be a little intimidating. The language is much more “friendly” to those familiar with Linux style scripting. The documentation though quite detailed, is difficult to parse until you know the basics and know what to look for. Being able to map the CMake commands to the Visual Studio IDE makes learning much easier. See CMake and Visual Studio for an introduction with an example.

So with CMake you have a better build system and the comfort of working in the familiar Visual Studio environment. It was all good, until Microsoft decided to obfuscate the CMake philosophy.

Visual Studio 2017 introduces built-in support for handling CMake projects. This makes it a lot simpler to develop C++ projects built with CMake without the need to generate VS projects and solutions from the command line. – Microsoft

Built-in support to invoke one command line statement to generate VS projects? There was nothing complicated there to simplify.
Anyway, I deciced to give it a try, only to find out that they have opened a can of worms that didn’t exist before.

  • It works in the context of Visual Studio’s new “Open Folder” feature rather than their established project/solution workspace. So you will have to learn the new environment.
  • There are additional settings files to override CMake configuration which will be a source of duplication and confusion.
  • It used the Ninja build system by default. To change it to VS projects, you have to adjust it in a settings file.
  • To launch the debugger for startup executable, you have to introduce a launch file.
  • It uses a MS version of CMake by default but can be changed in a settings file.
  • CTest is run from a new Test menu.

All these extra complexities were not required. All that they had to do, was to provide syntax highlighting for CMakeLists.txt which was missing. I did not find any reason to use their new built-in support.

CMake’s design is to be non intrusive. Microsoft being Microsoft has decided to blur those lines and make you dependent on their ecosystem. Use their built-in support at your own risk.

Software Architecture – Art, Science and Engineering

Structural Architecture defines the outcome of a building to the world, so does Software Architecture to an application. It has a role from the very early stages of conception to the end of the development cycle.

It is like a painting that comes to life from a blank sheet of canvas. It begins with a concept on an empty page. Iterative changes and collaboration with other experts gives it an outline. Technological advances provides the opportunity for exploration while the same technology can limit the boundaries. The outline then evolves to well defined structure with tangible components. All this requires the skills of art, science and engineering.

Art is rather abstract and fluid. Such a mindset is critical in the early stages of defining a software architecture.

Science is often experimental and forward looking. These traits are essential to explore new frontiers in technology.

Engineering is a disciplined approach to adapting proven concepts. It is necessary to guarantee practical success.