Is static code analysis possible without false positives?


    Today at one of the forums, when discussing the PVS-Studio static analyzer, the following question was asked:

    Tell me, do you have the mode that guarantees the absence of false positives? Let there be fewer tests in this mode, but no false positives at all. The fact is that when I was looking for the analyzer for projects and was going to make the analysis part of CI/CD, all the tested commercial analyzers were rejected right because of having such warnings. In addition, the C++ team was weak and I couldn't spend my time digging into each warning. By the way, at that point the budget could go very far, the price was no object at all.

    This question can be answered both yes and no. Let's try to figure it out.

    If a static analyzer creator claims that the tool they've developed doesn't issue false positives, this means one of two things:

    1. This is a lie;
    2. There is a narrow set of warnings that really don't issue false positives. But, unfortunately, almost all of them serve only as guidelines on code formatting.

    Any static code analyzer that performs in-depth analysis, by its very nature, is subject to false positives. There is nothing to be done with it. It is worth mentioning the "Rice's theorem", which claims that in the general case it is impossible to understand whether a program is written correctly with the help of another program.

    However, it's true that a number of diagnostics work without false positives. Many such diagnostics are described, for example, in the MISRA standard. Indeed, if in PVS-Studio you enable MISRA warnings and disable the rest, all warnings will be relevant. Well, maybe not all, but 99% for sure. But what kind of warnings are these?

    These are messages that, for example, indicate usage of octal numbers (V2501). Or, for example, warnings that one shouldn't use the goto operator (V2502). Everything is clear and honest. If there is an octal number / goto — you will see the warning. Not a single false positive!

    However, is this what you want to get from an advanced static code analyzer? I think no.

    I do not criticize the MISRA coding standard. It has its own field of application, where it is important and useful. However, speaking of application development, it is not interesting to see warnings similar to «do not use the new operator» (V2511), but you want, for example, to get warnings about typos. When it comes to typos, we can never be 100% sure that an error was found. Perhaps, the author deliberately wanted to write a suspicious expression. False positives are already inevitable here.

    I hope I could clarify the situation. But now the question arises, what to do next? How to implement a static analyzer in CI / CD and not suffer from false positives? There are answers too.

    I will quote a fragment of the article "Reasons to introduce the PVS-Studio static code analyzer into the development process".

    It's not so scary. There are at least three approaches that allow you to seamlessly implement static analysis even in large old projects.

    First approach. The «Ratchet Method», covered by Ivan Ponomarev in his thorough article "Introduce Static Analysis in the Process, Don't Just Search for Bugs with It".

    Second approach. In order to quickly start using static analysis, we offer PVS-Studio customers to use the "markup base". The general idea is the following. Imagine, the user has started the analyzer and received many warnings. Since a project that has been in development for many years, is alive, is still developing and bringing money, then most likely there won't be many warnings in the report indicating critical defects. In other words, critical bugs have already been fixed due to more expensive ways or with the help of feedback from customers. Thus, everything that the analyzer now finds can be considered technical debt, which is impractical to try to eliminate immediately.

    You can tell PVS-Studio to consider all these warnings irrelevant so far (to postpone the technical debt for later), and not to show them any more. The analyzer creates a special file where it stores information about uninteresting errors. From now on, PVS-Studio will issue warnings only for new or modified code. In addition, all this is implemented very smartly. If an empty line is added at the beginning of a .cpp file, the analyzer will correctly conclude that nothing has really changed and will remain quiet. You can put the markup file in the version control system. Even though the file is large, it's not a problem, as there's no need to upload it very often.

    From this point, developers will see only warnings related to newly written or modified code. So you can start using the analyzer, as they say, from the next day. You can get back to technical debt later and gradually correct errors and tweak the analyzer.

    The third approach. You can conclude a contract with us and delegate to our team the work of setting up and integrating static analysis. An example of this practice: "How the PVS-Studio Team Improved Unreal Engine's Code". :)

    Thank you for your attention.
    PVS-Studio
    Static Code Analysis for C, C++, C# and Java

    Comments 4

      0
      By the way, at that point the budget could go very far, the price was no object at all.
      I think you may consider createing a online service — validate warnings.

      For a small price your support team will validate new client's warnings, based on some criteria (for exmaple, only high probability ones) with suggestions on how to fix it.

      It is very similar to service of some companies that do security scans.

      I bet after some time, your support team willl become very efficient and maybe even come with ideas on how to automate most of the validations/suggestions.

        0
        We consult our clients, help to configure the analyzer and explain unclear warnings either way. A more advanced option of cooperation is when we fix errors ourselves. As I understand, we are talking about the intermediary option. I'm not sure if this service is in demand. We'll consider it. Thanks.
          0
          Based on your explanation, you have it pretty much covered.
          explain unclear warnings either way
          I was wondering, do you have something like decision-making logs from an analyzer for a warning? i.e. something like detailed steps why something was considered to be a warning. It might be easy for a developer to find an issue if he had more info.
            0
            The analyzer warnings are divided into levels of certainty. The article "The way static analyzers fight against false positives, and why they do it" covers in detail the question of assigning a certain level. Each diagnostic comes with its description with correct and incorrect code examples. Also the diagnostic description provides reference to CWE, which allows to look at the warnings from from another angle. In addition, in the diagnostic description there is also a reference to examples of real errors found in open source projects.

            In most cases, all this is enough for users to get through the warnings. However, if something really strange is happening, a user can always reach out to us. Example: False Positives in PVS-Studio: How Deep the Rabbit Hole Goes.

      Only users with full accounts can post comments. Log in, please.