Static analyzers' primary aim is to search for errors missed by developers. Recently, the PVS-Studio team again found an interesting example proving the power of static analysis.
You have to be very attentive while working with static analysis tools. Often the code that triggered the analyzer seems to be correct. So, you are tempted to mark the warning as false positive. The other day, we fell into such a trap. Here's how it turned out.
Recently, we've enhanced the analyzer core. When viewing new warnings, my colleague found a false one among them. He noted the warning to show the team leader, who glanced through the code and created a task. I took the task. That's what brought together three programmers.
The analyzer warning: V645 The 'strncat' function call could lead to the 'a.consoleText' buffer overflow. The bounds should not contain the size of the buffer, but a number of characters it can hold.
The code fragment:
struct A
{
char consoleText[512];
};
void foo(A a)
{
char inputBuffer[1024];
....
strncat(a.consoleText, inputBuffer, sizeof(a.consoleText) –
strlen(a.consoleText) - 5);
....
}
Before we take a look at the example, let's recall what the strncat function does:
char *strncat(
char *strDest,
const char *strSource,
size_t count
);
where:
'destination' — pointer to a string to append to;
'source' — pointer to a string to copy from;
'count' — maximum number of characters to copy.
At first glance, the code seems great. The code calculates the amount of free buffer space. And it seems that we have 4 extra bytes... We thought the code was written in the right way, so we noted it as an example of a false warning.
Let's see if this is really the case. In the expression:
sizeof(a.consoleText) – strlen(a.consoleText) – 5
the maximum value can be reached with the minimum value of the second operand:
strlen(a.consoleText) = 0
Then the result is 507, and no overflow happens. Why does PVS-Studio issue the warning? Let's delve into the analyzer's internal mechanics and try to figure it out.
Static analyzers use data-flow analysis to calculate such expressions. In most cases, if an expression consists of compile-time constants, data flow returns the exact value of the expression. In all other cases, as with the warning, data flow returns only a range of possible values of the expression.
In this case, the strlen(a.consoleText) operand value is unknown at compile time. Let's look at the range.
After a few minutes of debugging, we get the whole 2 ranges:
[0, 507] U [0xFFFFFFFFFFFFFFFC, 0xFFFFFFFFFFFFFFFF]
The second range seems redundant. However, that's not so. We forgot that the expression may receive a negative number. For example, such may happen if strlen(a.consoleText) = 508. In this case, an unsigned integer overflow happens. The expression results in the maximum value of the resulting type — size_t.
It turns out that the analyzer is right! In this expression, the consoleText field may receive a much larger number of characters than it can store. This leads to buffer overflow and to undefined behavior. So, we received an unexpected warning because there is no false positive here!
That's how we found new reasons to recall the key advantage of static analysis — the tool is much more attentive than a person. Thus, a thoughtful review of the analyzer's warnings saves developers time and effort while debugging. It also protects from errors and snap judgments.