COVID-19 Research and Uninitialized Variable

    0796_covid_sim/image1.png
    There is an open project COVID-19 CovidSim Model, written in C++. There is also a PVS-Studio static code analyzer that detects errors very well. One day they met. Embrace the fragility of mathematical modeling algorithms and why you need to make every effort to enhance the code quality.


    This little story begins with my ordinary search on GitHub. While looking through the search results, I accidentally came across the COVID-19 CovidSim Model project. Without thinking twice, I decided to check it using the PVS-Studio analyzer.


    The project turned out to be tiny. It contains only 13,000 lines of code, not counting empty lines and comments. And there are almost no errors there either. But one mistake is so simple and beautiful that I couldn't pass it by!


    void CalcLikelihood(int run, std::string const& DataFile,
                        std::string const& OutFileBase)
    {
      ....
      double m = Data[row][col]; // numerator
      double N = Data[row][col + 1]; // denominator
      double ModelValue;
      // loop over all days of infection up to day of sample
      for (int k = offset; k < day; k++)
      {
        // add P1 to P2 to prevent degeneracy
        double prob_seroconvert = P.SeroConvMaxSens *
          (1.0 - 0.5 * ((exp(-((double)(_I64(day) - k)) * P.SeroConvP1) + 1.0) *
          exp(-((double)(_I64(day) - k)) * P.SeroConvP2)));
        ModelValue += c * TimeSeries[k - offset].incI * prob_seroconvert;
      }
      ModelValue += c * TimeSeries[day - offset].S * (1.0 - P.SeroConvSpec);
      ModelValue /= ((double)P.PopSize);
      // subtract saturated likelihood
      LL += m * log((ModelValue + 1e-20) / (m / N + 1e-20)) +
            (N - m) * log((1.0 - ModelValue + 1e-20) / (1.0 - m / N + 1e-20));
      ....
    }

    Serious scientific code. Something is calculated. Formulas. Everything looks smart and detailed.


    But all these calculations shattered into pieces by human inattention. It's good that the PVS-Studio code analyzer can come to the rescue and point out the bug: V614 [CWE-457] Uninitialized variable 'modelValue' used. CovidSim.cpp 5412


    Indeed, let's take a closer look at it:


    double ModelValue;
    for (int k = offset; k < day; k++)
    {
      double prob_seroconvert = ....;
      ModelValue += c * TimeSeries[k - offset].incI * prob_seroconvert;
    }

    We are facing a simple and at the same time terrible error: an uninitialized variable. This algorithm can calculate anything.


    Well, that's it. There is nothing to explain here. It only remains to remind again that developers of scientific libraries and scientific applications should make additional efforts to ensure the code quality. Crash of an ordinary application is likely to cost much less than the use of incorrect results for scientific, medical, and other calculations.


    This is not our first article on this topic:



    Use the PVS-Studio static code analyzer! When errors are timely detected you can expect enormous benefits. Thanks for your attention!

    PVS-Studio
    Static Code Analysis for C, C++, C# and Java

    Comments 4

      0
      «It's gonna be Ok!», it should be something near zero, with some probability (near zero).
        0
        By the way, there is also such a funny fragment:
        void DigitalContactTracingSweep(double t)
        {
          ....
          //check every day to see if contacts become index cases - but they have to be infectious. Otherwise we could set the trigger time and cause their contacts to be traced when they are not being traced themselves.
          if ((Hosts[contact].index_case_dct == 0) && (
            Hosts[contact].is_infectious_almost_symptomatic() ||
            Hosts[contact].is_case() ||
            Hosts[contact].is_infectious_almost_symptomatic()))
          ....
        }
        

        The PVS-Studio warning: V501 [CWE-570] There are identical sub-expressions 'Hosts[contact].is_infectious_almost_symptomatic()' to the left and to the right of the '| | ' operator. Sweep.cpp 1354
        The is_infectious_almost_symptomatic function is constant and does not change anything:
        bool Person::is_infectious_almost_symptomatic() const
        {
          return this->inf == InfStat::InfectiousAlmostSymptomatic;
        }
        

        So, there's no point in calling it twice for sure. Another matter is that it may not be the error but just an accidentally duplicated check. But maybe it's the error. Nearby there is the following code:
        if (Hosts[contact].is_infectious_asymptomatic_not_case() ||
          Hosts[contact].is_case() ||
          Hosts[contact].is_infectious_almost_symptomatic())
        

        To tell more precisely, I need, for sure, to perceive the project. I can't understand it superficially.
          0
          The story turned out to have another continuation: www.viva64.com/en/b/0802

        Only users with full accounts can post comments. Log in, please.