Pull to refresh
404.16
PVS-Studio
Static Code Analysis for C, C++, C# and Java

PVS-Studio checks the code quality in the .NET Foundation projects: LINQ to DB

Reading time11 min
Views591

The .NET Foundation is an independent organization, created by Microsoft, to support open-source projects around the DotNet platform. Currently, the organization gathered many libraries under its wing. We have already tested some of these libraries with the help of PVS-Studio. The next project to check with the analyzer - LINQ to DB.

Introduction

LINQ to DB is a database access framework based on LINQ. LINQ to DB has collected the best of its predecessors. It allows you to work with various DBMS, whereas LINQ to SQL back in the day allowed you to work only with MS SQL. It's not as heavy and complicated as LINQ to SQL or Entity Framework. LINQ to DB provides more control and quick access to data. The framework is not that big: it's written in C# and contains more than 40,000 lines of code.

LINQ to DB is also one of the .NET Foundation projects. We have previously checked the projects of this organization: Windows Forms, Xamarin.Forms, Teleric UI for UWP, etc.

A little less conversation, a little more action! Let's check the LINQ to DB code taken from the official repository on GitHub. With the help of our PVS-Studio static analyzer, we will see if everything is fine with the LINQ's successor.

Deja Vu

Let me start, probably, with the most common cases that every developer encountered at least once: duplicate code.

V3001 There are identical sub-expressions 'genericDefinition == typeof(Tuple<,,,,,,,>)' to the left and to the right of the '||' operator. TypeExtensions.cs 230

public static bool IsTupleType(this Type type)
{
  ....
  if (genericDefinition    == typeof(Tuple<>)
        || genericDefinition == typeof(Tuple<;,>)
        || genericDefinition == typeof(Tuple<,,>)
        || genericDefinition == typeof(Tuple<,,,>)
        || genericDefinition == typeof(Tuple<,,,,>)
        || genericDefinition == typeof(Tuple<,,,,,>)
        || genericDefinition == typeof(Tuple<,,,,,,>)
        || genericDefinition == typeof(Tuple<,,,,,,,>)
        || genericDefinition == typeof(Tuple<,,,,,,,>))
  {
    return true;
  }
  ....
}

The first message of the analyzer caught my eye. Those who use tuples infrequently may think that this is a common consequence of copy-paste. Without hesitation, we can assume that a developer missed a comma in the last line of the Tuple<,,,,,,,> condition. However, even the Visual Studio's functionality showed me I was wrong.

Tuples in C# are divided into 8 types according to the number of elements. 7 of them differ only in a different number of elements, from 1 to 7, respectively. In this case, they correspond to the first seven lines in the condition. And the last one, Tuple<,,,,,,,>, includes 8 or more elements.

As a result, when trying to write Tuple<,,,,,,,,>, Visual Studio tells that there is no such tuple. Turns out that in the example above, there is an extra check for the variable correspondence with the Tuple<,,,,,,,> type, and not the missing comma, as it seemed initially.

But the next analyzer warning that caught my eye, has already raised a couple of questions.

V3003 The use of 'if (A) {...} else if (A) {...}' pattern was detected. There is a probability of logical error presence. Check lines: 256, 273. SqlPredicate.cs 256

public ISqlPredicate Reduce(EvaluationContext context)
{
  ....
  if (Operator == Operator.Equal)
  {
    ....
  }
  else
  if (Operator == Operator.NotEqual)
  {
    search.Conditions.Add(
      new SqlCondition(false, predicate, true));
    search.Conditions.Add(
      new SqlCondition(false, new IsNull(Expr1, false), false));
    search.Conditions.Add(
      new SqlCondition(false, new IsNull(Expr2, true), true));
    search.Conditions.Add(
      new SqlCondition(false, new IsNull(Expr1, true), false));
    search.Conditions.Add(
      new SqlCondition(false, new IsNull(Expr2, false), false));
  }
  else
  if (Operator == Operator.LessOrEqual || 
      Operator == Operator.GreaterOrEqual)
  {
    ....
  }
  else if (Operator == Operator.NotEqual)
  {
    search.Conditions.Add(
      new SqlCondition(false, predicate, true));
    search.Conditions.Add(
      new SqlCondition(false, new IsNull(Expr1, false), false));
    search.Conditions.Add(
      new SqlCondition(false, new IsNull(Expr2, false), false));
  }
  else
  {
    ....
  }
  ....
}

According to the analyzer, there are two branches with the same conditions in the fragment. That's why the second condition is always false. By the way, this is also indirectly indicated by another analyzer message: V3022 Expression 'Operator == Operator.NotEqual' is always false. SqlPredicate.cs 273.

In the example, we see the repetition of the Operator == Operator.NotEqual condition. These two condition branches perform slightly different operations. So, the question is - which of the branches do the developers really need? After a little analysis of the Reduce function I assume that most likely the developers need exactly the first branch. The one that has comparison with Operator.NotEqual. Its functionality is more similar to the Equal and LessOrEqual. Unlike its twin, the second branch with NotEqual has absolutely identical functionality with the else branch. Here is a link to the original file for comparison, pay attention to 245-284 lines.

V3008 The 'newElement' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 1320, 1315. ConvertVisitor.cs 1320

internal IQueryElement? ConvertInternal(IQueryElement? element)
{
  ....
  switch (element.ElementType)
  {
    ....
    case QueryElementType.WithClause:
    {
      var with = (SqlWithClause)element;

      var clauses = ConvertSafe(with.Clauses);

      if (clauses != null && !ReferenceEquals(with.Clauses, clauses))
      {
        newElement = new SqlWithClause()
        {
          Clauses = clauses
        };

        newElement = new SqlWithClause() { Clauses = clauses };
      }
      break;
    }
    ....
  }
  ....
}

In this code fragment, the author, apparently, could not decide on the style. They couldn't choose the one and left both options. That's exactly what the analyzer detected. I would recommend picking one and remove the unnecessary assignment. The analyzer issued the same message one more time:

V3008 The 'Stop' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 25, 24. TransformInfo.cs 25

public TransformInfo(Expression expression, bool stop, bool @continue)
{
  Expression = expression;
  Stop       = false;
  Stop       = stop;
  Continue   = @continue;
}

Now it's a different story. Here the *Stop *variable is first assigned with the *false *value and immediately after in the next line - with the stop value of parameter. Logically, in this case it is necessary to remove the first assignment since it is not used and is instantly overwritten by the argument value.

Where did the variable go?

V3010 The return value of function 'ToDictionary' is required to be utilized. ReflectionExtensions.cs 34

public static MemberInfo[] GetPublicInstanceValueMembers(this Type type)
{
  if (type.IsAnonymous())
  {
    type.GetConstructors().Single()
                                   .GetParameters()
                                   .Select((p, i) => new { p.Name, i })
                                   .ToDictionary(_ => _.Name, _ => _.i);
  }
  ....
}

What was the developer's intent with this fragment? It seems that there's a variable missing, to which you need to assign the result of this expression execution. Otherwise, the logic of action is unclear. During further execution of the GetPublicInstanceValueMembers function, there is no call of such expression. The developer's intent is unknown. Maybe this code fragment is in progress, so we need to wait for its further development.

V3025 Incorrect format. A different number of format items is expected while calling 'AppendFormat' function. Arguments not used: 1st. ExpressionTestGenerator.cs 663

void BuildType(Type type, MappingSchema mappingSchema)
{
  ....
  _typeBuilder.AppendFormat(
    type.IsGenericType ?
@"
{8} {6}{7}{1} {2}<{3}>{5}
  {{{4}{9}
  }}
"
:
@"
{8} {6}{7}{1} {2}{5}
  {{{4}{9}
  }}
",
    MangleName(isUserName, type.Namespace, "T"),
    type.IsInterface ? "interface" 
                     : type.IsClass ? "class" 
                                    : "struct",
    name,
    type.IsGenericType ? GetTypeNames(type.GetGenericArguments(), ",") 
                       : null,
    string.Join("\r\n", ctors),
    baseClasses.Length == 0 ? "" 
                            : " : " + GetTypeNames(baseClasses),
    type.IsPublic ? "public " 
                  : "",
    type.IsAbstract && !type.IsInterface ? "abstract " 
                                         : "",
    attr,
    members.Length > 0 ? (ctors.Count != 0 ? "\r\n" : "") + 
                         string.Join("\r\n", members) 
                       : string.Empty);
}

In this fragment we see the string formatting. The question is, where did the first argument call go? In the first formatted line, a developer used indexes from 1 to 9. But either the developer didn't need an argument with the index 0, or they forgot about it.

V3137 The 'version' variable is assigned but is not used by the end of the function. Query.cs 408

public void TryAdd(IDataContext dataContext, Query query, QueryFlags flags)
{
  QueryCacheEntry[] cache;
  int version;
  lock (_syncCache)
  {
    cache   = _cache;
    version = _version;
  }
  ....
  lock(_syncCashe)
  {
    ....
    var versionsDiff = _version - version;
    ....
    _cache   = newCache;
    _indexes = newPriorities;
    version  = _version;
  } 
}

We're in a tricky situation here. According to the diagnostic message, a value is assigned to the local version variable without ever using this value by end of the function. Well, one thing at a time.

At the very beginning, the value from _version is assigned to the version variable. During code execution, the version value does not change. It's only called once to calculate the difference with _version. And at the end, _version is assigned to the version again. The presence of lock statements implies that during the execution of a code fragment, outside the block with the _version variable, changes can occur in parallel from outside the function.

In this case, it is logical to assume that at the end it was necessary to swap version with _version. Still, it seems strange to assign a global value to a local variable at the end of a function. The analyzer issued similar message one more time: V3137 The 'leftcontext' variable is assigned but is not used by the end of the function. ExpressionBuilder.SqlBuilder.cs 1989

One loop iteration.

V3020 An unconditional 'return' within a loop. QueryRunner.cs 751

static T ExecuteElement(
  Query          query,
  IDataContext   dataContext,
  Mapper      mapper,
  Expression     expression,
  object?[]?     ps,
  object?[]?     preambles)
{
  using (var runner = dataContext.GetQueryRunner(query, 0, expression, ps,
    preambles))
  {
    using (var dr = runner.ExecuteReader())
    {
      while (dr.Read())
      {
        var value = mapper.Map(dataContext, runner, dr);
        runner.RowsCount++;
        return value;
      }
    }

    return Array.Empty.First();
  }
}

It's natural to use the while (reader.Read()) construct if you need to select the multiple rows from database. But here in the loop, we see the return without any conditions, which means that only one row is needed. Then the question is - why use a loop? In our case, there is no need for the while loop. If you need only the first element from the database, you can use a simple if.

Repeat actions make perfection

The cases with repeated checks are still present.

V3022 Expression 'version > 15' is always true. SqlServerTools.cs 250

internal static IDataProvider? ProviderDetector(IConnectionStringSettings css,
  string connectionString)
{
  ....
  if (int.TryParse(conn.ServerVersion.Split('.')[0], out var version))
  {
    if (version <= 8)
      return GetDataProvider(SqlServerVersion.v2000, provider);

    using (var cmd = conn.CreateCommand())
    {
      ....
      switch (version)
      {
        case  8 : return GetDataProvider(SqlServerVersion.v2000, provider);
        case  9 : return GetDataProvider(SqlServerVersion.v2005, provider);
        case 10 : return GetDataProvider(SqlServerVersion.v2008, provider);
        case 11 :
        case 12 : return GetDataProvider(SqlServerVersion.v2012, provider);
        case 13 : return GetDataProvider(SqlServerVersion.v2016, provider);
        case 14 :
        case 15 : return GetDataProvider(SqlServerVersion.v2017, provider);
        default :
          if (version > 15)
            return GetDataProvider(SqlServerVersion.v2017, provider);
          return GetDataProvider(SqlServerVersion.v2008, provider);
      }
    }
  }
  ....
}

You saw a code fragment. Did you notice an error? The analyzer says that in this example, the *version > 15 *condition is always true, which is why the return GetDataProvider(SqlServerVersion.v2008, provider) string is unreachable code. But let's take a closer look at the ProviderDetector function.

Firstly, I suggest paying attention to the version <= 8 condition. It means that further code cannot be executed if the version of SQLServer is 8 or earlier. But if we look down, we see the case 8 branch in the switch statement. This branch executes identical code. The fragment is an unreachable code, because the 8th version can no longer be used due to the condition above. And since it still executes the same code, then you can safely remove this branch from switch.

Secondly, let's talk about the analyzer's message. As we already said, all versions earlier than or equal to 8th will not go beyond the first condition. Versions from 9th to 15th are caught in the switch branches. In this case, we get into the default branch when the condition version > 15 is met. It makes the check of the same condition inside the default branch meaningless.

But the question remains: what do we need to write in GetDataProvider - v2017 or v2008? If we look at the rest of the switch branches, we can assume the following: the older the version, the SQLServer's release year is higher as well. In this case, let's use SQLServerVersion.V2017. The correct version of this code should look like this:

internal static IDataProvider? ProviderDetector(IConnectionStringSettings css,
  string connectionString)
{
  ....
  if (int.TryParse(conn.ServerVersion.Split('.')[0], out var version))
  {
    if (version <= 8)
      return GetDataProvider(SqlServerVersion.v2000, provider);

    using (var cmd = conn.CreateCommand())
    {
      ....
      switch (version)
      {
        case  9 : return GetDataProvider(SqlServerVersion.v2005, provider);
        case 10 : return GetDataProvider(SqlServerVersion.v2008, provider);
        case 11 :
        case 12 : return GetDataProvider(SqlServerVersion.v2012, provider);
        case 13 : return GetDataProvider(SqlServerVersion.v2016, provider);
        case 14 :
        case 15 : return GetDataProvider(SqlServerVersion.v2017, provider);
        default : return GetDataProvider(SqlServerVersion.v2017, provider);
      }
    }
  }
  ....
}

Now let's take a look at a simpler example of the V3022 diagnostic's triggering in this project.

V3022 Expression 'table == null' is always true. LoadWithBuilder.cs 113

TableBuilder.TableContext GetTableContext(IBuildContext ctx, Expression path, 
  out Expression? stopExpression)
{
  stopExpression = null;

  var table = ctx as TableBuilder.TableContext;

  if (table != null)
    return table;

  if (ctx is LoadWithContext lwCtx)
    return lwCtx.TableContext;

  if (table == null)
  {
    ....
  }
  ....
}

What do we have here? The table variable is compared to null twice. The first time, the condition checks the variable for an inequality with null. When the condition is met, the exit from a function takes place. This means that the code below the branch of the condition is executed only when table = null. No actions are performed on the variable until the next check. As a result, when the code reaches the table == null condition, this check always returns true.

Diagnostics of V3022 issued a few more useful warnings. We will not review them all in the article, but we encourage authors to check the project themselves and see all the warnings of the PVS-Studio analyzer.

V3063 A part of conditional expression is always true if it is evaluated: field.Field.CreateFormat != null. BasicSqlBuilder.cs 1255

protected virtual void BuildCreateTableStatement(....)
{
  ....
  if (field.Field.CreateFormat != null)
  {
    if (field.Field.CreateFormat != null && field.Identity.Length == 0)
    {
      ....
    }
  }
  ....
}

In the code snippet above, you can see that field.Field.CreateFormat is checked twice for null. But in this case, the second check is performed directly in the branch of the first check. Since the first check is a success, so when the checked value has not changed, it is not necessary to compare the field.Field.CreateFormat value with null for the second time.

null as something to die for

V3022 Expression 'rows' is always not null. The operator '?.' is excessive. SQLiteSqlBuilder.cs 214

protected override void BuildSqlValuesTable(
  SqlValuesTable valuesTable,
  string alias,
  out bool aliasBuilt)
{
  valuesTable = ConvertElement(valuesTable);
  var rows = valuesTable.BuildRows(OptimizationContext.Context);

  if (rows.Count == 0)
  {
    ....
  }
  else
  {
    ....

    if (rows?.Count> 0)
    {
     ....
    }

    ....
  }
  aliasBuilt = false;
}

According to the analyzer, in the line of this code fragment, the *if (rows?.Count > 0) *check for null is unnecessary, since rows cannot be null at that moment. Let's figure it out why. The result of the BuildRows function is assigned to the rows variable. Here is the code fragment of the function:

internal IReadOnlyList BuildRows(EvaluationContext context)
{
  if (Rows != null)
    return Rows;
  ....
  var rows = new List();
  if (ValueBuilders != null)
  {
    foreach (var record in source)
    {
      ....

      var row = new ISqlExpression[ValueBuilders!.Count];
      var idx = 0;
      rows.Add(row);

      ....
    }
  }
  return rows;
}

Since BuildRows cannot return null, then, according to the analyzer, check for null is redundant. But if BuildRows had returned null - what is meant by rows rows?.Count > 0 condition - then at the moment of the rows.Count == 0 condition check, the NullReferenceException would have been thrown. In the of such a condition, you would also need to do a null check to avoid an error. Until then, the current code looks suspicious and checking for null is redundant.

We got to the message, which made me think hard and do a couple of checks.

V3042 Possible NullReferenceException. The '?.' and '.' operators are used for accessing members of the '_update' object SqlUpdateStatement.cs 60

public override ISqlTableSource? GetTableSource(ISqlTableSource table)
{
  ....
  if (table == _update?.Table)
    return _update.Table;
  ....
}

A small fragment, a condition and exit from the function.

So, the analyzer has detected that update is accessed in two ways - with the null-conditional operator and without it. You might think that the condition is met only if _update doesn't equal null and both parts of the equality are the same. But. Big fat BUT.

In the case when table and _update equal null, then _update?.Table returns null. That meets the condition. Then when trying to call _update.Table you will get NullReferenceException. If we can return null, as ISqlTableSource? tells us in the function declaration, then we should write return _update?.Table to avoid an error.

Conclusion

The LINQ to DB project is large and complex, which makes it more exciting to check it. The project has a very large community, and we were lucky to get some interesting warnings.

If you want to know whether your code base have similar errors, you can try PVS-Studio on your project.

Tags:
Hubs:
Rating0
Comments0

Articles

Information

Website
pvs-studio.com
Registered
Founded
2008
Employees
31–50 employees