Как стать автором
Обновить
288.76
PVS-Studio
Статический анализ кода для C, C++, C# и Java

A Spy Undercover: PVS-Studio to Check ILSpy Source Code

Время на прочтение15 мин
Количество просмотров868

In PVS-Studio, we often check various compilers' code and post the results in our blog. Decompiler programs, however, seem to be a bit neglected. To restore justice in this world, we analyzed the ILSpy decompiler's source code. Let's take a look at the peculiar things PVS-Studio found.

Introduction

Probably almost every programmer used a decompiler at least once. The reasons could vary: to see how a method is implemented, to check if there is an error inside a library used, or to satisfy curiosity and look up some source code. At the mention of a decompiler, most .NET programmers will think of dotPeek or ILSpy. .NET Reflector is not as popular anymore. I remember when I first learned about these utilities and decompiled someone else's library - a thought of espionage ran through my head. I was obviously not the only one thinking along these lines - I am sure ILSpy's name is not accidental. I was curious what the spy is made of and wanted to reinforce it with a static analyzer. So I used the PVS-Studio analyzer on ILSpy's source code and put together an article based on the most interesting and suspicious code fragments I found.

To be honest, this article on ILSpy just sort of happened. Some of our clients are game development studios. This is one of the reasons why we try to make our tool as helpful and handy as possible for game developers, especially for those who employ Unity and Unreal Engine.

While I know many clients who work with Unreal Engine, I don't encounter that many Unity developers who use our analyzer. I want to encourage them to try the PVS-Studio analyzer, because I believe the Unity community can benefit from it. A cool way to demonstrate it would be to analyze a Unity-based open-source game and present the results. But the problem is - I could not find such a game! So please let me know of any ideas you have for such games I could analyze with PVS-Studio. When I did try to look for a Unity-based open-source game, my search yielded unexpected results. On one website, I found a list of Unity projects that for some mysterious reason included ILSpy. In PVS-Studio, we use a pool of projects to test our C# analyzer. That group includes ILSpy, so it's odd that we do not have an article on this project yet. But since I failed to find a Unity project for analysis, let's take a look at ILSpy.

Here's the project's description on GitHub: ILSpy is the open-source .NET assembly browser and decompiler.

Since there was no information on whether ILSpy's developers use a static analyzer, I am going to assume PVS-Studio is the first one. This makes my tests and research even more interesting. Now, without further discussion, let's move on to analysis results.

Replacement That Did Not Work

V3038 The '"'"' argument was passed to 'Replace' method several times. It is possible that other argument should be passed instead. ICSharpCode.Decompiler ReflectionDisassembler.cs 772

private static void WriteSimpleValue(ITextOutput output,
                                     object value, string typeName)
{
  switch (typeName)
  {
    case "string":
      output.Write(  "'"
                   + DisassemblerHelpers
                      .EscapeString(value.ToString())
                      .Replace("'", "\'")                   // <=
                   + "'");
      break;
    case "type":
    ....
  }
  ....
}

The author seems to be replacing all single quote character occurrences with a string consisting of two characters: a backslash and a single quote character. However, the developer missed a beat and by accident replaced the "'" with itself, thus performing a meaningless operation. There is no difference between assigning a string variable a value of "'" or "\'" - either way the string is initialized with a single quote character. To include "\'" in a string, use escape characters: "\\'" or @"\'". Thus, one can change the Replace method call in the following way:

Replace("'", @"\'")

Truth and Nothing but the Truth

Warning 1

V3022 Expression 'negatedOp == BinaryOperatorType.Any' is always true. ICSharpCode.Decompiler CSharpUtil.cs

static Expression InvertConditionInternal(Expression condition)
{
  var bOp = (BinaryOperatorExpression)condition;

  if (   (bOp.Operator == BinaryOperatorType.ConditionalAnd)
      || (bOp.Operator == BinaryOperatorType.ConditionalOr))
  {
    ....
  }
  else if (   (bOp.Operator == BinaryOperatorType.Equality)
           || (bOp.Operator == BinaryOperatorType.InEquality) 
           || (bOp.Operator == BinaryOperatorType.GreaterThan)
           || (bOp.Operator == BinaryOperatorType.GreaterThanOrEqual)
           || (bOp.Operator == BinaryOperatorType.LessThan) 
           || (bOp.Operator == BinaryOperatorType.LessThanOrEqual))
  {
    ....
  }
  else
  {
    var negatedOp = NegateRelationalOperator(bOp.Operator);
    if (negatedOp == BinaryOperatorType.Any)                  // <=
      return new UnaryOperatorExpression(....);
    bOp = (BinaryOperatorExpression)bOp.Clone();
    bOp.Operator = negatedOp;
    return bOp;
  }
}

The analyzer warns that the negatedOp variable always equals to the value of Any from the BinaryOperatorType enumeration. To verify this, let us take a look at the NegateRelationalOperator method code that provides a value for the negatedOp variable.

public static BinaryOperatorType NegateRelationalOperator(BinaryOperatorType op)
{
  switch (op)
  {
    case BinaryOperatorType.GreaterThan:
      return BinaryOperatorType.LessThanOrEqual;
    case BinaryOperatorType.GreaterThanOrEqual:
      return BinaryOperatorType.LessThan;
    case BinaryOperatorType.Equality:
      return BinaryOperatorType.InEquality;
    case BinaryOperatorType.InEquality:
      return BinaryOperatorType.Equality;
    case BinaryOperatorType.LessThan:
      return BinaryOperatorType.GreaterThanOrEqual;
    case BinaryOperatorType.LessThanOrEqual:
      return BinaryOperatorType.GreaterThan;
    case BinaryOperatorType.ConditionalOr:
      return BinaryOperatorType.ConditionalAnd;
    case BinaryOperatorType.ConditionalAnd:
      return BinaryOperatorType.ConditionalOr;
  }
  return BinaryOperatorType.Any;
}

If by the NegateRelationalOperator method call, the bOp.Operator's value does not match any of the case labels, the method returns BinaryOperatorType.Any. You can see that the NegateRelationalOperator method is called only when if and if else statements above the method are evaluated to false. Moreover, if you look closely, you can notice that the if and if else statements cover all case labels the NegateRelationalOperator method contains. By the time the NegateRelationalOperator method is called, the bOp.Operator does not satisfy any of the case labels and the method returns the BinaryOperatorType.Any value. As a result, negatedOp == BinaryOperatorType.Any always evaluates to true, and the next line returns the value from the method. In addition, we get unreachable code:

bOp = (BinaryOperatorExpression)bOp.Clone();
bOp.Operator = negatedOp;
return bOp;

By the way, the analyzer kindly issued a warning for this as well: V3142 Unreachable code detected. It is possible that an error is present. ICSharpCode.Decompiler CSharpUtil.cs 81

Warning 2

V3022 Expression 'pt != null' is always true. ICSharpCode.Decompiler FunctionPointerType.cs 168

public override IType VisitChildren(TypeVisitor visitor)
{
  ....
  IType[] pt = (r != ReturnType) ? new IType[ParameterTypes.Length] : null;
  ....
  if (pt == null)
    return this;
  else
    return new FunctionPointerType(
      module, CallingConvention, CustomCallingConventions,
      r, ReturnIsRefReadOnly,
      pt != null ? pt.ToImmutableArray() : ParameterTypes,    // <=
      ParameterReferenceKinds);
}

Here everything is straightforward - the else branch is executed if the pt variable is not null. So I don't see the need in a ternary operator that checks the pt variable for null. I suspect that in the past the code did not contain the if-else statement and the first return operator - then this check would have made sense. Right now it's a good idea to remove the extra ternary operator:

public override IType VisitChildren(TypeVisitor visitor)
{
  ....
  IType[] pt = (r != ReturnType) ? new IType[ParameterTypes.Length] : null;
  ....
  if (pt == null)
    return this;
  else
    return new FunctionPointerType(
      module, CallingConvention, CustomCallingConventions,
      r, ReturnIsRefReadOnly,
      pt.ToImmutableArray(), ParameterReferenceKinds);
}

Warning 3

V3022 Expression 'settings.LoadInMemory' is always true. ICSharpCode.Decompiler CSharpDecompiler.cs 394

static PEFile LoadPEFile(string fileName, DecompilerSettings settings)
{
  settings.LoadInMemory = true;
  return new PEFile(
    fileName,
    new FileStream(fileName, FileMode.Open, FileAccess.Read),
    streamOptions: settings.LoadInMemory ?                           // <=
      PEStreamOptions.PrefetchEntireImage : PEStreamOptions.Default,
    metadataOptions: settings.ApplyWindowsRuntimeProjections ? 
        MetadataReaderOptions.ApplyWindowsRuntimeProjections :
        MetadataReaderOptions.None
  );
}

This case is similar to the previous one - we get an unnecessary ternary operator. The settings.LoadInMemory property is set to true and this value does not change until the ternary operator checks the value. Here's the code for the property's getter and setter:

public bool LoadInMemory {
  get { return loadInMemory; }
  set {
      if (loadInMemory != value)
      {
        loadInMemory = value;
        OnPropertyChanged();
      }
  }
}

It's easy to exclude the unnecessary ternary operator and fix this code. There is probably no need to provide it here.

Warning 4

V3022 Expression 'ta' is always not null. The operator '??' is excessive. ICSharpCode.Decompiler ParameterizedType.cs 354

public IType VisitChildren(TypeVisitor visitor)
{
  ....
  if (ta == null)
      return this;
  else
      return new ParameterizedType(g, ta ?? typeArguments);     // <=
}

We can see the unnecessary null coalescing operator right away. When the ta variable gets to the else branch, it always has a value that is not null. Consequently, the ?? operator is excessive.

I got a total of 31 warnings under the number of V3022.

You Don't Belong Here

Warning 1

V3025 Incorrect format. A different number of format items is expected while calling 'Format' function. Arguments not used: End. ICSharpCode.Decompiler Interval.cs 269

public override string ToString()
{
  if (End == long.MinValue)
  {
    if (Start == long.MinValue)
      return string.Format("[long.MinValue..long.MaxValue]", End); // <=
    else
      return string.Format("[{0}..long.MaxValue]", Start);
  }
  else if (Start == long.MinValue)
  {
    return string.Format("[long.MinValue..{0})", End);
  }
  else
  {
    return string.Format("[{0}..{1})", Start, End);
  }
}

In the first string.Format method call, the format string does not match the arguments the method receives. The End variable's value, passed as an argument, cannot be inserted into the format string, because the string lacks the {0} format element. Following the method's logic, this is not an error and the return operator returns the string the code authors intended. This, of course, does not cancel the fact, that the code includes a useless string.Format method call with an unused argument. It's a good idea to fix this to make the code clean and easy to read.

Warning 2

V3025 Incorrect format. A different number of format items is expected while calling 'AppendFormat' function. Arguments not used: angle. ILSpy.BamlDecompiler XamlPathDeserializer.cs 177

public static string Deserialize(BinaryReader reader)
{
  ....
  var sb = new StringBuilder();
  ....
  sb.AppendFormat(CultureInfo.InvariantCulture,
                  "A{0} {2:R} {2} {3} {4}",
                  size, angle, largeArc ? '1' : '0',
                  sweepDirection ? '1' : '0', pt1);
  ....
}

In this case the angle variable was left out. Though the developer passed the variable to the AppendFormat method, the variable remains unused, because the format string contains two of {2} format elements and lacks the {1} format element. The authors probably intended to produce the following string:"A{0} {1:R} {2} {3} {4}".

Double Standards

Warning 1

V3095 The 'roslynProject' object was used before it was verified against null. Check lines: 96, 97. ILSpy.AddIn OpenILSpyCommand.cs 96

protected Dictionary<string, detectedreference=""> GetReferences(....)
{
  ....
  var roslynProject =  owner.Workspace
                            .CurrentSolution
                            .GetProject(projectReference.ProjectId);
  var project = FindProject(owner.DTE.Solution
                                 .Projects.OfType<envdte.project>(),
                            roslynProject.FilePath);              // <=

  if (roslynProject != null && project != null)                   // <=
  ....
}

First we get a roslynProject object's FilePath property with no worry that the roslynProject value may be null, and in the next line we check roslynProject for null. Such code does not look safe and may produce a NullReferenceException exception. To fix this code, one can use the FilePath property along with a null-conditional operator. The second step is to plan for the FindProject method to potentially get a null value as the last parameter.

Warning 2

V3095 The 'listBox' object was used before it was verified against null. Check lines: 46, 52. ILSpy FlagsFilterControl.xaml.cs 46

public override void OnApplyTemplate()
{
  base.OnApplyTemplate();

  listBox = Template.FindName("ListBox", this) as ListBox;
  listBox.ItemsSource = FlagGroup.GetFlags(....);         // <=

  var filter = Filter;

  if (filter == null || filter.Mask == -1)
  {
    listBox?.SelectAll();                                 // <=
  }
}

This case is similar to the previous example. First, we assign a value to the ItemsSource property and do not check whether the listBox variable contains null. Then, several lines later, I can see the listBox variable with the null-conditional operator. Note that between these two calls the listBox variable did not get a new value.

Our analyzer displayed 10 warnings with number V3095. Here is a list of those warnings:

  • V3095 The 'pV' object was used before it was verified against null. Check lines: 761, 765. ICSharpCode.Decompiler TypeInference.cs 761

  • V3095 The 'pU' object was used before it was verified against null. Check lines: 882, 886. ICSharpCode.Decompiler TypeInference.cs 882

  • V3095 The 'finalStore' object was used before it was verified against null. Check lines: 261, 262. ICSharpCode.Decompiler TransformArrayInitializers.cs 261

  • V3095 The 'definitionDeclaringType' object was used before it was verified against null. Check lines: 93, 104. ICSharpCode.Decompiler SpecializedMember.cs 93

  • V3095 The 'TypeNamespace' object was used before it was verified against null. Check lines: 84, 88. ILSpy.BamlDecompiler XamlType.cs 84

  • V3095 The 'property.Getter' object was used before it was verified against null. Check lines: 1676, 1684. ICSharpCode.Decompiler CSharpDecompiler.cs 1676

  • V3095 The 'ev.AddAccessor' object was used before it was verified against null. Check lines: 1709, 1717. ICSharpCode.Decompiler CSharpDecompiler.cs 1709

  • V3095 The 'targetType' object was used before it was verified against null. Check lines: 1614, 1657. ICSharpCode.Decompiler CallBuilder.cs 1614

By the way, if you want to check your own project with the PVS-Studio analyzer or recheck ILSpy to see all warnings by yourself, you can try the analyzer. On the PVS-Studio website, you can both download the analyzer and request a trial license.

All Roads Lead to One Place

Warning 1

V3139 Two or more case-branches perform the same actions. ILSpy Images.cs 251

protected override ImageSource GetBaseImage(MemberIcon icon)
{
  ImageSource baseImage;
  switch (icon)
  {
    case MemberIcon.Field:
      baseImage = Images.Field;
      break;
    case MemberIcon.FieldReadOnly:
      baseImage = Images.FieldReadOnly;
      break;
    case MemberIcon.Literal:
      baseImage = Images.Literal;             // <=
      break;
    case MemberIcon.EnumValue:
      baseImage = Images.Literal;             // <=
      break;
    case MemberIcon.Property:
      baseImage = Images.Property;
      break;
    case MemberIcon.Indexer:
      baseImage = Images.Indexer;
      break;
    case MemberIcon.Method:
      baseImage = Images.Method;
      break;
    case MemberIcon.Constructor:
      baseImage = Images.Constructor;
      break;
    case MemberIcon.VirtualMethod:
      baseImage = Images.VirtualMethod;
      break;
    case MemberIcon.Operator:
      baseImage = Images.Operator;
      break;
    case MemberIcon.ExtensionMethod:
      baseImage = Images.ExtensionMethod;
      break;
    case MemberIcon.PInvokeMethod:
      baseImage = Images.PInvokeMethod;
      break;
    case MemberIcon.Event:
      baseImage = Images.Event;
      break;
    default:
      throw new ArgumentOutOfRangeException(nameof(icon), 
                 $"MemberIcon.{icon} is not supported!");
  }

  return baseImage;
}

As I see it, this is clearly a mistake. If the icon variable's value equals MemberIcon.EnumValue, then the baseImage variable in the case branch must get the value of Images.EnumValue. This is a good example of an error that a static analyzer easily detects and a human eye easily misses when looking through code.

Warning 2

V3139 Two or more case-branches perform the same actions. ICSharpCode.Decompiler CSharpConversions.cs 829

bool ImplicitConstantExpressionConversion(ResolveResult rr, IType toType)
{
  ....
  switch (toTypeCode)
  {
    case TypeCode.SByte:
      return val >= SByte.MinValue && val <= SByte.MaxValue;
    case TypeCode.Byte:
      return val >= Byte.MinValue && val <= Byte.MaxValue;
    case TypeCode.Int16:
      return val >= Int16.MinValue && val <= Int16.MaxValue;
    case TypeCode.UInt16:
      return val >= UInt16.MinValue && val <= UInt16.MaxValue;
    case TypeCode.UInt32:
      return val >= 0;                 // <=
    case TypeCode.UInt64:
      return val >= 0;                 // <=
  }
  ....
}

I won't claim that the analyzer found here an obvious mistake, but the warning definitely makes sense. If the case labels for the TypeCode.UInt32 and TypeCode.UInt64 perform the same set of actions, why not write shorter code:

bool ImplicitConstantExpressionConversion(ResolveResult rr, IType toType)
{
  switch (toTypeCode)
  {
      ....
      case TypeCode.UInt32:
      case TypeCode.UInt64:
        return val >= 0;
  }
  ....
}

The analyzer issued 2 more warnings with the number V3139:

  • V3139 Two or more case-branches perform the same actions. ICSharpCode.Decompiler EscapeInvalidIdentifiers.cs 85

  • V3139 Two or more case-branches perform the same actions. ICSharpCode.Decompiler TransformExpressionTrees.cs 370

Safety Comes First

V3083 Unsafe invocation of event, NullReferenceException is possible. Consider assigning event to a local variable before invoking it. ILSpy MainWindow.xaml.cs 787class ResXResourceWriter : IDisposable

void assemblyList_Assemblies_CollectionChanged(....)
{
  ....
  if (CurrentAssemblyListChanged != null)
    CurrentAssemblyListChanged(this, e);      // <=
}

This way to raise events is fairly common, but the fact that we see this pattern in many projects is not an excuse to use it. Of course, this is not a critical error, but, as the analyzer's warning says - this event invocation is not safe and a NullReferenceException exception is possible. If all handlers unsubscribe from the event after *CurrentAssemblyListChanged *is checked for null and before the event is raised (for example, in a different thread), then a NullReferenceException exception is thrown. One can fix this and write the following safe code instead:

void assemblyList_Assemblies_CollectionChanged(....)
{
  ....
  CurrentAssemblyListChanged?.Invoke(this, e);
}

PVS-Studio found 8 more similar cases, they can all be fixed with the approach above.

Confident Uncertainty

V3146 Possible null dereference. The 'FirstOrDefault' can return default null value. ILSpy.BamlDecompiler BamlResourceEntryNode.cs 76

bool LoadBaml(AvalonEditTextOutput output, CancellationToken cancellationToken)
{
  var asm = this.Ancestors().OfType<assemblytreenode>()
                            .FirstOrDefault().LoadedAssembly;       // <=
  ....
  return true;
}

Here the developer calls the FirstOrDefault method to get the first available AssemblyTreeNode type element from the collection the OfType method returns. If the collection is empty or does not contain any elements that meet the search criteria, the FirstOrDefault method returns the default value - in our case it's null. A further attempt to access the the LoadedAssembly property means using a null reference and yields a NullReferenceException exception. To avoid this situation, it's a good idea to use a null-conditional operator:

bool LoadBaml(AvalonEditTextOutput output, CancellationToken cancellationToken)
{
  var asm = this.Ancestors().OfType<assemblytreenode>()
                            .FirstOrDefault()?.LoadedAssembly;     // <=
  ....
  return true;
}

We can assume the developer intended for the FirstOrDefault method to never return null in this particular place. If this is really the case, then it's a good idea to call the First method instead of FirstOrDefault, because it is a way to stress the developer's assurance that the method is always able to retrieve the required element from the collection. Moreover, if the element is not found in the collection, the developer gets the InvalidOperationException exception, which displays the following message: "Sequence contains no elements". This is more informative than a NullReferenceException exception that is thrown after the code refers to a null value the FirstOrDefault method returns.

Unsafe Scanning

V3105 The 'm' variable was used after it was assigned through null-conditional operator. NullReferenceException is possible. ILSpy MethodVirtualUsedByAnalyzer.cs 137

static bool ScanMethodBody(IMethod analyzedMethod, 
                           IMethod method, MethodBodyBlock methodBody)
{
  ....
  var mainModule = (MetadataModule)method.ParentModule;
  ....
  switch (member.Kind)
  {
    case HandleKind.MethodDefinition:
    case HandleKind.MethodSpecification:
    case HandleKind.MemberReference:
      var m = (mainModule.ResolveEntity(member, genericContext) as IMember)
              ?.MemberDefinition;
      if (   m.MetadataToken == analyzedMethod.MetadataToken               // <=
          && m.ParentModule.PEFile == analyzedMethod.ParentModule.PEFile)  // <=
      {
        return true;
      }
      break;
  }
  ....
}

In the code above, the developers used the null conditional operator to initialize the m variable. They anticipated that m could be assigned a null value. Interestingly, in the next line the developers get the m variable's properties and do not use the null conditional operator. This may lead to NullReferenceException type exceptions. As in some other examples we've reviewed so far, let's fix the problem by introducing the null-conditional operator:

static bool ScanMethodBody(IMethod analyzedMethod, 
                           IMethod method, MethodBodyBlock methodBody)
{
  ....
  var mainModule = (MetadataModule)method.ParentModule;
  ....
  switch (member.Kind)
  {
    case HandleKind.MethodDefinition:
    case HandleKind.MethodSpecification:
    case HandleKind.MemberReference:
      var m = (mainModule.ResolveEntity(member, genericContext) as IMember)
              ?.MemberDefinition;
      if (   m?.MetadataToken == analyzedMethod.MetadataToken
          && m?.ParentModule.PEFile == analyzedMethod.ParentModule.PEFile)
      {
        return true;
      }
      break;
  }
  ....
}

Good Old Friends

V3070 Uninitialized variable 'schema' is used when initializing the 'ResourceSchema' variable. ICSharpCode.Decompiler ResXResourceWriter.cs 63

class ResXResourceWriter : IDisposable
{
  ....
  public static readonly string ResourceSchema = schema;
  ....
  static string schema = ....;
  ....
}

At first I did not plan to list this warning, because about five years ago we found an identical error in the Mono project. But then I talked to a colleague and we decided the error is worth mentioning. As the article dedicated to analyzing Mono describes, by the time the schema static field initializes the ResourceSchema static field, the schema static field has not been initialized yet and evaluates to its default value - null. The ResXResourceWriter.cs file where we found the error, was kindly borrowed with copyright preservation from the Mono project. Then developers expanded the file with unique features for the ILSpy project. This is how bugs from projects spread across the internet and migrate from one project to another. By the way, the original developers have not yet fixed the bug in the original file.

Conclusion

Ultimately, the ILSpy decompiler's code analysis demonstrated that the project would benefit from a static code analyzer. Some code fragments we described are not errors, but refactoring them will clean up the code. Other code snippets are clearly incorrect. It is obvious that the authors expect a different result - for example the Replace method's behavior that has the same arguments. Regular use of static analysis allows developers to find and fix incorrect, ambiguous or excessive code. It is always quicker and cheaper to fix a bug at the stage of writing or testing code, than after the product is released with a bug and the users come and tell you "Hey, there's a bug here" - and you're lucky if they use these words. It's always better if the static analyzer tells you this. Thank you for reading.

A Note for Those Looking to Test ILSpy on Their Own

When analyzing the ILSpy project, we found a few problems related to the analyzer itself - yes, things like this happen. We fixed the issues, but the changes were not included in the 7.11 release. They will be available in the next version. Also note that ILSpy is compiled slightly differently from what most developers are used to. This peculiarity requires additional analyzer settings. So if you want to check ILSpy by yrself - let us know. We will provide you with the analyzer's beta and explain how to set up the analys

Теги:
Хабы:
Рейтинг0
Комментарии0

Публикации

Информация

Сайт
pvs-studio.ru
Дата регистрации
Дата основания
2008
Численность
31–50 человек
Местоположение
Россия