IchNikola Oct 6 2020 at 08:59

Nullable Reference will not protect you, and here is the proof

13 min

1.5K

PVS-Studio corporate blog.NET*C#*Visual Studio*

Have you ever wanted to get rid of the problem with dereferencing null references? If so, using Nullable Reference types is not your choice. Do you want to know why? This will be our topic today.

We warned you, and it happened. About a year ago, my colleagues wrote an article in which they warned that the introduction of Nullable Reference types will not protect against dereferencing null references. Now we have an indisputable proof of what we were saying found in the depths of Roslyn.

Nullable Reference types

The idea itself of adding Nullable Reference (further as NR) types seems noteworthy to me, since the problem related to dereferencing null references is still relevant to this day. Nevertheless, the implementation of protection against dereferencing turned out to be extremely unreliable. According to the idea of creators, only those variables whose type is marked with the "?" symbol can accept the null value. For example, a variable of the string? type indicates that it might contain null, and a variable of the string type might imply the opposite

However, nobody's stopping us from passing null to non-nullable reference variables (further as — NNR) of types, because they are not implemented at the IL code level. The compiler's built-in static analyzer is responsible for this limitation. Therefore, this new feature is more of a recommendation. Here is a simple example showing how it works:

#nullable enable
object? nullable = null;
object nonNullable = nullable;
var deref = nonNullable.ToString();

As we can see, the nonNullable type is specified as NNR, but we can safely pass null there. Of course, we will get a warning about converting "Converting null literal or possible null value to non-nullable type". However, we can get round it a bit more aggressively:

#nullable enable
object? nullable = null;
object nonNullable = nullable!; // <=
var deref = nonNullable.ToString();

One exclamation mark and there are no warnings. If you're a nitpicker, the following option is also available:

#nullable enable
object nonNullable = null!;
var deref = nonNullable.ToString();

Here's another example. Let's create two simple console projects. In the first we write:

namespace NullableTests
{
    public static class Tester
    {
        public static string RetNull() => null;
    }
}

In the second one we write:

#nullable enable 

namespace ConsoleApp1
{
    class Program
    {
        static void Main(string[] args)
        {
            string? nullOrNotNull = NullableTests.Tester.RetNull();
            System.Console.WriteLine(nullOrNotNull.Length);
        }
    }
}

Hover the cursor over nullOrNotNull and see this message:

It's a hint that the string here can't be null. But we already know that it will be null right here. Run the project and get the exception:

Sure, these are just synthetic examples that demonstrate that this feature doesn't guarantee you protection from dereferencing a null reference. If you consider synthetic examples to be boring and you're wondering where real examples are, don't worry — they will be further in the article.

NR types also have another problem — it's not clear whether they are enabled or not. For example, the solution has two projects. One is marked up using this syntax, and the other is not. When you go to the project with NR types, you can decide that if one is marked up, then all are marked up. However, this will not be the case. It turns out that you need to check every time whether nullable context is enabled in a project or file. Otherwise, you might mistakenly assume that the normal reference type is NNR.

How we found proofs

When developing new diagnostics in the PVS-Studio analyzer, we always test them on our database of real projects. This helps for several reasons. For example, we can:

watch "live" the quality of received warnings;
get rid of some false positives;
find interesting fragments in the code which you can tell someone about;
etc.

One of the new diagnostics — V3156 found places where exceptions can occur due to potential null. The diagnostic message is as follows: "The argument of the method is not expected to be null". Its main point is that a null value can be passed as an argument to a method that does not expect null. This can lead, for example, to an exception or incorrect execution of the called method. You can read more about this diagnostic rule here.

Proofs are here

So here we are in the main part of this article. get ready to see real code fragments from the Roslyn project which the diagnostic issued warnings for. Their underlying idea is that either the NNR type is passed null, or there is no checking of the NR type value. All of this can result in an exception.

Example 1

private static Dictionary<object, SourceLabelSymbol>
BuildLabelsByValue(ImmutableArray<LabelSymbol> labels)
{
  ....
  object key;
  var constantValue = label.SwitchCaseLabelConstant;
  if ((object)constantValue != null && !constantValue.IsBad)
  {
    key = KeyForConstant(constantValue);
  }
  else if (labelKind == SyntaxKind.DefaultSwitchLabel)
  {
    key = s_defaultKey;
  }
  else
  {
    key = label.IdentifierNodeOrToken.AsNode();
  }

  if (!map.ContainsKey(key))                // <=
  {
    map.Add(key, label);
  } 
  ....
}

V3156 The first argument of the 'ContainsKey' method is not expected to be null. Potential null value: key. SwitchBinder.cs 121

The message states that key is potential null. Let's see where this variable can get this value. Let's check the KeyForConstant method first:

protected static object KeyForConstant(ConstantValue constantValue)
{
  Debug.Assert((object)constantValue != null);
  return constantValue.IsNull ? s_nullKey : constantValue.Value;
}
private static readonly object s_nullKey = new object();

Since s_nullKey is not null, see what constantValue.Value returns:

public object? Value
{
  get
  {
    switch (this.Discriminator)
    {
      case ConstantValueTypeDiscriminator.Bad: return null;  // <=
      case ConstantValueTypeDiscriminator.Null: return null; // <=
      case ConstantValueTypeDiscriminator.SByte: return Boxes.Box(SByteValue);
      case ConstantValueTypeDiscriminator.Byte: return Boxes.Box(ByteValue);
      case ConstantValueTypeDiscriminator.Int16: return Boxes.Box(Int16Value);
      ....
      default: throw ExceptionUtilities.UnexpectedValue(this.Discriminator);
    }
  }
}

There are two null literals here, but in this case, we won't go into any case with them. This is due to IsBad and IsNull checks. However, I would like to draw your attention to the return type of this property. It is an NR type, but the KeyForConstant method already returns the NNR type. It turns out that normally the KeyForConstant method can return null.

Another source that can return null is the AsNode method:

public SyntaxNode? AsNode()
{
  if (_token != null)
  {
    return null;
  }

  return _nodeOrParent;
}

Again, please note the return type of the method — it is NR. It turns out that when we say that a method can return null, it doesn't affect anything. What's interesting here is the fact that the compiler here does not complain about the conversion from NR to NNR:

Example 2

private SyntaxNode CopyAnnotationsTo(SyntaxNode sourceTreeRoot, 
                                     SyntaxNode destTreeRoot)
{  
  var nodeOrTokenMap = new Dictionary<SyntaxNodeOrToken, 
                                      SyntaxNodeOrToken>();
  ....
  if (sourceTreeNodeOrTokenEnumerator.Current.IsNode)
  {
    var oldNode = destTreeNodeOrTokenEnumerator.Current.AsNode();
    var newNode = sourceTreeNodeOrTokenEnumerator.Current.AsNode()
                                       .CopyAnnotationsTo(oldNode);
        
    nodeOrTokenMap.Add(oldNode, newNode); // <=
  }
  ....
}

V3156 The first argument of the 'Add' method is not expected to be null. Potential null value: oldNode. SyntaxAnnotationTests.cs 439

Another example with the AsNode function, which was described above. Only this time oldNode will have the NR type. While the key described above had the NNR type.

By the way, I can't help but share an interesting finding with you. As I described above, when developing diagnostics, we check them on different projects. When checking the warnings of this rule, I noticed a curious thing. About 70% of all warnings were issued for methods of the Dictionary class. In which most of them fell on the TryGetValue method. This may be because we subconsciously do not expect exceptions from a method that contains the word try. So, check your code for this pattern, you might find something similar.

Example 3

private static SymbolTreeInfo TryReadSymbolTreeInfo(
    ObjectReader reader,
    Checksum checksum,
    Func<string, ImmutableArray<Node>, 
    Task<SpellChecker>> createSpellCheckerTask)
{
  ....
  var typeName = reader.ReadString();
  var valueCount = reader.ReadInt32();

  for (var j = 0; j < valueCount; j++)
  {
    var containerName = reader.ReadString();
    var name = reader.ReadString();

    simpleTypeNameToExtensionMethodMap.Add(typeName, // <=
                            new ExtensionMethodInfo(containerName, name)); 
  }
  ....
}

V3156 The first argument of the 'Add' method is passed as an argument to the 'TryGetValue' method and is not expected to be null. Potential null value: typeName. SymbolTreeInfo_Serialization.cs 255

The analyzer says that the problem is in typeName. Let's first make sure that this argument is indeed a potential null. Now look at ReadString:

public string ReadString() => ReadStringValue();

Ok, check out ReadStringValue:


private string ReadStringValue()
{
  var kind = (EncodingKind)_reader.ReadByte();
  return kind == EncodingKind.Null ? null : ReadStringValue(kind);
}

Great, now let's recall where our variable was passed to:

simpleTypeNameToExtensionMethodMap.Add(typeName, // <=
                              new ExtensionMethodInfo(containerName,
                                                      name));

I think it's high time we took a peek inside the Add method:

public bool Add(K k, V v)
{
  ValueSet updated;

  if (_dictionary.TryGetValue(k, out ValueSet set)) // <=
  {
    ....
  }
  ....
}

Indeed, if we pass null as the first argument to the Add method, we will get the ArgumentNullException.

By the way, here's what's interesting — what if we hover the cursor over typeName in Visual Studio, will we see that its type is string?:

The return type of the method is simply string:

In addition, if we create an NNR variable and assign it typeName, no error will be output.

Let's crash Roslyn

Doing this not out of spite, but for fun, I suggest trying to reproduce one of the examples shown.

Test 1

Let's take the example described under number 3:

private static SymbolTreeInfo TryReadSymbolTreeInfo(
    ObjectReader reader,
    Checksum checksum,
    Func<string, ImmutableArray<Node>, 
    Task<SpellChecker>> createSpellCheckerTask)
{
  ....
  var typeName = reader.ReadString();
  var valueCount = reader.ReadInt32();

  for (var j = 0; j < valueCount; j++)
  {
    var containerName = reader.ReadString();
    var name = reader.ReadString();

    simpleTypeNameToExtensionMethodMap.Add(typeName, // <=
                            new ExtensionMethodInfo(containerName, name)); 
  }
  ....
}

To reproduce it, we will need to call the TryReadSymbolTreeInfo method, but it is private. The good thing is that the class with it has the ReadSymbolTreeInfo_ForTestingPurposesOnly method, which is already internal:

internal static SymbolTreeInfo ReadSymbolTreeInfo_ForTestingPurposesOnly(
    ObjectReader reader, 
    Checksum checksum)
{
  return TryReadSymbolTreeInfo(reader, checksum,
          (names, nodes) => Task.FromResult(
            new SpellChecker(checksum, 
                             nodes.Select(n => new StringSlice(names, 
                                                               n.NameSpan)))));
}

It is very nice that we are simply offered to test the TryReadSymbolTreeInfo method. So, let's create our own class right here and write the following code:

public class CheckNNR
{
  public static void Start()
  {
    using var stream = new MemoryStream();
    using var writer = new BinaryWriter(stream);
    writer.Write((byte)170);
    writer.Write((byte)9);
    writer.Write((byte)0);
    writer.Write(0);
    writer.Write(0);
    writer.Write(1);
    writer.Write((byte)0);
    writer.Write(1);
    writer.Write((byte)0);
    writer.Write((byte)0);
    stream.Position = 0;

    using var reader = ObjectReader.TryGetReader(stream);
    var checksum = Checksum.Create("val");

    SymbolTreeInfo.ReadSymbolTreeInfo_ForTestingPurposesOnly(reader, checksum);
  }
}

Now we build Roslyn, create a simple console application, include all the necessary dll files, and write this code:

static void Main(string[] args)
{
  CheckNNR.Start();
}

Run, reach the desired point and see:

Next, go to the Add method and get the expected exception:

Let me remind you that the ReadString method returns an NNR type that cannot contain null as intended. This example once again confirms the relevance of the PVS-Studio diagnostic rules for searching for dereferencing null links.

Test 2

Well, since we have already started reproducing examples, why not reproduce another one. This example will not relate to NR types. However, the same V3156 diagnostic found it, and I wanted to tell you about it. Here's the code:

public SyntaxToken GenerateUniqueName(SemanticModel semanticModel, 
                                      SyntaxNode location, 
                                      SyntaxNode containerOpt, 
                                      string baseName, 
                                      CancellationToken cancellationToken)
{
  return GenerateUniqueName(semanticModel, 
                            location, 
                            containerOpt, 
                            baseName, 
                            filter: null, 
                            usedNames: null,    // <=
                            cancellationToken);
}

V3156 The sixth argument of the 'GenerateUniqueName' method is passed as an argument to the 'Concat' method and is not expected to be null. Potential null value: null. AbstractSemanticFactsService.cs 24

I'll be honest: when making this diagnostic, I didn't really expect triggering warnings for simple null. After all, it is quite strange to pass null to a method that throws an exception because of it. Although, I have seen places where this was justified (for example, with the Expression class), but that's not the point now.

So, I was very intrigued when I saw this warning. Let's see what is happening in the GenerateUniqueName method.

public SyntaxToken GenerateUniqueName(SemanticModel semanticModel,
                                      SyntaxNode location, 
                                      SyntaxNode containerOpt,
                                      string baseName, 
                                      Func<ISymbol, bool> filter,
                                      IEnumerable<string> usedNames, 
                                      CancellationToken cancellationToken)
{
  var container = containerOpt ?? location
                       .AncestorsAndSelf()
                       .FirstOrDefault(a => SyntaxFacts.IsExecutableBlock(a) 
                                         || SyntaxFacts.IsMethodBody(a));

  var candidates = GetCollidableSymbols(semanticModel, 
                                        location, 
                                        container, 
                                        cancellationToken);

  var filteredCandidates = filter != null ? candidates.Where(filter) 
                                          : candidates;

  return GenerateUniqueName(baseName, 
                            filteredCandidates.Select(s => s.Name)
                                              .Concat(usedNames));     // <=
}

As we can see, there is only one exit point in the method, no exceptions are thrown and there is no goto. In other words, nothing prevents us from passing usedNames to the Concat method and getting the ArgumentNullException.

But talk is cheap, so let's just do it. First, we have to find out where we can call this method from. The method itself is in the AbstractSemanticFactsService class. The class is abstract, so for convenience, let's take the CSharpSemanticFactsService class, which is inherited from it. In the file of this class, we'll create our own one, which will call the GenerateUniqueName method. It looks like this:

public class DropRoslyn
{
  private const string ProgramText = 
    @"using System;
    using System.Collections.Generic;
    using System.Text
    namespace HelloWorld
    {
      class Program
      {
        static void Main(string[] args)
        {
          Console.WriteLine(""Hello, World!"");
        }
      }
    }";
  
  public void Drop()
  {
    var tree = CSharpSyntaxTree.ParseText(ProgramText);
    var instance = CSharpSemanticFactsService.Instance;
    var compilation = CSharpCompilation
                      .Create("Hello World")
                      .AddReferences(MetadataReference
                                     .CreateFromFile(typeof(string)
                                                     .Assembly
                                                     .Location))
                      .AddSyntaxTrees(tree);
    
    var semanticModel = compilation.GetSemanticModel(tree);
    var syntaxNode1 = tree.GetRoot();
    var syntaxNode2 = tree.GetRoot();
    
    var baseName = "baseName";
    var cancellationToken = new CancellationToken();
    
    instance.GenerateUniqueName(semanticModel, 
                                syntaxNode1, 
                                syntaxNode2, 
                                baseName, 
                                cancellationToken);
  }
}

Now we build Roslyn, create a simple console application, include all the necessary dll files, and write this code:

class Program
{
  static void Main(string[] args)
  {
    DropRoslyn dropRoslyn = new DropRoslyn();
    dropRoslyn.Drop();
  }
}

Run the app and get the following:

This is confusing

Let's say we agree with the nullable concept. It turns out that if we see the NR type, we assume that it may contain a potential null. However, sometimes we can stumble upon cases when the compiler tells us the opposite. Therefore, we'll walk through several cases where the use of this concept is not intuitive.

Case 1

internal override IEnumerable<SyntaxToken>? TryGetActiveTokens(SyntaxNode node)
{
  ....
  var bodyTokens = SyntaxUtilities
                   .TryGetMethodDeclarationBody(node)
                   ?.DescendantTokens();

  if (node.IsKind(SyntaxKind.ConstructorDeclaration, 
                  out ConstructorDeclarationSyntax? ctor))
  {
    if (ctor.Initializer != null)
    {
      bodyTokens = ctor.Initializer
                       .DescendantTokens()
                       .Concat(bodyTokens); // <=
    }
  }
  return bodyTokens;
}

V3156 The first argument of the 'Concat' method is not expected to be null. Potential null value: bodyTokens. CSharpEditAndContinueAnalyzer.cs 219

First of all, we check out why bodyTokens is a potential null and notice the null conditional statement:

var bodyTokens = SyntaxUtilities
                 .TryGetMethodDeclarationBody(node)
                 ?.DescendantTokens();              // <=

If we go inside the TryGetMethodDeclarationBody method, we will see that it can return null. However, it is relatively large, so I'm giving a link for you to see it for yourself. So, it's all clear with bodyTokens, but I'd like to point out the ctor argument:

if (node.IsKind(SyntaxKind.ConstructorDeclaration, 
                out ConstructorDeclarationSyntax? ctor))

As we can see, its type is set as NR. At the same time, here's dereference in the line below:

if (ctor.Initializer != null)

This combination is a bit ominous. Nonetheless, you will say that, most likely, if IsKind returns true, then ctor is definitely not null. So it is:

public static bool IsKind<TNode>(
    [NotNullWhen(returnValue: true)] this SyntaxNode? node, // <=
    SyntaxKind kind,
    [NotNullWhen(returnValue: true)] out TNode? result)     // <=
    where TNode : SyntaxNode 
{
  if (node.IsKind(kind))
  {
    result = (TNode)node;
    return true;
  }

  result = null;
  return false;
}

Special attributes used here indicate at which output value the parameters will not be null. We can make sure of it by looking at the logic of the IsKind method. It turns out that the ctor type must be NNR inside the condition. The compiler is aware of it and says that ctor inside the condition will not be null. But if we want to get it ourselves, we have to go inside the IsKind method and notice the attribute there. Otherwise, it looks like dereferencing the NR variable without checking for null. We can try making this a bit more visible as follows:

if (node.IsKind(SyntaxKind.ConstructorDeclaration, 
                out ConstructorDeclarationSyntax? ctor))
{
    if (ctor!.Initializer != null) // <=
    {
      ....
    }
}

Case 2

public TextSpan GetReferenceEditSpan(InlineRenameLocation location, 
                                     string triggerText, 
                                     CancellationToken cancellationToken)
{
  var searchName = this.RenameSymbol.Name;
  if (_isRenamingAttributePrefix)
  {
    searchName = GetWithoutAttributeSuffix(this.RenameSymbol.Name);
  }

  var index = triggerText.LastIndexOf(searchName,            // <=
                                      StringComparison.Ordinal);
  ....
}

V3156 The first argument of the 'LastIndexOf' method is not expected to be null. Potential null value: searchName. AbstractEditorInlineRenameService.SymbolRenameInfo.cs 126

We are interested in the searchName variable. null can be written into it after calling the GetWithoutAttributeSuffix method, but it's not that simple. Let's see what happens in it:

private string GetWithoutAttributeSuffix(string value)
    => value.GetWithoutAttributeSuffix(isCaseSensitive:
                _document.GetRequiredLanguageService<ISyntaxFactsService>()
                         .IsCaseSensitive)!;

Let's dig a bit deeper:

internal static string? GetWithoutAttributeSuffix(
            this string name,
            bool isCaseSensitive)
{
  return TryGetWithoutAttributeSuffix(name, isCaseSensitive, out var result) 
         ? result : null;
}

It turns out that the TryGetWithoutAttributeSuffix method will return either result or null. And the method returns the NR type. However, when we go back a step, we notice that the method type has suddenly changed to NNR. This is due to the hidden sign "!":

_document.GetRequiredLanguageService<ISyntaxFactsService>()
         .IsCaseSensitive)!; // <=

By the way, it is quite tricky to notice it in Visual Studio:

By setting it, the developer tells us that the method will never return null. Although, looking at the previous examples and going into the TryGetWithoutAttributeSuffix method, I personally can't be sure:

internal static bool TryGetWithoutAttributeSuffix(
            this string name,
            bool isCaseSensitive,
            [NotNullWhen(returnValue: true)] out string? result)
{
  if (name.HasAttributeSuffix(isCaseSensitive))
  {
    result = name.Substring(0, name.Length - AttributeSuffix.Length);
    return true;
  }

  result = null;
  return false;
}

Conclusion

In conclusion, I would like to note that the attempt to save us from unnecessary null checks is a great idea. However, NR types are rather advisory in nature, because no one strictly forbids us to pass null to the NNR type. Therefore, the corresponding PVS-Studio rules remain relevant. For example, such as V3080 or V3156.

All the best to you and thank you for your attention.

Hubs: