szKarlen Jan 25 2015 at 11:50

Age of JIT compiling. Part I. Genesis

4 min

18K

Programming*.NET*C#*

+20

Comments 18

a553 Jan 25 2015 at 13:37

Можно проще:

[StructLayout(LayoutKind.Explicit)]
struct Magic
{
    [FieldOffset(0)]
    public object Obj;

    [FieldOffset(0)]
    public string Str;
}

static string ItIsString(object obj)
{
    Magic m = new Magic();
    m.Obj = obj;
    return m.Str;
}

static void Main(string[] args)
{
    Console.WriteLine(ItIsString(10).Length); // 10
    Console.WriteLine(ItIsString((long)'p' << 32 | 1)[0]); // p
}

szKarlen Jan 25 2015 at 13:47

ну, это известный пример :)

kekekeks Jan 25 2015 at 17:35

Интересно, при получении где-то ссылки на такую «строку», не поплохеет ли потом GC.

szKarlen Jan 26 2015 at 07:54

Вопрос, я думаю, открытый. Нужно исследовать )

sidristij Mar 15 2015 at 19:13

GC абсолютно фиолетово, какие объекты куда передаются. Поплохеет, когда произойдет вызов метода.

qw1 Jan 25 2015 at 19:46

Я раньше не слышал о такой штуке

Для значимых ReturnType MethodName(ref Type this, …arguments…)
Сделано это для поддержки изменяемости структур, т.е. чтобы мы могли модифицировать this.

Можно ли привести пример кода на c#, который бы использовал эту особенность?

dordzhiev Jan 25 2015 at 22:22

Значимые типы при передаче в аргументах методах копируются по значению.

struct Foo
{
    public int Bar = 0;
}

void FooBar1(Foo foo, int bar)
{
    // Работаем с копией
    foo.Bar = bar;
}

void FooBar2(ref Foo foo, int bar)
{
    // Работаем с самим переданным объектом
    foo.Bar = bar;
}

void Main()
{
    var foo = new Foo();
    Console.WriteLine(foo.Bar); // Выведет 0
    
    // Передали копию foo
    FooBar1(foo, 1);
    Console.WriteLine(foo.Bar); // Выведет 0

    // Передали адрес foo
    FooBar2(ref foo, 2);
    Console.WriteLine(foo.Bar); // Выведет 2
}

qw1 Jan 26 2015 at 08:06

Этот пример совсем на другую тему — он иллюстрирует копирование структуры при передаче в качестве аргумента метода. В нём нет вызова метода самой структуры.
Мне интересен пример, когда вызывается метод у структуры, модифицирующий указатель this.

szKarlen Jan 26 2015 at 08:19

нет-нет! именно указатель не модифицируется. просто при передаче byval (как с сылками) структура бы копировалась, и соответственно, после вызова instance-метода структуры — эффект нулевой.

это сделано для того, чтобы JIT не создавал дополнительные трамплины и избежать boxing'a (если рассматривать стратегию реализации).

Можно ли привести пример кода на c#, который бы использовал эту особенность?

В C# ничего заметить нельзя, т.к. первый аргумент всегда опускается на уровне сигнатур методов для всех instance-методов.

А так, в статье я уже привел пример с unbound delegates для структур.

qw1 Jan 26 2015 at 08:58

На уровне x86 никакой разницы. Т.е. в регистр ecx во всех случаях передаётся this.

Скрытый текст

namespace ConsoleApplication1
{
    struct MyStruct
    {
        public int x;
        public int GetX() { return x; }
    }

    class MyClass
    {
        public int x;
        public int GetX() { return x; }
    }

    class Program
    {
        static void Main(string[] args)
        {
            var s = new MyStruct();
            s.GetX();

            var c = new MyClass();
            c.GetX();
        }
    }
}

Код GetX() идентичный для структуры и класса (различие только в том, что в классе смещение поля +4, в структуре +0):

szKarlen Jan 26 2015 at 09:39

Ну тогда для чистоты эксперимента:

Ваш изменный код c MethodImplOptions.NoInlining

namespace ConsoleApplication1
{
    struct MyStruct
    {
        public int x;

        [MethodImpl(MethodImplOptions.NoInlining)]
        public int GetX()
        {
            return x;
        }
    }

    class MyClass
    {
        public int x;

        [MethodImpl(MethodImplOptions.NoInlining)]
        public int GetX()
        {
            return x;
        }
    }

    class Program
    {
        unsafe static void Main(string[] args)
        {
            var s = new MyStruct();
            s.GetX(); // breakpoint

            var c = new MyClass();
            c.GetX(); // breakpoint
        }
    }
}

Получается слудущий код:

            var s = new MyStruct();
00000000  push        ebp 
00000001  mov         ebp,esp 
00000003  push        eax 
00000004  xor         eax,eax 
00000006  mov         dword ptr [ebp-4],eax 
00000009  xor         edx,edx 
0000000b  mov         dword ptr [ebp-4],edx 
            s.GetX(); // breakpoint
0000000e  lea         ecx,[ebp-4] 
00000011  call        dword ptr ds:[00193828h] 

            var c = new MyClass();
00000017  mov         ecx,1938A0h 
0000001c  call        FFF820B0 
00000021  mov         dword ptr [eax+4],7Bh 
            c.GetX(); // breakpoint
00000028  mov         ecx,eax 
0000002a  call        dword ptr ds:[00193894h] 
00000030  mov         esp,ebp 
        }
00000032  pop         ebp 
00000033  ret

Не смущает строка
0000000e lea ecx,[ebp-4] ??? а это есть не что иное как загрузка адреса MyStruct s

p.s.
вижу, что товарищ kekekeks уже ответил и Вам стало понятно

kekekeks Jan 26 2015 at 08:55

var point = new System.Drawing.Point();
point.Offset(100, 500);

Соответственно Offset имеет сигнатуру void (ref Point this, int dx, int dy)

Ну и все сеттеры свойств структур, да.

qw1 Jan 26 2015 at 09:00

Ага, теперь понял. Просто синтаксически компилятор обязан копировать структуру, если встречает (Point this), речь не идёт о передаче ссылки на указатель this.

PsyHaSTe Jan 27 2015 at 11:26

И, да, инструкция callvirt не проверяет на “правильность” объекта.

А можно поподробнее? Насколько я знаю, как раз-таки проверяет, и как раз-таки поэтому её используют для вызова методов, даже если они не являются виртуальными. Извиняюсь за многабукв, но для полноты информации процитирую всё:

We can use a similar dispatch sequence to call non-virtual methods as well. However, for non-virtual methods,
there is no need to use the method table for method dispatch: the code address of the invoked method (or at least
its pre-JIT stub) is known when the JIT compiles the method dispatch. For example, if the stack location EBP-64
contains the address of an Employeeobject, as before, then the following instruction sequence will call the
TakeVacationmethod with the parameter 5:
mov edx, 5 ;parameter passing through register – custom calling convention
mov ecx, dword ptr [ebp-64] ;still required because ECX contains ‘this’ by convention
call dword ptr [0x004a1260]
It is still required to load the object’s address into the ECXregister – all instance methods expect to receive
in ECXthe implicit thisparameter. However, there’s no longer any need to dereference the method table pointer
and obtain the address from the method table. The JIT compiler still needs to be able to update the call site after
performing the call; this is obtained by performing an indirect call through a memory location (0x004a1260in
this example) that initially points to the pre-JIT stub and is updated by the JIT compiler as soon as the method is
compiled.
Unfortunately, the method dispatch sequence above suffers from a significant problem. It allows method
calls on null object references to be dispatched successfully and possibly remain undetected until the instance
method attempts to access an instance field or a virtual method, which would cause an access violation. In fact,
this is the behavior for C++ instance method calls – the following code would execute unharmed in most C++
environments, but would certainly make C# developers shift uneasily in their chairs:
class Employee {
public: void Work() { } //empty non-virtual method
};
Employee* pEmployee = NULL;
pEmployee->Work(); //runs to completion
If you inspect the actual sequence used by the JIT compiler to invoke non-virtual instance methods, it would
contain an additional instruction:
mov edx, 5 ;parameter passing through register – custom calling convention
mov ecx, dword ptr [ebp-64] ;still required because ECX contains ‘this’ by convention
cmp ecx, dword ptr [ecx]
call dword ptr [0x004a1260]
Recall that the CMPinstruction subtracts its second operand from the first and sets CPU flags according to the
result of the operation. The code above does not use the comparison result stored in the CPU flags, so how would
the CMPinstruction help prevent calling a method using a null object reference? Well, the CMPinstruction attempts
to access the memory address in the ECXregister, which contains the object reference. If the object reference is
null, this memory access would fail with an access violation, because accessing the address 0 is always illegal in
Windows processes. This access violation is converted by the CLR to a NullReferenceExceptionwhich is thrown
at the invocation point; a much better choice than emitting a null check inside the method after it has already
been called. Furthermore, the CMPinstruction occupies only two bytes in memory, and has the advantage of
being able to check for invalid addresses other than null.

a553 Jan 27 2015 at 17:13

Имеется ввиду что callvirt не проверяет тип объекта.

szKarlen Jan 27 2015 at 18:14

ok, однако товарищ a553 успел ответить :)

sidristij Mar 16 2015 at 19:22

Приятно видеть что в потрохах кто-то еще ковыряется :))

sidristij Mar 16 2015 at 19:25

кстати, я так получаю адрес объекта и по адресу получаю .NET ссылку на объект: кладу на стек число, достаю ссылку нужного типа