vkhanieva Mar 29 2021 at 08:41

MacOS Kernel, How Good Is This Apple?

19 min

1.6K

PVS-Studio corporate blogC++*Open source*Development for MacOS*

0818_XNU_MacOS_Kernel/image1.png

At the very beginning of this year, Apple released the source code for macOS – Big Sur. It includes XNU, the kernel of the macOS operating system. A few years ago, PVS-Studio has already checked the kernel source code. It coincided with the analyzer release on macOS. It's been a while since then. The new kernel source code has been released. A second check? Why not?

What kind of project is it: Apple and open-source?

XNU – X is Not Unix – is developed by Apple for use in Mac OS X operating system. Source code of this kernel was published 20 years ago under APSL (the Apple Public Source License) together with OC Darwin. Previously, you could even install Darwin as a full-fledged operating system. However, it's no longer possible. The source code was largely based on other open-source projects. That's why, it was published.

You can find source code of components here. I used the mirror on GitHub to check the project.

Previous check

As I mentioned, we've already checked this project with PVS-Studio. You can find out more in the article: PVS-Studio is now available on macOS: 64 weaknesses in the Apple's XNU Kernel. After it was published, my colleague, Svyatoslav also sent the article to the developers by email. However, they didn't answer. So, I assume, our check has nothing to do with further described fixes. The developers had to look for them in a different way. Although they could just run PVS-Studio :). Nowadays, after publishing an article, we usually write about it in the project's GitHub repository.

I wondered whether the errors described in the previous article had been fixed. Well, most of the errors were actually fixed. It means that the selected analyzer warnings were correct. The person who worked with the review to write this article is not involved in the development of XNU. That is, he or she is not familiar with this source code.

I'll show you some fixes here. However, to shorten the article, I won't fully explain the errors. If you can't understand the problem from the fix, refer to the first article on the project check. I won't explain all the fixed fragments. Most of them were corrected after all. There were no less than 64 fragments in the previous article!

Let's go on to the examples from the previous article.

Fragment N1, where a class member is compared to itself:

int
key_parse(
      struct mbuf *m,
      struct socket *so)
{
  ....
  if ((m->m_flags & M_PKTHDR) == 0 ||
      m->m_pkthdr.len != m->m_pkthdr.len) {
    ....
    goto senderror;
  }
  ....
}

It was fixed as follows:

0818_XNU_MacOS_Kernel/image2.png

Here, the macro from which the orglen variable is derived looks like this:

#define PFKEY_UNUNIT64(a) ((a) << 3)

It turns out that the analyzer was right. The comparison was incorrect. It should have been performed with the orglen variable, which had been in the code even before it was fixed.

Also, I want to mention fragment N5 as another example. Here the equal sign was finally changed to the equality check.

0818_XNU_MacOS_Kernel/image3.png

To mess up in the assertf condition is one thing, but to overwrite the variable for the debug version… It's definitely worth fixing.

Fragments 6 and 7 were fixed in the same way. It turned out that the enumerator value for comparison was mixed up in the nested check. In the internal check, the PBUF_TYPE_MEMORY element should be used instead of PBUF_TYPE_MBUF in both cases.

0818_XNU_MacOS_Kernel/image4.png

In case of fragments N8, 9, 10, the fix was as follows:

0818_XNU_MacOS_Kernel/image5.png

Why did I pay attention to this? Because a big part of the commit as a whole (the repository update to xnu-4903.270.47 from January 11) contains, among other things, many code-style edits. This may indicate that the codebase version has been cleaned up with various code quality tools. That will make this PVS-Studio check more exciting. After all, it's clear that the quality of the code base has already been improved by other tools.

As for fragments 11, 12, 13, 14, only fragment 11 was fixed:

0818_XNU_MacOS_Kernel/image6.png

The rest of the fragments are still the same. It seems like someone carelessly read our report ;) (or the analyzer report used to improve the code quality in the commit). To prove that the code has the same error, I'll show you the code for which the analyzer issued one of the warnings:

static int
kauth_resolver_getwork(user_addr_t message)
{
  struct kauth_resolver_work *workp;
  int error;

  KAUTH_RESOLVER_LOCK();
  error = 0;
  while ((workp = TAILQ_FIRST(....)) == NULL) { // <=
    thread_t thread = current_thread();
    struct uthread *ut = get_bsdthread_info(thread);

    ut->uu_save.uus_kauth.message = message;
    error = msleep0(....);
    KAUTH_RESOLVER_UNLOCK();
    /*
     * If this is a wakeup from another thread in the resolver
     * deregistering it, error out the request-for-work thread
     */
    if (!kauth_resolver_identity) {
      printf("external resolver died");
      error = KAUTH_RESOLVER_FAILED_ERRCODE;
    }
    return error; //<=
  }
  return kauth_resolver_getwork2(message);
}

PVS-Studio warning: V612 An unconditional 'return' within a loop. kern_credential.c 951

I cite the code almost fully to give you a general idea of this function. In the case of the labeled loop, when the entry condition is met, only one pass through the loop body is made. It ends with the return error. Apparently, it was meant that if the condition (workp = TAILQ_FIRST(....)) == NULL is met, then you need to find the cause of the error and end the function by returning information about it. However, for some reason, while was written instead of if, as in the fragment from the previous article. The error = msleep0(....) line looks in the code as follows:


error = msleep0(&kauth_resolver_unsubmitted,
                kauth_resolver_mtx,
                PCATCH,
                "GRGetWork",
                0, 
                kauth_resolver_getwork_continue);

Here, the last argument is a pointer to the kauth_resolver_getwork_continue function. In the body of the function, there's a condition similar to the condition of the loop. The analyzer points it out to us. However, here while was corrected to if.

static int
kauth_resolver_getwork_continue(int result)
{
  ....
  if (TAILQ_FIRST(&kauth_resolver_unsubmitted) == NULL) {
    ....
    return error;
  }
  ....
}

Actually, this code works a little more complicated than I described. It has recursion (in the kauth_resolver_getwork_continue method). As far as I understand, it was aimed at finding threads that can be reloaded. But I did not go into details. There's no doubt that while is redundant. Perhaps, it has stayed here since the source code performed the same task, but without using recursion.

These are the examples from the beginning of the article. Let's move on and take a look at the fragment N40. Here, the same element is assigned the same value twice:

PVS-Studio warning: V519 CWE-563 The 'wrap.Seal_Alg[0]' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 2070, 2071. gss_krb5_mech.c 2071

Of course, this error is also fixed:

0818_XNU_MacOS_Kernel/image7.png

Well, near the end of the article, fragment 62 is fixed as the previous article suggests. In fact, that's the only edit in the file.

0818_XNU_MacOS_Kernel/image8.png

Fragments 63 and 64 were also fixed. However, in this case, the code itself was changed completely. Therefore, it is difficult to understand what was fixed for the corresponding warning.

New findings

This was a long introduction. Now, let's move on to errors that caught my attention. I found them when I last checked the XNU source code with the PVS-Studio static analyzer. To be honest, it was hard to work with the report. The project has complex code, and I have no experience working with such a codebase. However, the PVS-Studio warnings are quite detailed. There's a link to the documentation with correct and incorrect code examples. It also gives a description of a possible problem, which helped me out a lot.

For this check, cloc counted 1346 *.c files, 1822 C/C++ headers, and 225 *.cpp files in the project.

Well, let's take a look at those interesting cases.

Fragment N1

void
pe_identify_machine(__unused boot_args *args)
{
  ....
  // Start with default values.
  gPEClockFrequencyInfo.timebase_frequency_hz = 1000000000;
  gPEClockFrequencyInfo.bus_frequency_hz      =  100000000;
  ....
  gPEClockFrequencyInfo.dec_clock_rate_hz = 
    gPEClockFrequencyInfo.timebase_frequency_hz;
  gPEClockFrequencyInfo.bus_clock_rate_hz =
   gPEClockFrequencyInfo.bus_frequency_hz;
  .... 
  gPEClockFrequencyInfo.bus_to_dec_rate_den =
    gPEClockFrequencyInfo.bus_clock_rate_hz /
    gPEClockFrequencyInfo.dec_clock_rate_hz;
}

PVS-Studio warning: V1064 The 'gPEClockFrequencyInfo.bus_clock_rate_hz'' operand of integer division is less than the 'gPEClockFrequencyInfo.dec_clock_rate_hz' one. The result will always be zero. pe_identify_machine.c 72

All fields used here are of an integer type:

extern clock_frequency_info_t gPEClockFrequencyInfo;

struct clock_frequency_info_t {
  unsigned long bus_clock_rate_hz;
  unsigned long dec_clock_rate_hz;
  unsigned long bus_to_dec_rate_den;
  unsigned long long bus_frequency_hz;
  unsigned long timebase_frequency_hz;
  ....
};

Through intermediate assignments, the divident field gPEClockFrequencyInfo.bus_clock_rate_hz is assigned the value 100000000, and the divisor field gPEClockFrequencyInfo.dec_clock_rate_hz is assigned the value 1000000000. In this case the divisor is ten times greater than the divident. Since all the fields here are integers, the gPEClockFrequencyInfo.bus_to_dec_rate_den field is 0.

Judging by the name of the resulting bus_to_dec_rate_den field, the divisor and the divident are mixed up. Probably, the code's author thought that the initial values will change, so the result will no longer be equal to 0. However, this code still seems very suspicious to me.

Fragment N2

void
sdt_early_init( void )
{
  ....
  if (MH_MAGIC_KERNEL != _mh_execute_header.magic) {
  ....
  } else {
    ....
    for (....) {
    const char *funcname;
    unsigned long best;                           //<=
    ....
    funcname = "<unknown>";
    for (i = 0; i < orig_st->nsyms; i++) {
      char *jname = strings + sym[i].n_un.n_strx;
      ....
      if ((unsigned long)sym[i].n_value > best) { //<=
        best = (unsigned long)sym[i].n_value;
        funcname = jname;
      }
    }
    .....
  }
}

PVS-Studio warning: V614 Uninitialized variable 'best' used. sdt.c 572

I assume this method is looking for the name of a certain function. The algorithm uses the best variable. It probably takes the position of the best candidate for the result. However, initially this variable is only declared without initialization. The next use checks the value of a certain element with the best variable, which will be uninitialized at that time. Even stranger is the fact that it is initialized only inside the condition that uses its own value.

Uninitialized variables can lead to unpredictable results. Although, this error may seem quite trivial, it is still common when checking different projects using PVS-Studio. For example, just recently my colleague, Andrey described an interesting case of such an error.

Fragment N3

int
cdevsw_isfree(int index)
{
  struct cdevsw * devsw;

  if (index < 0) {
    if (index == -1) {
      index = 0;
    } else {
      index = -index; 
    }
    devsw = &cdevsw[index];
    for (; index < nchrdev; index++, devsw++) {
      if (memcmp(....) == 0) {
        break;
      }
    }
  }

  if (index < 0 || index >= nchrdev) {
    return -1;
  }
  ....
  return index;
}

PVS-Studio warning: V560 A part of conditional expression is always false: index < 0. bsd_stubs.c:236

It's a small example of how the analyzer tracks possible values of variables. At the beginning of the function, the index variable is compared to zero. If it is less than zero, then the variable will be assigned a value not less than zero in the inner block. So, the next external if checks again whether the index variable has a value less than zero. However, it's impossible.

This does not change the program's logic. Although, there's a possibility that some other condition was implied. Well, in any case, extra checks don't make the code more readable and understandable.

Fragment N4

int
nfs_vinvalbuf_internal(....)
{
  struct nfsbuf *bp;
  ....
  off_t end = ....;

  /* check for any dirty data before the EOF */
  if ((bp->nb_dirtyend > 0) && (bp->nb_dirtyoff < end))
  {
    /* clip dirty range to EOF */
    if (bp->nb_dirtyend > end)
    {
      bp->nb_dirtyend = end;

      if (bp->nb_dirtyoff >= bp->nb_dirtyend)             //<=
      {
        bp->nb_dirtyoff = bp->nb_dirtyend = 0;
      }
    }

    if ((bp->nb_dirtyend > 0) && (bp->nb_dirtyoff < end)) //<=
    {
      ....
    }
  }
  ....
}

PVS-Studio warnings:

V547 Expression 'bp->nb_dirtyoff >= bp->nb_dirtyend' is always false. nfs_bio.c 3858
V560 A part of conditional expression is always true: (bp->nb_dirtyoff < end). nfs_bio.c 3862

As for this fragment, the analyzer warnings are crucial. So, it's better to simplify the code. Please, note that it's not the code's full form.

We'll start with the first warning. The analyzer decided that nb_dirtyoff can't be greater than or equal to nb_dirtyend. Let's sort it out. Before the suspicious check, there are two more if's with (bp->nb_dirtyend > 0) && (bp->nb_dirtyoff < end) and bp->nb_dirtyend > end checks. Also, the bp->nb_dirtyend = end assignment is performed.

Why is the third bp->nb_dirtyoff >= bp->nb_dirtyend check is always false?

0818_XNU_MacOS_Kernel/image9.png

It's so simple. From the conditions, it appears that nb_dirtyoff is less than end, and nb_dirtyend is equal to end. As a result, nb_dirtyend is certainly greater than nb_dirtyoff. bp->nb_dirtyoff = bp->nb_dirtyend = 0 assignment will never be executed.

Eventually, we have the following section of the code:

if ((bp->nb_dirtyend > 0) && (bp->nb_dirtyoff < end)) {
  /* clip dirty range to EOF */
  if (bp->nb_dirtyend > end) {
    bp->nb_dirtyend = end;
    if (bp->nb_dirtyoff >= bp->nb_dirtyend) {  //<=
      bp->nb_dirtyoff = bp->nb_dirtyend = 0;
    }
  }
}

We can simplify it at least to this:

if ((bp->nb_dirtyend > 0) && (bp->nb_dirtyoff < end)) {
  if (bp->nb_dirtyend > end) {
    bp->nb_dirtyend = end;
  }
}

But only if this algorithm is working correctly at this point.

The second warning indicates the fourth if nested in the first.

if ((bp->nb_dirtyend > 0) && (bp->nb_dirtyoff < end))

Here, the analyzer issues the warning based on the fact that the assignment of zero will never be performed. As a result, the external condition has already had the bp->nb_dirtyoff < end check. Thus, the internal check is meaningless due to the error in the condition above.

Fragment N5

tcp_output(struct tcpcb *tp)
{
  ....
  if (isipv6) {
    ....
    if (len + optlen) {
      ....
    }
  } else {
    ....
    if (len + optlen) {
      ....
    }
  }
  ....
}

PVS-Studio warning: V793 It is odd that the result of the 'len + optlen' statement is a part of the condition. Perhaps, this statement should have been compared with something else.

This is a fairly simple flaw. In the condition, instead of a boolean expression, two variables are simply added together. Eventually, the expression will be false only if the sum is equal to zero. If this is implied, then it may be worth making the comparison with 0 explicit. Then, the question of the condition correctness will not bother us.

Perhaps, it was done on purpose. However, a little higher in the code, there is this check:

if (len + optlen + ipoptlen > tp->t_maxopd) {
  ....
}

This suggests that the comparison should also have happened in two if's, pointed by the analyzer.

What is more, this function, reduced here to 16 lines, takes up 2268 lines in the original form! It's another possible reason for refactoring ;)

Here's the second warning for the same section:

V793 It is odd that the result of the 'len + optlen' statement is a part of the condition. Perhaps, this statement should have been compared with something else.

Fragment N6

int
ttyinput(int c, struct tty *tp)
{
  ....
  if (tp->t_rawq.c_cc + tp->t_canq.c_cc) {
  ....
}

PVS-Studio warning: V793 It is odd that the result of the 'tp->t_rawq.c_cc + tp->t_canq.c_cc' statement is a part of the condition. Perhaps, this statement should have been compared with something else. tty.c 568

This is a similar case. A little higher in the code, there's one more check. It uses the sum, and also compares the result with another variable:

if (   tp->t_rawq.c_cc + tp->t_canq.c_cc > I_HIGH_WATER – 3 // <=
    && ....) {
  ....
}

In the simplified code, the condition pointed by the analyzer is noticeable. However, in the initial code, it was nested in several if's. So, it's easy to miss it during a code review. Though, the analyzer won't miss it ;)

Fragment N7

errno_t
mbuf_adjustlen(mbuf_t m, int amount)
{
  /* Verify m_len will be valid after adding amount */
  if (amount > 0) {
    int used =  (size_t)mbuf_data(m)
              - (size_t)mbuf_datastart(m)
              + m->m_len;

    if ((size_t)(amount + used) > mbuf_maxlen(m)) {
      ....
    }
  ....
  return 0;
}

PVS-Studio warning: V1028 Possible overflow. Consider casting operands of the 'amount + used' operator to the 'size_t' type, not the result. kpi_mbuf.c

Again, we have an error in the condition, but it's completely different. The addition result is cast to size_t. Here the addition operands must be cast to size_t so that the result fits exactly into the numeric type. If an overflow occurs as a result of the addition, then the meaningless value reduced to size_t will be compared with the result of mbuf_maxlen(m). Since the programmer wanted to prevent an overflow, it must be done correctly:

if ((size_t)amount + used > mbuf_maxlen(m))

There were several warnings of such type. It's better to pay attention to this point.

V1028 Possible overflow. Consider casting operands, not the result. vm_compressor_pager.c 1165
V1028 Possible overflow. Consider casting operands, not the result. vm_compressor_pager.c 1131
V1028 Possible overflow. Consider casting operands, not the result. audit_worker.c 241
V1028 Possible overflow. Consider casting operands of the '((u_int32_t) slp * hz) + 999999' operator to the 'long' type, not the result. tty.c 2199

Fragment N8

int
fdavail(proc_t p, int n)
{
  ....
  char *flags;
  int i;
  int lim;
  ....
  lim = (int)MIN(....);
  if ((i = lim - fdp->fd_nfiles) > 0 && (n -= i) <= 0) //<=
  {
    return 1;
  }
  ....
  for (....)
  {
    if (*fpp == NULL && !(*flags & UF_RESERVED) && --n <= 0)
    {
      return 1;
    }
  }
  return 0;
}

PVS-Studio warning: V1019 Compound assignment expression 'n -= i' is used inside condition. kern_descrip.c_99 3916

To my mind, this code is very difficult to read. Perhaps the condition that the analyzer pointed out should be simplified:

i = lim - fdp->fd_nfiles;
if (i > 0)
{
  n -= i;
  if(n <= 0)
    return 1;
}

This code seems to be less effective, but it certainly is more understandable. To quickly check the equivalence of code's effectiveness, go to Godbolt (Compiler Explorer). By the way, there you can test the work of PVS-Studio diagnostics. The analyzer is easy to find among the tools of this service.

If you do not enable optimization, then the assembly code will be a couple of lines long. Although, there is no difference at all with optimizations. So, it makes no sense to write tricky code here. The compiler will make things right.

However, pay attention to the body of the if. The new n value is not used in it. That is, it's quite possible that no assignment is needed here. Then you can make it like this:

i = lim - fdp->fd_nfiles;
if (i > 0) {
  if(n – i <= 0)
    return 1;
}

Moreover, the source code can lead to an error when the n variable is used further. If the expression (n -= i) < = 0 is false, then the new value of n will be used. Since I haven't worked closely with the source code, it's hard for me to tell which behavior is right.

Fragment N9

static errno_t
vsock_put_message_listening(struct vsockpcb *pcb, 
                            enum vsock_operation op,
                            struct vsock_address src, 
                            struct vsock_address dst)
{
  switch (op)
  {
    case VSOCK_REQUEST:
      ....
      if (....)
      {
        vsock_pcb_safe_reset_address(pcb, dst, src);
        ....
      }
      ....
      done:
        ....
        break;
    case VSOCK_RESET:
      error = vsock_pcb_safe_reset_address(pcb, dst, src);
      break;
    default:
      vsock_pcb_safe_reset_address(pcb, dst, src);
      ....
      break;
  }
  return error;
}

PVS-Studio warning: V764 Possible incorrect order of arguments passed to 'vsock_pcb_safe_reset_address' function: 'dst' and 'src'. vsock_domain.c 549

Maybe it's not an error. However, it is extremely suspicious that the signature of the function called in this fragment looks like this:

static errno_t
vsock_pcb_safe_reset_address(struct vsockpcb *pcb, 
                             struct vsock_address src, 
                             struct vsock_address dst)

When using this function in this fragment, the last two arguments with similar names are passed in a different order.

Here are the warnings on the same fragment:

V764 Possible incorrect order of arguments passed to 'vsock_pcb_safe_reset_address' function: 'dst' and 'src'. vsock_domain.c 587
V764 Possible incorrect order of arguments passed to 'vsock_pcb_safe_reset_address' function: 'dst' and 'src'. vsock_domain.c 590

Fragment N10

int
ifclassq_tbr_set(struct ifclassq *ifq, ....)
{
  struct tb_regulator *tbr;
  ....

  tbr = &ifq->ifcq_tbr;
  ....
  tbr->tbr_rate = TBR_SCALE(rate / 8) / machclk_freq;
  ....
  tbr->tbr_last = read_machclk();

  if (   tbr->tbr_rate > 0               //<=
      && (ifp->if_flags & IFF_UP))
  { 
    ....
  } else {
    ....
  }
  ....
  return 0;
}

PVS-Studio warning: V1051 Consider checking for misprints. It's possible that the 'tbr->tbr_last' should be checked here. classq_subr.c_178.html#ln685

In the project, this diagnostic did not work in the best way. It happened because outside variables were constantly initialized in the code over the body of the condition or loop. These variables had names similar to those used in the condition. Therefore, this time the diagnostic issued several obviously false warnings. The checked tbr_rate field was not used in the condition body. It was initialized 35 lines higher than this check. That's why the warning in question still seems suspicious to me. However, the tbr_last field, initialized right before this check, is not used anywhere else. We can assume that it must be checked instead of the tbr_rate field.

Fragment N11

void
audit_arg_mac_string(struct kaudit_record *ar, ....)
{
  if (ar->k_ar.ar_arg_mac_string == NULL)
  {
    ar->k_ar.ar_arg_mac_string = kheap_alloc(....);
  }
  ....
  if (ar->k_ar.ar_arg_mac_string == NULL)
  {
    if (ar->k_ar.ar_arg_mac_string == NULL) // <=
    {
      return;
    }
  }
  ....
}

PVS-Studio warning: V571 Recurring check. The 'if (ar->k_ar.ar_arg_mac_string == NULL)' condition was already verified in line 245. audit_mac.c 246

PVS-Studio warning: V547 Expression 'ar->k_ar.ar_arg_mac_string == NULL' is always true. audit_mac.c 246

The analyzer issued two warnings for this code at once.

At first, you may notice that the check in the very first if is the same as the check in the second one. Although here everything is correct. Memory is allocated inside the body of the first check. Also, there is an explanation for the second check:

/*
 * XXX This should be a rare event.
 * If kheap_alloc() returns NULL,
 * the system is low on kernel virtual memory. To be
 * consistent with the rest of audit, just return
 * (may need to panic if required to for audit).
 */

Thus, there shouldn't be any internal validation in the second check. We just need to exit the method. So, most likely, the internal check is duplicated accidentally and does not make any sense.

Though maybe some other field should have been checked in the internal check. However, a copy-paste error crept in here. The developer forgot to correct the field name.

Fragment N12

int
utf8_encodestr(....)
{
  u_int16_t ucs_ch;
  int swapbytes = ....;
  ....
  ucs_ch = swapbytes ? OSSwapInt16(*ucsp++) : *ucsp++;
  ....
}

PVS-Studio warning: V567 Undefined behavior. The 'ucsp' variable is modified while being used twice between sequence points. vfs_utfconv.c 298

Macros are very tricky. Perhaps, you've already seen our article "Macro Evil in C++ Code". I usually do not write about warnings on macros. It's difficult to work with them without knowing the project's codebase.

However, this case turned out to be a little bit easier. Although to find the reason for this error and expand the macros chain, I had to fall down the rabbit hole. Actually, the chain begins with the OSSwapInt16(*ucsp++) expression.

0818_XNU_MacOS_Kernel/image10.png

Then, I realized that there was an easier way. I just opened the .i file that remained after the project's check. So, the line with this macro unfolded as follows:

ucs_ch = swapbytes
? ( (__uint16_t)(__builtin_constant_p(*ucsp++)
   ? ((__uint16_t)(  (((__uint16_t)(*ucsp++) & 0xff00U) >> 8)
                   | (((__uint16_t)(*ucsp++) & 0x00ffU) << 8)))
   : _OSSwapInt16(*ucsp++)))
: *ucsp++;

Above all, this section of the expression draws our attention:

  (((__uint16_t)(*ucsp++) & 0xff00U) >> 8)
| (((__uint16_t)(*ucsp++) & 0x00ffU) << 8)

None of the operators in the expression is a sequence point. As we don't know exactly which of the arguments of the | operator will be evaluated first, the value of *uscp is undefined.

For V567 diagnostic, PVS-Studio provides highly detailed documentation. If you wonder why such code can lead to undefined behavior, start with the documentation to explore the problem.

It's not over yet! There's a curious and important point. I bet that the programmer planned to increase the value of *ucsp only once. In fact, the value will increase twice. This process is invisible and not clear. That's why macros are extremely dangerous. In many cases, it's better to write an ordinary function. The compiler is most likely to perform the substitution automatically. So, no performance degradation will occur.

Fragment N13

struct pf_status pf_status;

int
pf_insert_state(struct pf_state *s, ....)
{
  ....
  if (....) {
    s->id = htobe64(pf_status.stateid++);
    ....
  }
  ....
}

PVS-Studio warning: V567 Undefined behavior. The 'pf_status.stateid' variable is modified while being used twice between sequence points. pf.c 1440

Once again, tricky macros stirred things up for increment. Let's take a look at the line with the htobe64 call. After preprocessing, the analyzer found the line suspicious:

s->id = (__builtin_constant_p(pf_status.stateid++) ? 
((__uint64_t)((((__uint64_t)(pf_status.stateid++) &
0xff00000000000000ULL) >> 56) | (((__uint64_t)(pf_status.stateid++) &
0x00ff000000000000ULL) >> 40) | (((__uint64_t)(pf_status.stateid++) &
0x0000ff0000000000ULL) >> 24) | (((__uint64_t)(pf_status.stateid++) &
0x000000ff00000000ULL) >> 8)  | (((__uint64_t)(pf_status.stateid++) &
0x00000000ff000000ULL) << 8)  | (((__uint64_t)(pf_status.stateid++) &
0x0000000000ff0000ULL) << 24) | (((__uint64_t)(pf_status.stateid++) &
0x000000000000ff00ULL) << 40) | (((__uint64_t)(pf_status.stateid++) &
0x00000000000000ffULL) << 56))) : _OSSwapInt64(pf_status.stateid++));

0818_XNU_MacOS_Kernel/image11.png

The problem is actually the same as in the previous example. There are no sequence points in the inner chain with the | and & operands. Therefore, it's unknown what value the pf_status.stateid will take during each operation. The result is also uncertain.

Again, the variable is incremented several times in a row. This process is an unpleasant surprise from the macro :).

Here are the remaining warnings of this project's diagnostic:

V567 Undefined behavior. The 'ip_id' variable is modified while being used twice between sequence points. ip_id.c 186
V567 Undefined behavior. The 'lp' variable is modified while being used twice between sequence points. nfs_boot.c 505
V567 Undefined behavior. The 'lp' variable is modified while being used twice between sequence points. nfs_boot.c 497
V567 Undefined behavior. The 'ip_id' variable is modified while being used twice between sequence points. kdp_udp.c 588
V567 Undefined behavior. The 'ip_id' variable is modified while being used twice between sequence points. kdp_udp.c 665
V567 Undefined behavior. The 'ip_id' variable is modified while being used twice between sequence points. kdp_udp.c 1543

Fragment N14

__private_extern__ boolean_t
ipsec_send_natt_keepalive(....)
{
  ....
  struct udphdr *uh = (__typeof__(uh))(void *)(  (char *)m_mtod(m)
                                                + sizeof(*ip));
  ....
  if (....)
  {
    uh->uh_sport = (u_short)sav->natt_encapsulated_src_port;
  } else {
    uh->uh_sport = htons((u_short)esp_udp_encap_port);
  }
  uh->uh_sport = htons((u_short)esp_udp_encap_port);
  ....
}

PVS-Studio warning: V519 The 'uh->uh_sport' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 4866, 4870. ipsec.c 4870

A suspicious situation occurred in this fragment: the uh_sport field is assigned different values depending on a certain condition. However, immediately after the if-else, the same field is again assigned the same value as in the else branch. As a result, the if-else block loses its meaning since the field value will still be overwritten.

Fragment N15

static kern_return_t
vm_shared_region_slide_page_v3(vm_offset_t vaddr, ....)
{
  ....
  uint8_t *page_content = (uint8_t *)vaddr;
  uint16_t page_entry;
  ....
  uint8_t* rebaseLocation = page_content;
  uint64_t delta = page_entry;
  do {
    rebaseLocation += delta;
    uint64_t value;
    memcpy(&value, rebaseLocation, sizeof(value));
    ....
    bool isBind = (value & (1ULL << 62)) == 1;   // <=
    if (isBind) {
      return KERN_FAILURE;
    }
    ....
  } while (delta != 0);
  ....
}

PVS-Studio warning: V547 Expression '(value & (1ULL << 62)) == 1' is always false. vm_shared_region.c 2820

The code is long because there are lots of variables. However, we are interested in the line with the isBind initialization, which I marked. Let's view this expression step by step.

The bitwise shift results in the mask created with a single unit in the 63rd bit. The result of the bitwise & with the value variable can only take the values 0 or 0x4000000000000000. So, none of these values is equal to 1. Therefore, the condition will always be false.

This condition must make the function return KERN_FAILURE. Thus, we may assume that the value 0x4000000000000000 is the most exceptional case, after which we need to exit the function. Then, the result of bitwise operations had to be compared with this number, not with 1. Well, it may be written as follows:

bool isBind = (value & (1ULL << 62)) != 0;

Fragment N16

int
vn_path_package_check(char *path, int pathlen, ....)
{
  char *ptr, *end;
  int comp = 0;
  ....
  end = path + 1;
  while (end < path + pathlen && *end != '\0') {
    while (end < path + pathlen && *end == '/' && *end != '\0') {
      end++;
    }
    ptr = end;

    while (end < path + pathlen && *end != '/' && *end != '\0') {
      end++;
    }
    ....
  }
  ....
}

PVS-Studio warning: V590 Consider inspecting this expression. The expression is excessive or contains a misprint. vfs_subr.c 3589

This diagnostic always indicates redundant code. Sometimes it hides a more serious error. However, here it's most likely to be just a flaw. The warning was issued on the first internal while. It makes no sense to check that the character is both equal to '/' and not equal to '\0'. Only the first check is enough, since if *end is equal to '/', then it can't exactly be '\0'.

The next while contains the same number of checks. Although, the inequality is checked in both cases. These checks can work together. Perhaps, the second while was written first, and the first was copied with a modified check for '/'. Then, we have the flaw that arose because of the copy paste.

Conclusion

This time the analyzer found fewer errors than in the previous check. It is highly likely that static analysis and other code quality control tools were used into the XNU development process. Almost certainly the project uses Clang Static Analyzer. However, PVS-Studio found errors and flaws anyway. Well, I didn't cite all the warnings for suspicious places. Because the conclusion on some of them can be made only based on a greater understanding of the code base.

However, even these fragments show that such an extremely important project, undoubtedly developed by professionals, needs tools for code quality control.

In case you are wondering what errors may be found by static analysis in general and by PVS-Studio specifically, check our articles describing errors found in open-source projects. There are code checks not only for operating systems, but also, for compilers and other programming tools that you may use on a daily basis. For example, recently we've published an article about defects in Qt 6.

Hubs:

MacOS Kernel, How Good Is This Apple?

What kind of project is it: Apple and open-source?

Previous check

New findings

Conclusion

Articles

Information