Algorithms in Go: Dutch National Flag / Habr

Most solutions to algorithmic problems can be grouped into a rather small number of patterns. When we start to solve some problem, we need to think about how we would classify them. For example, can we apply fast and slowalgorithmic pattern or do we need to use cyclic sortpattern? Some of the problems have several solutions with different patterns. In this article of series Algorithms in Go we consider an algorithmic pattern that solves an entire class of problems related to array processing.

The flag of the Netherlands consists of three colors: red, white and blue. Given balls of these three colors arranged randomly in a line (it does not matter how many balls there are), the task is to arrange them such that all balls of the same color are together and their collective color groups are in the correct order.

For simplicity instead of colors red, white, and blue we will be dealing with ones, twos and zeroes.

Let's start with our intuition. We have an array of zeroth, ones, and twos. How would we sort it? Well, we could put aside all zeroesinto some bucket, all onesinto another bucket, and all twosinto the third. Then we can fetch all items from the first bucket, then from the second, and from the last bucket, and restore all the items. This approach is perfectly fine and has a great performance. We touch all the elements when we iterate through the array, and then we iterate through all the elements once more when we "reassamble" the array. So, the overall time complexity is O(n) + O(n) ~= O(n). The space complexity is also O(n) as we need to store all items in the buckets.

Can we do better than that? There is no way to improve our time complexity. However, we can think of a more efficient algorithm in regard to space complexity. How would we solve the problem without the additional buckets, i.e in-place.

Let's make a leap of faith and pretend that somehow we were able to process a part of the array. We iterate through part of the array and put encountered zeroes and ones at the beginning of the array, and twos at the end of the array. Now, we switched to the next index i with some unprocessed value x. What should we do there?

In this case, there are three possibilities: x could be equal to 0, 1 or 2.

Let's say x is equal to 2. Where should we put it? We already have some twos at the end of the array, so we need to put current two right before the start of the known twos. We don't know which value is located before the start of known twos. Let's call this value y.

We swap our current value and y. As we don't know the value y at our current index i, it is possible that we need to swap it once more. So we don't proceed to the next value, i.e. we don't increase our running pointer i. After this operation we have:

We note that we now have one more two at the end, so we need to shift the start of the known twos. This fact tells us that we need to have one more pointer that points to the start of the processed twos , let's denote this pointer as high. When we encounter another two we put it right before this pointer.

Now, let's consider the case when x is equal to zero. We need to put this zeroright after the last processed zero

After this operation, we need to shift the end of known zeroes to the right. Therefore, we will have another pointer low that points to the end of known zeroes. When we encounter another zero we put it to the right of this pointer.

We swap the items at indices i and low+1. As index low+1 goes directly after index low that denotes the end of zeroes, the value at index low+1 will always be equal to one.

Now, what do we do with ones? That's a tricky part because, actually, we do nothing. The item is already in place, so we are free to proceed to the next item.

Ok, this could work when we are in the middle of the processing. But how do we start?

The running pointer is initialized with a value equal to zero. We don't have any known twos yet, so the start of twos is equal to the length of the array. In this way, the first encountered two will be put at the very end of the array which seems logical. We don't have any known zeroes either, so the end of zeroes is equal to minus one. Therefore, the first encountered zero will be put at the very beginning of the array and that's exactly what we want.

Right, how do we know where should we stop? We could iterate through the whole array, but at some point the movement becomes redundant. When our running pointer becomes equal to high, we know for sure that the value at that index is equal to two and we have already processed it. Therefore, we can stop there. If we don't have any twos then the value of highis equal to the length of the array, and we stop at the last item.

Finally, we can implement the algorithm in Go:

func Sort(arr []int) {
  low := -1
  high := len(arr)
  
  for i := 0; i < high; {
    val := arr[i]
    switch val {
      case 0:
      	arr[i], arr[low+1] = arr[low+1], arr[i]
        low++
        i++
      case 1:
        i++
      case 2:
        arr[i], arr[high-1] = arr[high-1], arr[i]
        high--
    }
}

In this post, we implemented a solution for Dutch National Flag problem. This is one of the most popular algorithmic patterns that solve an entire class of problems. More algorithmic patterns such as Sliding Window or Iterative Postorder Traversal can be found in the series Algorithms in Go.