Improving priority queues: Analysis of branching factor

Now that we know how d-way heaps work, the next question we need to ask is this: Wouldn’t we be just fine with a regular, binary heap? Is there an advantage in a higher branching factor?

1. Do we need d-ary heaps?

Usually binary heaps are enough for all our programming needs. The main advantage of this data structure is that it guarantees a logarithmic running time for each one of the common operations. In particular, being a binary balanced tree, the main opera­tions are guaranteed to require a number of comparisons proportional, in the worst case, to log2 (N). As we discuss in appendix B, this guarantees that we can run these methods on much larger containers than if the running time was linear. Consider that even with a billion elements, log2 (N) just evaluates to about 30.

As we have seen in the introduction, constant factors are irrelevant for the running time, that is, O(c*N) = O(N), and we know from algebra that two logarithms with dif­ferent bases only differ by a constant factor, in particular

logb(N) = log2(N) / log2(b)

So, in conclusion, we have

O(log2(N)) = O(log3(N)) = O(log(N))

When we move to the implementation, however, constant factors matter. They matter so much that, in some edge cases, algorithms that would be better according to the running time analysis actually are slower than a simpler algorithm with worse running time, at least for any practical input (for instance, if you compare 2n and n*100100000, then for the constant factor not to matter anymore, the input should be huge).

A prominent example of this behavior is given by Fibonacci heap[1]: in theory they provide amortized constant time for some crucial operations such as insert or priority update, but in practice they are both complicated to implement and slow for any via­ble input size.

The constant factors, in general, are due to several different reasons that include

  • Lag for reading/writing to memory (scattered vs localized readings)
  • The cost of maintaining counters or to iterate loops
  • The cost of recursion
  • Nitty/gritty coding details that in the asymptotic analysis are abstracted away (for instance, as we have seen, static vs dynamic arrays)

So, at this point it should be clear that in any implementation we should strive to keep this constant multiplicators as small as possible.

Consider this formula:

logb(N) = log2(N) / log2(b)

If b > 2, it’s apparent that logb(N) < log2(N), and therefore if we have a logarithmic factor in our algorithm’s running time, and we manage to provide an implementation that instead of log2(N) steps will require logb(N), while all other factors stay unchanged, then we will have provided a (constant-time) speed-up.

In section 2.10 we will further investigate how this applies to d-ary heaps.

2. Running time

So the answer is yes, there is an advantage in tuning the heap’s branching factor, but compromise is the key.

The insertion will always be quicker with larger branching factors, as we at most need to bubble up the new element to the root, with O(logD(n)) comparisons and swaps at most.

Instead, the branching factor will affect deletion and priority update. If you recall the algorithms for popping elements, for each node we need to first find the highest priority among all its children, then compare it to the element we are pushing down.

The larger the branch factor, the smaller the height of the tree (it shrinks logarith­mically with the branching factor). But, on the other hand, the number of children to compare at each level also grows linearly with the branch factor. As you can imagine, a branching factor of 1000 wouldn’t work very well (and it would translate into a linear search for less than 1001 elements!).

In practice, through profiling and performance tests, the conclusion has been reached that in most situations, D=4 is the best compromise.

3. Finding the optimal branching factor

If you are looking for an optimal value for D that works in every situation, then you are going to be disappointed. To a certain extent, theory comes to the rescue by showing us a range of optimal values. It can be shown that the optimal value can’t be greater than 5. Or, to put it another way, it can be mathematically proven that

  • The tradeoff between insert and delete is best balanced with 2 <= D <= 5.
  • A 3-way heap is in theory faster than a 2-way heap.
  • 4-way heaps and 3-way heaps have similar performance.
  • 5-way heaps are a bit slower.

In practice, the best value for D depends on the details of the implementation and of the data you are going to store in the heap. The optimal branching factor for a heap can only be determined empirically, case by case. There is no overall optimal branch­ing factor, and it depends on the actual data and on the ratio of insertions/deletions, or, for instance, how expensive it is to compute priority versus copying elements, among other things.

In common experience, binary heaps are never the fastest, 5-way heaps are seldom faster (for narrow domains), and the best choice usually falls between 3 and 4, depending on nuances.

So while I feel safe suggesting starting with a branching factor of 4, if this data structure is used in a key section of your application and small performance improve­ment can make a relevant difference, then you need to tune the branching factor as a parameter.

4. Branching factor vs memory

Notice that I suggested the larger branching factor among the two best performing ones for a reason. It turns out there is, in fact, another consideration to be made when looking for the optimal branching factor for heaps: locality of reference.

When the size of the heap is larger than available cache or than the available mem­ory, or in any case where caching and multiple levels of storage are involved, then on average a binary heap requires more cache misses or page faults than a d-ary heap. Intuitively, this is due to the fact that children are stored in clusters, and that on update or deletion, for every node reached, all its children will have to be examined. The larger the branching factor becomes, the more the heap becomes short and wide, and the more the principle of locality applies.

D-way heap appears to be the best traditional data structure for reducing page faults. New alternatives focused on reducing cache misses and page swaps have been proposed over the years; for instance, you can check out splay trees.

While these alternatives aren’t in general able to have the same balance between practical and theoretical performance as heaps, when the cost of page faults or disk access dominates, it might be better to resort to a linearithmic algorithm with higher locality rather than sticking with a linear algorithm with poor locality.

Source: Rocca Marcello La (2021), Advanced Algorithms and Data Structures, Manning Publications (2021)

Leave a Reply

Your email address will not be published. Required fields are marked *