19.1 Introduction to Average-Case Running Time

\( \newcommand{\NOT}{\neg} \newcommand{\AND}{\wedge} \newcommand{\OR}{\vee} \newcommand{\XOR}{\oplus} \newcommand{\IMP}{\Rightarrow} \newcommand{\IFF}{\Leftrightarrow} \newcommand{\TRUE}{\text{True}\xspace} \newcommand{\FALSE}{\text{False}\xspace} \newcommand{\IN}{\,{\in}\,} \newcommand{\NOTIN}{\,{\notin}\,} \newcommand{\TO}{\rightarrow} \newcommand{\DIV}{\mid} \newcommand{\NDIV}{\nmid} \newcommand{\MOD}[1]{\pmod{#1}} \newcommand{\MODS}[1]{\ (\text{mod}\ #1)} \newcommand{\N}{\mathbb N} \newcommand{\Z}{\mathbb Z} \newcommand{\Q}{\mathbb Q} \newcommand{\R}{\mathbb R} \newcommand{\C}{\mathbb C} \newcommand{\cA}{\mathcal A} \newcommand{\cB}{\mathcal B} \newcommand{\cC}{\mathcal C} \newcommand{\cD}{\mathcal D} \newcommand{\cE}{\mathcal E} \newcommand{\cF}{\mathcal F} \newcommand{\cG}{\mathcal G} \newcommand{\cH}{\mathcal H} \newcommand{\cI}{\mathcal I} \newcommand{\cJ}{\mathcal J} \newcommand{\cL}{\mathcal L} \newcommand{\cK}{\mathcal K} \newcommand{\cN}{\mathcal N} \newcommand{\cO}{\mathcal O} \newcommand{\cP}{\mathcal P} \newcommand{\cQ}{\mathcal Q} \newcommand{\cS}{\mathcal S} \newcommand{\cT}{\mathcal T} \newcommand{\cV}{\mathcal V} \newcommand{\cW}{\mathcal W} \newcommand{\cZ}{\mathcal Z} \newcommand{\emp}{\emptyset} \newcommand{\bs}{\backslash} \newcommand{\floor}[1]{\left \lfloor #1 \right \rfloor} \newcommand{\ceil}[1]{\left \lceil #1 \right \rceil} \newcommand{\abs}[1]{\left | #1 \right |} \newcommand{\xspace}{} \newcommand{\proofheader}[1]{\underline{\textbf{#1}}} \)

In our study of running-time analysis so far, we’ve divided up algorithms based on whether their running time depends on only the size of their inputs, or whether their running time depends on the actual value of the inputs as well, causing a spread of running times for any fixed input size. In the latter case, we’ve been concerned with the maximum boundary of this spread, which we call the worst-case running time.

However, in practice worst-case running time analysis can be misleading, with a variety of algorithms having a poor worst-case performance still yet performing well on the vast majority of inputs. Some reflection makes this not too surprising. Focusing on the maximum of a set of numbers says very little about the “typical” number in that set, or, more precisely, nothing about the distribution of numbers in that set.

One famous example we touched on in the previous chapter is the quicksort algorithm. Our implementation (and most implementations in practice) has a worst-case running time of \(\Theta(n^2)\), but in practice this algorithm runs faster than mergesort on most inputs, even though the latter has a running time of \(\Theta(n \log n)\). We said that the average-case running time of quicksort is \(\Theta(n \log n)\), and in this chapter, we’ll see precisely what this means. Though we won’t be looking at quicksort, but instead restricting ourselves to simpler examples. You’ll study the average-case running time of quicksort in CSC263/CSC265!

In this section, we’ll introduce a formal definition of average-case running time, and then in the next section we’ll look at a few examples of analyzing the average-case running time of a function.

Running time sets, worst-case, and average-case

First, let’s recall some definitions from Chapter 9 Analyzing Algorithm Running Time. Let func be a program. For each \(n \in \N\), we can define the set \(\cI_{\texttt{func},n}\) to be the set of all inputs to func of size \(n\). From this set, we can define the worst-case running time of func as the function

\[ WC_{\texttt{func}}(n) = \max \big\{ \text{running time of executing } \texttt{func}(x) \mid x \in \cI_{\texttt{func}, n} \big\} \]

In English, we can say that \(WC_{\texttt{func}}(n)\) is the maximum running time of func on an input of size \(n\).

We define the average-case running time of func in a similar way, except now taking the average rather than maximum of the set of running times. Formally, the average-case running time of func is defined as the function \(Avg_{\texttt{func}} : \N \to \R^+\), where

\[ Avg_{\texttt{func}}(n) = \frac{1}{|\cI_{\texttt{func}, n}|} \times \sum_{x \in \cI_{\texttt{func}, n}} \text{running time of } \texttt{func}(x) \]

Notice that when there is only one input per size, or when all inputs of a given size have the same running time, the average-case running time is the same as the worst-case running time, as there is no spread. However, as we’ll see in the next section, where there is a spread of running times, we need to take them all into account when calculating this average.