1 Combinatorics

1.1 Some basic mathematical notation

Summation: \[ \sum_{i=1}^n x_i = x_1 + x_2 + \ldots + x_n \]

Product: \[ \begin{split} \prod_{i=1}^n x_i &= x_1 \times x_2 \times \ldots \times x_n\\ &= x_1 x_2 \ldots x_n \end{split} \] The multiplication sign $\times$ between the factors is usually omitted unless it is needed for clarity.

Indicator function (in Iverson bracket notation): \[ [A] = \begin{cases} 1 & \text{if $A$ is true}\\ 0 & \text{if $A$ is not true}\\ \end{cases} \]

Scalar quantity: plain font, typically lower case ($x$, $\theta$, n), sometimes upper case ($K$, $R^2$, distribution functions $F$, $P$, $Q$).

Vector quantity: bold font, lower case ($\boldsymbol x$, $\boldsymbol \theta$).

Matrix quantity: bold font, upper case ($\boldsymbol X$, $\boldsymbol \Sigma$).

Sets: plain font, upper case ($\Omega, \mathcal{F}$)

1.2 Number of permutations

The number of possible orderings, or permutations, of $n$ distinct items is the number of ways to put $n$ items in $n$ bins with exactly one item in each bin. It is given by the factorial \[ n! = \prod_{i=1}^n i = 1 \times 2 \times \ldots \times n \] where $n$ is a positive integer. For $n=0$ the factorial is defined as \[ 0! = 1 \] as there is exactly one permutation of zero objects.

The factorial can also be obtained using the gamma function \[ \Gamma(x) = \int_0^\infty t^{x-1} e^{-t} dt \] which can be viewed as continuous version of the factorial with $\Gamma(x) = (x-1)!$ for any positive integer $x$.

1.3 De Moivre-Sterling approximation of the factorial

The factorial is frequently approximated by the following formula derived by Abraham de Moivre (1667–1754) and James Stirling (1692-1770) \[ n! \approx \sqrt{2 \pi} n^{n+\frac{1}{2}} e^{-n} \] or equivalently on logarithmic scale \[ \log n! \approx \left(n+\frac{1}{2}\right) \log n -n + \frac{1}{2}\log \left( 2 \pi\right) \] The approximation is good for small $n$ (but fails for $n=0$) and becomes more and more accurate with increasing $n$. For large $n$ the approximation can be simplified to \[ \log n! \approx n \log n -n \]

1.4 Multinomial and binomial coefficient

The number of possible permutation of $n$ items of $K$ distinct types, with $n_1$ of type 1, $n_2$ of type 2 and so on, equals the number of ways to put $n$ items into $K$ bins with $n_1$ items in the first bin, $n_2$ in the second and so on. It is given by the multinomial coefficient \[ \binom{n}{n_1, \ldots, n_K} = \frac {n!}{n_1! \, n_2! \, \ldots \, n_K! } \] with $\sum_{k=1}^K n_k = n$ and $K \leq n$. Note that it equals the number of permutation of all items divided by the number of permutations of the items in each bin (or of each type).

If all $n_k=1$ and hence $K=n$ the multinomial coefficient reduces to the factorial.

If there are only two bins / types ($K=2$) the multinomial coefficients becomes the binomial coefficient \[ \binom{n}{n_1} = \binom{n}{n_1, n-n_1} = \frac {n!}{n_1! \, (n - n_1)!} \] which counts the number of ways to choose $n_1$ elements from a set of $n$ elements.

For large $n$ and $n_k$ we can apply the De Moivre-Sterling approximation to the multinomial coefficient, yielding \[ \log\binom{n}{n_1, \ldots, n_K} = - n \sum_{k=1}^K \frac{n_k}{n} \log\left( \frac{n_k}{n} \right) \] Note this is $n$ times the Shannon entropy of a categorical distribution with $n_k/n$ as class probabilities.