1 Combinatorics
1.1 Some basic mathematical notation
Scalar quantity: plain font, typically lower case (\(x\), \(\theta\), n), sometimes upper case (\(K\), \(R^2\), distribution functions \(F\), \(P\), \(Q\)).
Sets: plain font, upper case (\(\Omega, \mathcal{F}\))
Vector quantity: bold font, lower case (\(\boldsymbol x\), \(\boldsymbol \theta\)).
Matrix quantity: bold font, upper case (\(\boldsymbol X\), \(\boldsymbol \Sigma\)).
Summation: \[ \sum_{i=1}^n x_i = x_1 + x_2 + \ldots + x_n \]
Product: \[ \begin{split} \prod_{i=1}^n x_i &= x_1 \times x_2 \times \ldots \times x_n\\ &= x_1 x_2 \ldots x_n \end{split} \] The multiplication sign \(\times\) between the factors is usually omitted unless it is needed for clarity.
Indicator function (in Iverson bracket notation): \[ [A] = \begin{cases} 1 & \text{if $A$ is true}\\ 0 & \text{if $A$ is not true}\\ \end{cases} \]
1.2 Number of permutations
A permutation or ordering is a specific arrangement of items in a sequence, or equivalently, a specific assignment to labelled positions.
The factorial \[ n! = \prod_{i=1}^n i = 1 \times 2 \times \ldots \times n \] is the number of permutations of \(n\) distinct items, where \(n\) is a positive integer.
For \(n=0\) the factorial is defined as \[ 0! = 1 \]
Thus, the factorial \(n!\) equals the number of ways to place \(n\) distinct items into \(n\) labelled boxes so that each box contains exactly one item.
1.3 Multinomial and binomial coefficient
The multinomial coefficient for \(K\) groups \[ \begin{split} W_K &= \binom{n}{n_1, \ldots, n_K} \\ &= \frac {n!}{n_1! \, n_2! \, \ldots \, n_K! } \end{split} \] is the number of permutations of \(n\) distinct items allocated to \(K\leq n\) groups, with \(n_k\) unordered items in group \(k\) and \(\sum_{k=1}^K n_k=n\).
Thus, the multinomial coefficient \(W_K\) equals the number of ways to place \(n\) distinct items into \(K\) labelled boxes so that box \(k\) contains exactly \(n_k\) unordered items, with \(\sum_{k=1}^K n_k=n\).
For \(n_k=1\) (and thus \(K=n\)) the multinomial coefficient reduces to the factorial.
For two groups (\(K=2\)) the multinomial coefficient becomes the binomial coefficient \[ \begin{split} W_2 &= \binom{n}{n_1, n_2} = \binom{n}{n_1, n- n_1}\\ &= \frac {n!}{n_1! \, (n - n_1)!}\\ & = \binom{n}{n_1} \end{split} \]
1.4 De Moivre-Sterling approximation
The factorial is frequently approximated by the following formula derived by Abraham de Moivre (1667–1754) and James Stirling (1692-1770) \[ n! \approx \sqrt{2 \pi} n^{n+\frac{1}{2}} e^{-n} \] or equivalently on logarithmic scale \[ \log n! \approx \left(n+\frac{1}{2}\right) \log n -n + \frac{1}{2}\log \left( 2 \pi\right) \] The approximation is good for small \(n\) (but fails for \(n=0\)) and becomes more and more accurate with increasing \(n\). For large \(n\) the approximation can be simplified to \[ \log n! \approx n \log n -n \]
The de Moivre-Sterling approximation applied to the multinomial coefficient yields \[ \log W_K \approx - n \sum_{k=1}^K \frac{n_k}{n} \log\left( \frac{n_k}{n} \right) \] Hence, for large \(n\) and large \(n_k\) the logarithm of the multinomial coefficient equals \(n\) times the Shannon-Gibbs entropy of a categorical distribution with class probabilities \(n_k/n\).