2.1 Inclusion-Exclusion
Roughly speaking, a "sieve method" in enumerative combinatorics is a method for determining the cardinality of a set $S$ that begins with a larger set and somehow subtracts off or cancels out unwanted elements. Sieve methods have two basic variations: (1) We can first approximate our answer with an overcount, and then subtract off an overcounted approximation of our original error, and so on, until after finitely many steps we have "converged" to the correct answer. This method is the combinatorial essence of the Principle of Inclusion-Exclusion, to which this section and the next four are devoted. (2) The elements of the larger set can be weighted in a natural combinatorial way so that the unwanted elements cancel out, leaving only the original set $S$. We discuss this technique in Sections 2.6 and 2.7.
The Principle of Inclusion-Exclusion is one of the fundamental tools of enumerative combinatorics. Abstractly, the Principle of Inclusion-Exclusion amounts to nothing more than computing the inverse of a certain matrix. As such, it is simply a minor result in linear algebra. The beauty of the principle lies not in the result itself, but rather in its wide applicability. We will give several examples of problems that can be solved by Inclusion-Exclusion, some in a rather subtle way. First, we state the principle in its purest form.
2.1.1 Theorem. Let $S$ be an $n$-set. Let $V$ be the $2^n$-dimensional vector space (over some filed $K$) of all functions $f\colon 2^S\to K$. Let $\phi\colon V\to V$ be the linear transformation defined by \begin{equation} \phi f(T) = \sum_{Y\supseteq T} f(Y), \text{ for all $T\subseteq S$.} \end{equation} Then $\phi^{-1}$ exists and is given by \begin{equation} \phi^{-1}f(T) = \sum_{Y\supseteq T} (-1)^{\#(Y-T)}f(Y), \text{ for all $T\subseteq S$.} \end{equation}Proof. Define $\psi\colon V\to V$ by $\psi f(T) = \sum_{Y\supseteq T}(-1)^{\#(Y-T)}f(Y)$. Then (composing functions right to left)
\begin{aligned} \phi\psi f(T) &= \sum_{Y\supseteq T}(-1)^{\#(Y-T)}\phi f(Y) \\ &= \sum_{Y\supseteq T} (-1)^{\#(Y-T)}\sum_{Z\supseteq Y} f(Z)\\ &= \sum_{Z\supseteq T} \left(\sum_{Z\supseteq Y\supseteq T} (-1)^{\#(Y-T)}\right) f(Z). \end{aligned}注:"composing functions right to left" 的意思应当是:$\phi\psi f(T)$ 的操作顺序是 $\phi$ 先作用于 $f$,$\psi$ 再作用于 $\phi f$ 。这里采用的顺序跟通常函数复合的操作顺序不同,有点奇怪。
Setting $m = \# (Z- T)$, we have
\begin{equation*} \sum_{\substack{Z\supseteq Y \supseteq T\\ (Z,T\ \mathrm{fixed})}} (-1)^{\#(Y-T)} = \sum_{i = 0}^{m} (-1)^{i}\binom{m}{i} = \delta_{0m}, \end{equation*} so $\phi\psi\, f(T) = f(T)$. Hence, $\phi\psi f = f$, so $\psi = \phi^{-1}$.注: 从证明过程可以看出,将 $\supseteq$ 换成任意偏序关系 $\le$,上述定理都成立。
The following is the usual combinatorial situation involving Theorem 2.1.1. We think of $S$ as being a set of properties that the elements of some given set $A$ of objects may or may not have. For any subset $T$ of $S$, let $f_=(T)$ be the number of objects in $A$ that have exactly the properties in $T$ (so they fail to have the properties in $\overline T = S - T$). More generally, if $w\colon A\to K$ is any weight function on $A$ with values in a field (or abelian group) $K$, then one could set $f_=(T) = \sum_x w(x)$, where $x$ ranges over all objects in $A$ having exactly the properties in $T$. Let $f_\ge(T)$ be the number of objects in $A$ that have at least the properties in $T$. Clearly then,
\begin{equation} f_\ge(T) = \sum_{Y\supseteq T} f_=(Y). \label{E:f_\ge(T)} \end{equation}Hence by Theorem ,
\begin{equation} f_=(T) = \sum_{Y\supseteq T}(-1)^{\#(Y-T)}f_\ge(Y). \label{E:4} \end{equation} In particular, the number of objects having none of the properties in $S$ is given by \begin{equation} f_=(\emptyset) = \sum_{Y}(-1)^{\#Y}f_\ge(Y), \label{E:5} \end{equation} where $Y$ ranges over all subsets of $S$. In typical applications of the Principle of Inclusion-Exclusion, it will be relatively easy to compute $f_\ge(Y)$ for $Y\subseteq S$, so equation \eqref{E:4} will yield a formula for $f_=(T)$.In equation \eqref{E:4} one thinks of $f_\ge(T)$ (the term indexed by $Y = T$) as being a first approximation to $f_=(T)$. We then subtract
\begin{equation*} \sum_{\substack{Y\supseteq T\\ \#(Y-T) = 1}} f_\ge(Y), \end{equation*} to get a better approximation. Next we add back in \begin{equation*} \sum_{\substack{Y\supseteq T\\ \#(Y-T) = 2}} f_{\ge}(Y), \end{equation*} and so on, until finally reaching the explicit formula \eqref{E:4}. This reasoning explains the terminology "Inclusion-Exclusion."Perhaps the most standard formulation of the Principle of Inclusion-Exclusion is one that dispenses with the set $S$ of properties per se, and just considers subsets of $A$. Thus, let $A_1, \dots, A_n$ be subsets of a finite set $A$. For each subset $T$ of $[n]$, let
\begin{equation*} A_T = \bigcap_{i\in T} A_i \end{equation*} (with $A_\emptyset = A$), and for $0\le k\le n$ set \begin{equation} S_k = \sum_{\#T = k} \# A_T, \label{E:S_k} \end{equation} the sum of the cardinalities, or more generally the weighted cardinalities \begin{equation*} w(A_T) = \sum_{x\in A_T} w(x), \end{equation*} of all $k$-tuple intersections of the $A_i$'s. Think of $A_i$ as defining a property $P_i$ by the condition that $x\in A$ satisfites $P_i$ if and only if $x\in A_i$. Then $A_T$ is just the set of objects in $A$ that have at least the properties in $T$, so by \eqref{E:5} the number $\#(\overline{A_1} \cap \dots \cap\overline{A_n})$ of elements of $A$ lying in none of the $A_i$'s is given by \begin{equation} \#(\overline{A_1} \cap \dots \cap\overline{A_n}) = S_0 - S_1 + S_2 - \dots + (-1)^{n}S_n, \label{E:7} \end{equation} where $S_0 = \#A_{\emptyset} = \#A$.The Principle of Inclusion-Exclusion and its various reformulations can be dualized by interchanging $\cap$ and $\cup$, $\subseteq$ and $\supseteq$, and so on, throughout. The dual form of Theorem states that if
\[ \widetilde{\phi} f(T) = \sum_{Y\subseteq T} f(Y), \quad \text{for all $T\subseteq S$}, \] then $\widetilde{\phi}^{-1} f(T)$ exists and is given by \begin{equation*} \widetilde{\phi}^{-1} f(T) = \sum_{Y\subseteq T} (-1)^{\#(T-Y)} f(Y), \quad\text{for all $T \subseteq S$}. \end{equation*} Similarly, if we let $f_\le(T)$ be the (weighted) number of objects of $A$ having at most the properties in $T$, then \begin{equation} \begin{aligned} f_\le(T) &= \sum_{Y\subseteq T}f_=(Y), \\ f_=(T) &= \sum_{Y\subseteq T} (-1)^{\#(T-Y)} f_\le(Y). \end{aligned}\label{E:8} \end{equation}A common special case of the Principle of Inclusion-Exclusion occurs when the function $f_=$ satisfies $f_=(T) = f_=(T')$ whenever $\#T = \#T'$. Thus also $f_\ge(T)$ depends only on $\#T$, and we set $a(n-i) = f_=(T)$ and $b(n-i) = f_\ge(T)$ whenever $\#T= i$. (Caveat. In many problems the set $A$ of objects and $S$ of properties will depend on a parameter $p$, and the functions $a(i)$ and $b(i)$ may depend on $p$. Thus, for example, $a(0)$ and $b(0)$ are the number of objects having all the properties, and this number may certainly depend on $p$. Proposition 2.2.2 is devoted to the situation when $a(i)$ and $b(i)$ are independent of $p$.) We thus obtain from equation \eqref{E:f_\ge(T)} and \eqref{E:4} the equivalence of the formulas
\begin{align}
b(m) &= \sum_{i= 0}^m \binom{m}{i} a(i), \quad 0\le m\le n, \label{E:9} \\ a(m) &= \sum_{i=0}^m \binom{m}{i} (-1)^{m-i} b(i), \quad 0\le m \le n. \label{E:10} \end{align}In other words, the inverse of the $(n+1)\times(n+1)$ matrix whose $(i,j)$-entry $(0\le i, j\le n)$ is $\binom{j}{i}$ has $(i,j)$-entry $(-1)^{j-i}\binom{j}{i}$. For instance,
\[ \begin{bmatrix} 1 & 1 & 1 & 1\\ 0 & 1 & 2 & 3\\ 0 & 0 & 1 & 3 \\ 0 & 0 & 0 & 1 \end{bmatrix}^{-1} = \begin{bmatrix} 1 & -1 & 1 & -1 \\ 0 & 1 & -2 & 3 \\ 0 & 0 & 1 & -3 \\ 0 & 0 & 0 & 1 \end{bmatrix} . \] Of course, we may let $n$ approach $\infty$ so that \eqref{E:9} and \eqref{E:10} are equivalent for $n = \infty$.Note that in language of the calculus of finite differences, \eqref{E:10} can be rewritten as
\[ a(m) = \Delta^m b(0), \quad 0 \le m \le n. \]