LN 2

MTH 318: COMBINATORICS
KRISHNA KAIPA
Abstract. Class notes for the course MTH318 Combinatorics at IISER-Pune during August se-
mester of 2019.
1. Introduction
The course policy can be found on the course webpage
www.iiserpune.ac.in/~kaipa/teaching/MTH318.
In this introductory lecture, we gave an overview of the contents of the course. Then we discussed
two problems without going into details, in order to motivate the subject and its methods.
(1) Let bn,k denote the number of ways to pick k objects out of a set of n-objects with repetition
(i.e. with replacement). This is also the number of ‘monomials’ X1a1 X2a2 . . . Xnan with ai
non-negative integers satisfying a1 + · · · + an = k. We talked about the generating functions
method to obtain this number bn,k .
Consider the ‘formal power series’ Bn (x) = k≥0 bn,k xk .
P
ClearlyP b1,k = 1 for all k ≥ 0 and hence B1 (x) = 1 + x + x2 + . . . . Then we showed that
bn,k = j bn−1,j by considering the possible values for an in the monomials interpretation
above. This gives Bn (x) = (1 + x + x2 + . . . )Bn−1 (x). Thus Bn (x) = (1 + x + x2 + . . . )n .
Since (1 + x + x2 + . . . )(1 − x) = 1, it follows that Bn (x)(1 − x)n = 1. From this, it can be
shown (see §2 below) that bn,k = n+k−1

k
.
(2) The number of derangements of a set S of size n. A derangement is a permutation (bijective

function) f : S → S such that f (s) 6= s for all s ∈ S. The main technique for determining
the number of derangements is the inclusion-exclusion principle.
1
2 KRISHNA KAIPA
2. Permutations, Combinations, and Formal Power Series

The number of r-permutations of a set of size n.
In other words the number of ordered arrangements of r objects taken from a set of n objects is
n!
n Pr = n(n − 1) . . . (n − r + 1) = .
(n − r)!
This is because there are n ways to pick the first object, n − 1 ways to pick the second object, and
so on until we get to (n − r + 1) ways to pick the r-th object.
The number of r-permutations of a set of size n with repetition or replacement.

In other words the number of r-permutations of a multiset {∞ · X1 , ∞ · X2 , . . . , ∞ · Xn } consisting
of an infinite supply of n distinct objects is simply
nr .
This is because there are n ways to pick the i-th objects for all 1 ≤ i ≤ r.
The number of r-combinations of a set of size n.

In other words the number of size r subsets a size n set is:

n n!
= n(n − 1) . . . (n − r + 1)/r! = .
r r!(n − r)!
To see this we note that n Pr is also equal to the number of ways to pick a subset of size r from a
set of size n, and then
to order the chosen r elements. In other words n Pr = r! nr which gives the
n n n

above formula for r . It is useful to note that r = n−r .
The number of r-combinations of a set of size n with repetition/replacement.

In other words the number of r-combinations of the multiset {∞·X1 , ∞·X2 , . . . , ∞·Xn } (as defined
above). Let us call this number bn,r . This is also the number of monomials X1a1 X2a2 . . . Xnan with ai
non-negative integers satisfying a1 + · · · + an = k. By setting bi = ai + 1 we can say that
bn,r = #{b1 , . . . , bn ∈ N : b1 + b2 + · · · + bn = n + r}.
Consider a string of n + r dots. If we mark n − 1 of the n + r − 1 spaces between the dots, then the
marks separate the n + r dots into n nonempty collection of dots. Conversely, given b1 , . . . , bn ∈ N
satisfying b1 + · · · + bn = n + r, we mark the n − 1 spaces after b1 dots, b1 + b2 dots, and so on upto
b1 + · · · + bn−1 dots. This establishes that

n+r−1 n+r−1
bn,r = = .
n−1 r
We now elaborate on the generating functions approach to obtaining bn,r .
The ring of formal power series

Let R[[X]] denote the set of formal power series of the form a0 + a1 X + a2 X 2 + . . . where ai ∈ R.
The word ‘formal’ means that Pwe do not need to think of the convergence questions of the series.
Given series i≥0 ai X and i≥0 bi X we can add them to get i≥0 (ai + bi )X i , and we can also
i i
P P
MTH 318: COMBINATORICS 3
multiply them to get

X X XX X
( ai X i )( bi X i ) = ai bj X i+j = X k (a0 bk + a1 bk−1 + · · · + ak b0 )
i≥0 i≥0 i≥0 j≥0 k≥0
As an example of multiplication, check that the product of the formal geometric seriesP1 + X +
X + . . . with the formal series 1 − X is just the series 1. More generally the formal series i≥0 ai X i
2
has a multiplicative inverse if and only if a0 6= 0. When a0 6= 0 the multiplicative inverse is unique.
(See
PAssignment 1).PThe formal derivative of the monomial X i is iX i−1 . We define the derivative
of i≥0 ai X to be i≥1 iai X i−1 . Given formal series a(X) and b(X), the product rule
i
d d d
dX
a(X)b(X) = ( dX a(X))b(X) + a(X)( dX b(X))
holds (See Assignment 1). As in calculus, there is also a higher-order derivative version of product
rule (See Assignment 1)
Xn
dn n
di dn−i
dX n a(X)b(X) = i
( dX i a(X))( dX n−i b(X))
i=0
PGenerating function approach to obtaining bn,r As discussed in §1, in terms of Bn (X) =

r 2 n
r≥0 bn,r X , we have Bn (X) = (1 + X + X + . . . ) . Thus Bn (X) is the multiplicative inverse of
n
(1 − X) . Using the higher-derivative product rule on the identity B1 (X)(1 − X) = 1, we get
md d m−1
( dX m B1 (X))(1 − X) − m( dX m−1 B1 (X) = 0.
This gives
d n−1 n−1
( dX n−1 B1 (X))(1 − X) = (n − 1)!B1 (X).
Multiplying by (1 − X) we get
n−1
d n
( dX n−1 B1 (X))(1 − X) = (n − 1)!,
which shows that the unique multiplicative inverse of (1 − X)n is

1 dn−1 2
X n + r − 1
Bn (X) = (1 + X + X + . . . ) = Xr.
n − 1! dX n−1 r≥0
n − 1
n+r−1

Therefore bn,r = r
.
Some practice problems.
P1) (from [Brualdi]) How many seven-digit numbers are there such that the digits are distinct
integers taken from {1, 2, . . . , 9} and such that the digits 5 and 6 do not appear consecutively in
either order?
Ans: The desired number is 9 P7 minus the number of seven-digit numbers with the last condition
replaced by the condition that 5 and 6 appear together. The latter number is (7 P5 ) × (6) × 2, where
7 P5 is the number of five-digit numbers with distinct digits drawn from {1, 2, 3, 4, 7, 9}, and the
factor 6 is for the 6 places in this five-digit number where the symbol 56 or 65 can be inserted, and
the last factor of 2 is for the two choices 56, 65.
4 KRISHNA KAIPA
3. Permutations of Multisets, Equivalence Relations

Practice problems continued
P2) (from [Brualdi]) Ten people, including two who do not wish to sit next to one another, are to
be seated at a round table. How many circular seating arrangements are there?
Ans: Without the restriction of the two people (call them A and B) not wanting to sit next to
each other, the answer would be 10!/10 = 9!. The number arrangements which we do not want, are
(9!/9) × 2 where (9!/9) is the number of circular permutations of nine objects (by treating A and
B as one bundle), and the factor 2 is for the ordering AB, BA. So the answer is 9! − 2 · 8!.
P3) (from [Brualdi]) How many eight-letter words can be constructed by using the 26 letters
{A, B, . . . , Z} if each word contains j letters from the set {A, E, I, O, U }? It is understood that
there isno restriction on the number of times a letter can be used in a word.
Ans: 8j · 5j · 218−j .
P4) (from [Brualdi]) A bakery has eight varieties of biscuits. If a box of biscuits contains 12
biscuits, how manydifferent options are there for a box of biscuits?
Ans: 1912
= n+k−1
k
with n = 8 and k = 12.
P4) (from [Brualdi]) What is the number of integral solutions of the equation x1 +x2 +x3 +x4 = 20
satisfying the constraints x1 ≥ 3, x2 ≥ 1, x3 ≥ 0, x4 ≥ 5 ?
Ans: same as the number of solutions in non-negative integers of the equation y1 +y2 + y3 + y4 = 11
where y1 = x1 − 3, y2 = x2 − 1, y3 = x3 , y4 = x4 − 5. Therefore, the answer is 14
11
= n+k−1
k
with
n = 4 and k = 11.
Permutations of Multisets
Consider a multiset {a1 · X1 , a2 · X2 , . . . , an · Xn } of n distinct objects Xi appearing with multiplicity
ai ∈ N for 1 ≤ i ≤ n. (Here all ai ’s are finite. The number of permutations of this multiset is:
(a1 + · · · + an )!
.
a1 !a2 ! . . . an !
Let m be the desired number. If the multiple copies of each symbol Xi are treated as being distinct,
then the number of permutations is clearly (a1 + · · · + an )!. But this also equals the number m in
question multiplied by the number (a1 !a2 ! . . . an !) of permutations of the ai copies of Xi for each i.
Thus we get the asserted formula for m.
It is complicated to write down a general formula for the number of r-permutations of this mul-
tiset. But for given n, r, ai ’s we can always determine the number
P6) (from [Brualdi]) Consider the multiset S = {3 · x, 2 · y, 4 · z} of nine objects of three types.
Find the number of 8-permutations of S.
Ans: The desired number is the number of 8-permutations of {2 · x, 2 · y, 4 · z} plus the number of
8-permutations {3 · x, 1 · y, 4 · z} plus the number of 8-permutations of {3 · x, 2 · y, 3 · z}. Hence the
answer is 8!/(2!2!4!) + 8!/(3!1!4!) + 8!/(3!2!3!). See also Assignment.
Equivalence Relations on a set

Given a set S, a relation on S is a subset R of S × S. A relation is called an equivalence relation
if i) (a, a) ∈ R for all a ∈ S, ii) (a, b) ∈ R ⇔ (b, a) ∈ R, iii) (a, b), (c, b) ∈ R ⇒ (a, c) ∈ R. The
conditions i), ii),iii) are known as reflexivity, symmetry, and transitivity. Instead of the notation
R ⊂ S × S, we often say a ∼ b whenever (a, b) ∈ R. For a ∈ S, the equivalence class of an element
a is [a] = {b ∈ S : b ∼ a}.
Theorem. The equivalence classes of an equivalence relation ∼ on a set S partition S (into non-
empty parts). Conversely, every partition of S arises from a unique equivalence relation on S.
Proof. Given an equivalence relation ∼ on S, suppose c ∈ [a] ∩ [b] then the fact that c ∼ a, c ∼ b
implies a ∼ b by transitivity. Therefore x ∼ a if and only if x ∼ b, and hence [a] = [b]. Thus, we
have shown that distinct equivalence classes are disjoint. In order to prove that the equivalence
classes partition S, it remains to show that every a ∈ S is in some class. This is true because
a ∈ [a]. `
Conversely, given a partition S = i∈I Si into disjoint parts Si indexed by an indexing set I, we
define a ∼ b if and only if there exists some i ∈ I with a, b ∈ Si . Clearly i) a ∼ a, ii) a ∼ b ⇔ b ∼ a,
iii) a ∼ b, b ∼ c ⇒ a ∼ c. Therefore ∼ is an equivalence relation, and clearly the equivalence classes
are just the Si for i ∈ I.
We note that the equivalence relation constructed (above) from a partition of S, coincides with ∼
when the partition is by equivalence classes of an equivalence relation ∼. Similarly, the partition into
equivalence classes of an equivalence relation ∼ on S constructed (above) from a partition of S coin-
cides with the original partition. This shows that the assignment ∼ 7→ partition by eq. classes of ∼
gives a bijective correspondence between the set of equivalence relations on S, and the set of parti-
tions of S.
6 KRISHNA KAIPA
4. Assignment 1
(1) Given a multiset S = {a1 · x1 , . . . , an · xn } of size m = a1 + · · · + an . Show that the number
of (m − 1) permutations of S is the number of permutations of S.
(2) In how many ways can four men and eight women be seated at a round table if there are to
be two women between consecutive men around the table?
(3) A group of mn people are to be arranged into m teams each with n players.
(a) Determine the number of ways if each team has a different name.
(b) Determine the number of ways if the teams don’t have names.
(4) Prove that a power series i≥0 ai xi has a multiplicative inverse if and only if a0 6= 0. More-
P
over, the multiplicative inverse is unique when it exists.
(5) Prove the product rule and the higher-order derivative product rule for formal power series.
(6) Let n be a positive integer. Suppose we choose a sequence i1 , i2 , . . . , in of integers between

1 and n at random.
(a) What is the probability that the sequence contains exactly n − 2 different integers? [Ans:
n(n − 1)(n − 2)(3n − 5)n!/(48nn ) ]
is the probability that the sequence contains exactly n − 3 different integers? [
(b) What
Ans: n4 n!(n − 2)(n − 3)/(12nn ) ]
(7) Consider the multiset {n · a, n · b, 1, 2, 3, ..., n + 1} of size 3n + 1. Determine the number of

its n-combinations. [Ans: (n + 1)2n .]
(8) Let P be the set of r-permutations of a set S of size m. Put an equivalence relation on P,
such that the set of equivalence classes is
(a) the set of r-combinations of S
(b) the set of circular r-permutations of S
(c) if r = m, the set of permutations of a multiset S 0 = {a1 · X1 , . . . , an · Xn } of size m.
(Given an eq. reln on a set A, we get an equivalence relation on any subset B ⊂ A. We
can take B = P and A = S m )
Show that each equivalence class has the same size, and hence obtain the known formulas
for the sizes of a), b), c).
5. Pigeonhole principle
Pigeonhole principle states that if n > k identical balls are distributed among k boxes, then
there exists a box with at least two balls.
A proof by contradiction is as follows: let ni be the number of balls placed in the i-th box for
1 ≤ i ≤ k. Suppose ni ≤ 1 for all i, then
k
X
n−k = (ni − 1) ≤ 0,
i=1
contradicting n > k.
Some sample applications of the pigeonhole principle.
1) (from [Bona]) A chess tournament has n participants, and any two players play exactly one
game against each other. Then it is true that in any given point of time, there are two players who
have finished the same number of games.
For 0 ≤ k ≤ n − 1, let bk be the number of players who have played k matches at the given point
in time. We have b0 + b1 + · · · + bn−1 = n. If b0 = 0 or if bn−1 = 0, then by pigeonhole principle
(with k = n − 1) we get that some bi ≥ 2. The case in which both b0 , bn−1 > 0 is impossible because
b0 > 0 ⇒ bn−1 = 0.
2) (from [Bona]) There is an element in the sequence {7, 77, 777, 7777, . . . , } that is divisible by
2019.
Let ai demote the i-th element of the sequence. We claim there exists 1 ≤ i ≤ 2019 for which
ai is divisible by 2019. Suppose not, then the list of 2019 remainders ai mod2019 take values in
1, 2, . . . , 2018. By pigeonhole principle, there exist 1 ≤ i < j ≤ 2019 with ai = aj mod2019, i.e.
2019 divides the number 77 . . . 7 × 10i = aj−i · 10i . Since 10i is relatively prime to 2019, it follows
that 2019 divides aj−i contradicting the hypothesis. Therefore, there exists 1 ≤ i ≤ 2019 for which
ai is divisible by 2019.
3) (from [Brualdi]) A chess master who has 11 weeks to prepare for a tournament decides to play
at least one game every day but, to avoid tiring himself, he decides not to play more than 12 games
during any calendar week. Show that there exists a succession of (consecutive) days during which
the chess master will have played exactly 21 games.
Let ai be the number of games played at the end of day i. We have 1 < a1 < a2 < · · · < a77 ≤
12 · 11 − 132. Consider the list of natural numbers:
a1 , a2 , . . . , a77 , a1 + 21, a2 + 21, . . . , a77 + 21
If the list has repetition, the only possibility is that ai + 21 = aj for some i < j, which means that
on the days i + 1, . . . j exactly 21 games have been played. Therefore, it suffices to show that the
list has repetition. Suppose not, then the list has 154 entries. This is impossible because the largest
entry of the list a77 + 21 ≤ 132 + 21 = 153.
8 KRISHNA KAIPA
4) (from[Brualdi]) Chinese Remainder Theorem: Given relatively prime natural numbers m and
n, and integers 0 ≤ a ≤ m − 1 and 0 ≤ b ≤ n. There exists a natural number x such that
x ≡ a mod m and x ≡ b mod n.
Consider the list of n remainders a + jm mod n for 0 ≤ j ≤ n − 1. In case a + im ≡ a + jm mod n

for some 0 ≤ i < j ≤ n − 1, then n divides (j − i)m. Since n is relatively prine to m, it follows that
n divides j − i contradicting the fact that 0 < j − i < n. Thus j 7→ a + jm mod n is a one-one
function from {0, 1, . . . , n − 1} to itself, and therefore is onto. In particular some a + im ≡ b mod n.
The principle used here is analogous to the pigeonhole principle which can be restated as: if A and
B are finite sets with |A| > |B| then a function f : A → B cannot be one-one.
6. Assignment 2
(1) (from [Bona]) (a) The set M consists of nine positive integers, none of which has a prime divi-
sor larger than six. Prove that M has two elements whose product is the square of an integer.
(b)* The set A consists of n + 1 positive integers, none of which have a prime divisor
that is larger than the n-th smallest prime number. Prove that there exists a non-empty
subset B ⊂ A so that the product of the elements of B is a perfect square. (Hint: rank
nullity theorem holds over any field including the binary field {0, 1} with 1 +1 = 0, 1 · 1 = 1.)
(2) (from [Bona]) Prove that among 502 positive integers, there are always two integers so that
either their sum or their difference is divisible by 1000.
(3) (from [Bona]) We select n + 1 different integers from the set {1, 2, . . . , 2n}. Prove that there
will always be two relatively prime integers among the selected integers.
10 KRISHNA KAIPA
7. Pigeon-hole Principle, general version

Pigeon-hole Principle, general version Let n, m and r be positive integers so that n > rm.
Let us distribute n identical balls into m identical boxes. Then there will be at least one box into
which we place at least r + 1 balls.
Example (from [Bona]): Ten points are given within a square of unit size. Then there are two of
them that are closer to each other than 0.48, and there are three of them that can be covered by a
disk of radius 0.5. √
We divide the unit square into 9 equal squares. Each sub-square has diameter 2/3 < 0.48. By
pigeonhole principle, there exists a sub-square with at least 2 points.
Similarly if we divide √
the unit square into four equal sub-squares, then each sub-square has a cir-
cumcircle of radius 1/ 8 < 0.5.
Problem (1) from Assignment 2: a) Label the numbers n1 , . . . , n9 and write nj = 2aj 3bj 5cj . Con-
sider the 3×9 matrix A with entries over {0, 1} whose j-th column is (aj mod 2, bj mod 2, cj mod 2)T .
There are in total 8 elements of {0, 1}3 and hence two of the nine column are identical, say columns
i and j. Then ni nj is a square.
(b) Label the numbers m1 , . . . , mn+1 , and let the first n primes be denoted p1 , . . . , pn . Write
n
a
Y
mj = pi ij .
i=1
Let M be the n × (n + 1) matrix with entries in {0, 1} defined by Mij = aij mod 2. Treating M as
a matrix over the binary field F2 , and using the rank-nullity theorem, the matrix M has nullity at
least 1. So there exists a non-zero vector (c1 , . . . , cn )T ∈ Fn2 such that M c = ~0. Let B be the subset
of A consisting of those mj for which cj 6= 0. It follows that:
X
aij = 0 mod 2, ∀ 1 ≤ i ≤ n.
{j:cj 6=0}
In particular the product of the numbers in B is a square.
We also discussed Problems (4)-(5) from Assignment 1.

8. Binomial Coefficients generalized

For x ∈ R and k ∈ Z we define

0

x
if k < 0
(1) = 1 if k = 0
k  x(x−1)...(x−k+1)

k!
if k > 0
For example, if n is a positive integer then:

−n k n+k−1
= (−1) .
k k
We recall that for a non-negative integer n we have the identity of polynomials:
n
n
X n
(1 + X) = X k.
k=0
k
We also recall Newton’s binomial theorem from calculus which states that for an arbitrary real
number n ∈ R and for −1 < x < 1, the infinite series
X n
xk ,
k≥0
k
converges to the number (1 + x)n .
We recall the identity

x+1 x x
(2) = + .
k k k−1
To see this, we first note that if k < 0 then both sides are 0. If k = 0, then the left side is 1 and
the right side is 1 + 0 = 1. If k > 0, it suffices to prove the polynomial identity

X +1 X X
(3) 0= − − .
k k k−1
where Xk := X(X−1)...(X+1−k)

k!
. This identity is easily checked directly.
There is also a combinatorial proof of the polynomial identity (3) : Since the polynomial on the
right has degree at most k, it suffices to show that this polynomial has more than k roots. We will
show that every natural number is a root. If x = n with n ∈ N, then x+1 k
is thenumber of size k
n
subsets of {1, . . . , n + 1}. The number of such subsets which contain n + 1 is k−1 and the number
n

of such subsets which do not contain (n + 1) is k+1 . Thus each x ∈ N is a root of (3).
The identity (2) can be generalized as below

Lemma 1. For x, y ∈ R and k ∈ Z, we have:
X k
x+y x y
(4) = , x, y ∈ R, k ∈ Z.
k i=0
k − i i
12 KRISHNA KAIPA
Proof.
If k < 0 both sides of the identity are 0. If k = 0, then the left side is 1 and the right-side is
x y
0 0
= 1. If k > 0, then it suffices to show the following identity in the polynomial ring R[X, Y ]:
X k
X +Y X Y
0 = h(X, Y ) := − .
k i=0
k−i i
Note that the (total) degree of h(X, Y ) is at-most k. (This means that any monomial X µ Y ν that
appears in h(X, Y ) must satisfy µ + ν ≤ k). Therefore, we can write
k
X
h(X, Y ) = hi (X)Y k−i ,
i=0
for some polynomials hi (X) ∈ R[X] of degree at most i. We note that h(m, n) = 0 for m, n ∈ N:
indeed any size-k subset of the set {1, . . . , m+n} is the disjoint union
ofP n of {1, . . . , m}
a size i subset
m+n k m
and a size (k − i) subset of {m + 1, . . . , m + n}. Therefore k = i=0 k−i i . In particular
for m ∈ N, the polynomial h(m, Y ) in R[Y ] of degree at most k has infinitely many roots (namely
Y = n for n ∈ N), and is hence the zero polynomial. Therefore
k
X
h(m, Y ) = hi (m)Y k−i = 0,
i=0
(here 0 is the zero polynomial in R[Y ]). This, in turn shows that for each 0 ≤ i ≤ k, the polynomial
hi (X) ∈ R[X] of degree at most i has infinitely many roots (namely X = m for m ∈ N), and is
therefore the zero polynomial. Returning to the expression h(X, Y ) = ki=0 hi (X)Y k−i , we conclude
P
that h(X, Y ) is the zero polynomial in R[X, Y ].
Example: Setting y = −1 and x = n in (4) and using the fact that −1

i
= (−1)i we get the
identity
X k
n−1 k−i n
= (−1) .
k i=0
i
Another generalization of (2) is as follows. Repeatedly using (2) (n + 1)-times (where n is a
non-negative integer), we get:

x+1 x x−1 x−n x−n
(5) = + + ··· + + .
k+1 k k k k+1
In particular for x = n where n is a non-negative integer, k ≥ 0 (or more generally k 6= −1) we
have:
X n
n+1 n n−1 0 j
(6) = + + ··· + = .
k+1 k k k j=k
k
A combinatorial interpretation of (6) is as follows. Given an integer j, there are ji n−j

k−i
subsets
of {1, . . . , n + 1} of size (k + 1) with the property that the (i + 1)-th largest element of the subset
is j + 1. Therefore we get the identity:
X
n+1 j n−j
(7) = .
k+1 j
i k−i
We note that (6) is a special case of (7) for i = k (or i = 0).
The most well known identity about binomial coefficients is the duality

n n
(8) = for n ∈ {0, 1, 2, . . . }, k ∈ Z.
k n−k
This follows from the symmetric expression n!/(k!(n − k)!) for either side of the identity.
Lemma 2. (Unimodality of binomial coefficients) Let 0 ≤ k ≤ n be integers. If n is even then the
maximum value of nk as a function of k occurs for k = n/2, and:

n n n n n
< < ··· < > > ··· > .
0 1 n/2 n/2 − 1 n
If n is odd, then the maximum value of nk as a function of k occurs at k = (n − 1)/2 and

k = (n + 1)/2, and:

n n n n n
< < ··· < = > ··· > .
0 1 (n − 1)/2 (n + 1)/2 n
Proof. This follows from the fact that
n
k − n−1

k k+1 2
n −1= −1=2 ,
k+1
n−k n−k
is negative if k < (n − 1)/2, positive if k > (n − 1)/2 and zero if k = (n − 1)/2.
For n ∈ {0, 1, 2, . . . }, the n-th Pascal matrix An of size (n + 1) × (n + 1) matrix defined by
Aij = j−1i−1
. The matrix An is upper triangular and has ones on the diagonal, and is hence
invertible.
Lemma 3. The ij-th entry of A−1
n is (−1)
i+j
times the ij-th entry of An . In other words,
1 ! 1 !
−1 −1
.. An .. .
. .
(−1)n (−1)n
In long form:
X
i+q p i
(−1) = δpq , p, q ∈ {0, 1, 2, . . . }
i
i q
Proof. Let V be the (n + 1)-dimensional vector space of polynomials in one variable X of degree at
most n and real coefficients:
V = {a0 + a1 X + · · · + an X n : a0 , . . . , an ∈ R}.
The function T : V → V given by T (f (X)) = f (X + 1) is a linear transformation. The matrix of T
with respect to the basis {1, X, X 2 , . . . , X n } is clearly An . The inverse transformation T −1 is given
by T −1 (f (X)) = f (X − 1). Since
(X − 1)j−1 = X j−1 − (j − 1)X j−2 + · · · + (−1)j−1 X 0 ,
it follows that the ij-th entry of the matrix representing T −1 is (−1)i+j j−1

i−1
.
Two other proofs of this result are in the Assignment.
14 KRISHNA KAIPA
9. Assignment 3
(1) Prove Lemma 3 by considering
1 dr
(1 + X)m .
r! dX r |X=−1
(2) Show the long form of Lemma 3 by first proving the identity:

n k n n−m
= .
k m m k−m
(3) Give two proofs –one combinatorial and one algebraic– of the following identities:
P n2
(a) 2n = .
P n k nk
n
(b) k k = 2 .
(c) i pi qi = pq 2p−q , p, q ∈ {0, 1, 2, . . . }.
P
(4) Let n and k be integers with 1 ≤ k ≤ n. Prove that

n
X n n 1 2n + 2 2n
= −
k=1
k k−1 2 n+1 n
(5) (optional) Let Vn be the vector space of polynomials in one variable of degree at most n and
real coefficients. Consider the basis {1, X, X 2 /2!, . . . , X n /n!} for Vn .
(a) Find the inverse of the (n + 1) × (n + 1) upper triangular matrix whose ij-th entry (for
j ≥ i) is 1/(j − i)!.
(b) Let a ∈ R. Find the matrix of the linear transformation Ta : Vn → Vn given by
Ta (f (X)) = f (X + a).
(c) Find the matrix of the linear transformation D : Vn → Vn given by Df (X) = f 0 (X)
(the derivative of f (X)).
(d) Verify that the linear transformation exp(aD) (using the usual series for exp) equals Ta
and interpret this.
(e) If V = R[X], show that Ta = exp(aD) by showing that Ta (f ) = exp(aD)(f ) for any
f ∈ R[X].
10. Sperner’s theorem

Sperner’s theorem Let S = {1, . . . , n}. Let 2S denote the set of all subsets of S. A chain of S
is a subset C of 2S such that if A, B ∈ C then either A ⊂ B or B ⊂ A. The largest possible size of
a chain of S is clearly n, and any such chain is of the form
{x1 } ⊂ {x1 , x2 } ⊂ · · · ⊂ {x1 , . . . , xi } ⊂ · · · ⊂ {x1 , . . . , xn },
where (x1 , . . . , xn ) is a permutation of (1, . . . , n). It follows that there are n! maximal chains of S.
Similarly there are k!(n − k)! maximal chains of S which contain a fixed subset B of S of size k.
An antichain of S is a subset A of 2S such that if A 6⊂ B for any pair of elements A, B ∈ A. For

n

any 1 ≤ k ≤ n, the set of all k-element subsets of S is an antichain of size k .
Theorem 4. (Sperner’s Theorem) The maximum possible size of an antichain of {1, . . . , n} is
n

bn/2c
.
Proof. Let β denote the maximum possible size of an antichain of S = {1, . . . , n}. As noted above,
there do exist antichains of size nk for any 1 ≤ k ≤ n. It follows from Lemma 2 that there exist
n n n

antichains of size bn/2c . Therefore, β ≥ bn/2c . It remains to show that β ≤ bn/2c .
Suppose C is a chain of S and A is an antichain of S. If A, B ∈ C then we have A ⊂ B or B ⊂ A,

and hence at most one of A and B can be an element of A (from the definition of antichain). This
shows that 0 ≤ |C ∩ A| ≤ 1.
Given an antichain A, for each A ∈ A let CA denote the set of all maximal chains of S which
contain A. If A, B ∈ A, then we note that CA and CB are disjoint because, as noted above, a chain
can contain at most one of A and B. Therefore, the quantity
X
|CA |,
A∈A
is the number of those maximal chains of S which have at least one element in common with A.
This number is clearly less than the number of all maximal chains of S, which as observed above is
n!. Hence, we get
X
(9) n! ≥ |CA |
A∈A
For each 1 ≤ k ≤ n, let αk denote the number of elements of A of size k. We must show that

n
≥ |A| = α1 + α2 + · · · + αn .
bn/2c
We also know that if |A| = k, then |CA | = k!(n − k)!. Therefore:
n n
X X X αk
|CA | = αk k!(n − k)! = n! n

A∈A k=1 k=1 k
Using Lemma 2 we get

X n!
(10) |CA | ≥ n
(α1 + α2 + · · · + αn ).
A∈A bn/2c
16 KRISHNA KAIPA
Combining inequalities (9) and (10), we conclude that:

n
(11) ≥ |A| ≥ β.
bn/2c
n

This completes the proof that β = bn/2c .
n

Remark : It can be shown that the only antichains of size bn/2c are: i) the collection of all size
n/2 subsets of S if n is even, ii) the collection of all size (n − 1)/2 subsets of S, and the collection
of all size (n + 1)/2 subsets of S if n is odd. Equality holds in (11) if and only if equality holds in
the inequalities (9) and (10). Equality in (10) holds if and only if all elements of A have size either
(n − 1)/2 or (n + 1)/2 if n is odd, and n/2 if n is even. Equality in (9) holds if and only if each
of the n! maximal chains of S contain a member of A. If n is even, it follows that equality holds
in (11) if and only if A is the set of all n/2-size subsets of S. If n is odd, with a little more work,
n n

it can be shown that either all the sets in A have size (n−1)/2 or all of them have size (n+1)/2 .
(See for example §7.2 in the book Combinatorics: Topics, Techniques, and Algorithms by Peter J.
Cameron)
11. Practice Problems for Quiz 1

(1) How many permutations are there of the letters of the word ADDRESSES? How many 8-
permutations are there of these nine letters?
(2) How many sets of three integers between 1 and 20 are possible if no two consecutive integers
are to be in a set?
(3) Use the pigeonhole principle to prove that the decimal expansion of a rational number m/n
eventually is repeating. For example,
34478/99900 = 0.34512512512512512 . . .
(4) (a) Show that if n + 1 integers are chosen from the set {1, 2, . . . , 2n}, then there are always
two which differ by 1.
(b) Show that if n + 1 distinct integers are chosen from the set {1, 2, . . . , 3n}, then there
are always two which differ by at most 2.
(c) Generalize the previous two parts of this problem.
(5) A collection of subsets of {1, 2, . . . , n} has the property that each pair of subsets has at least
one element in common. Prove that there are at most 2n−1 subsets in the collection.
Pn k n
k
(6) Evaluate k=0 (−1) k 10 .
(7) Find one binomial coefficient equal to the following expression:

n n n n
+3 +3 +
k k−1 k−2 k−3
(8) Determine the number of 12-combinations of the multiset S = {4 · a, 3 · b, 4 · c, 5 · d}.
(9) Determine the number of solutions of the equation X1 + X2 + X3 + X4 = 14 in nonnegative

integers X1 , X2 , X3 , and X4 not exceeding 8.
18 KRISHNA KAIPA
12. Quiz-1
(1) ([5 points]) Consider the multiset {n · a, 1, 2, 3, ..., n} of size 2n. Determine the number of
its n-combinations.
(2) ([5 points]) Find and prove a formula for

X m1 m2 m3
r,s,t≥0, r+s+t=n
r s t
All quantities involved are non-negative integers.
(3) ([5 points]) How many permutations of

FLOCCINAUCINIHILIPILIFICATION = F 2 L3 O2 C 4 I 9 N 3 A2 U 1 H 1 P 1 T 1
are there with no adjacent I’s.
(4) ([5 points]) There are 100 people at a party. Each person has an even number (possibly zero)
of acquaintances . Prove that there are three people at the party with the same number of
acquaintances (i.e. who know each other).
Answers.
(1) An n-combination of the given multiset is the same as the multiset formed by taking a with
multiplicity µ together with a size (n − µ) subset of {1, 2, . . . , n} for some 0 ≤ µ ≤ n. Thus
the desired answer is :
n
X n
= 2n .
µ=0
µ
Alternative Solution: We give another solution which is more complicated and hence not
preferable. But it does illustrate some binomial coefficient identities and generating function
techniques. Let X be the original multiset and let Y be the multiset {∞·a, ∞·1, . . . , ∞·n} of
n+1 distinct objects each having infinite multiplicity. Note that for treating n-combinations
of X, we may assume that the multiplicity of a in X is ∞ instead of n. The number of n-
combinations of Y is (n+1)+n−1 2n

n
= n
. Let Bi ⊂ Y denote the set of those n-combinations
of Y in which the element i of Y is used more than once. Clearly, the set of n-combinations
of X consists of the complement of ∪ni=1 Bi in the set of n-combinations of Y . Also for
I = (i1 , . . . , ik ) with 1 ≤ i1 < · · · < ik ≤ n, we have

(n + 1) + (n − 2k) − 1 2(n − k)
| ∩i∈I Bi | = = ,
n − 2k n
as it is equal to the number of solutions of y0 + y1 + · · · + yn = n − 2k with all yi being
non-negative integers. By the inclusion-exclusion principle, the number of n-combinations
of X is:
2n
Pn k−1 n 2(n−k)
Pn k n
2(n−k) P n−j n 2j

n
− k=1 (−1) k n
= k=0 (−1) n−k n
= j≥0 (−1) j n
.
The last expression is the constant term in the power series

n (1−X)n
Xn n
(X −2 − 1)n ( i≥0 ni X i ) = (1+X)X 2n D (1 − x)−1 ,
P
n!
where D = d/dX. Since Dn (1 − X)−1 = (1 − X)−(n+1) /n!, we need the constant term in the
Laurent-expansion of
(1 + X)n /(X n (1 − X))
which is the same as the coefficient of X n in the power series:
(1 + X)n (1 + X + X 2 + . . . ).
We conclude that the answer is ni=0 ni = 2n .
P
(2) The desired number is the coefficient of X n in
(1 + X)m1 (1 + X)m2 (1 + X)m3 = (1 + X)m1 +m2 +m3 .
Therefore, the answer is m1 +mn2 +m3 .

(3) The desired number is the same as the the number of permutations of the multiset
F 2 L3 O2 C 4 N 3 A2 U 1 H 1 P 1 T 1
of size 20 (we just removed the 9 I’s) multiplied by the number of ways to pick 9 spots
from the 21 spots consisting of the positions preceding, following and intermediate to the
20 letters created in step 1. This works out to be:

20! 21
×
2!3!2!4!3!2!1!1!1!1! 9
(4) Let n0 , n2 , . . . , n98 denote the number of persons in the party who have 0, 2, . . . , 98 acquain-
tances respectively. Suppose ni < 3 for all i. Since n0 + · · · + n98 = 100, we must have
ni = 2 for all i. Let A and B denote the 2 persons having zero acquaintances. The set of
acquaintances of any person C excludes at least 3 persons namely A, B, C. Thus n98 = 0,
contradicting the fact that n98 = 2. So the assumption that ni < 3 for all i is false.
20 KRISHNA KAIPA
13. Inclusion-Exclusion Principle
Inclusion-Exclusion Principle
Let A1 , A2 , . . . , An be finite subsets of a set S. Then
| ∪ni=1 Ai | = E1 − E2 + E3 + · · · + (−1)n−1 En ,
where
X
Ek = |Ai1 ∩ Ai2 ∩ · · · ∩ Aik |.
1≤i1 <i2 <···<ik ≤n
Proof. The proof is by induction on n. The base case n = 2 is well known:

|A1 ∪ A2 | = |A2 | + |A2 | − |A1 ∩ A2 |.
Assume the result is true for n − 1 sets A1 , . . . , An−1 . Let
X
Ek0 = |Ai1 ∩ Ai2 ∩ · · · ∩ Aik |
1≤i1 <i2 <···<ik ≤n−1
X
Ek00 = |An ∩ (Ai1 ∩ Ai2 ∩ · · · ∩ Aik )|
1≤i1 <i2 <···<ik ≤n−1
We also define E000 = |An |. It is understood that the quantities are zero when the summation runs
over an empty set: for example En0 = En00 = 0. We note that :
Ek = Ek0 + Ek−1
00
, 1≤k≤n
For 1 ≤ k ≤ n − 1 define Bi = Ai ∩ An . We can rewrite the expression for Ek00 as:
X
Ek00 = |Bi1 ∩ Bi2 ∩ · · · ∩ Bik |.
1≤i1 <i2 <···<ik ≤n−1
By the inductive hypothesis, we get:

n−1
X
(−1)i−1 Ei00 = | ∪n−1 n−1
i=1 Bi | = |An ∩ (∪i=1 Ai )|.
i=1
n−1
X
(−1)i−1 Ei0 = | ∪n−1
i=1 Ai |.
i=1
Therefore
n
X n−1
X n−1
X
(−1) i−1
Ei = (−1) i−1
Ei0 + E000 − (−1)i−1 Ei00 .
i=1 i=1 i=1
This can be written as:

| ∪n−1 n−1
i=1 Ai | + |An | − |An ∩ (∪i=1 Ai )|.
The last quantity is | ∪ni=1 Ai | using the base case n = 2.

Derangements of a finite set.

Let X be a finite set say X = {1, . . . , n}. The number of derangements of X is
Dn = #{f : X → X | f is bijective and f (i) 6= i ∀i}.
For example D1 = 0, D2 = 1 and D3 = 2.
Lemma 5. Dn is the closest integer to n!/e.
Proof. Define
Ai = {f : X → X | f (i) = i, f is bijective}.
Clearly Dn = n! − | ∪ni=1 Ai |. Note that Ai1 ∩ Ai2 ∩ · · · ∩ Aik is the set of permutations of X which
fix i1 , i2 , . . . , ik and hence:
|Ai1 ∩ Ai2 ∩ · · · ∩ Aik | = (n − k)!
for 1 ≤ i1 < i2 < ik ≤ n.
Therefore, the terms Ek in the application of inc.-exc. principle | ∪ni=1 Ai | = ni=1 (−1)i−1 Ei equal
P

n n!
Ek = (n − k)! = .
k k!
Using this we get
Dn = n! − n!( 1!1 − 2!1 + 3!1 + · · · + (−1)n−1 n!1 ).
In other words Dn /n! is the sum of the first (n + 1) terms (i.e. the (n + 1)-th partial sum) of the
alternating series
X∞
−1
e = (−1)i i!1 .
i=0
1
As the sequence i!
is monotonically decreasing, we have
|e−1 − Dn /n!| < 1
n+1!
.
Consequently |n!/e − Dn | < 1/(n + 1) ≤ 1/2. This proves that Dn is the closest integer to n!/e.
As another application of the inclusion-exclusion principle, we discussed the problem of deter-
mining the number of r-combinations of a multiset X = {n1 · a1 , . . . , nk · ak }. Let S be the multiset
{∞ · a1 , . . . , ∞ · ak }. The number of r-combinations of X is the number of r-combinations of S
minus |A1 ∪ . . . Ak | where Ai is the the number of r-combinations of S which uses ai more than ni
times. We discussed a specific example from [Brualdi].
22 KRISHNA KAIPA
14. Möbius inversion on the power-set of a finite set
Möbius inversion on the power-set of a finite set

Let X be a finite set, and 2X the power set of X. To each function f : 2X → R, we define another
function f † : 2X → R by X
f † (K) = f (L).
L⊂K
The following lemma tells us how to recover f from f † :

Lemma 6. X
f (K) = (−1)|K\L| f † (L)
L⊂K
Proof. Using the definition of f † the right hand side is:

X X X X X
(−1)|K\L| f † (L) = (−1)|K\L| f (J) = f (J) (−1)|K\L|
L⊂K L⊂K J⊂L J⊂K {L:K⊃L⊃J}
Let K̄ = K \ J and let L̄ = L \ J. We have

|K̄|
X
|K\L|
X
|K̄\L̄|
X |K̄|
(−1) = (−1) = (−1)i = δ0,|K̄| .
i=0
i
{L:K⊃L⊃J} L̄⊂K̄
Since δ0,|K̄| = δJ,K , we get:

X X
(−1)|K\L| f † (L) = f (J)δJ,K = f (K)
L⊂K J⊂K

We will now show how this result implies the inclusion-exclusion principle. First we make an
observation:
Let B1 , . . . , Bn be subsets of a set S. Let X = {1, . . . , n} and 2X the power set of X. For each
element I of 2X , i.e. for each subset of X, let SI denote the subset of S consisting of those elements
which belong to Bi for each i ∈ I but do not belong to Bj for j ∈ / I. In this definition, note that
n n
S∅ = S \ (∪i=1 Bi ) and SX = (∩i=1 Bi ). Clearly each element of S is in exactly one of the SI ’s. Thus
a
S= SI ,
I⊂X
where
SI = (∩i∈I Bi ) ∩ (∩j ∈I
/ (S \ Bj )).
At this stage we note that a
(∩i∈I Bi ) = SJ
J⊃I
Consider the function f : 2X → R given by f (I) = |SX\I |. Using the previous equality, we have
X
f † (I) = |SX\J | = | ∩i∈I
/ Bi |.
J⊂I
In particular f † (X) = S. We note that S∅ = ∩ni=1 (S \ Bi ), and hence:

| ∪ni=1 Bi | = |S| − |S∅ | = |S| − f (X)
Therefore,
X X X
| ∪ni=1 Bi | = |S| − (−1)|X\I| f † (I) = |S| − (−1)|J| | ∩j∈J Bj | = (−1)|J|−1 | ∩j∈J Bj |
I⊂X J⊂X J⊂X,J6=∅
The equality between the first and last terms of this chain of equalities is the inclusion-exclusion
principle.
24 KRISHNA KAIPA
15. Assignment 4
(1) Let n be a positive integer and let p1 , p2 , . . . pk be all the different prime numbers that divide
n. Let
φ(n) = #{1 ≤ j ≤ n : j is relatively prime to n}.
Use the inclusion-exclusion principle to show that
k
Y
φ(n) = n (1 − p−1
i )
i=1
(2) A metro train has six stops on its route from its starting station. There are 10 people on
the metro train as it departs its starting station. Each person exits the metro train at one
of its six stops, and at each stop at least one person exits. In how many ways can this happen?
(3) Prove that Dn is an even number if and only if n is an odd number.
(4) Determine the number of 10-combinations of the multiset S = {∞ · a, 4 · b, 5 · c, 7 · d}.

16. Möbius Inversion on Posets
Möbius Inversion on Posets A relation R on a set X is just a subset of X × X. The relation R

is called reflexive if (x, x) ∈ R for all x ∈ X. The relation R is called transitive if
(x, y), (y, z) ∈ R ⇒ (x, z) ∈ R.
The relation R is called symmetric if
(x, y) ∈ R ⇔ (y, x) ∈ R.
The relation R is called antisymmetric if
(x, y), (y, x) ∈ R ⇔ x = y.
We recall that an equivalence relation on a set X is a relation that is reflexive, transitive and
symmetric.
A partial order on a set X is a relation R on X which is reflexive, transitive and anti-symmetric.
For a partial order R, it is common to denote (a, b) ∈ R, as a ≤ b. The pair (X, ≤) is called a
partially-ordered set or in short a poset.
Examples of posets are
(1) X = 2S the power set of a set S. Here A ≤ B if A ⊂ B.
(2) A variant of (1) is to take X as the set of all linear subspaces of a vector space S. Here
A ≤ B is A is a linear subspace of B.
(3) X = R with usual meaning of ≤
(4) X = N with m ≤ n if m divides n.
A poset (X, ≤) is called a totally ordered set, if for any a, b either a ≤ b or b ≤ a. In other words
any two elements of X are comparable. In the examples above (3) is totally ordered. The example
(1) is totally ordered if S has atmost one element. Similarly the example (2) is totally ordered if
dim(S) ≤ 1. All other cases of the examples above are not totally ordered.
Given a poset (X, ≤) let

F(X) = {f : X × X → R : x 6≤ y ⇒ f (x, y) = 0}.
Note that F(X) is a linear subspace of the vector space of all functions {f : X × X → R}.
Under the condition that for each pair of elements x, y ∈ X the set (sometimes called interval )
{z ∈ X : x ≤ z ≤ y} is finite, we define a multiplication (called convolution) on F by defining
f (x, y) = 0 if x 6≤ y and in case x ≤ y we define:
X
(f ∗ g)(x, y) = f (x, z)g(z, y).
{z:x≤z≤y}
This multiplication is not necessarily commutative, i.e. f ∗g need not equal g ∗f . The multiplication
is however associative
X
(f ∗ g ∗ h)(x, y) = f (x, u)g(u, v)h(v, y) = (f ∗ g) ∗ h = f ∗ (g ∗ h)
{u,v:x≤u≤v≤y}
26 KRISHNA KAIPA
There is a unique multiplicative identity g with the property that f ∗g = g ∗f = f for all f ∈ F(X).
The Kronecker delta function g(x, y) = δx,y is a multiplicative identity because
(f ∗ δ)(x, y) = f (x, y) = (δ ∗ f ).
The uniqueness is left as an exercise (Assignment 5). Note that if X = {1, . . . , n} with the usual
meaning of ≤, then F(X) is isomorphic to the algebra Un of n × n real upper triangular matrices
by the map f 7→ Af where the ij-th entry of Af is f (i, j).
Returning to a general poset X, suppose f ∈ F(X) has a left inverse g i.e. g ∗ f = δ. Then, the
identity 1 = δ(x, x) = g(x, x)f (x, x) shows that f (x, x) 6= 0. Similarly if f has a right inverse h, i.e.
f ∗ h = δ, then again the condition f (x, x)h(x, x) = 1 forces f (x, x) 6= 0.
Lemma 7. Any f ∈ F(X) satisfying f (x, x) 6= 0 for all x ∈ X has a unique two-sided multiplicative
inverse.
Proof. First we note that if g is a left inverse of f and h is a right inverse of f then
h = δ ∗ h = (g ∗ f ) ∗ h = g ∗ (f ∗ h) = g ∗ δ = g.
Therefore, if f has a left inverse and a right inverse, then every left inverse equals a right inverse.
This shows that there is a unique h such that f ∗ h = h ∗ f = δ. We have already seen that
g(x, x) = h(x, x) = 1/f (x, x) for a left inverse g or a right inverse h.
We will now determine g(x, y) for x < y. We recall that the interval {z : x ≤ z ≤ y} is finite.
We partition this interval into parts M0 , M1 , . . . , Mn where M0 = {x} and for i > 0, Mi consists
of those z in this interval for which #{t : x < t ≤ z} = i. If z ∈ Mi where i > 0 and u satisfies
x ≤ u < z then
{t : x < t ≤ z} ⊃ {t : x < t ≤ u} q {t : u < t ≤ z}.
The sets in the disjoint union are nonempty because u belongs to the first set and z belongs to the
second set. Therefore, it follows that
#{t : x < t ≤ u} < #{t : x < t ≤ z} = i,
and hence u ∈ Mj for some j < i. Assume inductively that g(x, z) has been defined for z ∈
M0 , . . . , Mi−1 . For z ∈ Mi , we have
X
0 = δ(x, z) = g(x, z)f (z, z) + g(x, u)f (u, z).
{u:x≤u<z}
As observed above, each u satisfying x ≤ u < z also satisfies u ∈ Mj for some j < i. Therefore
g(x, u) is known by the inductive hypothesis. This allows us to determine
−1 X
g(x, z) = g(x, u)f (u, z).
f (z, z)
{u:x≤u<z}
Next, we determine the right inverse h(x, y) for x < y. We partition the interval {z : x ≤ z ≤ y}
into parts M00 , M10 , . . . , Mn0 where M00 = {y} and for i > 0, Mi0 consists of those z in this interval for
which #{t : z ≤ t < y} = i. If z ∈ Mi0 where i > 0 and v satisfies z < v ≤ y then
{t : z ≤ t < y} ⊃ {t : z ≤ t < v} q {t : v ≤ t < y}.
The sets in the disjoint union are nonempty because v belongs to the second set and z belongs to
the first set. Therefore, it follows that
#{t : v ≤ t < y} < #{t : z ≤ t < y} = i,
and hence v ∈ Mj0 for some j < i. Assume inductively that h(z, y) has been defined for z ∈
M00 , . . . , Mi−1
0
. For z ∈ Mi0 , we have
X
0 = δ(z, y) = f (z, z)h(z, y) + f (z, v)h(v, y).
v:z<v≤y
As observed above, each v satisfying z < v ≤ y also satisfies v ∈ Mj0 for some j < i. Therefore
h(v, y) is known by the inductive hypothesis. This allows us to determine
−1 X
h(z, y) = f (z, v)h(v, y).
f (z, z)
{v:z<v≤y}

28 KRISHNA KAIPA
17. Möbius Inversion on Posets, continued...
The function ζ(x, y) defined by ζ(x, y) = 1 if x ≤ y and zero otherwise, has an inverse by the
previous lemma. The Möbius function of X is the multiplicative inverse of ζ. It is denoted µ(x, y).
We have µ(x, x) = 1 and for x < y:
X
µ(x, y) = − µ(x, z).
{z:x≤z<y}
Example 1) If X is a totally ordered set, then


1
 if #{z : x ≤ z < y} = 0
µ(x, y) = −1 if #{z : x ≤ z < y} = 1

0 if #{z : x ≤ z < y} > 1
Example 2 Let (X, ≤) is the power set of a finite set S. Let F 0 (X) denote the vector space of all
R-valued functions on X. Consider the associative multiplication on F 0 (X) defined by (f ∗ g)(B) =
0
P
A⊂B f (A)g(B \ A). There is an injective linear map ı : F (X) ,→ F(X) given by ı(f )(A, B) = 0
if A 6⊂ B and ı(f )(A, B) = f (B \ A) if A ⊂ B. The map ı also respects multiplication:
X X
(ı(f ) ∗ ı(g))(A, B) = ı(f )(A, C)ı(g)(C, B) = f (C \ A)g(B \ C) = ı(f ∗ g).
C:A⊂C⊂B C:A⊂C⊂B
Note that ζ = ı(ζ 0 ) and δ = ı(δ 0 ) where ζ 0 (A) = 1 for all A ⊂ X and δ 0 (A) = δ∅,A . Any f ∈ F 0 (X)
satisfying f (∅) 6= 0 has a multiplicative inverse given by
X
g(Y ) = f−1(∅)
g(Z)f (Y \ Z).
Z(Y
It follows that ı(g) is the multiplicative inverse of ı(f ). In particular µ(A, B) = µ0 (B \ A) where
µ0 (∅) = 1 and for non-empty A we have
X
µ0 (A) = − µ0 (B).
B(A
It is clear that µ0 (A) depends only on the size of A, say µ0 (A) = ϕ(|A|) for some function ϕ :
{0, 1, . . . } → R satisfying ϕ(0) = 1 and for n > 0:
n−1
X n
ϕ(n) = − ϕ(i).
i=0
i
This can be expressed as saying that An (ϕ(0), . . . , ϕ(n))T = (1, 0, . . . , 0)T where An is the (n + 1) ×
i+j i−1
(n + 1) lower triangular Pascal matrix. Since we know that the (ij)-th entry of A−1

n is (−1) j−1
,
we get for m > 0:  ϕ(0) 
1
! 1 !
ϕ(1) 0 −1
 .  = A−1 .. = .. .
.. . .
ϕ(n) 0 (−1)n
Therefore µ0 (A) = (−1)|A| and returning to F(X), we conclude µ(K, L) = (−1)|L\K| for K ⊂ L.
Example 3) If X1 , . . . , Xk are posets, then the Cartesian product X = X1 × · · · × Xk is a

poset by defining (a1 , . . . , ak ) ≤ (b1 , . . . bk ) if ai ≤ bi for each i. (Check reflexive, transitive, and
antisymmetric properties Assignment 5). Let F 0 (X) = ki=1 F(Xi ) be the linear subspace of F(X)
Q
consisting of functions of the form
(Πki=1 fi ) ((a1 , . . . , ak ), (b1 , . . . bk )) = f1 (a1 , b1 )f2 (a2 , b2 ) . . . fk (ak , bk ).
Let ~a denote (a1 , . . . , ak ) and let ~c ≤ ~a denote ci ≤ ai for each 1 ≤ i ≤ k. We note that F 0 (X) is
also a sub-algebra of F(X) because
((Πki=1 fi )∗(Πki=1 gi ))(~a, ~b) = ~a≤~c≤~b (Πki=1 fi )(~a, ~c) (Πki=1 gi )(~c, ~b) = Πki=1 ai ≤ci ≤bi fi (ai , ci )gi (ci , bi ) =
P P
= Πki=1 (fi ∗ gi )(ai , bi ) = (Πki=1 (fi ∗ gi ))(~a, ~b).

As an application of this, suppose fi ∈ F(Xi ) have multiplicative inverses denoted fi−1 . We get
(Πki=1 fi ) ∗ (Πki=1 fi−1 ) = Πki=1 (fi ∗ fi−1 ) = Πki=1 δXi = Πki=1 (fi−1 ∗ fi ) = (Πki=1 fi−1 ) ∗ (Πki=1 fi )
Since δX = Πki=1 δXi , we conclude that Πki=1 fi−1 is the multiplicative inverse of Πki=1 fi . In particular
µX = Πki=1 µXi
mk
For a natural number n with k prime factors n = pm1 m2
1 p2 . . . pk , let Xi be the totally ordered set
Xi : p0i < p1i < · · · < .

The product X = X1 × · · · × Xk (the set of natural numbers with no prime factors other than
p1 , . . . , pk ) has the product partial order t ≤ s if t|s. The Möbius function
(
0 if some |bi − ai | > 1
µX (Πki=1 pai i , Πki=1 pbi i ) = Πki=1 µXi (pai i , pbi i ) = Πki=1 (bi −ai )
(−1) if 0 ≤ bi − ai ≤ 1 for all i
We note that for m|n, we have µ(m, n) = µ(1, n/m). In particular µX (1, n) is 0 if n is not square-
free (i.e. some mi > 1), and it is equal to (−1)k if n is square free (i.e. m1 = · · · = mk = 1).
Theorem 8. (Möbius Inversion on posets) Let (X, ≤) be a poset with the finite intervals property.
Suppose Pthat there is a least element 0 in X (i.e. 0 ≤ x for all x ∈ X). Given f : X → R define
f † (x) = y≤x f (y). We can recover f from f † by the formula
X
f (x) = f † (y)µX (y, x).
y≤x
Proof. Let F ∈ F(X) be defined by F (z, x) = δ0,z f (x). Let G = F ∗ ζ. We note that G(z, x) =
δ0,z f † (x). Since G ∗ µX = F ∗ ζ ∗ µ = F , it follows that
X X
f (x) = F (0, x) = G(0, y)µX (y, x) = f † (y)µX (y, x).
y≤x y≤x

Example 1) If X is a totally ordered set with a least element then Möbius Inversion states that
for any f : X → R: X
f † (x) = f (y) ⇒ f (x) = f † (x) − f † (x − 1)
y≤x
30 KRISHNA KAIPA
where x − 1 denotes the predecessor of x in the total order X.
Example 2) For X = 2S with S finite, the least element is the empty set ∅ and Möbius Inversion
states that for any f : X → R:
X X
f † (K) = f (L) ⇒ f (K) = (−1)|K\L| f † (L),
L⊂K L⊂K
which is Lemma 6.
in elementary number theory) Given a function f : N → R, let

Example 3) (Möbius Inversion P
† †
f : N → R be defined by f (m) = d|m f (d), then
X
f (n) = µ(1, n/d)f † (d).
d|n
mk
To see this, fix n = pm 1 m2
1 p2 . . . pk , and let X = {1, . . . , n}
Pwith †i ≤ j if i|j. The
P least element of
†
this poset is 1. By Möbius inversion in X, we have f (n) = d|n f (d)µ(d, n) = d|n f (d)µ(1, n/d).
Example 4) (Euler φ-function in elementary number theory) Let

φ(n) = #{1 ≤ j ≤ n : j is relatively prime to n}.
√
Let ξ = exp(2π −1/n). For 1 ≤ m ≤ n, the least natural number j for which (ξ m )j = 1 is n/d
where d = gcd(m, n). For a given divisor d of n, clearly there are φ(n/d) such values of m. This
shows that X X
φ† (n) = φ(d) = φ(n/d) = n.
d|n d|n
Therefore, by inversion formula we have:
X X
φ(n) = d µ(1, n/d) = n µ(1, d)/d.
d|n d|n
If p1 , . . . , pk are the prime divisors of n, then µ(1, d) is zero unless d|p1 p2 . . . pk . In case d|p1 p2 . . . pk ,
we have µ(1, d) equals +1 if d has an even number of prime factors and −1 is d has an odd number
of prime factors. It follows that
X
−1
µ(1, d)/d = (1 − p−1 −1
1 )(1 − p2 ) . . . (1 − pk ),
d|n
and hence
k
Y
φ(n) = n (1 − p−1
i ).
i=1
18. Assignment 5
(1) Check that if X is poset, then δ(x, y) is the unique multiplicative identity element of F(X).
(2) If X1 , . . . , Xk are posets, consider the relation on on the Cartesian product X = X1 ×· · ·×Xk
defined by (a1 , . . . , ak ) ≤ (b1 , . . . bk ) if ai ≤ bi for each i. Check that ≤ is a partial order.
(3) Consider the power set of {1, 2, 3} denoted X = 2{1,2,3} partially ordered by set containment.
Let a function f ∈ F(X) be defined by


 1, if A = B

2, if A ⊂ B and |B| − |A| = 1
f (A, B) =

 1, if A ⊂ B and |B| − |A| = 2

−1, if A ⊂ B and |B| − |A| = 3

Find the inverse of f with respect to the convolution product.
(4) Consider the multiset X = {n1 · a1 , n2 · a2 , . . . , nk · ak } of k distinct elements with pos-

itive repetition numbers n1 , n2 , · · · , nk . We introduce a partial order on the combina-
tions of X by stating the following relationship: If A = {p1 · a1 , p2 · a2 , · · · , pk · ak } and
B = {q1 · a1 , q2 · a2 , . . . , qk · ak } are combinations of X, then A ≤ B provided that pi ≤ qi for
i = 1, 2, . . . , k. Prove that this statement defines a partial order on X and then compute its
Mobius function.
32 KRISHNA KAIPA
19. Quiz-2
(1) ([4 points]) Let X be a poset with finite intervals property. Let µ(·, ·) denote the Möbius
function of X. Write down a recursive/inductive formula for µ(x, y) where x ≤ y.
P
Ans: µ(x, x) = 1 and µ(x, y) = − {z:x≤z<y} µ(x, z).
(2) ([8 points]) The function λ(n) is defined for positive integers n by the rule
X
λ(d) = log(n).
d|n
e
Prove that λ(n) = log(p) if n = p where p is a prime, and λ(n) = 0 otherwise.
Ans: By inversion we have

X
λ(n) = log(n/d)µ(d),
d|n
where µ(d) is 0 if d is not square-free, and in case d is square free then µ(d) = (−1)#prime factors of d .
If n = 1, we get
λ(1) = log(1)µ(1) = 0 × 1 = 0.
If n > 1, let p1 , . . . , pk denote the prime divisors of n. We can write:
X X
λ(n) = (log(n) − log(Πi∈I pi )) (−1)#I = log(n)(1 − 1)k − (−1)#I log(Πi∈I pi ))
I⊂{1,...,k} I⊂{1,...,k}
We recall our convention that 0k = δ0,k . Since k ≥ 1, the first term is 0. The other term is
k
X X k
X
− log(pi ) (−1)#I = log(pi ) (1 − 1)k−1 .
i=1 {i}⊂I⊂{1,...,k} i=1
Since (1 − 1)k−1 = δ1,k , we conclude that λ(n) = 0 if n has more than one prime factor, and
in case n = pe for some prime p and e ∈ N, we have log(n) = log(p).
(3) ([8 points]) Let ak,n denote the number of surjective functions (also called onto functions)
from {1, 2, . . . , k} to {1, 2, . . . , n}. Using inclusion-exclusion principle, find an expression for
ak,n . (Remark: The case k = 10 and n = 6 should be familiar !)
Ans: For I ⊂ {1, . . . , n} let BI denote the set of functions from {1, . . . , k} to {1, . . . , n}
whose image does not contain the members of I. For example if I = ∅ then BI consists of all
functions from {1, . . . , k} to {1, . . . , n}. Clearly #BI = (n − #I)k . By inclusion-exclusion
principle
n
n−j n
X X
#I k
ak,n = (−1) (n − #I) = (−1) jk.
j=0
j
I⊂{1,...,n}
20. Mid-Term Examination

P n2
(1) (a) (3 points) Give a combinatorial proof of the identity 2n n
= k k .
(b) (4 points) Prove that among 502 positive integers, there are always two integers so that
either their sum or their difference is divisible by 1000.
solution
a) The left side is the number of ways of picking a size-n subset of {1, . . . , 2n}. Partitioning
{1, . . . , 2n} as X q Y where X = {1, . . . , n} and Y = {n + 1, . . . , 2n}, each size-n subset of
{1, . . . , 2n} is obtained by picking a size-k subset of X and a size (n − k) subset of Y for
some k ∈ {0, . . . , n}. Therefore 2nn
equals
P n n P n2
k k n−k = k k .
b) Suppose we have a set of 502 positive integers x1 , . . . , x502 such that the difference of
any two of these integers is not divisible by 1000. The remainders yi = xi mod 1000 form a
size-502 subset of {0, . . . , 999}. The latter set can be partitioned into 501 boxes:
{0}, {500}, {1, 999}, {2, 998}, {499, 501}.
Any subset of {0, . . . , 999} which does not contain {j, 1000 − j} for any j ∈ {1, . . . , 499}, has
size at most 501. Therefore, every size 502-subset of {0, . . . , 999} must contain {j, 1000 − j}
for some j ∈ {1, . . . , 499}, in other words there exist i 6= j such that xi + xj is divisible by
1000.
(2) ( 6 points) How many permutations are there of the digits 1, 2, . . . , 8 in which none of the
patterns 12, 34, 56, 78 appears?
solution Let j ∈ {1, 2, 3, 4}. The number of permutations of the digits 1, 2, . . . , 8 in which j
of the patterns 12, 34, 56, 78 appear, is (8 − j)!. Therefore, by inclusion-exclusion principle
the answer to the problem is
8! − 4 × 7! + 6 × 6! − 4 × 5! + 4! = 24024.
(3) (7 points) Let n be a positive integer. Suppose we choose a sequence i1 , i2 , . . . , in of integers
between 1 and n at random. Note that order of the sequence matters, and repetition of
digits is allowed. What is the probability that the sequence contains exactly n − 3 different
integers?
solution The probability is #favourable outcomes/nn . A favourable item is obtained by the
following steps
(a) Pick a size n − 3 subset of {1, . . . , n},
(b) Form a multiset of size n whose underlying set is the set picked in the previous step
(c) Choose a permutation of the multiset in the previous step,
There are n3 ways to pick the set in step 1. For step 2, there are 3 different kinds of multisets
depending on the set of multiplicities: a) 4, 1,. . . , 1, b) 3, 2, 1, . . . , 1, c) 2,2, 2, 1, . . . , 1. The
number of corresponding multisets is a) n−3 1
, b) (n − 3)(n − 4), c) n−3 3
. For step 3, the
number of permutations depends on the types a)-c) in step 2): it is a) n!/4!, b) n!/(3!2!), c)
n!/(2!2!2!).
The total number of favourable outcomes is therefore:
n
× n−3 + n−3
n! n!
n!
3 1
· 4! + (n − 3)(n − 4) · 3!2! 3
· 2!2!2! ,
34 KRISHNA KAIPA
Dividing this by nn and simplifying the desired probability is:

n

4
n!(n − 2)(n − 3)/(12nn ).
(4) Let dn be the number of derangements of {1, 2, . . . , n}. Take d0 = 1. Let g(x) be the formal
power series
∞
X dj X j
g(X) = .
j=0
j!
(a) (4 points) Assuming the recurrence in part b), derive a differential equation for g(X)
and solve it to get a closed form solution for g(X).
(b) (4 points) Using the expression for dn or some other method prove that
dn+1 = n(dn + dn−1 ), n∈N
solution
(a)
∞ ∞
X n+1 X n(d n+1
d n+1 X n +dn−1 )X
g(X) = 1 + n+1!
=1+ n+1!
.
n=1 n=1
Differentiating, we get:
∞
X n(d n
0 n +dn−1 )X
g (X) = n!
= Xg(X) + Xg 0 (X).
n=1
Thus, we get the differential equation g 0 (X)(1 − X) = Xg(X). We can write this as
d
(g(X)(1 − X)) + g(X)(1 − X) = 0.
dX
The unique solution of this differential equation is g(X) = ce−X /(1−X) where c = g(0) = 1.
Thus g(X) = e−X /(1 − X).
Pn i
(b) We know dn /n! = i=0 (−1) /i!. Therefore
n n−1 n−1
(−1)i (−1)i (−1)i (−1)n (−1)n+1
X X X
n n 1
(d
n+1! n
+ dn−1 ) = n+1
( i!
)+ n+1
( i!
)=( i!
)+ n!
+ (n+1)!
.
i=0 i=0 i=0
The last expression above is also equal to dn+1 /(n+1)!. This shows that dn+1 = n(dn +dn−1 ).
Alternative solution of (b): Let ∆n denote the set of derangements of {1, . . . , n}. For
1 ≤ i ≤ n, let Fi be the set of derangements of {1, . . . , n + 1} which map n + 1 to i. Clearly
∆n+1 = qni=1 Fi .
It suffices to show that #Fi = (dn +dn−1 ). We can partition each Fi as qnj=1 Fij where Fij ⊂ Fi
consists of those derangements which map j to n + 1. Clearly #Fii = dn−1 . For j 6= i any
element of Fij is a bijective function g : {1, . . . , j −1, j +1, . . . , n} → {1, . . . , i−1, i+1, . . . , n}
satisfying g(x) 6= x for all x. For each j 6= i, there is a bijection from Fij to the set of
derangements of {1, . . . , n} which map j to i: This bijection is given by g 7→ ĝ where
(
g(x) if x 6= j
ĝ(x) =
i if x = j.
Thus
#∆n = #(qj∈{1,...,n}\i Fij .)
This shows that n
X
#Fi = #Fij = dn−1 + dn
i=1
(5) (7 points) Let a ≤ b be elements of a poset X with finite intervals property. By a chain of
length i between a and b we mean a sequence a = x0 < · · · < xi = b. Let µX denote the
Möbius function of X. Prove that :
X
µX (a, b) = (−1)length(C) ,
C
where the sum runs over all C chains between a and b.
solution We prove the result by induction on the size n of the interval {c

P : a ≤ c ≤ b}. In
order to prove that the result holds for the base case n = 1 we must show C (−1)length(C) =
µ(a, a) = 1. This follows because there is only one chain a = x0 = a between a and b = a.
For n > 1, assume inductively that the result holds if the size of the interval {c : a ≤ c ≤ b}
is at most n − 1. Next, suppose that the interval {c : a ≤ c ≤ b} has size n. By the inductive
formula for µX we have X
µ(a, b) = − µ(a, c).
{c:a≤c<b}
Any chain C : a = x0 < x1 < · · · < xi = b between a and b, can be written as C 0 < b where
C 0 is the chain C 0 : a = x0 < · · · < xi−1 . Thus C 7→ C 0 gives a bijection from the set of
chains between a and b to the set of chains which begin at a and end at some c < b. In this
notation:
0
X X
(−1)length(C) = − (−1)length(C ) .
C C0
By the inductive hypothesis, the right side of this equation is
X
− µ(a, c).
{c:a≤c<b}
36 KRISHNA KAIPA
21. Recurrence relations and generating functions
Recurrence relations and generating functions

The Hemachandra/Fibonacci numbers fn : Let fn denote the number of rhythms of (time)-duration
n − 1 consisting of two kinds of beats. A short beat and a long beat, and assume that short beat
is one time unit, and the long beat is two time units.
In terms of Sanskrit poetry, we want a vritta (poetry meter) of n − 1 matras (time units),
using two types of varnas (syllables): a laghu varna having one matra (short syllable of duration
one) and a guru varna having two matras (long syllable of duration 2). This goes back to Indian
mathematician Pingala (before Christ), or in less ancient work of Hemachandra (around 1150 AD
but before Fibonacci). The number of meters with i long syllables, is the number of permutations
of the multiset {(n − 1 − 2i) · a, i · b} which is n−1−i
i
. Thus
X
n−1−i

fn = i
.
i
Also fn = fn0
+ fn00
where fn0 , fn00
are the number of meters in which the last syllable is short, long
respectively. Clearly fn0 = fn−1 and fn00 = fn−2 , and hence
fn = fn−1 + fn−2 .
We assume that f0 = 0 and f1 = 1. Let H(x) = ∞ i
P
i=0 fi X be the generating function for the
Hemachandra numbers. We have
X
H(X) = X + (fi−1 + fi−2 )X i = X + XH(X) + X 2 H(X),
i≥2
which gives
H(X) = X/(1 − X − X 2 ).
√
Using the fact that X 2 + X − 1 = (X − φ−1 )(X + φ) where φ = 1+ 5
2
and using partial fractions
we get:
H(X) = √15 ( 1−Xφ
1 1
− 1+X/φ )
Using geometric series expansion we get:
√ √
fn = √1
5
(φn − (−φ−1 )n ) = √1
5
(( 1+2 5 )n − ( 1−2 5 )n ).
As another example of generating functions we recall that:
∞
X
−n n+k−1

(1 − X) = k
X k.
k=0
The Stirling number of the second kind Sn,k is the number of equivalence relations on {1, . . . , n}
having exactly k equivalence classes. Clearly k!Sn,k is the number of surjective functions from
{1, . . . , n} to {1, . . . , k} which we have worked out to be kj=0 (−1)k−j kj j n . Therefore
P
k
X
1
(−1)k−j kj j n .

Sn,k = k!
j=0
Sn,k X n = Sn,k X n can be expressed as

P P
For fixed k, the generating function Sk (X) = n≥k n≥0
X k
X
k Pk k−j k
X n k!1 (−1)k−j 1 1

Sk (X) = j
jn = k! j=0 (−1) j 1−jX
.
n≥0 j=0
(Note that we have used the convention 00 = 1 here for the term j = 0.)
38 KRISHNA KAIPA
22. Stirling numbers of Second kind, Bell numbers
Multiplying the expression for Sk (X) by Πki=1 (1 − iX) we get the expression:
k
Y k
X Y
1 X−1/i
f (X) = (X − 1/j) + jk 1/j−1/i
.
j=1 j=1 {i:i6=j}
We recall that the Lagrange interpolation polynomial for the data points (x1 , y1 ), . . . , (xk , yk ) (i.e.
the unique polynomial of degree at most k − 1 whose graph passes through these points) is
k
X Y
X−xi
yk xj −xi
.
j=1 {i:i6=j}
Qk
Therefore, we conclude f (X) − j=1 (X − 1/j) is the desired Lagrange interpolation polynomial for
the data points (1/j, 1/j k ) for 1 ≤ j ≤ k. However the polynomial X k − kj=1 (X − 1/j) is also a
Q
polynomial of degree at most k − 1 which interpolates the same data points. Therefore, we conclude
that f (X) = X k . In other words
Xk
Sk (X) = .
(1 − X)(1 − 2X) . . . (1 − kX)
The numbers Sn,k obey the recurrence
Sn,k = kSn−1,k + Sn−1,k−1 .
To see this, we note that given a partition of {1, . . . , n} into k parts, either the subset {1, . . . , n − 1}
is partitioned into k parts or k − 1 parts. In the former case, there are k ways to complete this
partition of {1, . . . , n − 1} to a partition of {1, . . . , n} ( by deciding where to send n). In the latter
case the only partition of {1, . . . , n} into k parts is to take {n} as a part together with the k − 1
parts of {1, . . . , n − 1}. Using this recurrence we get
X X X
Sk (X) = X k + Sn,k X n = X k + (X Sn−1,k−1 X n−1 ) + kX( Sn−1,k X n−1 ).
n>k n>k n>k
In other words
Sk (X) = XSk−1 (X) + kXSk (X).
Since Sn,1 = 1, we have S1 (X) = X/(1 − X) and hence we recover the formula:
Xk
Sk (X) = .
(1 − X)(1 − 2X) . . . (1 − kX)
The exponential generating function SkE (X) = n≥0 Sn,k X n /n! can be expressed as
P
X k
X Pk
SkE (X) X n n!k!
1
(−1)k−j kj j n = 1 k
k−j

= k! j=0 (−1) j
exp jX = (eX − 1)k /k!.
n≥0 j=0
The Bell number Bn is the number of equivalence relations on a set a set of size n.
Bn = Sn,0 + · · · + Sn,n .
Note that B0 = 1. A recurrence satisfied by the bell numbers is

n
X n
Bn+1 = Bk .
k=0
k
To see this, we note that given a partition of {1, . . . , n + 1}, the size of the part
containing n + 1 is
an integer (n − k + 1) (where 0 ≤ k ≤ n) and this part can be chosen in nk ways. The remaining
subset of {1, . . . , n+1} of size k can be partitioned in Bk ways. The exponential generating function
X X XX
i
B(X) = 1 + Bn X n /n! = 1 + Bi+1 X i+1 /(i + 1!) = 1 + k
Bk X i+1 /(i + 1)!.
n≥0 i≥0 i≥0 k≥0
Differentiating and setting j = i − k, we get:

X X
B 0 (X) = Bk X k /k! X j /j! = eX B(X).
k≥0 j≥0
If g(X) ∈ R[[X]] is a formal power series with zero constant term, then for any f ∈ R[[X]] the
composition f (g(X)) is well defined in R[[X]]. To see this suppose g(X) = b1 X + b2 X 2 + · · · +,
and f (X) = a0 + a1 X + . . . , then the coefficient of X m in f (g(X)) is the coefficient of X m in the
polynomial f˜(g̃(X) where:
f˜(X) = a0 + a1 X + . . . am X m , g(X) = b1 X + b2 X 2 + · · · + bm X m .
In particular if g(X) ∈ R[[X]] has zero constant term, then eg(X) is well defined. For example
X X
ee −1 ∈ R[[X]] but ee is not defined in R[[X]].
Returning to the differential equation B 0 (X) − eX B(X) = 0 in R[[X]], we can write this within
R[[X]] as
X X
ee −1 dX
d
(B(X)e1−e ) = 0
Note that for f (X), g(X) ∈ R[[X]], the product f (X)g(X) is the zero element of R[[X]] if and
X
only if one of f (X), g(X) is the zero element of R[[X]] (Assignment). Since ee −1 is not zero, it
d X
follows that dX (B(X)e1−e ) = 0. A formal power series f (X) has zero derivative if and only if
X
f (X) = c has only the constant term. Therefore, we conclude that B(X) = cee −1 . Since both of
X
the series B(X) and ee −1 have constant term 1, it follows that c = 1 and hence:
X −1
B(X) = ee .
P∞
Another way to derive this is to note that B n = S n,0 + · · · + Sn,n = k=0 Sn,k . Thus the exponential
generating function B(X) equals ∞ E
P
S
k=0 k (X) (the sum of the exponential generating functions of
X
the Stirling numbers of the second kind). Since Sk (X) = (e − 1) /k! we again get B(X) = ee −1 .
E X k
Stepping out of R[[X]] into the analytic setting of convergent real power series, we can write
eX −1 X
e = 1e ee . The coefficient of X n /n! in this convergent power series is:
n ∞
jn
X X
−1 n jX −1
e coeff. of X /n! in e /j! = e j!
j≥1 j=1
Therefore, we obtain Bn as the limit of a series:

∞
jn
X
1
Bn = e j!
j=1
40 KRISHNA KAIPA
23. Assignment 6
(1) A Poisson random variable X with parameter λ has probability mass function
Pr(X = k) = λk e−λ /k!, k = 0, 1, 2, . . . .
The n-th moment of a discrete random variable Y is y y n Pr(Y = y). Determine the n-th
P
moment of the Poisson random variable having unit expectation (mean) in terms of the
topics of this section. (Ans: Bn .)
(2) Let Vm be the vector space of polynomials with real coefficients of degree at most m in one
variable. Show that Ek (X) = X(X − 1) . . . (X − k + 1) for 0 ≤ k ≤ m (with E0 (X) = 1)
k
form a basis for Vm . Express the P standard basis ek (X) = X for 0 ≤ k ≤ m in terms of the
basis E0 , . . . , Em . (Ans: ej = i Sj,i Ei where Sj,i is the Stirling number of second kind.)
(First method: See quiz 3)
(Second method: Let P be the m + 1 × m + 1 matrix such that [e0 , . . . , em ] = [E0 , . . . , Em ]P .
Let P̃ be the matrix defined by P̃ij = (i − 1)!Pij . Obtain the relation B = AP̃ where A is
the upper triangular Pascal matrix and Bij = (i − 1)j−1 . Use knowledge of A−1 to solve the
problem.)
(3) Prove √that the n-th Fibonacci number fn is the integer that is closest to the number
√1 ( 1+ 5 )n .
5 2
(4) Prove that fn is divisible by 4 if and only if 6|n.
(5) By examining the Fibonacci sequence, make a conjecture about when fn is divisible by 7
and then prove your conjecture. (Ans 8|n.)
(6) Show that the product of two elements of R[[X]] is zero if and only if one of them is zero.
(7) (Optional) Let m and n be positive integers. Prove that if m is divisible by n, then fm is
divisible by fn .
More generally gcd(fm , fn ) = fgcd(m,n) .

24. Catalan numbers
Examples of solving recurrence relations.

Towers of Hanoi problem: There are 3 pegs, and on one of the pegs n circular disks of distinct
sizes are arranged in increasing order of size from top to bottom. Let hn be the number of moves
required transfer the disks to a different peg with the condition that at no point should a larger
disk be placed on a smaller disk. The only way to do this is to i) first move all but the largest disk
to a different peg (hn−1 moves required for this), then ii) move the largest disk to the remaining
vacant peg (1 move for this), and iii) move the other n − 1 disks to the peg in step ii)P (again hn−1
moves required). Thus, we have hn = 2hn−1 + 1 for n ≥ 1 and h0 = 0. Let H(X) = n≥0 hn X n .
From the recurrence we obtain
X
H(X) = X(2H(X) + 1−X 1 X
) ⇒ H(X) = (1−X)(1−2X) 1
= 1−2X 1
− 1−X = (2n − 1)X n .
n≥0
Catalan numbers
Ballot problem: A and B are contesting an election. Suppose n people vote for A and n people
vote for B. What is the probability that while counting the votes A never trails B?
Lattice Paths: Consider the integer lattice Z × Z. A walk consists of one step right (m, n) 7→
(m + 1, n) or one step up (m, n) 7→ (m, n + 1). How many paths are there from (0, 0) to (n, n) which
stay below the diagonal (but can touch it).
Triangulating a convex (n + 2)-gon (Euler): In how many ways can we triangulate a (n + 2)-gon
using (n − 1) ‘diagonals’ which do not cross each other.
The answer to the lattice problem and the polygon problem is Cn and the answer to the ballot
problem is Cn / 2nn
where Cn is the Catalan number

1 2n
Cn = .
n+1 n
The first few Catalan numbers are C0 = C1 = 1 and C2 = 2. In order to get the above formula for
Cn we will first derive the recurrence
Cn = C0 Cn−1 + C1 Cn−2 + · · · + Cn−1 C0 , n ≥ 1,
n
P
and then use the recurrence to determine the generating function C(X) = n≥0 Cn X . The
coefficient of X n in C(X) will give the formula for Cn . The Ballot problem can be converted to
the Lattice problem as follows. To a count of votes, we associate a lattice walk beginning at (0, 0)
and ending at (n, n) by taking a vote for A to a step ‘right’, and a vote for B as a step ‘up’. A
vote count in which A never trails B is a lattice walk which remains below diagonal. For the lattice
problem, given a sub-diagonal walk W let (k, k) – for 0 ≤ k ≤ n − 1 – be the last point prior to
(n, n) which lies on the given walk. Note that the walk can be broken as
W 0 → (k + 1, k) → W 0 → (n, n),
where W 0 is a lattice walk from (0, 0) to (k, k) and W 0 is a lattice walk from (k + 1, k) to (n, n − 1).
The number of such walks W 0 is Ck and the number of such walks W 00 is the same as the walk from
(0, 0) to (n, n − 1) − (k + 1, k) = (n − k − 1, n − k − 1) which is Cn−k−1 . Thus we get the desired
42 KRISHNA KAIPA
Pn−1
recurrence Cn = k=0 Ck Cn−k−1 .
To obtain the same recurrence for the polygon problem, label the vertices as v0 , . . . , vn+1 and
treat the edge vn vn+1 as the ‘base’ of the polygon. Given a triangulation of the polygon, the base
is part of a triangle vn vi vn+1 for some 0 ≤ i ≤ n − 1. To the left of this triangle is a triangulated
(n − i + 1)-gon (with vertices vi , vi+1 , . . . , vn ) and to the right of P
this triangle is a triangulated
(i + 2)-gon (with vertices vn+1 , v0 , . . . , vi ). Thus we again get Cn = n−1
i=0 Ci Cn−i−1 as desired.
In terms of the generating function C(X), the recurrence relation can be simply stated as saying
that the coefficient of X n−1 in C(X)2 for n ≥ 1 is the coefficient of X n in C(X). In other words
XC(X)2 = C(X) − 1.
By completing squares, we get
X
1/2
(2XC(X) − 1) = ±(1 − 4X)1/2 = ±

n
(−4)n X n .
n≥0
Since the constant term on the left side is −1, we must take ± in the right side to be − rather than
+. Therefore
X
1/2 1/2
n−1 2n−1 n−1
j 2j+1
Xj.
P
C(X) = n
(−1) 2 X = j≥0 j+1 (−1) 2
n≥1
Since
1/2 1 · 3 · . . . (2j − 1) 2j!
= (−1)j j+1
= (−1)j ,
j+1 (j + 1)!2 j!(j + 1)!22j+1
we get
1/2 j 2n+1 1 2n
Cn = (−1) 2 = .
n+1 n+1 n
25. Assignment 7
(1) Solve the recurrence relation hn = 8hn−1 − 16hn−2 for n ≥ 2. Take initial values h0 = −1
and h1 = 0.
(Ans: h(x) = (8x − 1)/(1 − 4x)2 = ∞ n n
P
n=0 4 (n − 1)x )
(2) Solve the non-homogeneous recurrence relation hn = 2hn−1 + n, n ≥ 1 with h0 = 1.

(Ans: h(X) = ∞ n
(1 + X(1 − X)−2 )/(1 − 2X) and hn = 3 · 2n − n − 2)
P
n=0 hn X =
(3) Let hn be the size of the set Mn consisting of 2 × n matrices of the form ( xy11 xy22 ... xn
... yn ) with
entries from {1, 2, . . . , 2n} (with no repetition) with rows and columns strictly increasing.
Prove that hn = Cn (the n-th Catalan number).
Solution 1 : Define h0 = 1. For a matrix M ∈ Mn , we note that yj ≥ 2j because yj is
the largest member of {x1 , . . . , xj , y1 , . . . , yj }. Similarly, xj ≤ 2j − 1 because xj is the least
member of {xj , . . . , xn , yj , . . . , yn }. Given M ∈ Mn , let j be the largest integer less than n
such that yj = 2j. We then have for n ≥ 1, the recurrence hn = h0 h̃n + h1 h̃n−1 + · · · + hn h̃0
where h̃n is the size of the subset M̃n ⊂ Mn consisting of matrices with the additional
property that yj >
x2 −1 x3 −1 ... xn −1
2j for 1 ≤ j ≤ n − 1. For M̃ ∈ M̃n , we note that the associated matrix
y1 −1 y2 ... yn−1 is in Mn−1 : The columns are increasing because xj − 1 ≤ 2j − 2 < yj−1 .
Thus, h̃n = hn−1 . We have shown that hn ’s obey the same recurrence as Cn and h0 = C0 = 1
and hence hn = Cn )
Solution 2 : A Dyck path is a path in the first quadrant of the integer lattice from (0, 0) to
(2n, 0) using n steps of the form (x, y) 7→ (x + 1, y + 1) (denoted %) and n steps of the form
(x, y) 7→ (x + 1, y − 1) (denoted &). We have seen that the number of Dyck paths is the
Catalan number Cn . For a Dyck path, we associate a matrix M ∈ M: the first row consists
of integers i such that the i-th step is % and the second row consists of integers i such that
the i-th step is &. Note that xi is the step number of the i-th % step and yi is the step
number of the i-th & step. At the end of xi steps the location is (xi , 2i − xi ) which must be
strictly above the x-axis, and hence which xi ≤ 2i − 1. At the end of yi steps the location is
(yi , yi − 2i) which must be on or above the x-axis, and hence which yi ≥ 2i. This shows that
M has increasing columns. Reversing the process we can define a Dyck path to a matrix
M ∈ M: The i-th step is % or & according as i appears in the first or second row of M ).
The maps are inverses of each other.
(4) Show that in R[[X]], we have:

√
X 2n 1− 1 − 4X X 1

2n
−1/2
(1 − 4X) = Xn and = X n.
n≥0
n 2X n≥0
n + 1 n
Using this prove the identity:
X n
2n + 1 1 2n + 2 1 2j 2n − 2j
= =
n 2 n+1 j=0
j+1 j n−j
44 KRISHNA KAIPA
26. Quiz-3 : 14-Oct 2019

(1) (7 points) Prove that two consecutive Fibonacci numbers are relatively co-prime. In other
words gcd(fn , fn+1 ) = 1 for all n ≥ 0.
Ans: Clearly gcd(f0 , f1 ) = gcd(1, 1) = 1. Assume gcd(fk , fk+1 ) = 1 for k ≤ n. We have
gcd(fn+1 , fn+2 ) = gcd(fn+1 , fn+1 + fn ) = gcd(fn , fn+1 ) = 1.
(2) (7 points) Suppose you deposit |5000 in a bank account that pays 6% interest at the end
of each year (compounded annually). From the second year onwards, at the beginning of
each year you deposit |1000. Let hn be the amountPin your account after n years (so h0 =
|5000). Determine the generating function h(x) = ∞ n
n=0 hn x and find a formula for hn .
Ans: WePhave h0 = 5000 and hn = 1000+1.06hn−1 , n ≥ 1. Consider the generating function
h(x) = n≥0 hn xn . We have:
X X
h(x) = 5000 + xn (1000 + 1.06hn−1 ) = 4000 + 1000( xn ) + 1.06xh(x)
n≥1 n≥0
Thus
! !
X X
h(x) = 1000(1 − 1.06x)−1 [4 + (1 − x)−1 ] = 1000 (1.06x)n · 5+ xn .
n≥0 n≥1
In particular
n
!
1.06n − 1
X
n n−k n
hn = 1000 5 · (1.06) + (1.06) = 1000 5(1.06) +
k=1
0.06
(3) (6 points) Let Vm be the vector space of polynomials with real coefficients of degree at most
m in one variable. Consider the basis Ek (X) = X(X − 1) . . . (X − k + 1) for 0 ≤ k ≤ m
(with E0 (X) = 1) for Vm . Express the standard basis ek (X) = X k for 0 ≤ k ≤ m in terms
of the basis E0 , . . . , Em and the Stirling number of second kind. (with proof)
Ans: Given m ≤ n, the total number mn of functions from {1, . . . , n} to {1, . . . , m} equals
the sum over k (as kranges from 0 to m) of the number of ways choosing a size k subset of
{1, . . . , m} and then choosing a surjective map from {1, . . . , n} to this subset. Hence:
m
n
X m
m = k!Sn,k .
k=0
k
The sum on k in the right side can be taken as going from 0 to n instead of 0 to m (because
m
k
= 0 when k > m). In particular the polynomial
n
X
Xn − Sn,k Ek (X),
k=0
of degree at most n has n + 1 roots namely 0, 1, . . . , m. This shows that
j
X
ej (X) = Sj,i Ei (X).
i=0
27. Stirling numbers of the First kind
Stirling numbers of the first kind cn,k is the number of ways to arrange n people into k non-empty
circles. The recurrence obeyed by cn,k is
cn+1,k = cn,k−1 + ncn,k .
This is because the first n of the n + 1 people can be arranged in either i) k − 1 circles or ii) k circles.
In case i), n + 1-st person forms a circle all by him/herself. In case ii) the n + 1-st Pperson has to
n
be inserted into one of the k circles, and there are n ways to insert. Let Ck (X) = n≥k cn,k X /n!
be the exponential generating function of the numbers cn,k . We note that cn,0 = δ0,n and hence
C0 (X) = 1. We also consider a generating function C(X, Y ) ∈ R[[X, Y ]] defined by
X
C(X, Y ) = Ck (X)Y k
k≥0
The recurrence above for cn+1,k gives

X m
d X k−1 X d
dX
C k (X) = k−1!
+ m!
(cm,k−1 + mcm,k ) = Ck−1 (X) + X dX Ck (X).
m≥k
Thus
d Ck−1 (X)
C (X)
dX k
= 1−X
.
This in turn gives the differential equation
∂ Y
∂X
C(X, Y )− 1−X
C(X, Y ) = 0.
As usual, if we have an element of h(X, Y ) = R[[X, Y ]] with zero constant term (i.e. h(0, 0) = 0)
∂h Y
and such that ∂X = − 1−X , then we can write the above equation as
∂
exp(−h) ∂X (C(X, Y ) exp(h)) = 0.
We can take
h(X, Y ) = Y log(1 − X) = −Y (X + X 2 /2 + X 3 /3 + . . . ).
We note that
X Y X
Y
exp(h) = (1 − X) = (−X)n = En (Y )(−1)n X n /n!
n≥0
n n≥0
where En (Y ) = Y (Y − 1) . . . (Y − n + 1). We leave it as an exercise to check that the product of

two
P elements of R[[X, Y ]] is zero if and only if one of the factors is zero. Since the element exp(h) =
n n ∂
n≥0 En (Y )(−1) X /n! is not the zero element of R[[X, Y ]], it follows that ∂X (C(X, Y ) exp(h)) =
∂
0. As an exercise check that ∂X f = 0 for an element f ∈ R[[X, Y ]] if and only if f ∈ R[[Y ]].
Therefore we conclude that C(X, Y ) exp(h) ∈ R[[Y ]]. Therefore
X X
C(X, Y )(1 − X)Y = C(0, Y )(1 − 0)Y = Ck (0)Y k = C0,k Y k = C0,0 = 1.
k≥0 k≥0
Finally, we get
C(X, Y ) = (1 − X)−Y .
46 KRISHNA KAIPA
We can expand C(X, Y ) in two ways

X X
En (−Y )(−1)n X n /n! = C(X, Y ) = Y k (− log(1 − X))k /k!
n≥0 k≥0
where the first expansion has been noted above, and the second follows by writing (1 − X)−Y as
exp(−Y log(1 − X)) This shows that
Ck (X) = (− log(1 − X))k /k!
Moreover the expansion
X X X
En (−Y )(−1)n X n /n! = C(X, Y ) = X n /n! ( Cn,k Y k )
n≥0 n≥0 k≤n
yields the identity (with Z = −Y )

X
En (Z) = (−1)n−k Cn,k Z k
k≤n
Therefore, just as the Stirling numbers Sn,k of the second kind were change of basis coefficients
expressing ej ’s in terms of Ei ’s, we see that the Stirling numbers Cn,k of the first kind (multiplied
by (−1)n−k ) are change of basis coefficients expressing Ej ’s in terms of ei ’s in the vector space of
polynomials of degree at most m in one variable Z.
28. Partition numbers
P
The analogue of Bell numbers for Stirling numbers of the first kind, i.e. k Cn,k works out to be
just n!. To see this we compare the coefficient of X n in the first and last power series in the equation
below: X X X
X i = (1 − X)−1 = C(X, 1) = X n /n! ( Cn,k ).
i≥0 n≥0 k≤n
In combinatorial terms, the number of ways of arranging n people into circles is n!. We note that
n! is also the number of permutations of the same n people, and in fact there is a natural bijection
between Perm(S) the set of permutations of a set S of size n, and the set Circ(S) of arranging S
into circles (where the circles are not labeled). Given σ ∈ Perm(S) we will associate an element
σ † ∈ Circ(S). Similarly, given τ ∈ Circ(S) we will associate an element τ† ∈ Perm(S), and we will
show that these maps are inverses of each other, thus defining a bijection between Perm(S) and
Circ(S).
Let S = {1, 2, . . . , n}. Given σ ∈ Perm(S) and i ∈ {1, . . . , n} the orbit of i is the subset of S
given by {i, σ(i), σ 2 (i), . . . }. Since this is a finite set there must be natural numbers k, m such that
σ k (i) = σ k+m (i). Since σ is a bijection, it follows that σ m (i) = i. Let m be the smallest integer with
this property. It follows that the orbit of i has size m. Note that if j is in the orbit of i, then the
orbit of j is the same as the orbit of i. In other words two distinct orbits do not overlap. Moreover
any j ∈ S is in its own orbit. We write the orbit as a circle with the clockwise traversal of the
circle corresponding to the sequence i, σ(i), σ 2 (i), . . . . Thus, we have obtained an arrangement S
into circles (namely the orbits), which we define to be σ † . Given an element τ ∈ Circ(S), we define
a permutation τ† of S by defining σ(i) (for each i ∈ S) to be the successor of i while traversing the
circle clockwise. (in particular, if the circle containing i has no other member except i, then σ fixes
i). Starting with σ ∈ Perm(S), it is clear that the element of Perm(S) defined by (σ † )† equals σ.
Similarly starting with a τ ∈ Circ(S), the element of Circ(S) defined by (τ† )† equals τ .
Partitions
We now study the analogue of Bell numbers when the set S = {1, . . . , n} is replaced by the multiset
{n · a}. Let p0 = 1 and let pn denote the ways to partition the multiset {n · a} into any number
of parts. In other words the number of ways to write n = n1 + n2 + · · · + nk where k ∈ {1, . . . , n}
and each ni > 0. Suppose there are ai parts of size i (where we allow ai = 0, then clearly pn is the
number of solutions a1 , . . . , an in non-negative integers to the equation
1a1 + 2a2 + · · · + nan = n.
We note that the coefficient of X n in the infinite product
X X X
( X i )( X 2i ) . . . ( X ji ) . . .
i≥0 i≥0 i≥0
is also the number of solution to the above equation. Thus the generating function
∞
X
n
Y 1
p(X) = pn X = i
.
n≥0 i=1
1 − X
48 KRISHNA KAIPA
Note that the infinite product makes sense because pn is the coefficient of X n in the finite product
n X n
Y Y 1
( X ji ) = i
.
j=1 i≥0 i=1
1 − X
n
Q∞ i
Similarly, the coefficient of X in the infinite product i=1 (1 − X ) actually equals the coefficient
of X n in the finite product ni=1 (1 − X i ). Since the coefficient of X n in the product
Q
∞ ∞
Y 1 Y
· (1 − X i )
i=1
1 − X i i=1
n
is the same as the coefficient of X in the finite product
n n
Y 1 Y
i
· (1 − X i ) = 1,
i=1
1 − X i=1
it follows that the reciprocal of p(X) is the infinite product ∞ i

Q
i=1 (1 − X ). Euler calculated the
first 50 or more terms in the expansion of this series and observed a pattern which he was able to
prove many years later (around 1783 AD). This is Euler’s famous Pentagonal number theorem. A
pentagonal number is an integer of the form (3m2 ± m)/2.
Theorem. (Euler’s pentagonal number theorem) The coefficient of X n in 1/p(X) is (−1)m if n =
(3m2 ± m)/2 and is zero if n is not a pentagonal number:
∞ ∞
2 2
Y X
i
(1 − X ) = 1/p(X) = (−1)m (X (3m −m)/2 + X (3m +m)/2 ).
i=1 m=0
An elementary combinatorial proof is certainly within the scope of this course, but will not give
it here. The interested student can see look up ‘Franklin’s proof of Euler’s pentagonal number
theorem’.
Another famous partition identity of Euler is:
Theorem. (Euler) Let po (n) be the number of partitions of n into parts of odd size, and let pd (n)
be the number of partitions of n into parts of distinct sizes. Then po (n) = pd (n).
n
P
Proof. We prove this by showing that the generating functions p o (X) = n≥0 po (n)X and pd (X) =
n
P
n≥0 pd (n)X are equal. It is easy to see that
∞
Y
pd (X) = (1 + X i ),
i=1
n
because the coefficient of X in this infinite product is the number of solutions to the above equation
P n
i=1 iai = n) with ai ∈ {0, 1}. Thus
Q∞ 2i
i=1 (1 − X ) 1
pd (X) = ∞Q i
= Q∞ 2i−1 )
= po (X)
i=1 (1 − X ) i=1 (1 − X

The subject of partitions is a vast and deep subject into which we do not go further. Ramanujan
observed and proved that : i) 5|p(n) if n ≡ 4 mod 5, ii) 7|p(n) if n ≡ 5 mod 7, iii) 11|p(n) if n ≡ 6
mod 11. The pentagonal number’s theorem is closely related to the Dedekind Eta function in the
theory of modular forms.
Remark (for those familiar with group theory): Let S = {1, . . . , n}. Recall the bijection σ 7→ σ †
from Perm(S) to Circ(S). Given σ ∈ Perm(S) let Lσ be the multiset consisting of the lengths of
the circles in σ † . Note that Lσ is just a partition of n. We introduce an equivalence relation on
Perm(S) be defining σ and τ to be equivalent if Lσ = Lτ . It is clear that the number of equivalence
classes is pn the number of partitions of n. The significance of this is that the equivalence relation
on Perm(S) we defined has an interpretation in the theory the symmetric group Sn (i.e. Perm(S)
viewed as a group): two permutations are equivalent if they are conjugate in the sense of group
theory. The equivalence classes are the conjugacy classes in the symmetric group. The conclusion
is that pn is the number of conjugacy classes in the symmetric group Sn .
50 KRISHNA KAIPA
29. Counting labeled trees
Number of labeled trees on n vertices A graph consists of a set V consisting of n vertices, and a
subset E of the n2 possible edges between these vertices. A path v1 v2 . . . vk in the graph consists
of a sequence of distinct vertices v1 , . . . , vk such that there is an edge between vi and vi+1 for each
1 ≤ i ≤ k − 1. A tree is a graph in which there is a unique path between any two vertices. Clearly,
a tree with one vertex has no edges. So, we now assume n > 1. Pick any vertex, say vn . Let r be
the number of edges emanating from vn . If r = 0, then there can be no path starting at vn which
is not the case. Therefore 1 ≤ r ≤ n − 1. Let w1 , . . . , wr be the neighbours of vn . If we remove vn
and the r edges emanating from vn , then we are left with a graph G0 on n − 1 vertices. In G0 , if
a vertex u can be connected to wi then it cannot be connected to wj for j 6= i: for otherwise, in
the original graph, we would have two distinct paths between wi and wj , namely wi − vn − wj and
the path P P 0 where P and P 0 are the paths in the new graph from wi to u and u to wj . Moreover,
any vertex u is connected to wi for some i, because otherwise u cannot be connected to vn in the
original graph G. Thus G0 decomposes into a disjoint union of graphs G1 , . . . , Gr with the property
that if u, v ∈ Gi then the unique path joining u and v in G actually lies in Gi . In other words, each
Gi is a tree. We claim that the number of edges in a tree is one less than the number of vertices
ν. This is true if ν = 1. Assume, inductively that the result is true for less than ν vertices. In
particular , in the graph G0 above the number of vertices exceeds the number of edges by r. In G,
there is one more vertex vn and r more edges (joining vn to w1 , . . . , wr ). Therefore the number of
vertices of G exceeds the number of edges by r + 1 − r = 1.
Let us label the vertices as v1 , v2 , . . . , vn . How many distinct trees are there on this set of vertices?
Let tn denote this number. Clearly t1 = 1 (no edges), and t2 = 1 (one edge) . If n = 3, of the 3
vertices one of them has 2 edges and the other two have 1 edge emanating. The vertex with 2 edges
can be chosen in 31 = 3 ways, and hence t3 = 3.
Theorem. (Cayley 1885) tn = nn−2 .
There are several proofs of this theorem. We give a proof using exponential generating functions.
A rooted labeled tree is a tree in which a particular vertex has been singled out. Let τn denote the
number of rooted labeled trees on n vertices. Clearly τn = ntn . Our goal is to show that τn = nn−1 .
This will show tn = nn−2 . It is easy to see that τ1 = t1 = 1. We define a formal power series:
∞
X
(12) T (x) = τn X n /n!
n=1
We now assume n ≥ 2. As above, let 1 ≤ r ≤ n − 1 be the number of edges emanating from

vn . We recall that the graph G0 is an unordered collection of r disjoint trees (on total vertex set
{v1 , . . . , vn−1 }) and each of these r trees has a root (the unique element of this tree which is a
neigbour of vn ). Let us order these trees as G1 , . . . , Gr . There are r! ways to do so. Let wi denote
the root of Gi . Let νi denote the number of vertices of Gi . The number of ways of choosing the
vertex sets of G1 , . . . , Gr is the multinomial coefficient ν1n−1

,...,νr
. Therefore, for n ≥ 2, we get:
n−1
X
1
X n − 1!
τn /n = r!
τν1 τν2 . . . τνr .
r=1 ν1 +···+νr =n−1
ν 1 !ν 2 ! . . . ν r !
Pn−1
Thus, for n ≥ 2, we conclude that τn /n! is the coefficient of X n−1 in r=1 T (x)r /r!, or also
n−1
X
eT (X) − 1 = T (X)r /r!,
r=1
r i
because T (X) has no terms X for i < r. Thus, we get:
∞
X
−1 + T (X)/X = τn X n−1 /n! = exp(T (x)) − 1.
n=2
Thus we have obtain the functional equation:
(13) T (x)e−T (x) = X
We will now solve this equation, to show τn = nn−1 . Let f (X) = Xe−X . Note that the composi-
tion f ◦ T = X. According to the Lagrange inversion theorem presented below, we have
τk kk−1
k· k!
= [X k−1 ]ekX = (k−1)!
.
Therefore, τk = k k−1 .
n
P
Theorem. (Lagrange ∼ 1770). Let f (X) = P n≥0 fnnX ∈ R[[X]] with f0 = 0, f1 6= 0. There
is a unique compositional inverse g(X) = n≥0 gn X ∈ R[[X]] with g0 = 0, g1 6= 0 satisfying
f (g(X)) = X. The coefficients of g are given by :
k[X k ]g(X) = Res(1/f k )
We will explain the term Res used above. Before, doing that we note that the existence of
g(X) is not difficult: If we plug in an unknown series g(X) = g1 X + g2 X 2 + . . . into the equation
f (g(X)) = X, we get:
f1 (g1 X + g2 X 2 + . . . ) + f2 (g1 X + g2 X 2 + . . . )2 + · · · = X(f1 g1 ) + X 2 (f1 g2 + f2 g12 ) + . . .
This gives g1 = 1/f1 , g2 = −f2 g12 /f1 . Similarly, having solved for g1 , . . . , gk we can determine gk+1
in terms of g1 , . . . , gk . Thus the exact formula for the coefficient gk is the important part of this
theorem.
We need to enlarge our ring R[[X]] of formal power series to include negative powers of X. The
ring of Laurent series denoted R((X)) consists of the set of formal series
X
{ an X n | k ∈ Z}
n≥k
with finitely many negative powers of X. The operations of addition and multiplication of Laurent
series are clear. Note that any non-zero element f (X) of of R((X)) can be written uniquely as
X m f † (X) where m ∈ Z (possibly negative) and f † (X) ∈ R[[X]] and has nonzero constant term.
The integer m is called the order of f and denoted ord(f ). At this point we note that the reciprocal
of f already exists in R((X)): it is X −m (1/f † (X)) where the reciprocal 1/f † (X) exists in R[[X]]
because f † (X) has non-zero constant term. Since every non-zero element of R((X)) has a recipro-
cal, it follows that R((X)) is a field.
52 KRISHNA KAIPA
We can extend the differentiation operation D on R[[X]] to R((X)): we just define DX n = nX n−1
for all integers n and differentiate a Laurent series term-wise. We note that
D : R((X)) → R((X))
is a linear transformation whose kernel is clearly the constants (i.e. power series with only constant
term). Note that the image of D consists of all Laurent series in which the coefficient of X −1 is
zero. This is because DX n = nX n−1 can never be a nonzero scalar multiple of X −1 .
We define the residue Res(f ) of a Laurent series f to be the coefficient of X −1 in f . Note that
Res : R((X)) → R
is a surjective linear transformation (of vector spaces over R). As noted above:
ker(Res) = Im(D).
In other words Res(f 0 ) = 0 where f 0 = Df . As a corollary we note that:
Res(f g 0 ) = −Res(f 0 g).
Another property of Res is:
[X k ]f = Res(X −k−1 f ),
which follows from the definition of Res.
A third property of Res is that for a nonzero f we have:

Res(f 0 /f ) = ord(f ).
To see this we write f = X ord(f ) f † and note that
f 0 /f = ord(f )/X + Df † /f † .
The first term has residue equal to ord(f ) and the second term is in R[[X]] and hence has residue
zero.
The most important property of Res (for the proof of Lagrange’s theorem) is
Res((f ◦ g)(X) · g 0 (X)) = Res(f )ord(g)
where g(X) ∈ R[[X]] with zero constant term (in other words ord(g) > 0). To see this we write
f = Res(f )X −1 + f1 (X), where f1 ∈ Ker(Res). Since Ker(Res) = Im(D) it follows that f1 = f20 for
some Laurent series f2 . We have
0
(f ◦ g)(X) · g 0 (X) = Res(f ) gg + (f2 ◦ g)0 .
Thus Res((f ◦ g)(X) · g 0 (X)) = Res(f )ord(g).
Returning to the proof of Lagrange’s theorem, let g(X) be the compositional inverse of f (X).
Note that ord(f ) = ord(g) = 1.
k[X k ]g = kRes(X −k−1 g) = −Res((DX −k )g) = Res(X −k g 0 ) = Res((f −k ◦ g)g 0 ) = Res(f −k )ord(g)
Since ord(g) = 1, we conclude that
k[X k ]g = Res(f −k ).
Remarks: Our presentation of Lagrange’s theorem is based on https://en.wikipedia.org/

wiki/Formal_power_series#The_Lagrange_inversion_formula.
The function W (x) = −T (−x) is the power series at x = 0 of the Lambert W -function which oc-
curs in many contexts in physics and chemistry. See https://en.wikipedia.org/wiki/Lambert_
W_function.
54 KRISHNA KAIPA
30. Assignment 8
(1) Consider a rooted tree on n vertices. We divide the vertex set as V0 , V1 , . . . where V0 consists
of the root and Vi consists of those vertices such that the unique path from the root to this
vertex has i edges. We say that an unlabeled rooted tree is a binary tree if for each for each
i and for each v ∈ Vi set of neighbours of v in Vi+1 –called the children of v– is at most 2
in number. The children are partitioned into two (possibly empty) parts called left children
and right children. If a vertex has only one child, then this child can be left or right. If a
vertex has two children, then one child is left and the other is right. Show that the number
of binary trees on n vertices is the Catalan number Cn . Example: for n = 3 the five binary
trees are (where r is root):
r → L → L, r → L → R, r → R → L, r → R → R, r → LR
(Ans: Let an be the number of such binary trees. Define a0 = 1. The left side ‘branch’
of the root is again a binary tree with j vertices, and the right side ‘branch’ of the root is a
binary tree with n − 1 − j vertices. Therefore, the numbers an satisfy the same recurrence
as that of the Catalan numbers an = a0 an−1 + a1 an−2 + · · · + an−1 a0 for n ≥ 1 and a0 = C0 .
Therefore, an = Cn .)
(2) In this problem we will give a combinatorial proof of the following partition identity in
R[[X, Y ]]:
∞ ∞
Y 1 X
k Xk
= Y
i=1
1 − Y Xi k=0
(1 − X)(1 − X 2 ) . . . (1 − X k )
(a) We can expand the left side as a power series k,n pn,k X n Y k . Interpret pn,k as the
P
number of partitions of n into exactly k parts. P
(b) We can expand the right side as a power series k,n qn,k X n Y k . Interpret qn,k as the
number of partitions of n whose largest part is k.
(c) Prove the identity by proving that pn,k = qn,k . Given a partition n = n1 + · · · + nk
with n1 ≥ n2 ≥ · · · ≥ nk , consider a n × n matrix with entries in {0, 1}. The i-th row
has ni ones followed by (n − ni ) zeros. For i > k the rows consist of zeros. Conversely
to each such n × n matrix (with {0, 1} entries, in which the number of ones in each
row is non-increasing, and within each row the zeros trail the ones) we get a partition.
Consider the transpose operation on such matrices.
(3) Consider the equation XC(X)2 = C(X) − 1 for the generating function n≥0 Cn X n of the
P
Catalan numbers. Let A(X) = C(X) − 1. Convert the above equation to a functional
equation for A(X) of a form to which Lagrange inversion can be applied. Using Lagrange
1 2n

inversion obtain the formula Cn = n+1 n .
31. Philip Hall marriage theorem/Systems of Distinct Representatives
Systems of Distinct Representatives

Let X be a finite set, and let let S1 , . . . , Sm be a collection of subsets of X (we allow Si = Sj even
if i 6= j). A system of distinct representatives (in short SDR) or a transversal for the collection
S1 , . . . , Sm is a choice of m distinct elements x1 , . . . , xm of X with the property that xi ∈ Si .
Examples:
• Let X = {1, . . . , 5} and S1 = {1, 2, 3} , S2 = {1, 4, 5} and S3 = {3, 5}. Here a transversal is
given by f (1) = 2, f (2) = 4, f (3) = 5. Another transversal is f (1) = 2, f (2) = 1, f (3) = 3.
• Let S1 , S2 , S3 , S4 be the family of subsets of the set X = {a, b, c, d, e}, defined by S1 =
{a, b, c}, S2 = S4 = {b, d}, S3 = {a, b, d}. Here (c, b, a, d) is an SDR.
Marriage Problem Let X be a set of n men, and A = {W1 , . . . , Wm } a set of m women. Each
woman Wi prefers a subset Si ⊂ X. The marriage problem is to find a SDR, i.e. to find m distinct
men M1 , . . . Mm such that for each 1 ≤ i ≤ m, the man Mi is in the preferred list of Wi .
A necessary condition for existence of a SDR:

For a SDR to exist it is necessary that for each B ⊂ {1, . . . , m} we must have
(14) | ∪i∈B Si | ≥ |B|.
Indeed, if this condition fails for some B ⊂ {1, . . . , m} of size say j where 1 ≤ j ≤ m, then it is
impossible to find j distinct representatives for the Si , i ∈ B because the union of these Si ’s itself
has size less than j. It turns out that there is no other obstruction to finding an SDR. The following
theorem is attributed to Philip Hall (1935) but D. Kőnig had also proved it earlier (in context of
matching for bipartite graphs)
Theorem. (Philip Hall marriage theorem) The necessary condition is sufficient.

Proof. The proof is by induction on m. If m = 1 then we just have to pick an element from S1
and this is possible because the condition (14) says that |S1 | ≥ 1. Assume inductively that given
any finite set Y and k < m subsets of Y , then the condition (14) is also sufficient. Returning
to the m subsets S1 , . . . , Sm of X, we consider two cases: i) either there exists B ⊂ {1, . . . , m}
with 1 ≤ |B| ≤ m − 1 for which equality holds in (14), or ii) for every B ⊂ {1, . . . , m} with
1 ≤ |B| ≤ m − 1, inequality holds in (14).
In case i) suppose wlog that |S1 ∪ · · · ∪ Sj | = j. Since j < m by inductive hypothesis, we get
distinct elements x1 , . . . , xj such that xi ∈ Si . Moreover S1 ∪ · · · ∪ Sj must equal {x1 , . . . , xj } as
both sets have the same size j. Now let Y = X \ {x1 , . . . , xj } and Ai = Sj+i \ {x1 , . . . , xj } for
1 ≤ i ≤ m−j. We claim that the condition (14) holds for this new collection: for B ⊂ {1, . . . , m−j}
we have:
∪i∈B Ai = (∪i∈B Sj+i ) \ {x1 , . . . , xj } = ((S1 ∪ · · · ∪ Sj ) ∪ (∪i∈B Sj+i )) \ {x1 , . . . , xj }.
Since (S1 ∪ · · · ∪ Sj ) ∪ (∪i∈B Sj+i ) has size at least |B| + j by (14), it follows that
| ∪i∈B Ai | = | ((S1 ∪ · · · ∪ Sj ) ∪ (∪i∈B Sj+i )) \ {x1 , . . . , xj }| ≥ |B|.
56 KRISHNA KAIPA
Since m − j < m, the inductive hypotheses gives an SDR xj+1 , . . . , xm for A1 , . . . , Am−j . Combining
with x1 , . . . , xj we get an SDR x1 , . . . , xm for S1 , . . . , Sm .
Case ii): here | ∪i∈B Si | > |B| for all B ⊂ {1, . . . , m}. Since |Sm | > 1, pick xm ∈ Sm and consider
Y = X \ {xm } and Ai = Si \ {xm } for 1 ≤ i ≤ m − 1. The condition (14) holds for this new
collection because for B ⊂ {1, . . . , m − 1} we have:
| ∪i∈B Ai | = |(∪i∈B Si ) \ {xm }| > |B| − 1 ≥ |B|.
Since m − 1 < m, the inductive hypotheses gives an SDR x1 , . . . , xm−1 for A1 , . . . , Am−1 . Combining
with xm we get an SDR x1 , . . . , xm for S1 , . . . , Sm .
First application of Hall’s theorem

Lemma 9. Suppose we have two partitions m
` `m
i=1 Ai and i=1 Bi of a set Y such that each of the
sets Ai and Bj have the same size n. We claim that there exist ai ∈ Ai for 1 ≤ i ≤ m and a
permutation σ of {1, . . . , n} such that ai ∈ Bσ(i) for 1 ≤ i ≤ m.
Proof. Let X = {1, . . . , m} and consider subset S1 , . . . , Sm of X defined by
Si = {1 ≤ j ≤ m : Ai ∩ Bj 6= ∅}.
Given 1 ≤ j ≤ m and B ⊂ {1, . . . , m} of size j we claim (14) holds: indeed | ∪i∈B Si | equals the
number of nonempty sets in the right side of
qi∈B Ai = qm
i=1 (Bi ∩ (qi∈B Ai ))
Therefore the size of the set on the right side in this equation is at-most n · | ∪i∈B Si | (note each Bi
has size n) where as the set on the left side has size |B|n. This establishes that | ∪i∈B Si | ≥ |B|.
By Hall’s theorem, we get an SDR σ(1), . . . , σ(m) of m distinct elements of {1, . . . , m} (in other
words a permutation of {1, . . . , m}) such that σ(i) ∈ Si which means that Ai ∩ Bσ(i) 6= ∅. Pick
ai ∈ Ai ∩ Bσ(i) .
As a corollary, we can prove the following theorem related to group theory (this topic is optional):
Theorem. Let H be a subgroup of a group G such that [G : H] = m is finite. There exist
g1 , . . . , gm ∈ G such that
G = ∪m m
i=1 gi H = ∪i=1 Hgi .
Proof. We will prove the theorem under the extra assumption that G is finite. In the Assignment,
we will drop this assumption. Let |H| = n and |G| = mn. We apply the previous lemma with
A1 , . . . , Am and B1 , . . . , Bm being respectively, the set of left cosets of H in G and the set of right
cosets of H in G. By the lemma there exist distinct elements a1 , . . . , am of G such that:
G = qm m
i=1 ai H = qi=1 Hai .

32. Birkhoff- von Neumann theorem on double stochastic matrices
Second application of Philip Hall theorem: The Birkhoff- von Neumann theorem.
We begin with some definitions.

• A vector v ∈ Rn is called a probability vector is vi ≥ 0 and
P
vi = 1.
• A n × n real matrix A is called Markov if each of the n columns of A is a probability vector.
• A n × n real matrix A is called doubly stochastic if both A and AT (the transpose of A) are
Markov.
• A n×n permutation matrix A is a matrix with entries in {0, 1} such that each row and column
has exactly one non-zero entry. There are n! permutation matrices. For each permutation σ
of {1, . . . , n} the permutation matrix Aσ is defined by Aij = δσ(i),j . Note that a permutation
matrix is doubly stochastic.
• If V is a real vector space and v1 , . . . , vm ∈ V then a convex combination of v1 , . . . , vm is a
linear combination t1 v1 + · · · + tm vm with the property that ti ≥ 0 and t1 + · · · + tm = 1.
• A convex combination of Markov matrices is Markov, and similarly a convex combination
of doubly stochastic matrices is doubly stochastic.
Theorem. (G. Birkhoff 1946, J. von Neumann 1953) Every n × n doubly stochastic matrix is a
convex combination of n × n permutation matrices.
Proof. The proof is by induction on the number mA of non-zero entries of the doubly stochastic
matrix A. Since each column of A must have a non-zero entry, it follows that mA ≥ n. In fact, if
mA = n then it is clear that A is obtained by permuting the columns of the n × n identity matrix,
i.e. A is a permutation matrix. Thus the base case of the induction, mA = n has been established.
We now assume inductively that the theorem is true if mA < m. Given a doubly stochastic matrix
A with mA = m, let X = {1, . . . , n} and define S1 , . . . , Sn ⊂ X by
Si = {j ∈ X : Aij 6= 0}.
For B ⊂ X let SB = ∪i∈B Si . We claim that |SB | ≥ |B| for all B ⊂ X :
X n
XX XX XX n
XX X
|B| = 1= Aij = Aij = Aij ≤ Aij = 1 = |SB |.
i∈B i∈B j=1 i∈B j∈SB j∈SB i∈B j∈SB i=1 j∈SB
Therefore, by Philip Hall theorem there exists a permutation σ of X = {1, . . . , n} such that σ(i) ∈
Si , or in other words Ai,σ(i) > 0. Let λ = min{Ai,σ(i) : 1 ≤ i ≤ n}. Note that 0 < λ ≤ 1 and that
λ = 1 if and only if A = Pσ . Therefore, if λ = 1 then we have already expressed A as a convex
combination of permutation matrices. If λ < 1, we define
A − λPσ
A0 = .
1−λ
Note that the sum of the j-th column of A0 equals
( sum of the j-th column of A) − λ
=1
1−λ
58 KRISHNA KAIPA
The same is true for the row-sums of A0 also. Moreover A0ij ≥ 0, because:
(
Aij /(1 − λ) if j 6= σ(i)
A0ij = Aiσ(i) −λ ≥ 0.
1−λ
if j = σ(i)
Thus A0 is doubly stochastic. Also, the number of non-zero entries of A exceeds the number of
non-zero entries of A0 by
#{i : Ai,σ(i) = λ} ≥ 1.
Thus the number of non-zero entries of A0 is strictly P less than
P the number of non-zero entries of A.
By the inductive hypothesis, it follows that A0 = τ tτ Pτ , τ tτ = 1, where the sum runs over all
permutations τ of {1, . . . , n}, and where tτ ≥ 0. It follows that
X
A = λPσ + (1 − λ)tτ Pτ .
τ
We note that the right side is indeed a convex combination of permutation matrices.
33. Generalized Phillip Hall marriage theorem, Kőnig-Egerváry theorem
Let S1 , . . . , Sm be subsets of a finite set X. For a nonempty subset B of {1, . . . , m}, let SB = ∪i∈B Si .
We recall the Philip Hall theorem: for an SDR of S1 , . . . , Sm to exist, the necessary condition
|SB | ≥ |B| ∀B ⊂ {1, . . . , m},
is also sufficient. There is a generalization of the theorem when the necessary condition does not
hold. We define a deficit of S1 , . . . , Sm to capture the worst case of the violation of the necessary
condition, i.e. the largest value of |B| − |SB | for which |SB | < |B|:
∆ = max{|B| − |SB | : |SB | < |B|}

If the necessary condition is never violated, we define ∆ = 0. In other words
∆ = maxB⊂{1,...,m} max{0, |B| − |SB |}.
Theorem. (Generalized Phillip Hall marriage theorem) The largest sub-collection of S1 , . . . , Sm for
which an SDR exists is m − ∆.
Proof. Suppose a sub-collection of size m − t of S1 , . . . , Sm has an SDR. We may assume this
sub-collection is S1 , . . . , Sm−t and let xi ∈ Si for 1 ≤ i ≤ m − t be an SDR . Enlarge the set
X by adding t extra elements X̃ = X q {z1 , . . . , zt }, and define a collection of m subsets of X̃
by S̃i = Si q {z1 , . . . , zt }. The collection S̃1 , . . . , S̃m of subsets of X̃ has an SDR: take the SDR
xi ∈ Si ⊂ S̃i for 1 ≤ i ≤ m − t, and take zi ∈ S̃i for m − t + 1 ≤ i ≤ m. Since the Hall condition is
necessary, we must have for each non-empty B ⊂ {1, . . . , m}:
|B| ≤ | ∪i∈B S̃i | = t + | ∪i∈B Si |,
which shows that ∆ ≤ t, i.e. m − ∆ ≥ m − t.
Conversely, let X̃ = X q {z1 , . . . , z∆ } and consider the collection of m subsets of X̃ given by
S̃i = Si q {z1 , . . . , z∆ }. For each non-empty B ⊂ {1, . . . , m}, we have:
| ∪i∈B S̃i | = ∆ + | ∪i∈B Si | = |B| + [ ∆ − ( |B| − | ∪i∈B Si | ) ] ≥ |B|,
where the last inequality follows from the definition of ∆. Since the Hall condition is satisfied, there
exists an SDR xi ∈ S̃i for 1 ≤ i ≤ m. Note that S̃i = Si q {z1 , . . . , z∆ } and hence at most ∆ of
these xi ’s can be in {z1 , . . . , z∆ }. It follows that a sub-collection of size at least m − ∆ of S1 , . . . , Sm
has an SDR.
Combining these two assertions,we conclude that the maximum possible size of a sub-collection
of S1 , . . . , Sm which has an SDR is m − ∆.
The next theorem (∼ 1914) stated in terms of 0, 1-matrices is a another version of the generalized
Philip Hall theorem.
Theorem 10. (Kőnig-Egerváry theorem) Let A be a m × n matrix with 0, 1 entries. Define a line
of a m × n matrix to be a row or column. The maximum possible size of a subset of the set of 1’s of
A, no two of which are on the same line is equal to the minimum possible size of a subset of lines
of A which together contain all the 1’s of A.
60 KRISHNA KAIPA
Proof. Let X = {1, . . . , n} and consider the collection of subsets S1 , . . . , Sm of X given by

Si = {j ∈ X : Aij = 1}.
Let B be a non-empty subset of {1, . . . , m} and suppose the sub-collection Si for i ∈ B has an
SDR, given by ji ∈ Si for i ∈ B. We note that the collection of 1’s of A indexed by the entries
{(i, ji ) : i ∈ B} is a collection of 1’s no two of which are on the same line. Conversely given such
a collection of 1’s, appearing in places (i1 , j1 ), (i2 , j2 ), . . . , (iµ , jµ ) it follows that the i1 . . . , iµ are
distinct j1 , . . . , jµ are distinct. Thus the sub-collection Si1 , . . . , Siµ has an SDR. It follows that the
maximum possible size of a collection of such 1’s is the maximum possible size of a sub-collection
of S1 , . . . , Sm which has an SDR.
Given a collection of lines of A which includes all the 1’s of A, let C ⊂ {1, . . . , m} be the set
of rows in this collection (it is possible that C = ∅). Let B = {1, . . . , m} \ C. The columns
SB = ∪i∈B Si must be in the collection of lines. Thus the minimum possible size of a collection of
such lines is
min{m − |B| + |SB | : B ⊂ {1, . . . , m}}.
If B = ∅ i.e. C = {1, . . . , m} then the corresponding collection has size m. So the minimum value
above is m − ∆ where
∆ = max{|B| − |SB | : B ⊂ {1, . . . , m}, |B| > |SB |}.
The result now follows from the generalized Philip Hall theorem. Moreover, given a collection
S1 , . . . , Sm of subsets of a finite set X (say of size n), we can form the 0, 1 -matrix A of size
m × n above, and hence the Kőnig-Egerváry theorem is equivalent to the generalized Philip Hall
theorem.
34. Assignment 9
(1) Let A = (A1 , A2 , . . . , An ) be a family of sets with an SDR. Let x be an element of A1 . Prove
that there is an SDR containing x, but show by example that it may not be possible to find
an SDR in which x represents A1 .
(2) Suppose A = (A1 , A2 , . . . , An ) is a family of sets that “more than satisfies” the marriage
condition in the sense that:
|Ai1 ∪ Ai2 ∪ · · · ∪ Aik | ≥ k + 1 ∀ k = 1, 2, . . . , n, 1 ≤ i1 < · · · < ik ≤ n.
Let x ∈ A1 . Prove that A has an SDR in which x represents A1 .
(3) Let n > 1 and let A = (A1 , A2 , . . . , An ) be the family of subsets of {1, 2, . . . , n}, where
Ai = {1, 2, . . . , n} \ {i} for i = 1, 2, . . . , n. Prove that A has an SDR and that the number
of SDRs is the n-th derangement number Dn .
(4) A corporation has seven available positions Y1 , Y2 , . . . , Y7 and there are ten applicants X1 , . . . , X10 .
The set of positions each applicant is qualified for is given, respectively, by
{Y1 , Y2 , Y6 }, {Y2 , Y6 , Y7 }, {Y3 , Y4 }, {Y1 , Y5 }, {Y6 , Y7 }, {Y3 }, {Y2 , Y3 }, {Y1 , Y3 }, {Y1 }, {Y 5}.
Determine the largest number of positions that can be filled by the qualified applicants and
justify your answer.
(5) Let A = (A1 , A2 , A3 , A4 , A5 , A6 ) where

A1 = {a, b, c}, A2 = {a, b, c, d, e}, A3 = {a, b}, A4 = {b, c}, A5 = {a}, A6 = {a, c, e}.
Does the family A have an SDR? If not, what is the largest number of sets in the family
with an SDR?
(6) (optional) In this problem we drop the assumption that G is finite, in the theorem about
group theory in the previous section.
(a) Prove the following group theory lemma:
Lemma. Let H be a subgroup of a group G such that [G : H] = m is finite. Prove that
there is a normal subgroup N of G with N contained in H and such that the quotient
group G/N is finite. (Hint: consider the kernel of the homomorphism G → Sm afforded
by the left action of G on G/H.)
(b) Let N be as in part a) and let Ḡ denote the finite group G/N . Let H̄ = H/N . Note
that [Ḡ : H̄] = m. (Prove this if it is not clear). Since we have proved the theorem for
finite G we obtain elements x1 , x2 , . . . , xm of G such that
Ḡ = ∪m m
i=1 xi H̄ = ∪i=1 H̄xi .
Use this to show that:
G = ∪m m
i=1 xi H = ∪i=1 Hxi .
62 KRISHNA KAIPA
35. Quiz-4
(1) (6 points) Consider the equation XC(X)2 = C(X)−1 for the generating function n≥0 Cn X n
P
of the Catalan numbers. Let A(X) = C(X) − 1. Convert the above equation to a functional
equation for A(X) of a form to which Lagrange inversion can be applied. Using Lagrange
1 2n

inversion obtain the formula Cn = n+1 n
.
Ans: We have X = A(X)/(1+A(X))2 which shows that A(X) is the compositional inverse
of f (X) = X/(1 + X)2 . By Lagrange inversion, for n ≥ 1, we have
Cn = n1 Res(f −k ) = n1 Res(X −n (1 + X)2n ) = n1 [X n−1 ](1 + X)2n = n1 n−1
2n 1 2n

= n+1 n
.
(2) (7 points) Let S1 , . . . , Sn be n sets that have an SDR. Suppose that a1 , . . . , at are an SDR for
S1 , . . . , St where t < n. Prove that S1 , . . . , Sn have an SDR including the elements a1 , . . . , at ,
but not necessarily as representatives of S1 , . . . , St . Give an example of sets S1 , . . . , Sn and
elements a1 , . . . , at that constitute an SDR for S1 , . . . , St but which cannot be representa-
tives of S1 , . . . , St in any SDR of S1 , . . . , Sn .
Ans: Let x1 , . . . , xn be an SDR for S1 , . . . , Sn . If {a1 , . . . , at } ⊂ {x1 , . . . , xn } then we are

done. Otherwise, there is an integer 0 ≤ µ ≤ t − 1 such that (upto relabeling a1 , . . . , at and
S1 , . . . , St ) that {a1 , . . . , aµ } ⊂ {x1 , . . . , xn } and aµ+1 , . . . , at ∈
/ {x1 , . . . , xn }. It suffices to
show that we can increase the quantity {a1 , . . . , at } ∩ {a1 , . . . , at }.
Case 1: If µ < t/2. In this case replacing xµ+1 , . . . , xt by aµ+1 , . . . , at respectively, we get
an SDR
x1 , . . . , xµ , aµ+1 , . . . , at , xt+1 , . . . , xn
which contains {aµ+1 , . . . , at }. This SDR has at least t − µ > µ elements from {a1 , . . . , at }.
Case 2: µ ≥ t/2: Suppose there is an i ∈ {1, . . . , t − µ} such that xµ+i ∈ / {a1 , . . . , aµ }
(this is always possible if µ > t/2) then replacing xµ+i by aµ+i we get a new SDR with at
least µ + 1 elements from {a1 , . . . , at }.
It remains to consider the case when µ = t/2 and {xt/2+1 , . . . , xt } = {a1 , . . . , at/2 }. Here
we note that {x1 , . . . , xt/2 } and {a1 , . . . , at/2 } are disjoint. Replacing xi by ai and xt/2+i by
at/2+i for 1 ≤ i ≤ t/2, we get an SDR containing all of a1 , . . . , at .
part2) : Let S1 , . . . , Sn ⊂ S where |S| ≥ n (which is necessary for an SDR of S1 , . . . , Sn

to exist). Let a1 , . . . , at be distinct elements of S, and suppose S1 = · · · = Sn−1 = S and
Sn = {at }. The Hall condition holds for S1 , . . . , Sn : For a subset B of {1, . . . , n}, if B = {n}
then | ∪i∈B Si | = 1 = |B|. For any other B, we have | ∪i∈B Si | = |S| ≥ n ≥ |B|. Thus
an SDR for S1 , . . . , Sn does exist. Clearly in any such SDR at represents Sn and hence
a1 , . . . , at cannot represent S1 , . . . , St in any SDR for S1 , . . . , St . Since S1 = · · · = St = S,
clearly a1 , . . . , at is an SDR for S1 , . . . , St .
(3) (7 Points) Let P (X) = ∞ n

P
n=1 pn X be the generating function of the partition numbers.
Consider the formal power series defined by:
∞
X XP 0 (X)
an X n = .
n=1
P (X)
(where P 0 (X) is the derivative of P (X)). Give an interpretation of an in terms of elementary
number theory.
0 (X) P∞ kX k
Ans: Since P (X) = Π∞ k −1
k=1 (1 − X ) we get XP
P (X)
= k=1 1−X k (see justification below).
n
The contribution to the coefficient of X of the k-th summand is 0 if k - n and k if k | n.
Thus an = σ(n) the sum of divisors of n.
0 (X)
Justification for XP = ∞ kX k
P
P (X) k=1 1−X k :
In terms of Q(X) = 1/P (X) = ∞ k
Q
k=1 (1 − X ), we must show
X∞
0 kX k
−XQ · Q = 1−X k
,
k=1
which in turn reduces to
∞
X Y
Q0 = (1 − X k )0 (1 − X i )
k=1 i∈N\{k}
Let Trn denote the polynomial of degree at most n obtained by truncating a power series
up-to the X n -th term. In other words Trn C(X) = (c0 + c1 X + · · · + cn X n ). Now, two
elements A(X), B(X) ∈ R[[X]] are equal if and only if Trn A = Trn B for all non-negative
integers n. Thus we must show (for each n):
 
∞
X Y
Trn (Q0 ) = Trn  (1 − X k )0 (1 − X i )
k=1 i∈N\{k}
This follows from:

!0    
n+1
Y n+1
X Y X∞ Y
Trn (Q0 ) = Trn (1 − X k ) = Trn  (1 − X k )0 (1 − X i ) = Trn  (1 − X k )0 (1 − X i )
k=1 k=1 i∈{1,...n+1}\{k} k=1 i∈N\{k}
64 KRISHNA KAIPA
36. Additional Practice Problems

(1) Recall that the Stirling number S(n, k) of the second kind, is the number of ways to put
n distinct objects into k unlabeled boxes, such that no box is non-empty. Determine the
following values (with reasoning), assuming n ≥ 2. Express your final answer in terms of
binomial coefficients.
(a) S(n, 2).
(b) S(n, n − 1).
(c) S(n, n − 2).
(2) a) Show that the number of partitions of n with each part at most 2 is bn/2c + 1.
b)* Show that the number of partitions of n with each part at most 3 is the nearest integer
to (n + 3)2 /12 .
(3) Determine a recurrence relation for the number an of ternary strings (made up of 0’s, 1’s,
and 2’s) of length n that do not contain two consecutive 0’s or two consecutive 1’s. Then
find a formula for an .
(4) Let S be the multiset {∞ · e1 , ∞ · e2 , ∞ · e3 , ∞ · e4 }. Determine the generating function

for the sequence h0 , h1 , h2 , · · · , hn where hn is the number of n- combinations of S with the
following added restrictions:
(a) Each ei occurs an odd number of times.
(b) Each ei occurs a multiple-of-3 number of times.
(c) The element e1 does not occur, and e2 occurs at most once.
(d) The element e1 occurs 1,3, or 11 times, and the element e2 occurs 2, 4 or 5 times.
(e) Each ei occurs at least 10 times.
(5) Find the exponential generating function for the number of symmetric n × n permutation
matrices.
37. Final Exam

(1) (7 pts) Suppose we have two partitions m
` `m
i=1 Ai and i=1 Bi of a set Y such that each of the
sets Ai and Bj have the same size n. Show that there is a relabeling of the sets B1 , . . . , Bm
such that each Ai ∩ Bi is non-empty.
(2) (a) (4 pts) Give a formula (proof not required) for the generating function C(X) = ∞ n
P
n=0 Cn X
of the Catalan numbers (Recall C0 = C1 = 1 and C2 = 2)
(b) (4 pts) Using part a) or otherwise, express the following expression as a binomial coef-
ficient (with proof)
n
X 1 2j 2n − 2j
j=0
j+1 j n−j
(3) Let hn denote the number of ways of putting a certain structure (let us call it h-structure)
on a set of size n ≥ 1. The exponential generating function for h-structures is h(X) =
P ∞ n
n=1 hn X /n!. For example if h is the trivial structure (putting no structure) on a set of
size n (so that hn = 1) then h(X) = eX − 1.
(a) (5 pts) Let g and h be two structures. An f -structure on a set S of size n is obtained
by
• partitioning S into any number of non-empty blocks. The blocks are not labeled.
• putting an h-structure on each block.
• putting a g-structure on the set of blocks.
Obtain an expression for f (X) in terms of g(X) and h(X).
(b) (5 pts) Recall that the Bell number Bn is Pthe number of equivalence relations on a set
of size n. Obtain a formula for B(X) = ∞ B
n=1 n X n
/n!, preferably using part a).
(c) (5 pts) Recall that the Stirling number of the first kind cn,k is the number of ways of
arranging n distinct people into k unlabeled circles. Obtain a formula for Ck (X) =
n
P
n≥1 cn,k X /n!, preferably using part a).
(d) (5 pts) A rooted forest on n labeled vertices {v1 , . . . , vn } is a disjoint union of any num-
ber of rooted labeled trees such that union of the vertex set of all the trees is {v1 , . . . , vn }.
Recall that a rooted tree is just a tree with one of its vertices singled out. Let τn the
number of rooted labeled trees on n vertices, and let fn be the number of rooted forests
on n labeled vertices. For
P∞example f1 = 1, f2 = 3, and τnP = nn−1 . Express, preferably
using part a), F (X) = n=1 fn X n /n! in terms of τ (X) = ∞ n
n=1 τn X /n!. Find a simple
∞ ∞
relation between the sequence {fn }n=1 and the sequence {τn }n=2 , and use it to obtain
the functional equation for τ (X).
Solution to Question 1) done in class
Solution to Question 2)
a) We have
X √
1 2n 1− 1−4X

C(X) = n+1 n
Xn = 2X
.
n≥0
66 KRISHNA KAIPA
b) We have X
2n
X n = (XC(X))0 = (1 − 4X)−1/2 .

n
n≥0
Thus n
X
1 2j 2n−2j

j+1 j n−j
j=0
n
is the coefficient of X in
√
(1−4X)−1/2 −1
X
1− 1−4X √ 1 −1/2 2n−1
n−1
(−4)n X n−1 /2 =
P
2X
· 1−4X
= 2X
= n n≥1 n−1
X .
n≥1
2n+1

Therefore, the answer is n
.
Solution to Question 3)
a) We claim f (X) = g(h(X)). Indeed:

X X 1 n
fn = gk hi1 hi2 . . . hik
k≥1
k! i1 i2 . . . ik
{(i1 ,i2 ,...,ik )∈Nk :i1 +···+ik =n}
This gives
fn X n hi1 X i1 hi2 X i2 hik X ik k
X X Y Xg
gk k h(X)
f (X) = n!
= k! i1 ! i2 !
... ik !
= k!
= g(h(X))
n≥1 k≥1 (i1 ,i2 ,...,ik )∈Nk k≥1
x −1
b) Here f (X) = B(X) and h(X) = g(X) = eX − 1. Thus B(X) = ee − 1.
c) Here f (X) = Ck (X) and gn = δk,n which gives g(X) = X k /k!. PAlso hn = (n − 1)! as it is
the number of circular permutation of n objects, which gives h(X) = n≥1 X n /n = − log(1 − X).
Thus
Ck (X) = (− log(1 − X))k /k!
d) For F (X) we must take hn = τn and gn = 1. Thus F (X) = eτ (X) − 1.
Next, we note that for n ≥ 2, we have τn /n (the number of unrooted labeled trees on n vertices
{v1 , . . . , vn }) equals fn−1 : to see this we note that given such a tree, if we remove vn and all edges
emanating from it, we get a rooted forest on {v1 , . . . , vn−1 }. Thus
X nf X n
τ (X) = X + n−1
n!
= X + XF (X) = Xeτ (X) .
n≥2
This we get the functional equation τ (X)e−τ (X) = X.

38. Epilogue
There are some topics which should have been covered to some extant during the course, but
which we did not have time for. Here we list some of these topics/techniques.
Introduction to Ramsey Theory. This topic will be covered in the Graph Theory course. A
basic example of this theory is the following observation. In a party with at least 6 people, there
will either be a group of 3 people who know each other, or a group of 3 people who are strangers
to each others. Given positive integers m and n, the Ramsey number, R(m, n) is the minimum
number of people that should be in a party so that either there is a group of m people who know
each other pairwise, or a group of n people who are pairwise strangers. The technique known as
the probabilistic method gained prominence after it was used by Erdős (1947) to obtain the lower
bound R(m, m) ≥ 2m/2 .
m+n−2

An upper bound for the Ramsey numbers is R(m, n) ≤ m−1
. A nice way to prove this is the
technique of double induction below
Technique of Double Induction.
Theorem. If a statement P (m, n) for (m, n) ∈ N × N is true for (m, n) ∈ {1} × N and for
(m, n) ∈ N × {1}, and if P (m, n) follows from P (m − 1, n) and P (m, n − 1), then P (m, n) holds
for all (m, n) ∈ N × N.
Proof. This is just a variant of ordinary induction, by inducting on k = m + n. Let R(k) be the
assertion that P (m, n) is true whenever m + n = k. The base case is k = 2. Assume R(k − 1)
holds. For any (m, n) with m + n = k, we note that (m − 1, n) and (m, n − 1) are true by inductive
hypothesis. We now use the fact P (m − 1, n) and P (m, n − 1) ⇒ P (m, n) to deduce that P (m, n)
is true. Thus R(k) holds.
The marching band Problem. Exercise 3.4.26 of [Brualdi]
Suppose that the mn people of a marching band are standing in a rectangular formation of m rows
and n columns in such a way that in each row each person is taller than the one to his or her right.
Suppose that the leader rearranges the people in each column in decreasing order of height from
front to back. Show that the rows are still arranged in decreasing order of height from left to right.

LN 2

Uploaded by

Copyright:

Available Formats

LN 2

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

LN 2

Uploaded by

Copyright:

Available Formats

MTH 318: COMBINATORICS

(2) The number of derangements of a set S of size n. A derangement is a permutation (bijective

2. Permutations, Combinations, and Formal Power Series

The number of r-permutations of a set of size n with repetition or replacement.

The number of r-combinations of a set of size n.

The number of r-combinations of a set of size n with repetition/replacement.

The ring of formal power series

multiply them to get

PGenerating function approach to obtaining bn,r As discussed in §1, in terms of Bn (X) =

which shows that the unique multiplicative inverse of (1 − X)n is

3. Permutations of Multisets, Equivalence Relations

Equivalence Relations on a set

(6) Let n be a positive integer. Suppose we choose a sequence i1 , i2 , . . . , in of integers between

(7) Consider the multiset {n · a, n · b, 1, 2, 3, ..., n + 1} of size 3n + 1. Determine the number of

Some sample applications of the pigeonhole principle.

Consider the list of n remainders a + jm mod n for 0 ≤ j ≤ n − 1. In case a + im ≡ a + jm mod n

7. Pigeon-hole Principle, general version

In particular the product of the numbers in B is a square.

We also discussed Problems (4)-(5) from Assignment 1.

8. Binomial Coefficients generalized

converges to the number (1 + x)n .

We recall the identity

The identity (2) can be generalized as below

A combinatorial interpretation of (6) is as follows. Given an integer j, there are ji n−j

We note that (6) is a special case of (7) for i = k (or i = 0).

(4) Let n and k be integers with 1 ≤ k ≤ n. Prove that

10. Sperner’s theorem

An antichain of S is a subset A of 2S such that if A 6⊂ B for any pair of elements A, B ∈ A. For

Suppose C is a chain of S and A is an antichain of S. If A, B ∈ C then we have A ⊂ B or B ⊂ A,

Using Lemma 2 we get

Combining inequalities (9) and (10), we conclude that:

11. Practice Problems for Quiz 1

(7) Find one binomial coefficient equal to the following expression:

(8) Determine the number of 12-combinations of the multiset S = {4 · a, 3 · b, 4 · c, 5 · d}.

(9) Determine the number of solutions of the equation X1 + X2 + X3 + X4 = 14 in nonnegative

(2) ([5 points]) Find and prove a formula for

All quantities involved are non-negative integers.

(3) ([5 points]) How many permutations of

The last expression is the constant term in the power series

13. Inclusion-Exclusion Principle

Proof. The proof is by induction on n. The base case n = 2 is well known:

By the inductive hypothesis, we get:

This can be written as:

The last quantity is | ∪ni=1 Ai | using the base case n = 2.

Derangements of a finite set.

14. Möbius inversion on the power-set of a finite set

Möbius inversion on the power-set of a finite set

The following lemma tells us how to recover f from f † :

Proof. Using the definition of f † the right hand side is:

Let K̄ = K \ J and let L̄ = L \ J. We have

Since δ0,|K̄| = δJ,K , we get:

In particular f † (X) = S. We note that S∅ = ∩ni=1 (S \ Bi ), and hence:

(3) Prove that Dn is an even number if and only if n is an odd number.

(4) Determine the number of 10-combinations of the multiset S = {∞ · a, 4 · b, 5 · c, 7 · d}.

16. Möbius Inversion on Posets

Möbius Inversion on Posets A relation R on a set X is just a subset of X × X. The relation R

Given a poset (X, ≤) let

17. Möbius Inversion on Posets, continued...

Example 1) If X is a totally ordered set, then