Inclusion-Exclusion Principle: idea and how to use it

What is the Inclusion-Exclusion Principle?

The Inclusion-Exclusion Principle may sound fancy but it is nothing more than a counting strategy. The Principle of Inclusion-Exclusion provides a method to find the size of the union of a group of sets, given the size of each set and the size of all possible intersections among the sets.

To illustrate, we take a look at these examples. Do not worry if the solutions do not make sense to you yet. We will revisit these examples and break down the solutions in a more organized way.

In a class of 35 students, 20 are taking math, 15 are taking computer science, and 10 are taking both math and computer science. How many students are taking either math or computer science?

We apply the Inclusion-Exclusion Principle and get $$20+15-10 = 25.$$ Therefore, 25 students are taking either math or computer science.

A large software development company employs 115 computer programmers. Of them, 75 are proficient in Java, 50 in C#, 100 in Python, 35 in C# and Java, 60 in Java and Python, 45 in C# and Python, and just 10 programmer is proficient in all three languages above. Determine the number of computer programmers who are not proficient in none of these three languages.

We again apply the Inclusion-Exclusion Principle and get $$75+50+100-35-60-45+10 = 95.$$ In addition, 115-95 = 20. Therefore, 20 computer programmers who are not proficient in none of these three languages.

Inclusion-Exclusion Formula

We see the Inclusion-Exclusion Principle in different areas of mathematics such as combinatorics and probability. Due to the different focuses of the fields, the inclusion-exclusion principle will be expressed differently. However, the idea is the same.

Combinatorics

Combinatorics is concerned with counting the number of elements in different sets. The notation \( N(A) \) or \( |A| \) represents the size of a set \(A\), also known as the cardinality of \(A\). We have the following formula:

The two-set case: $$ N(A \cup B) = N(A) + N(B) – N(A \cap B)$$

The three-set case: $$N(A \cup B \cup C) = N(A) + N(B) + N(C) – N(A \cap B) – N(B \cap C) – N(A\cap C) + N(A \cap B \cap C)$$

Probability

Probability is concerned with measuring the likelihood of events. The notation \(P[A]\) represents the probability of event \(A\) occurring. We have the following formula:

The inclusion-exclusion principle of two sets is expressed as $$ P[A \cup B] = P[A] + P[B] -P[A \cap B]$$

The three-set case: $$P[A \cup B \cup C] = P[A] + P[B] + P[C] – P[A \cap B] – P[B \cap C] – P[A\cap C] +P[A \cap B \cap C]$$

The formula also extends from the simple two-set or three-set case to a general \(n\)-set scenario, which makes this principle a powerful tool because it provides a systematic approach to solving complex counting problems and determining probabilities of compound events. However, we do not want to get lost in abstract and complex formulas. For the sake of simplicity, we will focus on the two-set and three-set cases.

When to use the Inclusion-Exclusion Principle?

From the above formulas, we see that the Inclusion-Exclusion principle expresses the size of the union of a group of sets by the size of each set and the size of all possible intersections among the sets. When we find ourselves in such a scenario with a group of sets, the Inclusion-Exclusion principle is a good candidate to consider. Unfortunately, nothing is absolute. This is just an initial filter to increase our chance to choose the right tool to tackle a problem.

Finally, let us look at the examples from above, one more time.

Example1

In a class of 35 students, 20 are taking math, 15 are taking computer science, and 10 are taking both math and computer science. How many students are taking either math or computer science?

To illustrate it visually, we consider the following Venn diagram,

First, we introduce some notation. Let S denote the set of the students in the class, \(A\) denote the students who take math, and \(B\) denote the students who take computer science. The intersection part, i.e., the overlapped part, \(A \cap B\), denotes the students who take both.

Next, we try to express the question in terms of the notation. From the given information, we have \(N(S) = 35\), \(N(A) = 20\), \(N(B) = 15\), and \(N (A \cap B) = 10\). Our goal is to find the size of the union of the sets \(A\) and \(B\), i.e., \(N(A \cup B)\), which is the number of students who take either math or computer science.

The Inclusion-Exclusion Principle gives an easy way to figure out the size \(N(A \cup B)\).
$$ N(A \cup B) = N(A) + N(B) – N(A\cap B) = 20+15-10 = 25$$ There are 25 students who take either math or computer science.

Example 2

A large software development company employs 115 computer programmers. Of them, 75 are proficient in Java, 50 in C#, 100 in Python, 35 in C# and Java, 60 in Java and Python, 45 in C# and Python, and just 10 programmer is proficient in all three languages above. Determine the number of computer programmers who are not proficient in none of these three languages.

Solution

Firstly, we introduce some notation. Let J denote the programmers who are proficient in Java. Let C denote the programmers who are proficient in C#. Let P denote the programmers who are proficient in Python.

Secondly, we try to express the question using the notation. We are given that

  • \(N(J) = 75\)
  • \(N(C)=50\)
  • \(N(P) = 100\)
  • \(N(C \cap J) = 35\)
  • \(N(J \cap P) = 60\)
  • \(N(C\cap P) = 45\)
  • \(N(J \cap C \cap P) = 10\)

Moreover, the number of computer programmers who are not proficient in none of these three languages can be obtained by\( 115 – N(J \cup C \cup P)\).

To continue, we apply the Inclusion-Exclusion formula and find that

$$\begin{aligned}
N(J \cup C \cup P) &= (N(J)+ N(C) + N(P) – N(C \cap J) – N(J \cap P) – N(C\cap P) + N(J \cap C \cap P) \\
&=75+50+100-35-60-45+10 \\
&= 95
\end{aligned}$$ Furthermore, \( 115 – N(J \cup C \cup P) = 115 – 95 = 20. \)

Therefore, we find 20 computer programmers not proficient in none of these three languages.

More Advanced Examples

We have more examples selected from sample questions for the Actuarial Probability Exam.

As mentioned before, even though the principle is a powerful tool for solving complex counting problems, it does not mean it is always “the” tool when we deal with a group of sets or events.

An example where the Inclusion-Exclusion Principle is not helpful

If this question feels difficult for you, for the moment, it is okay. All you have to remember is that the Inclusion-Exclusion Principle is a tool that can solve all kinds of counting problems. For those interested in becoming an actuary, you might want to take on the challenge, as this is a sample question from the Actuarial Probability Exam.

Similarly, we will analyze the problem as before. Click the buttons, you will see how I break things down. Your approach might be different from mine. That is great! It means you challenged yourself instead of just copying what I do. Kudos to you!

An auto insurance company has 10,000 policyholders. Each policyholder is classified as

  • young or old;
  • male or female; and
  • married or single

Of these policyholders, 3000 are young, 4600 are male and 7000 are married. The policyholders can also be classified as 1320 young males, 3010 married males, and 1400 young married persons. Finally, 600 of the policyholders are young married males. Calculate the number of the company’s policyholders who are young, female, and single.

Solution

Given:

Let \( N(C) \) denote the number of policyholders in classification \( C \) 
Let \( Y \) = young , \( M \) = male, \(S\) = single. Note that the complement sets are \(Y’\) = old, \(M’\) = female, and \(S’\) = married according to the classifications. We know that

  • young: \( N(Y) = 3000 \)
  • male: \( N(M) = 4600 \)
  • married: \( N(S’) = 7000 \)
  • young and male: \( N(Y \cap M) = 1320 \)
  • married and male: \( N(S’ \cap M) = 3010 \)
  • young and married: \( N(Y \cap S’) = 1400 \)
  • young and married and male: \( N( Y \cap S’ \cap M) = 600 \)

Goal:

Find the number of the policyholders who are young, female and single, \(N(Y \cap M’ \cap S) \)

Let us try to achieve the goal:

Although our initial idea is to utilize the Inclusion-Exclusion Principle, after a few attempts we realize that we do not have enough information to apply the formula for three sets. So we have to try to see if we can find something useful.

We observe that

\(\begin{aligned}
N(Y \cap S’) &= N( Y \cap S’ \cap M) + N( Y \cap S’ \cap M’) \\
&= 600 + N( Y \cap S’ \cap M’)\\
&=1400
\end{aligned}\)

Hence, \(N( Y \cap S’ \cap M’) = 1400 – 600 = 800\).

We also observe that \( N(Y \cap M’ \cap S) + N( Y \cap S’ \cap M’) = N( Y \cap M’) \). In addition,

\(\begin{aligned}
N( Y \cap M’) &= N(Y) – N(Y \cap M) \\
&= 3000 – 1320 \\
&= 1680
\end{aligned}\)
Therefore, $$N(Y \cap M’ \cap S) = N( Y \cap M’) – N( Y \cap S’ \cap M’) = 1680 – 800 = 880.$$

Similar Posts

One Comment

Comments are closed.