Basics of probability (II)

Reference (of all figures and content): Introduction to Probability, 2nd edition [Blitzstein and Hwang, 2019]

5 min readAug 1, 2022

Recall from our previous post, the naive definition of probability, one has to fulfill that 1) the sample space is finite, and 2) probabilities for all elements are equal. This makes “counting” possible.

Counting

Calculating the naive probability can be achieved by counting the number of elements for a certain event. For example, we have a bag of 10 seeds, 5 black and 5 white. We want to know the probability of taking a black seed from the bag (the event). Here, the sample space is the bag, and picking any black seed from the sample space is one possible outcome. In total, there are 5 possible outcomes, by “counting” the number of black seeds. Therefore, the probability of this event is 5/10 = 1/2

Multiplication rule

However, this is a relatively simple event with only one step to accomplish. When we need the probability of an event which is completed by multiple steps, for example, picking 1 black ball and 1 white ball, which is called a compound experiment, we can use the multiplication rule to calculate the probability.

Consider a compound experiment consisting of two sub-experiments, A and B. Let that A has a possible outcomes and B has b possible outcomes. Then the compound experiment as ab outcomes.

Since the experiment is consisted in several “steps”, and each step requires sampling; this leads to two scenarios, sampling (e.g. picking seeds) with replacement or sampling without replacement. The concepts can be easily understood with our previous bag example. For the bag of 5 white seeds and 5 black seeds, we want to take 1 white seed in the first round, and then take 1 white seed in the second round.

sampling with replacement
When with replacement, we first take the white seed, put it back, and take a white seed again. The number of possible outcomes would be 5 possible ones for the first white seed and 5 for the second white seed, calculated by 5*5 = 5^2.
To formulate the scenario, given that there are n elements in the sample space and k choices to make, there are n^k possibilities for sampling with replacement.
sampling without replacement
When without replacement, we take the white seed, keep it, and pick the second white seed among the remaining 4 white seeds. The number of possible outcomes would be 5*(5-1) .
To formulate the scenario, given that there are n elements in the sample space and k choices to make, there are n(n-1)(n-2)...(n-k+1) possibilities for sampling without replacement.

A permutation is the arragmenet of elements in some order. For example, 3,1,5,2,4 is a permutation of [1,5]. This can been seen as a form of sampling n out of n without replacement. According to the formula n(n-1)(n-2)...(n-k+1) , when n = k, the number is n(n-1)(n-2)...(1) .Therefore, permutation is a form of sampling without replacement.

Labeling objects

As mentioned before, sets and subsets are unordered by definition. For example, {1,3,4} = {4,1,3}. Hence, when calculating the possible outcomes, we have to understand if the two can be regarded as one outcome or two outcomes; that is to say, if order matters for the event.

A critical concept in probability theory is to label objects. Generally, we first calculate all the possible outcomes, and determine the importance of order, and adjust for overcounting accordingly.

Adjust for overcounting

What does it mean when order matters or doesn’t matter? Let’s look at an example. Consider a group of four people, how many ways are there choose a 2-people committee? Applying the rule of labeling:

In this example, we overcount each combination by a factor of 2, so the number of ways should be 12 (the total possible outcomes)/2 = 6.

Binomial coefficient

A binomial coefficient counts the number of subsets of a certain size for a set, such as the number of ways to choose a committee of size k from a set of n people.

The binomial coefficient has an important application in algebra, the binomial theorem.

To dig a little further, let’s expand the formula. I will use the explanation from the book for better understanding.

Non-naive probability

So far, we have only discussed probabilities under the assumption of naive probability, which requires a finite sample space and equal probabilities of all elements. However, there are scenarios where the naive assumptions do not apply. To generalize the notion of probability, mathematicians made a short wish list regarding how they want the probability to behave. Items on such list, in math, is called “axioms”.

In mathematics or logic, an axiom is an unprovable rule or first principle accepted as true because it is self-evident or particularly useful. (Merriam-Webster)

Axioms

Properties

“Particularly, the third property is a special case of inclusion-exclusion, a formula for finding the probability of a union of events when the events are not necessarily disjoint (or finite).”