The concept of probability holds a significant position in the realm of mathematics and extends beyond its boundaries. It plays a crucial role in decision-making, outcome prediction, and uncertainty analysis. This all-encompassing guide aims to unravel the intricacies of probability by delving into its fundamental principles and pragmatic implementations.

## Sample Space and Events

The fundamental basis of probability theory is rooted in the concept of random experiments, which entail unpredictable outcomes that are characterized by key concepts.

A **sample space** $S$ is the set of all possible outcomes of a random experiment. For example, when rolling a six-sided die, the sample space is $S=\{1,2,3,4,5,6\}$.

**Events** are subsets of the sample space. They represent specific outcomes or combinations of outcomes. For instance, if we define event $A$ as “rolling an even number,” then $A=\{2,4,6\}$.

In the context of probability theory, the repetition of a random experiment is commonly referred to as a trial. Each trial represents a distinct occurrence of the experiment and is subject to the same underlying probability distribution. For instance, in the case of a coin toss, each trial results in either heads or tails. It is noteworthy that the sample space of the experiment is contingent upon the specific definition of the random experiment.

Let $I$ denote a subset of the natural numbers, and let $(A_k)_{k\in I}$ represent a family of events defined on a sample space $S$. It follows that the union and intersection of the events $A_k$ are also events. Specifically, the event $E=\cup_{k\in I} A_k$ occurs if and only if at least one of the events $A_k,\;k\in I$ occurs. Similarly, the event $F=\cap_{k\in I} A_k$ occurs if and only if all of the events $A_k,\;k\in I$ occur. This observation is closely related to the principles of set theory.

## Probability definition

### Sigma-algebra

The present discourse commences by introducing the notion of a sigma-algebra, commonly represented as $σ$-algebra, which constitutes a fundamental concept in both probability theory and measure theory. Specifically, a sigma-algebra is a set of subsets of a given set, typically the sample space, that satisfies specific properties. The sigma-algebra plays a crucial role in defining probability measures and establishing a suitable mathematical framework for probability theory.

A $\sigma$-algebra $\mathscr{F}$ is on a sample space $S$ is a collection of subsets of $S$ that satisfies the following three properties:

- The sample space $S\in\mathscr{F}$.
- When $A\in \mathscr{F}$, then its complement $\overline{A}:=S\setminus A\in \mathscr{F}$.
- If $A_1,A_2,A_3,\cdots$ is a countable sequence of elements in $\mathscr{F}$, then their union $$ \bigcup_{n=1}^{+\infty} A_n\in \mathscr{F}.$$

### What is a probability?

In this study, a sample space denoted by $S$ and its corresponding $\sigma$-algebra $\mathscr{F}$ are established. The concept of probability is then introduced and defined in the following manner:

A probability if a function $P:\mathscr{F}\to\mathbb{R}$ that satisfies the following properties:

- For any $A\in\mathscr{F}$, $P(A)\ge 0$
- $P(S)=1$.
- If $A_1,A_2,A_3,\cdots$ is a countable sequence of disjoint (mutually exclusive) events in $\mathscr{F}$ (meaning that $A_i\cap A_j=\emptyset$ for $i\neq j$), then the probability of the union of these events is equal to the sum of their individual probabilities: $$ P\left(\bigcup_{i=1}^{+\infty} A_i\right)=\sum_{i=1}^{+\infty} P(A_i).$$

## Probability formula

This section aims to compile a comprehensive collection of probability formulas and properties. To this end, we consider a sample space $S$ and a $\sigma$-algebra $\mathscr{F}$, which will serve as the foundation for our subsequent analysis.

### The probability of the complementary event

For a probability $P$ on $\mathscr{F}$, we have $P(\emptyset)=0$ and $0\le P(A)\le 1$ for any $A\in\mathscr{F}$. | + |

To prove that $P(\emptyset)=0$, we consider a collection of events $\{A_n:n\in\mathbb{N}^\ast\}$ such that $A_n=\emptyset$ for any $n=1,2,3,\cdots$. As $\emptyset =\cup_{i=1}^{+\infty} A_n$, then $$ P(\emptyset)=\sum_{n=1}^{+\infty}P(A_n).$$ This is a convergent numerical series (because the sum is $P(\emptyset)\in\mathbb{R}$.) Thus by using series properties, we have $\lim_{n\to \infty}P(A_n)=0$. But for any $n$, $P(A_n)=P(\emptyset)\to P(\emptyset)$ as $n\to+\infty$. This implies that $P(\emptyset)=0$. Furthermore, if $A\in\mathscr{F}$, then we can write $S=A\cup A^c$. Thus, by probability definition, $1=P(S)=P(A)+P(A^c)\ge P(A)\ge 0$. This ends the proof. |

For any event $A\in\mathscr{F}$, the probability of its complementary $P(A^c)=1-P(A)$. | + |

In fact, clearly $S=A\cup A^c$ and $A\cap A^c=\emptyset$. Thus $1=P(S)=P(A)+P(A^c)$. Thus the result follows. |

### Useful probability formulas

In the realm of mathematics, the set $F\setminus E$ is defined as the set of all elements $x$ that belong to set $F$ but not to set $E$. This can be expressed as $x\in F\setminus E$ if and only if $x\in F$ and $x\notin E$. A probability formula for $F\setminus E$ is presented in the following result.

For any couple of events $(E,F)$ in $\mathscr{F}$, we have $P(F\setminus E)=P(F)-P(E\cap F)$. | + |

In fact, observe that \begin{align*} & F=(F\cap E)\cup (F\setminus E)\cr & (F\cap E)\cup (F\setminus E)=\emptyset.\end{align*} Thus, $P(F)=P(F\cap E)+P(F\setminus E)$. This ends the proof. |

The subsequent statement presents the probability formula for the union of two events of arbitrary nature.

Let $A$ and $B$ denote arbitrary events in the sigma-algebra $\mathscr{F}$. Then $$ P(A\cup B)=P(A)+P(B)-P(A\cap B).$$ | + |

In fact, the event $A\cup B$ is the union of the following disjoint events $A\cap B$, $A\setminus B$ and $B\setminus A$. Thus, \begin{align*} P(A\cup B)=P(A\cap B)+P(A\cap B)+P(B\setminus A).\end{align*} Now the result follows from the previous result. |

The binary operation of inclusion denoted by the symbol “\subset” establishes a partial order on the collection of sets $\mathscr{F}$. Under this partial order, the probability function $P$ exhibits monotonicity, specifically, it is an increasing function.

For any $A,B\in\mathscr{F}$ with $A\subset B$, we have $P(A)\le P(B)$. | + |

In fact, $$ P(B)=P(A)+P(B\setminus A)\ge P(A).$$ |

### The complete system of events

A set of events denoted by $\{A_i:i\in \Lambda\subset\mathbb{N}\}\subset \mathscr{F}$ is deemed to be a complete system of events if said events are mutually exclusive (disjoint) and their union is equivalent to the sample space, denoted by $S=\cup_{i\in \Lambda} A_i$.

Let $\{A_i:i\in \Lambda\subset\mathbb{N}\}\subset \mathscr{F}$ be a complete system of events. Then for any $B\in\mathscr{F}$, $$ P(B)=\sum_{i\in \Lambda} P(B\cap A_i).$$ | + |

Clearly the events $B\cap A_i$ are mutually exclusive. Therefore $$ P\left(\bigcup_{i\in \Lambda} (B\cap A_i)\right)= \sum_{i\in \Lambda} P(B\cap A_i).$$ On the other hand, $$ \bigcup_{i\in \Lambda} (B\cap A_i)=B\cap (\cup_{i\in \Lambda} A_i)=B\cap S=B.$$ This ends the proof. |

## How to construct a probability from an experience

Typically, in order to derive a probability function from a random experiment, we utilize the probability definition in conjunction with certain fundamental properties. The ensuing discussion will demonstrate, through the use of an illustrative example, the methodology employed to ascertain the probability.

The experiment involves launching an arrow at a somewhat peculiar target: one with infinite size, where the first circle of radius 1 defines the disk 0, the second circle of radius 2 defines the annulus 1, the circle of radius 3 delimits annulus 2, and so on. The focus is on determining the number of the annulus or disk where the arrow lands. In this context, the universe is $S=\mathbb{N}$ and the $\sigma$-algebra is $\mathscr{F}=\mathcal{P}(\mathbb{N})$, that is, the $\sigma$-algebra generated by all singleton integers. It suffices to define the probability $P$ on the integers, with $\sigma$-additivity defined on $\mathcal{P}(\mathbb{N})$. Note that if we set $P(\{n\})=\frac{1}{n+1}$, we do not define a probability, since the series $\sum_{n=0}^{+\infty} P(\{n\})$ diverges. However, if we set $P(\{n\})=\frac{6}{\pi^2(n+1)^2}$, then the series $\sum_{n=0}^{+\infty} P(\{n\})$ converges, and $$ P(S)=\sum_{n=0}^{+\infty} P(\{n\})=1,$$ since $$ \sum_{n=1}^{+\infty} \frac{1}{n^2}=\frac{\pi^2}{6}.$$

## Sequences of events

In order to discuss the monotony of sequences within a given space, it is necessary to establish a binary order relation that can compare the elements of said space. In the present case, the elements in question are events that exist within the space denoted as $\mathscr{F}$. This space is equipped with the binary relation of inclusion, represented by the symbol “$\subset$”. Specifically, let us consider a collection of events denoted as $\mathscr{A}:=\{A_n:n\in\mathbb{N}\}\subset \mathscr{F}$ ($\mathscr{F}$ is a $\sigma$-algebra). We define this collection as increasing if $A_n\subset A_{n+1}$ for all $n$, and decreasing if $A_{n+1}\subset A_n$ for all $n$.

Continuity Theorems: Suppose that $\{A_n:n\in\mathbb{N}\}$ is a sequence of events. Then - If the sequence is increasing then $$ \lim_{n\to+\infty} P(A_n)=P\left(\lim_{n\to+\infty} A_n\right)=P\left(\bigcup_{n=0}^{+\infty} A_n\right).$$
- Now if the sequence is decreasing then $$ \lim_{n\to+\infty} P(A_n)=P\left(\lim_{n\to+\infty} A_n\right)=P\left(\cap_{n=0}^{+\infty} A_n\right).$$
$\blacktriangleright$ Details: |
+ |

$\bullet$ We assume that the sequence $(A_n)_n$ is increasing. Construct a sequence of events mutually exclusive $(B_k)_K$ such that $\cup_{k=0}^n B_k=\cup_{k=0}^n A_k$ for all $n$. It suffices to select $B_0=A_0$ and $B_{k+1}=A_{k+1}\setminus A_k$ for all $k$. Therefor \begin{align*} P(B_{k+1})&=P(A_{k +1})-P(A_{k+1}\cap A_k)\cr &= P(A_{k +1})-P(A_k).\end{align*} Usingf this equality, we obtain \begin{align*}P\left(\bigcup_{n=0}^{+\infty} A_n\right)&= P\left(\bigcup_{n=0}^{+\infty} B_n\right) = \sum_{n=0}^{+\infty} P(B_n)\cr &= P(B_0)+\sum_{k=1}^{+\infty} P(B_k)\cr&= P(A_0)+ \lim_{n\to+\infty} \sum_{k=1}^{n} P(B_k)\cr& = \lim_{n\to+\infty} \sum_{k=1}^{n}[ P(A_k)-P( A_{k-1})]\cr & = P(A_0)+\lim_{n\to+\infty} (P(A_n)-P(A_0))\cr &= \lim_{n\to\infty}P(A_n).\end{align*} $\bullet$ We assume that the sequence $(A_n)_n$ is decreasing. Then the sequence of complementary events $(A_n^c)_n$ is increasing. Thus, the first point, we have $$ P\left(\bigcup_{n=0}^{+\infty} A_n^c\right)=\lim_{n\to\infty} P(A^c_n)=1-\lim_{n\to\infty} P(A_n).$$ On the other hand, \begin{align*} P\left(\bigcup_{n=0}^{+\infty} A_n^c\right)&=P\left((\bigcap_{n=0}^{+\infty} A_n)^c\right)\cr &=1- P\left(\bigcap_{n=0}^{+\infty} A_n^c\right). \end{align*} This ends the proof. |

## Boole’s inequality

Boole’s Inequality, also known as the Union Bound, is a fundamental concept in probability theory. It provides an upper bound on the probability of the union of multiple events. We have seen that for arbitrary events $A$ and $B$, we have $P(A\cup B)=P(A)+P(B)-P(A\cap B)$. This implies that $$ P(A\cup B)\le P(A)+P(B).$$ By induction, for any events $A_1,A_2,\cdots,A_n$, we have $$ P(A_1\cup A_2\cup\cdots\cup A_n)\le P(A_1)+P(A_2)+\cdots+P(A_n).$$ More generally, we have the following result:

Boole’s Inequality For any events $(A_n)_{n\in\mathbb{N}}$, we have $$P\left(\bigcup_{n=0}^{+\infty} A_n\right)\le \sum_{n=0}^{+\infty} P(A_n).$$ $\blacktriangleright$ Details: |
+ |

We select $U_n=\cup_{k=0}^n A_k$. Clearly, $$ \bigcup_{k=0}^{+\infty} A_k=\bigcup_{k=0}^{+\infty} U_k.$$ Now we define $B_0=U_0=A_0$ and $$ B_{n+1}=U_{n+1}\setminus U_n\subset A_{n+1}.$$ Remark that, $B_n\subset U_n$, the events $(B_n)_n$ are mutually exclusive and $$ \bigcup_{k=0}^{+\infty} B_k=\bigcup_{k=0}^{+\infty} A_k.$$ Thus \begin{align*}P\left(\bigcup_{n=0}^{+\infty} A_n\right)&=P\left(\bigcup_{n=0}^{+\infty} B_n\right)\cr &= \sum_{n=0}^{+\infty} P(B_n)\le \sum_{n=0}^{+\infty} P(A_n).\end{align*} |