Random Variables

Sources:

  1. NOTES ON PROBABILITY by Greg Lawler

Random Variables

Notation

Symbol Type Description
$ $ Set Sample space, the set of all possible outcomes
$ $ Element of $ $ A specific outcome in the sample space
$ $ \(\sigma\)-algebra Event space, the collection of subsets of $ $ that satisfy the properties of a \(\sigma\)-algebra
$ $ Function \(\mathbb{P} : \mathcal{F} \to [0,1]\) Probability measure, a function \(\mathbb{P} : \mathcal{F} \to [0,1]\) satisfying Kolmogorov’s axioms
$ X $ Random variable A measurable function $ X : $
\(\mu_X\) Function \(\mu_X : \mathcal B \to [0,1]\) Distribution of the random variable $ X $, defined on Borel subsets of $ $
$ $ \(\sigma\)-algebra The Borel \(\sigma\)-algebra on $ $
$ F_X(x) $ Cumulative distribution function (CDF) Probability that $ X $ takes a value less than or equal to $ x $, $ F_X(x) = (X x) $
$ f_X(x) $ Probability density function (PDF) Describes the density of $ X $ if $ X $ is absolutely continuous
$ p_X(x) $ Probability mass function (PMF) Describes the probability of $ X $ taking a specific value $ x $ if $ X $ is discrete
$ (, , _X) $ Probability space The transformed probability space induced by the random variable $ X $

Abbreviations

Abbreviation Description
r.v. Random variable
PDF Probability density function
PMF Probability mass function
CDF Cumulative distribution function

Definition

The term "random variable" is somewhat misleading, as it is neither "random" nor a "variable" in the conventional sense. Instead, it is a function.

A random variable \(X\) is a measurable function that maps outcomes in the sample space \(\Omega\) to the real numbers \(\mathbb{R}\). Formally, it is defined as:

\[ X: \Omega \longrightarrow \mathbb{R} \]

such that for every Borel set \(B \subseteq \mathbb{R}\),

\[ X^{-1}(B)=\{\omega \in \Omega: X(\omega) \in B\} \in \mathcal{F} . \]

note that \(X^{-1}(B)\) is a set of outcomes, i.e., an event.

Here, we use the shorthand notation: \[ \{X \in B\} = \{\omega \in \Omega: X(\omega) \in B\} \]

to denote event \(X^{-1}(B)\).

Distribution of a Random Variable

If \(X\) is a random variable, then for every Borel set \(B \subseteq \mathbb{R}, X^{-1}(B) \in \mathcal{F}\). Using this, we can define a function \(\mu_X\) on Borel sets: \[ \mu_X(B)=\mathbb{P}(X \in B)=\mathbb{P}\left(X^{-1}(B)\right) . \]

This function \(\mu_X\) is a measure, making \(\left(\mathbb{R}, \mathcal{B}, \mu_X\right)\) a probability space. The measure \(\mu_X\) is called the distribution of the random variable \(X\).

Nature of Random Variable

使用随机变量的本质就是转换概率空间, 将 \((\Omega, \mathcal{F}, \mathbb{P})\) 转化为 \(\left(\mathbb{R}, \mathcal B, \mu_X\right)\), 使问题的形式更加方便用数学处理.

Explanation

首先我们知道:

  • 对于概率空间 \((\Omega, \mathcal{F}, \mathbb{P})\), 概率度量函数\(\mathbb P(E)\)的参数为\(E\), \(E \subseteq \Omega\), \(E \in \mathcal F\).
  • 对于概率空间 \(\left(\mathbb{R}, \mathcal B, \mu_X\right)\), 概率度量函数\(\mu_X(B)\)的参数为\(B\), \(B \subseteq \Omega\), \(E \in \mathcal B\).

虽然我们用statement(->参见前文))将 \(B\)\(E\) 定义为event, 但 \(B\)\(E\) 自身是outcome的集合.

Example

例如, 定义随机实验为"购买一个汉堡, 品尝其肉馅是什么肉", 规定:

  • \(\Omega = \{牛肉馅,猪肉馅,鸭肉馅,鱼肉馅\}\), 记四个元素(outcome)为\(\omega_1, \omega_2, \omega_3, \omega_4\).
  • \(\mathcal F\) = \(\{(E_1), (E_2)\} = \{(\omega_1,\omega_4), (\omega_2,\omega_3)\}\).
    • 定义event \(E_1\): "汉堡是牛肉馅或者鱼肉馅的", 这个event是\(\omega_1, \omega_4\)的集合, 即: \(E_1=\{\omega_1, \omega_4\}\). \(\omega_1, \omega_4 \in \Omega\).
    • 定义event \(E_2\): "汉堡是猪肉馅或者鸭肉馅的", \(E_2=\{\omega_2, \omega_3\}\). \(\omega_1, \omega_3 \in \Omega\).
  • \(\mathbb P(E)\) = 事件\(E\)发生的概率.

概率空间 = \((\Omega, \mathcal{F}, \mathbb{P})\).

接着定义随机变量\(X\): \(X(\omega_i) = i\). 记\(X\)的取值为\(\mathcal X\), 则:

  • \(\mathcal X = \{1,2,3,4\}\), 记四个元素(outcome)为\(x_1, x_2, x_3, x_4\).
  • \(\mathcal B\) = \(\{(B_1), (B_2)\} = \{(x_1,x_4), (x_2,x_3)\} = \{(1,4), (2,3)\}\).
    • 定义event \(B_1\): "\(X^{-1}(B_1)\)为True", 这个event是\(x_1, x_4\)的集合, 即: \(B_1=\{x_1, x_4\} =\{1, 4\}\). \(1, 4 \in \mathcal X\), \((1,4) \in \mathcal B\).
    • 定义event \(B_1\): "\(X^{-1}(B_2)\)为True", 这个event是\(x_2, x_3\)的集合, 即: \(B_2=\{x_2, x_3\} =\{2, 3\}\). \(2, 3 \in \mathcal X\), \((2,3) \in \mathcal B\).
  • \(\mu_X(B)\) = 事件\(B\)发生的概率.

概率空间 = \(\left(\mathbb{R}, \mathcal B, \mu_X\right)\), 或者说 \(\left(\mathcal {X}, \mathcal B, \mu_X\right)\).

定义event \(B\): "\(X\)取值为1或者4", 这个event其实是\(x_1,x_4\)的集合, 即: \(B=\{1, 4\}\). \(x_1, x_4 \in \mathcal X\), \(\mathcal X\)\(\mathbb R\)的子集.

注意到, \(B_1, B_2\)自身只是outcome的集合, 但我们用"\(X^{-1}(B_1), X^{-1}(B_2)\)成立"这两个statement来定义它们. \(B_1, B_2\)的取值让statement为True, 也就是事件发生.

Cumulative distribution function (CDF)

The distribution \(\mu_X\) is often expressed in terms of its cumulative distribution function (CDF): \[ F_X(x)=\mathbb{P}(X \leq x)=\mu_X((-\infty, x]) \]

where \((-\infty, x]\) is indeed a Borel set in \(\mathbb{R}\).

Properties of a CDF:

  1. \(\lim _{x \rightarrow-\infty} F(x)=0\).

  2. \(\lim _{x \rightarrow \infty} F(x)=1\).

  3. \(F\) is non-decreasing.

  4. \(F\) is right-continuous: \[ F\left(x^{+}\right)=\lim _{\epsilon \downarrow 0} F(x+\epsilon)=F(x) . \]

Reconstruction from the CDF

From \(F_X(x)\), we can reconstruct \(\mu_X\) as: \[ \mu_X((-\infty, x])=F_X(x), \]

extending uniquely to all Borel sets.

Discrete and continuous random variables

  • If \(\mu_X\) gives measure one to a countable set of reals, then \(X\) is called a discrete random variable.
    • In this case, \(X\) can be described by a probability mass function (PMF).
  • If \(\mu_X\) gives zero measure to every singleton set, and hence to every countable set, \(X\) is called a continuous random variable.
    • If it is absolutely continuous, \(X\) can be described by a probability density function (PDF).

Probability density function (PDF)

For a continuous random variable \(X\), the PDF \(f_X\), if it exists, satisfies:

  1. \[ F_X(x)=\int_{-\infty}^x f_X(t) d t . \]

  2. If \(f_X\) is continuous at \(x\),

\[ f_X(x)=\frac{d}{d x} F_X(x) . \]

  1. The total integral equals 1 :

\[ \int_{-\infty}^{\infty} f_X(x) d x=1 \]

Probability mass function (PMF)

The PMF \(p_X(x)\) of a discrete random variable is defined as: \[ p_X(x)=\mathbb{P}(X=x), \]

where \(p_X(x)>0\) for values \(x\) in the support of \(X\).

Note: In writing \(P(X=x)\), we are using \(X=x\) to denote an event, consisting of all outcomes \(\omega\) to which \(X\) assigns the number \(x\). This event is also written as \(\{X=x\} ;\) formally, \(\{X=x\}\) is defined as \(\{\omega \in \Omega: X(s)=x\}\), but writing \(\{X=x\}\) is shorter and more intuitive.