Random Variables
A discrete random variable is one whose set of possible values is countable.
Probability Distributions for Discrete Random Variables
A probability mass function of a discrete rv is \(p(x)=P(X=x)\) that satisfies \(0\le p(x)\le1\) and \(\sum_{x}p(x)=1\).
The cumulative distribution function (cdf) \(F(x)\) of a rv \(X\) with pmf \(p(x)\) is given by \(F(x)=P(X\le x)=\sum_{y\le x}p(y)\)
Note that the cdf is not continuous for a discrete distribution.
To get the pmf from the cdf: \(P(a\le x\le b)=F(b)-F(a^{-})\) where \(a^{-}\) is the largest value of \(x\) not equal to \(a\).
And \(P(X=a)=F(a)-F(a^{-})\)
Expected Values of Discrete Random Variables
The expected value of a discrete distribution: \(E(X)=\mu_{X}=\sum_{x\in D}xp(x)\) Beware of the word expected, as its value may not be an allowed value for \(x\).
Any distribution that has a large amount of probability far from \(\mu\) is said to have a heavy tail. This is trivially true when \(\mu\) is infinite, but it need not be infinite to have a heavy tail. Be careful when making inferences of such distributions!
The variance of a distribution:
Let \(h(x)\) be a function of \(X\). \(E[h(x)]=\sum_{x\in D}h(x)p(x)\)
The variance of \(h(x)\) is \(V[h(x)]=\sum_{x\in D}\{h(x)-E[h(x)]\}^{2}p(x)\)
Linear scaling of \(X\):
The mode of a discrete distribution is the value at which the function is maximized.
Standardized Variables
If we define \(Y=\frac{X-\mu_{X}}{\sigma_{X}}\), then \(\mu_{Y}=0,\sigma_{Y}=1\).
Independence of Random Variables
If \(X\) and \(Y\) are random variables, then they are independent if and only if \(P\{X=i,Y=j\}=p_{X}(i)p_{Y}(j)\ \forall i,j\)
Markov Inequality
Let \(Y\) be a nonnegative random variable. Then for \(c\ge0\):
Proof Sketch:
Let Y take on values \(u_{i}\ge0\) Note that:
Split the sum for \(0<u_{i}<c\) and \(u_{i}\ge c\) and use these inequalities.
Chebyshev’s Inequality
Chebyshev’s Inequality: If \(k\ge 1\), then:
This gives an upper bound for being an integral number of \(\sigma\)‘s from the mean.
This is true for any discrete distribution. Because of its generality, it is not a tight bound.
Proof sketch:
Let \(Y=\frac{X-\mu_{x}}{k\sigma_{X}}\). Then \(\mu_{Y}=0,\sigma_{Y}=1/k\).
Trivially, we know that:
Now the important thing to show is that \(p(|y|\ge1)=p(|X-\mu_{X}|\ge k\sigma_{X})\). This is not hard.
Looking at the sum again, note that
So each \(p(y)\) must be less than \(1/k^{2}\) for all \(y^{2}\ge1\). This concludes the proof.
Another way to prove it is to use Markov’s Inequality with \(Y=|X-\mu|^{2}\) and \(c=k^{2}\sigma^{2}\)