#set page(paper: "a4", flipped: true, margin: 0.5cm, columns: 4)
#set text(size: 9pt)
#set list(spacing: 1.2em)

- *Mututally Exclusive*: $A inter B = emptyset$
- *Union*: $A union B = { x : x in A or x in B }$
- *Intersection*: $A inter B = { x : x in A and x in B }$
- *Complement*: $A' = { x : x in S and x in.not A }$
- $(A inter B)' = (A' union B')$
- *Multiplication*: R experiments performed sequentially. Then $n_i dot ... dot n_r$ possible outcomes for $r$ experiments
- *Addition*: $e$ can be performed $k$ ways, and $k$ ways do not overlap : total ways: $n_1 + ... + n_k$
- *Permutation*: Arrangement of $r$ objects out of $n$, _ordered_. $P^n_r = n!/(n-r)!, P^n_n = n!$
- *Combination*: Selection of $r$ objects out of $n$, _unordered_ $vec(n, r) = n!/(r!(n-r)!), vec(n, r) times P^r_r = P^n_r$

== Probability
- Axioms:
  + $0 <=P(A) <= 1$
  + $P(S) = 1$
- Propositions:
  + $P(emptyset) = 0$
  + $A_1 ... A_n$ are mutually exclusive,$P(A_1 union ... union A_n) = P(A_1) + ... + P(A_n)$
  + $P(A') = 1-P(A)$
  + $P(A) = P(A inter B) + P(A inter B')$
  + $P(A union B) = P(A) + P(B) - P(A inter B)$
  + If $A subset B, P(A) <= P(B)$

== Conditional Probability
- $P(B|A)$ is probability of $B$ given that $A$ has occured
- $P(B|A) = P(A inter B) / P(A)$
- $P(A inter B) = P(B|A)P(A)$
- $P(A|B) = (P(A)P(B|A)) / P(B)$
- $P(A inter B inter C) = P(A)P(B|A)P(C|B inter A)$
- *Independent*: $P(A inter B) = P(A)P(B), A perp B$
  - If $P(A) != 0, A perp B arrow.l.r P(B|A) = P(B)$ (Knowledge of $A$ does not change $B$)
- *Independence vs Mutually exclusive*
  - $P(A) > 0 and P(B) > 0, A perp B arrow.double.r "not mutually exclusive"$
- Partition: $A_i...A_n$ is mutually exclusive and $union.big^n_i=1 A_i = S, A_i...A_n$ is partition of S
  - $P(B) = sum^n_(i=1) P(B inter A_i) = sum^n_(i=1) P(A_i)P(B|A_i)$
  - $n = 2, P(B) = P(A)P(B|A) + P(A')P(B|A')$
- *Bayes Theorem*: $P(A_k|B) = (P(A_k)P(B|A_k)) / (sum^n_(i=1)P(A_i)P(B|A_i))$
- $n = 2, P(A|B) = (P(A)P(B|A)) / (P(A)P(B|A) + P(A')P(B|A'))$

== Random Variables
- Notations:
  - ${X = x} = {s in S : X(s) = x) in S$
  - ${X in A} = {s in S : X(s) in A) in S$
== Probability Distributions
- PMF(_Discrete_) of $X - f(x) = cases(P(X=x) "if" x in R_X, 0 "otherwise")$
- Properties (*must* satisfy)
  + $f(x_i) >= 0, x_i in R_X$
  + $f(x_i) = 0, x_i in.not R_X$
  + $sum^infinity_(i=1)f(x_i) = 1$
- PDF(_Continuous_) of $X$ is function that satisfies the following
  + $f(x) >= 0, x in R_X "and" f(x) = 0, x in.not R_X$
  + $integral_R_X f(x) dif x = 1$
  + $a <= b, P(a <= X <= b) = integral^b_a f(x) dif x$
  - To validate, check (1) and (2)

- CDF (Discrete): $F(X) = P(X <=x)$
  - $P(a<=X<=b) = P(X<=b) - P(X<a) = F(b) - F(a-), a-$ (is largest value in $R_X$ smaller than $a$)
- CDF(Continuous): $F(X) = integral^x_(-infinity)f(t)dif t$
  - $f(x) = dif/(dif t) F(x)$
  - $P(a<=X<=b) = F(b) - F(a)$
  - $F(x)$ is non-decreasing.
  - PDF/PMF and CDF have 1 to 1 correspondence.
  - Ranges of $F(x)$ and $f(x)$ satisfy:
    - $0 <= F(X) <= 1$
    - for discrete: $0 <= f(X) <= 1$
    - for continuous: $f(x) >= 0$, not necessary $f(x) <= 1$

== Expectation
- Expectation(Discrete): $ E(X) = mu_X = sum_(x_i in R_X) x_i f(x_i) $
- Expectation(Continuous): $ E(X) = mu_X = integral^(infinity)_(-infinity)x_i f(x_i) $
- *Properties*
  + $E(a X + b) = a E(X) + b$
  + $E(X + Y) = E(X) + E(Y)$
  + Let $g(dot)$ be arbitrary function.
    - $ E[g(X)] = sum g(x)f(x) \ "or"\ E[g(X)] = integral_R_X g(x)f(x)  $
    - example: $E(X^2) = sum x^2f(x)$
- Variance: $ sigma^2_x = V(X) = E(X - mu)^2 = E(X^2) - E(X)^2 $
  - Discrete: $V(X) = sum (x-mu_x)^2 f(x)$
  - Continuous: $V(X) = integral^(infinity)_(-infinity) (x-mu_x)^2 f(x)$
  - *Properties*
    + $V(a X + b) = a^2V(X)$
    + $V(X) = E(X^2) - E(X)^2$
    + Standard Deviation = $sigma_x = sqrt(V(X))$

== Joint Probability Function
- Discrete $ f_(X, Y)(x,y) = P(X = x, Y = y) $
- *Properties*
  + $f(X,Y)(x, y) >= 0, (x, y) in R_(X,Y)$
  + $f(X,Y)(x, y) = 0, (x, y) in.not R_(X,Y)$
  + $sum^infinity_(i=1)sum^infinity_(j=1)(x_i,y_i) = 1$
- Continuous $ P((X, Y) in D) = integral.double_((x, y) in D) f(x,y) dif y dif x $
  - $P(a<=X<=b, c<=Y<=d) = integral^b_a integral^d_c f(x, y) dif y dif x$
- *Properties*
  + $f_(X,Y)(x, y) >= 0$, for any $(x,y) in R_(X,Y)$
  + $f_(X,Y)(x, y) = 0$, for any $(x,y) in.not R_(X,Y)$
  + $integral^infinity_(-infinity)integral^infinity_(-infinity)f_(X,Y)(x, y) dif x dif y= 1$

=== Marginal Probability Distribution
  - Discrete: $f_X (x) = sum_y f_(X,Y)(x,y)$
  - Continuous: $f_X (x) =integral^infinity_(-infinity) f_(X,Y)(x,y) dif y$

- Conditional Distribution: $ f_(Y|X) (y|x) = (f_(X,Y)(x,y)) / (f_X (x)) $
  - If $f_X (x) > 0, f_(X,Y)(x,y) = f_X (x) f_(Y|X) (y|x)$
  - $P(Y <= y | X = x) = integral^y_(-infinity) f_(Y|X)(y|x) dif y$
  - $E(Y | X = x) = integral^infinity_(-infinity) y f_(Y|X)(y|x) dif y$
=== Independent Random Variable
- *Independent*: $ f_(X, Y)(x, y) = f_X(x) f_Y(y) $
- *Properties*
  + If $X, Y$ are independent random variables, $ P(X <= x; Y <= y) = P(X <= x) P(Y <= y) $
  + $g_1(X) "and" g_2(Y)$ are independent. (E.g. $X^2 "and" log(Y)$ are independent)
  + if $F_X(x) > 0, "then" f_(Y|X)(y|x) = f_Y (y)$

#colbreak()
=== Expectation and Covariance
- *Expectation*: $ E(g(X, Y)) = sum_x sum_y g(x, y)f_(X, Y)(x, y) \
 E(g(X, Y)) = integral^(infinity)_(-infinity) integral^(infinity)_(-infinity) g(x, y)f_(X, Y)(x, y) dif y dif x $

#[
#let cov = "cov"
- *Covariance*: $ cov(X, Y) = sum_x sum_y (x-mu_x)(y-mu_y)f_(X, Y)(x, y) $
- *Properties*
  + $cov(X, Y) = E(X Y) - E(X)E(Y)$
    - $E(X Y) = integral integral x y f(x, y) dif y dif x$
  + If $X$ and $Y$ are independent, $cov(X, Y) = 0$
    - $X perp Y => cov(X, Y) = 0$
    - $cov(X, Y) = 0 arrow.double.not X perp Y$
    - $E(X Y) = E(X)E(Y)$
  + $cov(a X + b, c Y + d) = a c dot cov(X, Y)$
  + $V(a X + b Y) = a^2V(X) + b^2V(Y) + 2 a b dot cov(X, Y)$
]
= Probability Distributions

== Discrete Distributions
=== Discrete Uniform Distribution
If random variable $X$ assumes values $x_1, ...$ with _equal_ probability, then $X$ follows discrete uniform distribution. PMF of $X$ is $ f_X(x) = cases(1/k\, &x = x_1\,...\,x_k \ 0 & "otherwise") $
$ E(X) = sum^k_(i=1)x_i f_X (x_i) = 1/k sum^k_(i=1)x_i $
$ V(X) = E(X^2) - E(X)^2 = 1/k sum^k_(i=1)x_i^2 - mu^2_x $

- *Bernoulli Trial*: experiment with only 2 outcomes (1/0)
- *Bernoulli Random Variable*: X be no of sucess in Bernoulli trial. $X$ has only 2 values. $p$ is probability of success. PMF: $ f_X (x) &= P(X = x) = cases(p &"if" x = 1, 1-p &"if" x = 0) \ &= p^x (1-p)^(1-x), "for" x = 0,1 $
- $X ~ "Bernoulli"(p)$
  - $q = 1-p$
  - $f_X (1) = p, f_X (0) = q$
  - $E(X) = p$
  - $V(X) = p q$
- *Bernoulli Process* - repeated independent and identical bernoulli trials
- *Binomial Random Variable* - No of successes in $n$ trials of bernoulli process.
  - P of $x$ successes in $n$ trials
- $X ~ "Bin"(n, p)$
  - $P(X = x) = vec(n,x)p^x (1-p)^(n-x)$
  - $E(X) = n p$
  - $V(X) = n p(1-p)$
- *Negative Binomial Distribution* - No of trials needed for $k$ successes
  - P of $x$ trials needed for $k$ successes
- $X ~ "NB"(k, p)$
  - $f_(x)(x) = P(X = x) = vec(x - 1, k - 1)p^(k)(1-p)^(x-k)$
  - $E(X) = k/p$
  - $V(X) = ((1-p)k) / p^2$
- *Geometric Distribution*: No of trials needed until first success occurs.
- $X ~ "Geom"(p) = "NB"(1, p)$
  - $f_x (x) = P(X = x) = (1-p)^(x-1)p$
  - $E(X) = 1/p$
  - $V(X) = (1-p)/p^2$