본문 바로가기
Study/통계학

9. 기댓값, 지시확률변수와 선형성 (Expectation, Indicator Random Variables, Linearity)

by EDGE-AI 2022. 1. 22.

본 글은 Havard University Statistics 110 강의를 듣고 정리한 내용입니다.

 

CDF : F(x) = P(X ≤ x), x ∈ R

discrete

Find P(1 < x ≤ 3) using F

P(X ≤ 1) + P(1 < X ≤ 3) = P(X ≤ 3)

∴ P(1 < X ≤ 3) = F(3) - F(1) -> P(a < X ≤ b) = F(b) - F(a)

 

Properties of CDF

  1. Increasing
  2. Right Continuous
    • Left Continuous 하지는 않음
  3. F(X) - > 0 as X -> -∞, F(X) -> 1 as x -> ∞

Independent of random variables

X, Y are indep r.v.s if P(X≤x, Y≤y) = P(X≤x)P(Y≤y) for all x, y

Discrete case: P(X=x, Y=x) = P(X=x)P(Y=y)

 

Averages(Means, Expected values)

Discrete r.v.s

  • {1, 2, 3, 4, 5, 6} -> (1+2+3+4+5+6) / 6 = 3.5
  • 1, 2, ..., 99, 100 -> 101 * 50 = 5050
    • 1/n ∑[j = 1~n] j = (n+1)/2 arithmetic series
  • {1, 1, 1, 1, 3, 3, 5}
    • add divide by 8
    • 5/8 * 1 + 2/8 * 3 + 1/8 * 5 (Weighted sum)
    • Average of a discrete r.v. X E(X) =  ∑ xP(X=x), summed over x with P(X=x) > 0

Bernolli r.v.s

X ~ Bern(p), E(X) = 1*P(X=1) + 0*P(X=0) = p

x = 1 if A occur, 0 else (indicator r.v.)

then E(X) = P(A)

Binomal r.v.s

X ~ Bin(n,p) 

3번째 줄 j = 0 ~ n-1

Linearity

E(X+Y) = E(X)+E(Y) even if X, Y are dependent (독립이 아니여도 성립)

E(cX) = cE(X) if c is a const

 

Redo Binomial

X = X_1+ ... +X_n​ where X_i~Bern(p)

E(X) =n * E(X_1) = np

초기하분포(Hypergeometric)

Ex. 5 card hand, X = number of aces. Let Xj be indicator of jth card being ace, 1 ≤ j ≤ 5

E(X) = E(X1+...+X5) = E(X1) + ... + E(X5) = 5E(X1) = 5 * P(1st card ace) = 5/13, even though Xj's are dependent

expected value of 초기화분포 = 이항분포의 기댓값 계산방법

 

Geometric(기하분포)

Geom(p) : indep Bern(p) trials, number of failure before success, Let X~Geom(p), q = 1-p

{F,F,F,F,F,S) -> P(X=S) = q^5 p

PMF : P(X=k) = q^k p, k ∈ {0, 1, 2, ...}

기하분포의 기댓값

Story proof

Let c = E(X)x = 0 * p + (1+c) * q = q + cq

1+c : 하나의 실패 이후 재시작.

∴ c = q/p

 

출처: https://www.edwith.org/ai152

댓글