Probability

Complement Law – P(A) = 1 – P(A)

Laws Of Addition -P(A B) = P(A) + P(B) – P(A B), if A and B not mutually exclusive

P(A B) = P(A) + P(B), if A and B are mutually exclusive

Conditional Probability – P(A|B) = P(A B)

P(B)

Independent Condition – If A and B are independent, P(A B) = P(A) x P(B)

Laws Of Multiplication – If A and B are dependent, P(A B) = P(A) x P(B|A) or

P(A B) = P(B) x P(A|B)

Descriptive Statistics

Population Mean, m= all values

N

Sample Mean, x = all values

n

Population Variance, s2 = (X – m)2

N

Sample Variance, S2 = (x – x)2

n-1

Standard Deviation = square root of s2 or S2

Probability Distribution

Expected Value, E(x) = all x P(xi = x) = m

Properties of E(x),

E(a) = a

E(ax) = aE(x)

E(ax b) = aE(x) b

E(x1 x2) = E(x1) E(x2)

E(x2) = all x2 P(xi = x)

Variance, Var(x) = E(x – m)2 or Var(x) = E(x2) – n(x)2

Properties of Var(x),

Var(a) = 0

Var(ax) = a2Var(x)

Var(ax b) = a2E(x)

Var(x1 x2) = Var(x1) + Var(x2)

E(x2) = all x2 P(xi = x)

Standard Deviation = square root of var(x)

Binomial Distribution – x ~ Bin (n , p)

Characteristics,

Experiment consist of a number of trials

Results of trials are only either success or failure

Probability of each test between trials are the same

E(x) = np

Var(x) = npq

Continuous Distribution – x ~ N(m , s2)

Standardising, z = x – m

s

Normal Approximation to Binomial Distribution – x ~ N(np , npq)

Conditions,

Number of trials n > 50

Must use continuity correction

Joint Probability

Conditional Mean – E(x | y=y1) = all x P(xi | y)

E(XY) = all x all y P(xi = x and yi = y)

When x and y are independent, E(XY) = E(X) E(Y)

Covariance of 2 random variables, sxy – Cov(XY) = E(XY) – E(X)E(Y)

When X and Y are independent, Cov(XY) = 0, since E(XY) = E(X)E(Y)

Correlation Coefficient, r = Cov(XY),-1 r 1

Var(x) Var(y)

Formula for Variance of linear combinations of 2 dependent variables –

Var(X Y) = Var(X) + Var (Y) 2Cov(XY)

Var(aX bY) = a2Var(X) + b2Var (Y) 2abCov(XY)

Distribution Of Sample Mean Sample Proportion

Let X denote the population variable. m the population mean and s2 the population variance.

then,

x ~ N(m,s2/n)

Let P denote the population proportion with proportion P with n, the number of samples,

then

P ~ N { p , p (1-p)/n }

if P is unknown,

P ~ N { P , P (1-P)/n } approx. where P is the sample proportion with the use of continuity correction x (1/2n)

Theory Of Estimation

Mean Square Error – MSE = E(V – q)2 where V is the value of the estimator from the true value q

Best estimator of the true value is the one that yields the lowest MSE

Confidence Interval – The interval of which the true value is probable to be included.

3 Cases Of Formula For Confidence Interval –

For population mean where

m, s2 given,-m = x (s2/n)1/2 Zsig level

m given but s2 unknown, samples size n > 50-m = x (S2/n)1/2 Zsig level

m given but s2 unknown, samples size n < 50-m = x (S2/n)1/2 tsig level
For difference in population means mx my where
m, s2 given,-
mD = (x y) (sx2/nx + sy2/ny)1/2 Zsig level
m given but s2 unknown, samples size n > 50-

mD = (x y) (Sx2/nx + Sy2/ny)1/2 Zsig level

m given but s2 unknown, samples size n < 50- mD = (x y) (Sp2/nx + Sp2/ny)1/2 tsig level where pooled variance, Sp2 = S(x-x)2 + S(y-y)2 nx + ny - 2 Sp2 = Sx2(nx-1) + Sy2(ny-1) nx + ny - 2 Paired Samples- mD = D (SD2/nD)1/2 tsig level where D is the difference between the paired samples.

For Population Proportion, p ~ N { p, p(1-p)/n }

p not given, then it is estimated with variance P(1-P)/n, in the confidence interval of

p = P (P(1-P)/n)1/2 Zsig level

Hypothesis Testing

Procedure:

State Null and Alternate hypothesis

Determine one or two sided test

Find Ztest or ttest and compare the result with Zcritical and Tcritical respectively

Decision Rule, |Ztest| < Zcritical or |ttest| < Tcritical then null hypothesis is true
Conclude in relation to hypothesis / question
e.g.,
Ztest = x - m
s/n
P-value -
Decision Rule
Reject H0 if p-value < level of significance
Accept H0 if p-value level of significance