Unit 04 - Probability - 1 Per Page
-
Upload
amit-chaturvedi -
Category
Documents
-
view
216 -
download
1
description
Transcript of Unit 04 - Probability - 1 Per Page
1
Stat E100
Unit 4: Probability IPS Chapter 4
2
Unit 4 Outline: Probability
• Definitions and Concepts in Probability
• Rules of Probability
• Independence
• Conditional Probability
• Introduction to Random Variables
• Discrete vs. Continuous
• Means (expected Values), Variances, and Correlations
3
Probability Terminology
Terminology
• random phenomenon – an event whose individual outcomes are uncertain but there is a regular distribution in a large number of repetitions.
Examples:
• Coin tossing and dice rolling
• The lottery and other games of chance
• Drawing a random sample from some population
• outcome: the value of one replication of a random experiment or phenomenon,
Coin Tossing:
• H with one toss of a coin
• HTT with three tosses
4
Probability Terminology (cont.) • sample space (labeled S): is the set of all possible outcomes of a
random phenomenon
• Examples:
1. Toss a coin three times: S = {HHH,THH,HTH,…,TTT}
2. Face showing when rolling a six-sided die: S = {1,2,3,4,5,6}
3. Pick a real number between 1 and 20: S ={[1,20]}
• event (labeled A): a set of outcomes of a random phenomenon.
• Examples:
1. The event A that exactly two heads are obtained when a coin is tossed three times: A={HHT,HTH,THH}
2. The result of the toss of a fair die is an even number: A = {2,4,6}
3. The number chosen from the set of all real numbers between 1 and 20 is at most 8.23: A = {[1,8.23]}
5
Simple Probability
• The probability of an event can be thought of as long run
proportion/frequency
• For a random phenomenon, if the sample space is finite and if
all of the individual outcomes have the same probability,
then the probability of an event A (written P(A)) is the ratio
Use this formula to determine the probability of getting two
heads in three tosses of a coin? Probability of getting an even
number in one roll of a die?
S
AAP
in elements of #
in elements of #)(
6
Events in Sample Spaces
(more Terminology)
• The union of two events A and B is the event that either A occurs or B
occurs or both occur:
• The intersection of two events A and B is the event that both A and B
occur.
Set theory notation (the ∩ and U) is widely used, but not
needed in this class
The complement of an event A, Ac, is the event that A does not occur
and thus consists of outcomes that are not in A. The text just calls it
“not A”
BABorA )(
ABBABandA )(
7
Rules of Probability Rule 1: The probability of any event A satisfies
0 P(A) 1
Rule 2: P(S) = 1. The probability of the sample space S is 1
Rule 3: P(Ac) = 1 – P(A). The probability of an event not
happening is 1 – (Probability of event happening)
Rule 4: If A and B are disjoint events then
P(A or B) = P(A) + P(B).
Note: the general rule is P(A or B) = P(A) + P(B) – P(A and B)
Justification for these can be seen in Venn Diagrams…
8
S
Venn Diagram
Rule 3. For any event A, P(Ac) = 1 - P(A)
9
S
Venn Diagram
Rule 4. If A and B are disjoint events then
P(A or B) = P(A) + P(B)
A and B are
disjoint
(mutually
exclusive)
-No outcomes in
common -
Cannot happen
at same time
10
S
Venn Diagram
In general for any events A and B:
P(A or B) = P(A) + P(B) – P(A and B)
11
Independent events
• There is a 5th `rule’ for `independent’ pairs of events
• Motivation on next slide, using coin tossing
• Rule 5: Two events A and B are independent if and only if
knowing that one event occurs does not change the probability
that the other event occurs. If A and B are independent, then:
P(A and B) = P(A)×P(B)
Sometimes called the multiplication rule for independent
events.
• Does knowing the results of flipping a fair coin once affect the
chances of heads on a 2nd flip?
Note: Independence can’t be easily drawn in a Venn Diagram…
12
An Example
• There is a bag with 3 balls in it: 1 is red, and 2 are black
• You draw two balls out of the bag, one at a time (without
replacement). Define the events:
A: the first ball drawn is black
B: the second ball drawn is black
• Are A and B independent?
13
Unit 4 Outline: Probability
• Definitions and Concepts in Probability
• Rules of Probability
• Independence
• Conditional Probability
• Introduction to Random Variables
• Discrete vs. Continuous
• Means (expected Values), Variances, and Correlations
14
Conditional probability
• conditional probability: the probability of one event occurring
under the condition that we know the outcome of another event
• Let A and B be two events in a sample space, with P(A) > 0. The
conditional probability of event B, given that A has occurred,
written P(B|A), is
• P(B|A) read as “probability of B, given A” has happened, or
probability of B if A is true.
• Note that if A and B are independent, P(B|A) = P(B)
• Conditional probability can get tricky!!!
)(
) and ()|(
AP
ABPABP
A: the first ball drawn is black
B: the second ball drawn is black
“Simple” Example
2
1
6/4
6/2
)(
)(
)(
)B and ()|(
BlackisBallFirstP
BlackareBallsBothP
AP
APABP
First Ball
Second Ball
• There is a bag with 3 balls in it: 1 is red, and 2 are black
• You draw two balls out of the bag, one at a time (without
replacement). Define the events:
16
Very tricky…
• The Monty Hall Problem
There are prizes behind 3 doors: two are ‘worthless’ (an
ant farm) and one is expensive (like a new car)
You are asked to choose one of the 3 doors
Then, Monty Hall (from Let’s Make a Deal) opens one
of the other 2 doors and shows you a worthless prize
• Should you switch doors?
• NYTimes take: http://www.nytimes.com/2008/04/08/science/08monty.html
17
A general multiplication rule
(from conditional probability)
• Suppose A and B are two events in a sample space (not
necessarily independent). Then
P(A and B) = P(B | A) × P(A)
P(A and B and C) = P(C | A and B) × [P(B | A) × P(A)]
The first relationship is a simple algebraic rearrangement of
what’s above:
)(
) and ()|(
AP
ABPABP
18
A Simple Example
It is known that approximately 20% of men and 3% of women are taller
than 6 feet in the US.
Let F = the event that someone is female and T = taller than 6 feet.
a) What is P(T | F)? What is P(T | FC)?
b) What is the probability that the next person walking through the door
is a woman and 6 feet tall?
c) What is the probability that the next person walking through the door
is 6 feet tall?
19
Simple Example (cont.)
c) What is the probability that the next person walking through the
door is 6 feet tall?
Two ways for this to happen: (T and F) or (T and Fc) [Think Venn Diagrams]
) and () and ()( CFTPFTPTP
)()|()() | ( CC FPFTPFPFTP
115.050.020.050.003.0
2-way tables can help organize your thinking…
Tall (6' or more)
Yes No
Female
Yes
No
P(F)*P(T | F) P(F)*P(not T | F)
= (0.5)*(0.97)
= 0.485
P(not F)*P(T | not F)
= (0.5)*(0.20)
= 0.100
P(not F)*P(not T | not F)
= (0.5)*(0.80)
= 0.400
= (0.5)*(0.03)
= 0.015
P(F and T)
21
Bayes’ Rule
• Bayes’s rule (formula) provides a way to go from P(B | A) to
P(A | B) (they are in general not equal…)
• If A and B are two events whose probabilities are
not 0 or 1
)()|()()|(
)()|(
)(
)()|()|(
CC APABPAPABP
APABP
BP
APABPBAP
22
Conditional probability and Bayes’ Rule
d) What is the probability that a randomly selected 6
foot tall person is female?
Or, you can just use the 2x2 table from 2 slide earlier.
130.0115.0
015.0
50.020.050.003.0
50.003.0
)()|()() | (
)() | (
)(
) and ()|(
CC FPFTPFPFTP
FPFTP
TP
TFPTFP
23
Unit 4 Outline: Probability
• Definitions and Concepts in Probability
• Rules of Probability
• Independence
• Conditional Probability
• Introduction to Random Variables
• Discrete vs. Continuous
• Means (expected Values), Variances, and Correlations
24
Random Variables
• A random variable is a variable whose value is a
numerical outcome of a random phenomenon.
• Usually denoted by capital letters X, Y or Z
• Example: toss a coin three times, let X = number
of observed heads.
S = {TTT,HTT,THT,TTH,THH,HTH,HHT,HHH}
values for X = {0, 1, 2, 3}
25
Random variables versus data
• There is an important distinction between random variables (Xi) and realized observed data (xi)
• A random variable is theoretical and has not yet been observed, but it has the potential to take different values with certain probabilities
• The larger the number of observations (n), the more the histogram of the observed data x1, x2, ... xn resembles the probability distribution of the (theoretical) random variable (Xi)
• We have been using random variables all along, but we called them `variables’…responses to day a random sample/poll, public transit data, toddler’s nutritional intake, etc…
26
Discrete Random Variables
• In coin tossing example, number of heads, X, is called a
discrete random variable, a variable with a finite number
of distinct values
• X has a simple distribution of values with associated
probabilities. This is the theoretical analogue of the
observed frequency distribution in a set of numbers.
• For X, we can summarize its distribution in a table or
histogram
27
Formal Terminology
• A Discrete Random Variable X can take on values in a finite set with k members. We denote these possible outcomes by x1, x2, x3, x4, x5, ...,xk
• The Probability Distribution of X is specified by assigning probabilities to each of the possible outcomes. The probability distribution tells us how likely each outcome is.
• The probabilities assigned to each outcome must satisfy the following conditions:
For each xi, P(X = xi) is between 0 and 1.
• That is, 0 ≤ P(X = xi) ≤ 1 for all possible values of xi
The sum of the probabilities equals 1.
• That is, P(X = x1) + P(X = x2) + …+ P(X = xk) = 1
28
A coin tossing random variable
• Let X be the number of heads when a fair coin is tossed
four times (whoa, way more complex!?)
S = {HHHH, HHHT, HHTH, ..., TTTT}
P(HHHH) = P(HHHT) = ... = P(TTTT) = 1/16
• X is a discrete random variable with 5 possible values
{0, 1, 2, 3, 4}
• Its probability distribution is:
Value 0 1 2 3 4
Probability 1/16 4/16 6/16 4/16 1/16
29
Possible outcomes in 4 tosses of a coin
30
Probability distribution for X, the number of
heads when a coin is tossed 4 times
31
Defining a Discrete Random Variable’s
Probability Distribution
The probability distribution for a discrete random variable X,
can completely defined when depicted in any of 3 forms:
1) Tabular form (shown 3 slides earlier)
2) Graphically (last slide)
3) By a formula (example coming in next unit)
Note: When k gets large, the formula based approach is often
the only feasible choice.
32
Expected value (aka mean) of a discrete
random variable • Sometimes called average or mean value of a random variable
• Motivated by X = number heads in 4 tosses of a coin: |S| = 16.
S = {HHHH,THHH,HTHH,HHTH,HHHT,TTHH,…,TTTT}
Average value of X, using its population of values
• 1/16 x (4 + 3 + 3 + 3 + 3 + 2 + … + 0), or
• (4 x 1/16) + (3 x 4/16) + (2 x 6/16) + (1 x 4/16) + (0 x 1/16) = 32/16 = 2.0
• Can be expressed as: XX = xi P(X = xi)
= x1 p1 + x2 p2 + ... + xk pk
Center at 2
33
Mean of a discrete random variable
• If X is a discrete random variable with k possible values, its probability distribution is:
Value x1 x2 x3 ... xk
Probability p1 p2 p3 ... pk
• Mean (or “expected value”) of X, denoted μX, is given by:
X = E(X) = (xiP(X = xi)) = x1 p1 + x2 p2 +...+ xk pk
If X represents a measure on some member of a population, then E(X) is the population mean of this measure
34
Variance of a discrete random variable
• If X is a discrete random variable with k possible values, its probability distribution is:
Value x1 x2 x3 ... xk
Probability p1 p2 p3 ... pk
• The variance of X, denoted 2X, is given by
2X = (x1 - X)2 p1 + (x2 - X)2 p2 + ... + (xk - X)2 pk
This is sometimes written as:
Standard deviation for a RV is the square root of variance:
k
i
ii pxXEXVar1
22 )(])[()(
])[()()( 2 XEXVarXSD
35
Correlation between 2 Discrete RV’s
Recall that the correlation for two variables, when we have
observations on the variables, is
The definition of correlation for two random variables X and Y is:
y
in
i x
i
s
yy
s
xx
nr
)()(
1
1
1
Y
Y
X
Xxy
YXE
)()(
**We’ll never have you calculate this by hand
36
Correlation ( of Random Variables
Correlation for random variables has the same propertie and
interpretation as with data
XY > 0 means that when X tends to be larger than its mean,
Y will tend to be larger than its mean
XY < 0 → when X > µX, Y tends to be smaller than its mean
-1 ≤ XY ≤ 1
The use of correlation in applications is more important than
formula/definition of …
37
Continuous random variables
• A continuous random variable is a variable taking all
possible values in an interval of numbers
• If X is a continuous random variable, its probability
distribution is described by a density curve (defined
graphically or as a formula)
• The probability of an event is the area under the
curve above the values of X that make up the event
• What density curve have we already talked about in
detail?
38
39
Rules for variances (IPS pg. 271)
• Rule 1: If X is a random variable with variance 2X and a and b
are fixed constants, then
var(a + bX) = b2 2X
• Rule 2: If X and Y have correlation then:
var(X + Y) = 2X + 2
Y + 2 X Y
var(X – Y) = 2X + 2
Y - 2 X Y
var(aX – bY) = a22X + b22
Y – 2a)(b)X Y
• If X and Y are not correlated, = 0 and the above simplifies to:
var(X + Y) = var(X – Y) = 2X + 2
Y
var(aX – bY) = a22X + b22
Y
40
Example – linear combo of random variables Yearly vet costs for cats have an average of $200 and a sd of $100. For dogs, these costs average $250 with a standard deviation of $150. You can expect these costs to be independent. a) What is the expected total yearly vet cost for someone who owns
one cat and one dog? Mean(1X + 1Y) = 1(µX) + 1(µY) = 200 + 250 = $450 b) What is the standard deviation of the total yearly vet cost for
someone who owns one cat and one dog?
Var(1X + 1Y) = 122X + 122
Y + 21)(1)X Y
= 121002 + 121502 + 201)(1)100150 = 32500 SD(X + Y ) = √32500 = $180.28