Lecture II

39
LECTURE II

description

decision science chapter

Transcript of Lecture II

Calculation Time

LECTURE II

Bivariate Frequency Distribution

WorkerAgeWageWorkerAgeWage1267851623680239148017237903371390182710804248251928124052568020321250621625213614607271050223312708321225233414309301300243012601033142025271020112689026228801228102027236401322630282468014281040293212001529124030311150
WageAge500-700700-900900-11001100-13001300-1500Total20-2453825-2912521030-3463935-3933Total6558630

Problem 1

In a newspaper account, describing the incidence of influenza among tubercular persons living in the same family, the following paragraph appeared. Exactly a fifth of the 100,000 inhabitants showed signs of tuberculosis and no fewer than 5,000 among them had an attack of influenza, but among them only 1,000 lived in uninfected houses. In contrast with this 1/15th of the tubercular persons who did not have influenza were still exposed to infection. Altogether 21,000 were attacked by influenza and 41, 000 were exposed to risk of infection, but the number, who having influenza but not tuberculur lived in houses where no other cases of influenza occurred, was only 2000.

Redraft the information in a concise tabular form.

Ratio Chart

When we want to see the rate of change in a variable, or when a comparison is to be made between the rates of change of two or m0re variables.

With alinearscale, the physical distance on the scale from, say, $1 to $2 is the same as the distance from $2 to $3.

On aratioscale, the intervals between amounts are determined by their ratio to each other. Thus the distance from $1 to $2 is the same as the distance from $2 to $4 or from $3 to $6 because each represents an increase of 100%.

A ratio scale makes it possible to compare at a glance the magnitudes of changes that occur at different ranges. For example, if an investment rises from $1 to $2 during one time period, and later in the graph it rises from $3 to $6, a ratio scale will make it obvious that each rise was of the same magnitude. A linear scale would make it appear that the second rise was proportionally three times as great as the first rise.

On a linear scale, a vertical inch (or any other distance) represents the samenumber of units(such as dollars) wherever it appears on the graph. On a ratio scale, a vertical inch represents the samedegree of growthwherever it appears on the graph.

A ratio scale demonstrates comparisons, growth, relationships between items and between time periods. It is especially useful for plotting two or more items on the same graph, because it's the only way to compare the growth of two items starting at different levels. If you compare them on a linear scale, the apparent differences between them will be meaningless.

If a graph bears no indication of which kind of scale it's using, it probably is a linear scale. If it's a ratio scale, there should be a label on it somewhere sayingratio scale,log scale, orsemi-log scale.

Why Ratio Scale

To see the reason a ratio scale is important, look at this graph. It shows a trend that appears out of control. After making a slow start, the subject of the graph appears to be growing faster and fasterand near the end of the graph it is exploding.

What is this thing that's racing to the sky? Is it the federal debtcareening on the road to national bankruptcy? Is it some investment in the last stages of a runaway bull marketclimbing toward its final blowoff?

Well, as a matter of fact, it's neither of those things. It's the yearly value of a 5% passbook savings account with interest compounded year after year. Hold on to the account long enough and the curve will rise with a steeper and steeper arc, even though each year's value is only 5% greater than the prior year'sjust as at the beginning of the trend.

Sooner or later, a linear graph will turn any constant growth rate into a picture of a skyrocket. It distorts comparisons because it makes the change from 100 to 120 appear to be 20 times as large as the change from 5 to 6even though each is, proportionately, a 20% increase.

Now look at this graph. The same 5% growth rate is plotted on a ratio scale. Each horizontal gridline represents the same percentage growth from the previous gridline. Since the growth is a constant 5% per year, the plot line is perfectly straightbecause 5% represents the same degree of gain towards the end of the graph as it did at the beginning.

Economists and mathematicians almost always use ratio scales whenever rates of growth are important.

Ratio scales are appropriate for investment prices, sales figures, income, or any other absolute amounts being plotted over a period of time. They should not be used to plot anything in which a relationship is already inherent in the amountssuch as percentages (like the inflation rate), ratios between two items (such as a gold-silver ratio or price-earnings ratio)because the benefit provided by a ratio scale is already built into the figures being plotted.

Measures of Central TendencyMean

Mean x = x/ N

OR

x = fx/ f

Mean of Grouped Data

x = fm/ f (where m is the mid point of the class interval)

Problem 2

Marks (x)Frequency (f)9110211312613101411157163172181Total 46

Calculate Mean

Problem 3

Monthly earningsNo. of Workers27.5-32.512032.5-37.515237.5-42.517042.5-47.521447.5-52.5410

Calculate Mean

Problem 4

The mean age of a group of 100 persons was found to be 32.02. Later, it was discovered that age 57 was misread as 27. find the corrected mean.

Problem 5

The mean monthly salary paid to all employees in a company was Rs 500. the monthly salaries paid to male and female employees average Rs. 520 and Rs 420 respectively. Determine the percentage of males and females employed by the company.

Mean Formula to be used

x 12 = N1 x 1 + N2 x 2

__________

N1+N2

Problem 6

xf9110211312613101411157163172181Total 46

Short cut method for calculating mean

x = A + fx/f

Problem 7

Monthly earningNo. of workers1000-150031500-2000242000-2500552500-3000513000-3500393500-4000224000-4500154500-500011Total = 220

Short cut method for calculating mean

x = A + fx/f *i

Weighted Arithmetic Mean

x= (w1x1) + (w2x2) +(w3x3) +(w4x4)

w1 +w2 +w3 +w4

OR

x= wx

w

ToyPrice per Toy Rs xNumber Sold wCar350Locomotive525Airplane715Double Decker910100

Median

Median = (N + 1) th item

2

Also when the data is in class intervals

(N + 1) - C

Me=l + 2 X i

f

where l is the lower limit of the median class

i is the width of median class

f is frequency of median class

C cumulative frequency upto (excluding) the median class

N total no. of cases

Problem 8

Class IntervalsCumulative Frequency0-5295-1022410-1546515-2058220-2563425-3064430-3565035-4065340-45655

Quartiles, Deciles and Percentiles

The Median divides a distribution into two halves.

The first and third quartiles (denoted Q1 and Q3) are defined as follows:

25% of the data lie below Q1 (and 75% is above Q1),

25% of the data lie above Q3 (and 75% is below Q3)

Decile divides data in 10 equal intervals

Percentile divides data in 100 equal intervals

Nth percentile of a data is the value of the item such that N percent items lie below it.

Percentile

nN - C

Pn=l + 100 X i

f

Decile

nN - C

Dn=l + 10 X i

f

Quartile

nN - C

Qn=l + 4 X i

f

Problem 9

Minutes/weekNo. of teenagersMinutes/weekNo. of teenagers0-99.527399.5-499.55899.5-199.542499.5-599.532199.5- 299.565599.5 & above9299.5-399.578

The above distribution represents the number of minutes spent by a group of teenagers in going to movies.

Find the median, 1st and 3rd quartile, 70th percentile and 8th decile.

Problem 10

MarksNo. of StudentsMore than 050More than 1046More than 2040More than 3020More than 4010More than 503

Above is the distribution of marks obtained by 50 students in Decision Science.

Calculate the median marks. If 60 percent of the students pass this test, find the minimum marks obtained by a pass candidate

Inter Quartile Range

The inter-quartile range (IQR) is the difference between the first and third quartiles, i.e. IQR = Q3- Q1

Example

The ordered blood pressure data is:

113124124132146151170

Q1 Q3

Inter Quartile Range (IQR) is 151-124 = 27

27

An alternative definition of Q1 and Q3 is based on Q1 having a rank position = 0.25(n+1) and Q3 having rank position = 0.75(n+1), where n is the sample size.

If n=10, then Q1 would have rank position = 0.2511=2.75 and Q3 has rank position = 8.25 . Therefore Q1 is found by interpolating between the second an third observations and Q3 is found by interpolating between observations 8 and 9.

JMP software uses this definition of quartiles.

Box-plots

A box-plot is a visual description of the distribution based on

Minimum

Q1

Median

Q3

Maximum

Useful for comparing large sets of data

28

Example 1

The pulse rates of 12 individuals arranged in increasing order are:

62, 64, 68, 70, 70, 74, 74, 76, 76, 78, 78, 80

Q1=(68+70)2 = 69, Q3=(76+78)2 = 77

IQR = (77 69) = 8

29

Example 1: Box-plot

Example 2: Box-plots of intensities from 11 gene expression arrays

Outliers

An outlier is an observation which does not appear to belong with the other data

Outliers can arise because of a measurement or recording error or because of equipment failure during an experiment, etc.

An outlier might be indicative of a sub-population, e.g. an abnormally low or high value in a medical test could indicate presence of an illness in the patient.

32

Outlier Boxplot

Re-define the upper and lower limits of the boxplots (the whisker lines) as:

Lower limit = Q1-1.5IQR, and

Upper limit = Q3+1.5IQR

Note that the lines may not go as far as these limits

If a data point is < lower limit or > upper limit, the data point is considered to be an outlier.

33

Example

outliers

34

Problem 11

The numbers below represent forty eighteen-hole scores of a golfer at Crystal Springs Golf Club. Make a box-and-whisker plot of this data and find the outlier.

88 80 80 77 84 74 80 79 83 77

75 75 78 79 97 81 72 85 75 76

78 76 77 72 75 70 78 77 72 79

78 75 79 77 63 73 77 78 82 86

Mode

Mode by Interpolation

Mo = l1 + f2X i

f0+f2

OR

Mo = l2 - f0X i

f0+f2

OR

Mo = l1 + f1-f0X i

(f1- f0)+(f1-f2)

Where l1 is the lower limit of the modal class

l2 is the upper limit of the modal class

f0 equals the frequency of the class next below modal class in value

f1 equals the frequency of the modal class in value

f2 equals the frequency of the following class (class next above modal class) in value and

i equals the interval of the modal class

Problem 12

Wage GroupFrequency14-18618-221822-261926-301230-34534-38438-42342-46246-50150-54054-581

Mode= 3 Median 2 Mean

Problem 12

Twenty percent of the workers in a firm employing a total of 2,000 earn less than Rs 2 per hour, 440 earn from Rs 2 to 2.24 per hour, 24% earn 2.25 to 2.49 per hour, 370 earn from rs 2.50-2.74 per hour, 12 % earn from 2.75 to 2.99 per hour and the rest earn rs 3 or more per hour. Set up a frequency table and calculate the modal wage.

Thank You

AG_04659_AS.celAG_11745_AS.celKB_5828_AS.celKB_8840_AS.cel

8

10

12

14