by Ass oc . Prof. Sami Fethi

24
Research in business studies Research Methods in Business Studies © 2009/10, Sami Fethi, EMU, All Right Reserved, Pearson Education, 2005, 3. Ed. Department of Business Administration SPRING 2009-10 Quantitative and Qualitative Data Analysis by Assoc. Prof. Sami Fethi

description

Department of Business Administration. SPRING 200 9 - 10. by Ass oc . Prof. Sami Fethi. Quantitative and Q ualitative D ata Analysis. Quantitative data analysis. Examining differences Relationship between variables Explaining and predicting relationship between variables - PowerPoint PPT Presentation

Transcript of by Ass oc . Prof. Sami Fethi

Page 1: by Ass oc . Prof. Sami Fethi

Research in business studies

Research Methods in Business Studies © 2009/10, Sami Fethi, EMU, All Right Reserved, Pearson Education, 2005, 3. Ed.

Department of Business Administration

SPRING 2009-10

Quantitative and Qualitative Data Analysis

by

Assoc. Prof. Sami Fethi

Page 2: by Ass oc . Prof. Sami Fethi

Research in business studies

Research Methods in Business Studies © 2009/10, Sami Fethi, EMU, All Right Reserved, Pearson Education, 2005, 3. Ed.

2

Quantitative data analysis

Examining differences Relationship between variables Explaining and predicting relationship between variables Data reduction, structure and dimension Additional methods Characteristic of qualitative research Qualitative data Analytical procedure Interpretation Strategies for qualitative analysis Quantify qualitative data Validity in qualitative research

Page 3: by Ass oc . Prof. Sami Fethi

Research in business studies

Research Methods in Business Studies © 2009/10, Sami Fethi, EMU, All Right Reserved, Pearson Education, 2005, 3. Ed.

3

Examining differences Hypotheses about one mean In research we often have to make statements about the mean. When the

population variance is unknown, the stadard error of the mean is also unknown. The standard error of the mean must be estimated from sample data.

e.g. SDX= SD‘/where SDX= standard error of mean SD‘= estimated standard deviation N= sample size SD‘= N-1 is degrees of freedom

Example 1: For a supermarket chain to add a new product, at least 100 units must be sold per week. The new product is tested in ten randomly selected stores for a limited time.Apply a test such as one-tailed t test and answer the question that will the new product sell more than 100 unit per week?a) construct hypothesisb) calculate mean and standard deviation if they are not given.c) calculate standart error of meand) find t- value

N

1

)(2

1

N

XxN

i i

Page 4: by Ass oc . Prof. Sami Fethi

Research in business studies

Research Methods in Business Studies © 2009/10, Sami Fethi, EMU, All Right Reserved, Pearson Education, 2005, 3. Ed.

4

Examining differencesa) H0: X<=100

H1: X>100b) X and SD are given 109.4 and 14.90 respectively.

c) SDX = 14.90/ =4.55

d) t= (X-µ)/SDX=(109.4-100)/4.55=2.07Where t-table is 1.83 at 5% significant level. We reject the null Hypotheses about two means This is usually associated with such a question: Are

the tastes in region A different from the tastes in region B?

e.g.

Where X1= sample mean for the first sample X2= sample mean for the second sample

21

)()( 2121

XXSD

XXZ

110

Page 5: by Ass oc . Prof. Sami Fethi

Research in business studies

Research Methods in Business Studies © 2009/10, Sami Fethi, EMU, All Right Reserved, Pearson Education, 2005, 3. Ed.

5

Examining differences

= the standard eror of differences in means

µ1 and µ2 are the unknown population means and the general estimate of:

In assuming the two population variances to be equal, the common population variance can be generated by pooling the samples. When the variances are unknonw and the standard errors of means must be estimated, then the t represents an adequate test statistics, distributed with v= N1+ N2-2- degrees of freedom.

Example2: A manufacturer has developed a new product and wonders whether the label of the package should be red or blue. The new products with two different labels are tested in ten randomly selected stores. The means sales obtained for the red package are 403.0 and for the blue package 390.3. The standard error of estimate for the difference means is 8.15.

21 XXSD

2

22

1

12

22

2

121 N

SD

N

SDSDSDSD XXXX

Page 6: by Ass oc . Prof. Sami Fethi

Research in business studies

Research Methods in Business Studies © 2009/10, Sami Fethi, EMU, All Right Reserved, Pearson Education, 2005, 3. Ed.

6

Examining differences

a) construct hypothesisb) find t- valuea) H0: (µ1- µ2 )=0 H1: (µ1- µ2 )≠0or H0: (µ1- µ2 )<=0 H1: (µ1- µ2 )>0

b) =((403.0-390.3)-0)/8.15=1.56

V=10+10-2=18 degrees of freedom...5% and df 18 so critical value from the table is 2.101. This means that null hypothesis is accepted.. H0: (µ1- µ2 )=0. This means that the two unknown population means are assumed to be same.

21

)()( 2121

XXSD

XXt

Page 7: by Ass oc . Prof. Sami Fethi

Research in business studies

Research Methods in Business Studies © 2009/10, Sami Fethi, EMU, All Right Reserved, Pearson Education, 2005, 3. Ed.

7

Useful alternative tests

In problems involving one or two population means, t-methods are usually appropriate, but often non-parametric methods are good alternatives.

e.g. Non-parametric methods have advantage of requiring less in terms of assumptions and less powerful than t-methods (see siegel and Castella; 1998).

e.g. The main difference between them is that t-method associates with means while non-parametric methods are concerned with medians.

ANOVA- analysis of variance measures comparisons of more than two groups simultaneously. This method rests on comparing the ratio of systematic variance to unsystematic variance.

o In ANOVA, the following is computed: Total variation by comparing each observation with the grand mean. The between-group variation by comparing the treatment means with

the grand mean. The within-group variation by comparing each score in the group with

the group mean. Recall-MANOVA-multivariate analysis of variance. This has more than

one dependent variable compared to ANOVA:

Page 8: by Ass oc . Prof. Sami Fethi

Research in business studies

Research Methods in Business Studies © 2009/10, Sami Fethi, EMU, All Right Reserved, Pearson Education, 2005, 3. Ed.

8

Comparison of more than two group

Example 3: In the following table, three advertising campaigns tested in 24 randomly selected cities comparable in size and demographics. The following output is an anova analysis results:

Source Sum of sq.

Degree of freedom

Mean sq.

F-ratio

Between group

49.0 2 24.1 5.88

Within group

87.5 21 4.17

total 136.5 23

Page 9: by Ass oc . Prof. Sami Fethi

Research in business studies

Research Methods in Business Studies © 2009/10, Sami Fethi, EMU, All Right Reserved, Pearson Education, 2005, 3. Ed.

9

Example 3a) construct hypothesisb) find F- value whether significant or notc) Comment on the F-valuesa) H0: G1= G2= G3 H1: G1≠ G2 ≠ G3d.f= 24-1=23, between group 3-1=2 within group 23-2=21.

b) Fcalculated=24.1/4.17=5.88 Fcritical=n-k,k-1=24-3,3-1=(21,2). From F-distribution, Fcritical is 3.47.c) Since 5.88 is greater than 3.47, we reject the null hypothesis, that is, the group means are equal and accept the alternative hypothesis that the advertising campaigns vary in effectiveness.

Page 10: by Ass oc . Prof. Sami Fethi

Research in business studies

Research Methods in Business Studies © 2009/10, Sami Fethi, EMU, All Right Reserved, Pearson Education, 2005, 3. Ed.

10

Relationship between variables

In research, we are often preoccupied with whether there is a relationship or two or more variables covary.o Correlation coefficientBased on the Pearson criterion, it examines the strength of linear relationship between two variables, for example x and y.o Theoretically, the Correlation coefficient can take the values from -1 to 1. A correlation coefficient of 1 tells us that two variables perfectly covary positively whereas -1 shows that two variables perfectly inversely related. Close to 0 indicates that the variables are unrelated. The formula of the Correlation coefficient as fololw:Where X and Y represent the sample means of X and Y.

22 )()(

))((

YyXx

YyXxr

ii

iiXY

Page 11: by Ass oc . Prof. Sami Fethi

Research in business studies

Research Methods in Business Studies © 2009/10, Sami Fethi, EMU, All Right Reserved, Pearson Education, 2005, 3. Ed.

11

Relationship between variables

o Correlation coefficientA Correlation coefficient shows covariation between two variables, and not that the variables are causally related.The square of the Correlation coefficient is the coefficient of determination.R2=Explained variation/Total variationo Example 4- partial correlation Using the following table (Table 1) and calculate the relationship between advertisement recognition, appeal and sex. In other words, Is the relationship between advertisement recognition and appeal inluenced by controling for sex?

Page 12: by Ass oc . Prof. Sami Fethi

Research in business studies

Research Methods in Business Studies © 2009/10, Sami Fethi, EMU, All Right Reserved, Pearson Education, 2005, 3. Ed.

12

Example 4o This is partial correlation and can be formulated as follow based on partial Correlation coefficient r123 as such ad.roc, appeal, sex

223

213

231312123

11

)()(

rr

rrrr

29.0

)09.0(1)33.0(1

)09.0()33.0(24.022,,.

sexappealrocAdr

o This shows that controlling for sex the observed relationship between ad.roc, and appeal positive and strengthened.

Page 13: by Ass oc . Prof. Sami Fethi

Research in business studies

Research Methods in Business Studies © 2009/10, Sami Fethi, EMU, All Right Reserved, Pearson Education, 2005, 3. Ed.

13

Explaining and predicting relationship between variables

o Explaining and predicting relationship between variables are important tasks in business research. One of the most applied and useful approaches to examining relationships between variables is regression analysis. In regression analysis, we want to fit a model that best describes the data which is done in regression analysis by applying the method of least squares. More precisely, this is done by fitting a straight line that minimizes the squared vertical deviations from that line as shown in following figure.o Single Linear Regression

Y= a0+a1xi+ei

Where Y=the outcome variable, X=predictor variable, a1=slope of the straight line fitted to the data and a0=intercept of the line and ei=difference between the score predicted and the score actually obtained. This is called residual.

Page 14: by Ass oc . Prof. Sami Fethi

Research in business studies

Research Methods in Business Studies © 2009/10, Sami Fethi, EMU, All Right Reserved, Pearson Education, 2005, 3. Ed.

14

Explaining and Predicting Relationship between Variables

Single Linear Regression

Figure 1 The linear model

Page 15: by Ass oc . Prof. Sami Fethi

Research in business studies

Research Methods in Business Studies © 2009/10, Sami Fethi, EMU, All Right Reserved, Pearson Education, 2005, 3. Ed.

15

Single Linear RegressionExample 5

Table 2 Data matrix

o Assume that a car dealer collects data for six months on four variables; Tv advertising, printing advertising, competitors’ advertising and sales. Y is sales. The car dealer expects carsales to be positively correlated with TV-ads and Print-ads and negatively correlated with competitors’ ads.

Page 16: by Ass oc . Prof. Sami Fethi

Research in business studies

Research Methods in Business Studies © 2009/10, Sami Fethi, EMU, All Right Reserved, Pearson Education, 2005, 3. Ed.

16

Simple Mean Regression-outputExample 5

Table 3 Simple mean regression-output

o Assume that a car dealer collects data for six months on four variables; Tv advertising, printing advertising, competitors’ advertising and sales. Y is sales. The car dealer expects carsales to be positively correlated with TV-ads and Print-ads and negatively correlated with competitors’ ads. Based on the information below, comment on the estimated coefficinent and T-ratio as well as R2 on Tv-Ads.

Page 17: by Ass oc . Prof. Sami Fethi

Research in business studies

Research Methods in Business Studies © 2009/10, Sami Fethi, EMU, All Right Reserved, Pearson Education, 2005, 3. Ed.

17

Simple Mean Regression-outputExample 5-Answer

o The estimated constant term 0.7 shows that If the dealer does not use Tv-ads at all (Tv-ads=0), the estimated expected value of carsale is 0.7 unit that is 7 car. The estimated regression coefficient of sales on Tv-Ads is 0.9. This coefficient shows that if the variable Tv-ads is increased by 1 unit, the estimated expected value of carsales increases by 0.9 units, that is nine car. The result, R- square, R2 that is 85.3 percent shows that the sample determination of coefficient is equal to 0.853. Practically speaking, this means that the variation in the variable Tv-ads has explained 85.3 percent of the variations in the dependent variable carsales. Estimated t-value on Tv-ads is 4.81 which is greater than 2 (tabular value from t-distribution) or rule of thumb so it is signficant 5% and 1% levels. This means that we can reject the null hypothesis that is the corresponding population regression coefficient is equal to zore. The conclusion then is that Tv-ads and sales are significantly related to each other or Tv-ads has positive impact on sales.

Page 18: by Ass oc . Prof. Sami Fethi

Research in business studies

Research Methods in Business Studies © 2009/10, Sami Fethi, EMU, All Right Reserved, Pearson Education, 2005, 3. Ed.

18

Assumptions in Regression analysis

o The expected value of the error term is zeroo The variance for the error term for each X is constant. This term homoscedasticity. If the variance to e varies with X, this is termed heteroscedasticity.o The error for the observations are uncorrelated.o e should be normally distributed for each X.o The error term should not be correlated with x-corr(e, x)=0o It is also a common assumption that the regression model should be linear in its parameters.

Page 19: by Ass oc . Prof. Sami Fethi

Research in business studies

Research Methods in Business Studies © 2009/10, Sami Fethi, EMU, All Right Reserved, Pearson Education, 2005, 3. Ed.

19

Correlation Coefficients-outputExample 6

Table 4 Correlation coefficients-output

o Assume that a car dealer collects data for six months on four variables; Tv advertising, printing advertising, competitors’ advertising and sales. Y is sales. The car dealer expects carsales to be positively correlated with TV-ads and Print-ads and negatively correlated with competitors’ ads. Use the concept of correlation coefficient and explain the relationships between the variable under inspection based on the information given in table 4.

Page 20: by Ass oc . Prof. Sami Fethi

Research in business studies

Research Methods in Business Studies © 2009/10, Sami Fethi, EMU, All Right Reserved, Pearson Education, 2005, 3. Ed.

20

Correlation Coefficients-outputExample 6 -Answer

o The relationship between carsales (dependent) and Tv advertising, printing advertising, competitors’ advertising (explanatory) are expected to be high. The relationship between the explanatory variables as such Tv advertising, printing advertising, competitors’ advertising are expected to be low. So high correlation coefficient between for example Tv advertising and printing advertising shows a high degree of multicollinearity. This influences the estimates results badly. To remedy this situation, the relevant variable can be dropped from the regression equation. For example between sales and Tv-ads is 0.92 which is highly reasonable score or between sales and Comp-ads is 0.155 which is very low score .

Page 21: by Ass oc . Prof. Sami Fethi

Research in business studies

Research Methods in Business Studies © 2009/10, Sami Fethi, EMU, All Right Reserved, Pearson Education, 2005, 3. Ed.

21

Multiple Regression

Table 5 Multiple regression – output

o In multiple regression, at least two or more independent or explanatory variables are applied to explain/predict the dependent variable. The purpose is to make the model more realistic, control for other variables, and explain more of the variance in the dependent variable as well as reduce the residuals. The following is a typical example output for a multiple regression.

Page 22: by Ass oc . Prof. Sami Fethi

Research in business studies

Research Methods in Business Studies © 2009/10, Sami Fethi, EMU, All Right Reserved, Pearson Education, 2005, 3. Ed.

22

Dummy Variables

Table 6 Coding of dummy variable

o In a multiple regression, dummy variable can be used in two ways. As a dependent variables where its values take 1 or 0 that is also called dichotomous. The other type can be used as independent variable which takes the value 0 or 1. The dummy variable used in an analysis when there does not exist as numerical values. For example, in the following table that is a nominal scaled variable that can not be ranked so to be applied in a regression analysis, the seasons need to be assigned numbers

Page 23: by Ass oc . Prof. Sami Fethi

Research in business studies

Research Methods in Business Studies © 2009/10, Sami Fethi, EMU, All Right Reserved, Pearson Education, 2005, 3. Ed.

23

Dummy variablesExample 7

Table 6 Coding of dummy variable

o In the following table, there three new variables A, B and C and indicates that the four seasons are different combinations of zeros and ones. Assume that the following regression model for sales of women’s clothing where the price (P) is also included, has been estimated:Sale=1000 - 0.5P+100A - 20B - 50Ca) Calculate the sales in the summer by considering dummy variables as well (i.e. p=$200 ).b) Calculate the sales in the autumn by considering dummy variables as well (i.e. p=$200 ).c) Compare the sales in winter and spring by keeping the same price.

Page 24: by Ass oc . Prof. Sami Fethi

Research in business studies

Research Methods in Business Studies © 2009/10, Sami Fethi, EMU, All Right Reserved, Pearson Education, 2005, 3. Ed.

24

Dummy variablesExample 7-Answer

o In the following table, there three new variables A, B and C and indicates that the four seasons are different combinations of zeros and ones. Assume that the following regression model for sales of women’s clothing where the price (P) is also included, has been estimated:

Sale=1000 - 0.5P+100A - 20B - 50C

a) Calculate the sales in the summer by considering dummy variables as well (i.e. p=$200 ).

Sale=1000 - 0.5 (200)+100(1) – 20(0) – 50(0)=$1000

b) Calculate the sales in the autumn by considering dummy variables as well (i.e. p=$200 ).

Sale=1000 - 0.5 (200)+100(0) – 20(1) – 50(0)= $880

c) Compare the sales in winter and spring by keeping the same price.

Winter- Sale=1000 - 0.5 (200)+100(0) – 20(0) – 50(1)= $950spring- Sale=1000 - 0.5 (200)+100(0) – 20(0) – 50(0)= $900