Secondary-Data Analysis: Issues and Examples 周雪光 Stanford University.

19
Secondary-Data Analysis: Issues and Examples 周周周 Stanford University

Transcript of Secondary-Data Analysis: Issues and Examples 周雪光 Stanford University.

Secondary-Data Analysis:Issues and Examples

周雪光Stanford University

Why secondary data? The role of data in social science research

Data availability delimits or expand the horizon of our views, theories, approaches, and knowledge growth

French government archive on policing, folklore The role of organizations in social stratification

The challenges of collecting first-hand data

What is secondary data? Data that has been collected (and analyzed) by others Credibility, and potential for cumulative knowledge The possibility of further use (reanalysis of data)

GSS, CPS, PSID – hundreds of research articles New information from the same data, because of new analytical

tools, new theoretical perspectives, and new operationalization.

An Example: -- The diffusion of medical innovation

Coleman et al. (1966) Burt (1987) Marsden and Pololny (1990) Strang & Tuma (1993) Bulte and Lilien (2001)

A variety of secondary data available Data collected by government agencies

Census, industrial survey, firm survey Especial survey/study by government agencies

SME firm finance Employment quality survey

Data collected by other researchers ( ICPSR ) Data collected by for-profit databanks ( COMPUSTAT , etc. )

Considerations in making data accessible in public domain The replicatability in scientific research (recent practice in natural

science, economics, and sociology) Accumulation of knowledge The monopoly of data and knowledge

Issues related to the use of secondary data An observation : issues are similar to data

issues in other types of empirical research

Assessment of data quality The purpose, information of the data The population of study, sampling framework and

procedures Methods of data collection, response rate Data coding and entry Codebook – questionnaire, coding scheme, etc. Previous research using the data

The limitation of secondary data-based research Data quality and representativeness

New organizational forms, new environmentsDifferent research purposes and information

Limitation in available informationCross-sectional vs. longitudinal dataNew topics : EQ , social network ,

inter-firm contractual relationship

General observations

A large proportion of research is based on secondary data

The issues encountered in using secondary data are similar to data issues in other context

There is a need for a research community for the sharing of secondary data; Making data available in the public domain Data evaluation and quality check

Example 1

“Medical Innovation Revisited: Social Contagion versus Marketing Effort”

Christophe Van den BulteGary L. Lilien

AJS 2001 (106)

Medical Innovation : A dataset’s rich journey

Coleman et al. ( 1966 ) In the mid-1950s, pattern of adopting a new medicine.

The theme: what determines doctors’ adoption decision – uncertainty of new medicine and mechanisms that affect doctors’ decisions.

Social network Social positions

Research design Four cities in Illinois 126 doctors interviewed (total n = 148. Information: what channel affect a doctor’s adoption decision?

Initial stage, mid-stage, and final stage: what is the most important factor? Channel:

Salesperson, professional magazine, mailed advertisement, pharmacy magazine, colleague, conference, other.

Subsequent studies

Burt ( 1987 ): Theme: diffusion mechanisms Not cohesion, but structural equivalence

Marsden & Podolny ( 1990 ), Strang & Tuma ( 1993 ) Statistical models of diffusion S & T: isomorphic mechanisms

Van den Bulte & Lilien ( 2001 ) Theme : the diffusion mechanisms of innovation New theory : the role of marketing, not social network New data collection Findings: the intensity of marketing wipes out the network effects

Example 2

“Embeddedness in the Making of Financial Capital: How Social Relations and Networks Benefit Firms Seeking Financing”

Brian Uzzi

ASR 1999 (64)

Research issues

theme : the channel and cost of bank loans Theoretical: the nonlinear effects of social networks Embeddedness vs. arms-length social relations: the

strength and complementarity of networks

To address problems in quantitative data – “multiplicity” in business transactions.

Research design

Triangle research methodsTheory, quantitative data, and case studies

Secondary data collected by U.S. government

Case study to provide context and details of social network in operation

Characteristics of Interviewees in the Field Research: Relationship Managers (RMs) at Chicago Banks, 1988

Coefficients from the Heckman Selection Regression of Access to Credit and Interest Rate on Loan on Selected Independent Variables: U.S. Nonagricultural Firms, 1989

Social Embeddedness and the Firm’s Cost of Financial Capital: U.S. Nonagricultural Firms, 1989

Summary

The role of data in social science research Evolving standards and expectations in

data quality The importance of a research community