1017269_frmFile2

8/11/2019 1017269_frmFile2

http://slidepdf.com/reader/full/1017269frmfile2 1/4

Names and Numbers:

Types of Variables

Statistics provide a way of dealing with numbers. Before

leaping headlong into statistical tests, it is necessary to get an idea of how

these numbers come about, what they represent, and the various forms they

can take.

Let’s begin by examining a simple experiment. Suppose an investigator

has a hunch that clam juice is an effective treatment for the misery of pso-

riasis. He assembles a group of patients, randomizes them to a treatment and

control group, and gives clam juice to the treatment group and something

that looks, smells, and tastes like clam juice (but isn’t) to the control group.

After a few weeks, he measures the extent of psoriasis on the patients, per-

haps by estimating the percent of body involvement or by looking at the

change in size of a particular lesion. He then does some number crunchingto determine if clam juice is as good a treatment as he hopes it is.

Let’s have a closer look at the data from this experiment. To begin

with, there are at least two variables. A definition of the term variable is a

little hard to come up with, but basically it relates to anything that is mea-

sured or manipulated in a study. The most obvious variable in this exper-

iment is the measurement of the extent of psoriasis. It is evident that this

is something that can be measured. A less obvious variable is the nature of

treatment—drug or placebo. Although it is less evident how you might

1

1

There are four types of variables. Nominal and ordinal variables consist ofcounts in categories and are analyzed using “nonparametric” statistics.

Interval and ratio variables consist of actual quantitative measurements and

are analyzed using “parametric” statistics.

8/11/2019 1017269_frmFile2


convert this to a number, it is still clearly something that is varied in the

course of the experiment.

A few more definitions are in order. Statisticians frequently speak of

independent and dependent variables. In an experiment, the independent variables are those that are varied by and under the control of the exper-

imenter; the dependent variables are those that respond to experimental

manipulation. In the current example, the independent variable is the type

of therapy —clam juice or placebo—and the dependent variable is the size

of lesions or body involvement. Although in this example the identifica-

tion of independent and dependent variables is straightforward, the dis-

tinction is not always so obvious. Frequently, researchers must rely on

natural variation in both types of variables and look for a relationship

between the two. For example, if an investigator was looking for a

relationship between smoking and lung cancer, an ethics committee

would probably take a dim view of ordering 1,000 children to smoke a pack

of cigarettes a day for 20 years. Instead, the investigator must look for a

relationship between smoking and lung cancer in the general adult popula-

tion and must assume that smoking is the independent variable and that lung

cancer is the dependent variable; that is, the extent of lung cancer dependson variations in smoking.

There are other ways of defining types of variables that turn out to be

essential in determining the ways the numbers will be analyzed. Variables are

frequently classified as nominal, ordinal, interval, or ratio (Figure 1-1).

A nominal variable is simply a named category. Our clam juice versus place-

bo is one such variable, as is the sex of the patient, or the diagnosis given to

a group of patients.

2 PDQ STATISTICS

Dependent(reflux)

Independent(proton pump

inhibitorversus placebo)

Interval(36°-38°C)

8/11/2019 1017269_frmFile2


An ordinal variable is a set of ordered categories. A common example

in the medical literature is the subjective judgment of disease staging in can-

cer, using categories such as stage 1, 2, or 3. Although we can safely say that

stage 2 is worse than stage 1 but better than stage 3, we don’t really know by how much.

The other kinds of variables consist of actual measurements of individ-

uals, such as height, weight, blood pressure, or serum electrolytes. Statisti-

cians distinguish between interval variables, in which the interval between

measurements is meaningful (for example, 32° to 38°C), and ratio variables,

in which the ratio of the numbers has some meaning. Having made this dis-

tinction, they then analyze them all the same anyway. The important dis-

tinction is that these variables are measured quantities, unlike nominal and

ordinal variables, which are qualitative in nature.

So where does the classification lead us? The important distinction is

between the nominal and ordinal variables on one hand and the interval and

ratio variables on the other. It makes no sense to speak of the average value

of a nominal or ordinal variable—the average sex of a sample of patients or,

strictly speaking, the average disability expressed on an ordinal scale. How-

ever, it is sensible to speak of the average blood pressure or average heightof a sample of patients. For nominal variables, all we can deal with is the

number of patients in each category. Statistical methods applied to these two

broad classes of data are different. For measured variables, it is generally

assumed that the data follow a bell curve and that the statistics focus on the

center and width of the curve. These are the so-called parametric statistics.

By contrast, nominal and ordinal data consist of counts of people or things

in different categories, and a different class of statistics, called nonpara-

metric statistics (obviously!), is used in dealing with these data.

C.R.A.P. DETECTORS

Question. What kind of variables are these—nominal or ordinal? Are

they appropriate?

Chapter 1 Names and Numbers: Types of Variables 3

Example 1-1

To examine a program for educating health professionals in a sports injury clinicabout the importance of keeping detailed medical records, a researcher doesa controlled trial in which the dependent variable is the range of motion ofinjured joints, which is classified as (a) worse, (b) same, or (c) better, and theindependent variable is (a) program or (b) no program.

8/11/2019 1017269_frmFile2


Answer. The independent variable is nominal , and the dependent vari-

able, as stated, is ordinal . However, there are two problems with the choice.

First, detailed medical records may be a good thing and may even save some

lives somewhere. But range of motion is unlikely to be sensitive to changesin recording behavior. A better choice would be some rating of the quality

of records. Second, range of motion is a nice ratio variable. To shove it into

three ordinal categories is just throwing away information.

C.R.A.P. Detector I-1

Dependent variables should be sensible. Ideally, they should be clinically

important and related to the independent variable.

C.R.A.P. Detector I-2

In general, the amount of information increases as one goes from nom-

inal to ratio variables. Classifying good ratio measures into large categories

is akin to throwing away data.

4 PDQ STATISTICS

1017269_frmFile2

Documents

Transcript of 1017269_frmFile2