1017269_frmFile2
Click here to load reader
Transcript of 1017269_frmFile2
8/11/2019 1017269_frmFile2
http://slidepdf.com/reader/full/1017269frmfile2 1/4
Names and Numbers:
Types of Variables
Statistics provide a way of dealing with numbers. Before
leaping headlong into statistical tests, it is necessary to get an idea of how
these numbers come about, what they represent, and the various forms they
can take.
Let’s begin by examining a simple experiment. Suppose an investigator
has a hunch that clam juice is an effective treatment for the misery of pso-
riasis. He assembles a group of patients, randomizes them to a treatment and
control group, and gives clam juice to the treatment group and something
that looks, smells, and tastes like clam juice (but isn’t) to the control group.
After a few weeks, he measures the extent of psoriasis on the patients, per-
haps by estimating the percent of body involvement or by looking at the
change in size of a particular lesion. He then does some number crunchingto determine if clam juice is as good a treatment as he hopes it is.
Let’s have a closer look at the data from this experiment. To begin
with, there are at least two variables. A definition of the term variable is a
little hard to come up with, but basically it relates to anything that is mea-
sured or manipulated in a study. The most obvious variable in this exper-
iment is the measurement of the extent of psoriasis. It is evident that this
is something that can be measured. A less obvious variable is the nature of
treatment—drug or placebo. Although it is less evident how you might
1
1
There are four types of variables. Nominal and ordinal variables consist ofcounts in categories and are analyzed using “nonparametric” statistics.
Interval and ratio variables consist of actual quantitative measurements and
are analyzed using “parametric” statistics.
8/11/2019 1017269_frmFile2
http://slidepdf.com/reader/full/1017269frmfile2 2/4
convert this to a number, it is still clearly something that is varied in the
course of the experiment.
A few more definitions are in order. Statisticians frequently speak of
independent and dependent variables. In an experiment, the independent variables are those that are varied by and under the control of the exper-
imenter; the dependent variables are those that respond to experimental
manipulation. In the current example, the independent variable is the type
of therapy —clam juice or placebo—and the dependent variable is the size
of lesions or body involvement. Although in this example the identifica-
tion of independent and dependent variables is straightforward, the dis-
tinction is not always so obvious. Frequently, researchers must rely on
natural variation in both types of variables and look for a relationship
between the two. For example, if an investigator was looking for a
relationship between smoking and lung cancer, an ethics committee
would probably take a dim view of ordering 1,000 children to smoke a pack
of cigarettes a day for 20 years. Instead, the investigator must look for a
relationship between smoking and lung cancer in the general adult popula-
tion and must assume that smoking is the independent variable and that lung
cancer is the dependent variable; that is, the extent of lung cancer dependson variations in smoking.
There are other ways of defining types of variables that turn out to be
essential in determining the ways the numbers will be analyzed. Variables are
frequently classified as nominal, ordinal, interval, or ratio (Figure 1-1).
A nominal variable is simply a named category. Our clam juice versus place-
bo is one such variable, as is the sex of the patient, or the diagnosis given to
a group of patients.
2 PDQ STATISTICS
Dependent(reflux)
Independent(proton pump
inhibitorversus placebo)
Interval(36°-38°C)
8/11/2019 1017269_frmFile2
http://slidepdf.com/reader/full/1017269frmfile2 3/4
An ordinal variable is a set of ordered categories. A common example
in the medical literature is the subjective judgment of disease staging in can-
cer, using categories such as stage 1, 2, or 3. Although we can safely say that
stage 2 is worse than stage 1 but better than stage 3, we don’t really know by how much.
The other kinds of variables consist of actual measurements of individ-
uals, such as height, weight, blood pressure, or serum electrolytes. Statisti-
cians distinguish between interval variables, in which the interval between
measurements is meaningful (for example, 32° to 38°C), and ratio variables,
in which the ratio of the numbers has some meaning. Having made this dis-
tinction, they then analyze them all the same anyway. The important dis-
tinction is that these variables are measured quantities, unlike nominal and
ordinal variables, which are qualitative in nature.
So where does the classification lead us? The important distinction is
between the nominal and ordinal variables on one hand and the interval and
ratio variables on the other. It makes no sense to speak of the average value
of a nominal or ordinal variable—the average sex of a sample of patients or,
strictly speaking, the average disability expressed on an ordinal scale. How-
ever, it is sensible to speak of the average blood pressure or average heightof a sample of patients. For nominal variables, all we can deal with is the
number of patients in each category. Statistical methods applied to these two
broad classes of data are different. For measured variables, it is generally
assumed that the data follow a bell curve and that the statistics focus on the
center and width of the curve. These are the so-called parametric statistics.
By contrast, nominal and ordinal data consist of counts of people or things
in different categories, and a different class of statistics, called nonpara-
metric statistics (obviously!), is used in dealing with these data.
C.R.A.P. DETECTORS
Question. What kind of variables are these—nominal or ordinal? Are
they appropriate?
Chapter 1 Names and Numbers: Types of Variables 3
Example 1-1
To examine a program for educating health professionals in a sports injury clinicabout the importance of keeping detailed medical records, a researcher doesa controlled trial in which the dependent variable is the range of motion ofinjured joints, which is classified as (a) worse, (b) same, or (c) better, and theindependent variable is (a) program or (b) no program.
8/11/2019 1017269_frmFile2
http://slidepdf.com/reader/full/1017269frmfile2 4/4
Answer. The independent variable is nominal , and the dependent vari-
able, as stated, is ordinal . However, there are two problems with the choice.
First, detailed medical records may be a good thing and may even save some
lives somewhere. But range of motion is unlikely to be sensitive to changesin recording behavior. A better choice would be some rating of the quality
of records. Second, range of motion is a nice ratio variable. To shove it into
three ordinal categories is just throwing away information.
C.R.A.P. Detector I-1
Dependent variables should be sensible. Ideally, they should be clinically
important and related to the independent variable.
C.R.A.P. Detector I-2
In general, the amount of information increases as one goes from nom-
inal to ratio variables. Classifying good ratio measures into large categories
is akin to throwing away data.
4 PDQ STATISTICS