The Usability Perception Scale (UPscale): A Measure for Evaluating Feedback Displays

The Usability Perception Scale (UPscale): A Measure for Evaluating Feedback Displays

Beth Karlin Transformational Media Lab

[email protected]

Rebecca Ford Center for Sustainability

[email protected]

Underlying Assumptions 1.  Technology and new media are changing how people

interact with our natural, built, and social worlds.

B. Karlin


interact with our natural, built, and social worlds. 2.  There are potential opportunities to leverage these

changes for pro-social / pro-environmental benefit.

B. Karlin


interact with our natural, built, and social worlds. 2.  There are potential opportunities to leverage these

changes for pro-social / pro-environmental benefit 3.  A psychological approach provides a theoretical base

and empirical methodology to study this potential.

B. Karlin

Transformational Media Lab Mission:

Our lab studies how technology and new media are (and can be) used to transform individuals, communities, and systems.

Documentary Film

Campaigns

Home Energy Management

B. Karlin

Energy Feedback “Information about the result of a process or action that can be used in modification or control of a process or system”

Oxford English Dictionary

B. Karlin

Energy Feedback

1888

�  Average frequency: monthly (approx. 12 data points/year)

�  Average frequency: hourly (approx 8,760 data points/year)

Energy Feedback

B. Karlin

Energy usage tells its own story...

Powe

r Con

sum

ptio

n (W

atts)

Small changes, big impacts

$9.24 $5.28 Savings: $3.96 43%

B. Karlin

And the computer is still plugged in…

(uci@home project)

blu-ray netflix streaming

Appliance Disaggregation (up to 6.3 trillion data points/year)

200 microsecond sampling

B. Karlin (uci@home project)

Savings Add Up

“…without waiting for new technologies or regulations or changing household lifestyle.”

Dietz, Gardner, Gilligan, Stern, & Vandenbergh (2009)

“Household actions can provide a behavioral wedge to rapidly reduce carbon emissions …”

•  5-12% reduction in 5 years •  9-22% reduction in 10 years

B. Karlin

Over 200 devices on the market

(Karlin, Ford, & Squiers, in press) B. Karlin

What are we missing?

Public and Private Interest

Feedback is effective… �  100+ studies conducted since 1976 �  Reviews found average 10% savings

�  Mean r-effect size = .1174 (p < .001)

•  Significant variability in effects (from negative effects to over 20% savings)

Darby, 2006; Ehrhardt-Martinez et al., 2010; Fischer, 2008; Karlin & Zinger, under review B. Karlin

Feedback is ✗ can be effective… �  100+ studies conducted since 1976 �  Reviews found average 10% savings

�  Mean r-effect size = .1174 (p < .001)

Darby, 2006; Ehrhardt-Martinez et al., 2010; Fischer, 2008; Karlin & Zinger, in preparation

•  Significant variability in effects (from negative effects to over 20% savings)

B. Karlin

Feedback is ✗ can be effective…

Ehrhardt-‐Martinez, Laitner, & Donnely., 2010

It depends. . .

10% 15% 5%

2% 20% average savings

Feedback is It depends…

✗ can be effective…

Moderators identified in meta-analysis

•  Study population (WHO?)

•  Study duration (HOW LONG?)

•  Frequency of feedback (HOW OFTEN?)

•  Feedback medium (WHAT TYPE?)

•  Disaggregation (WHAT LEVEL?)

•  Comparison (WHAT MESSAGE?)

Karlin & Zinger, in preparation B. Karlin

Methodological Limitations 1.  Not naturalistic

�  Participants generally recruited to participate

�  May be different from “active adopters”

2.  Not comparative �  Most studies tests one type of feedback (vs. control)

�  Very few studies isolating or combining variables

3.  Not testing mediation �  DV is energy use, but studies rarely test possible

mediators to explain effectiveness

B. Karlin, 2013

Methodological Limitations �  Not naturalistic

�  Participants generally recruited to participate

�  May be different from “active adopters”

�  Not comparative �  Most studies tests one type of feedback (vs. control)

�  Very few studies isolating or combining variables

�  Not testing mediation �  DV is energy use, but studies rarely test possible

mediators to explain effectiveness

B. Karlin, 2013

Does program x lead to outcome y?

Program x Outcome y

Simple causal model

B. Karlin

What is the program?

What is going on here?

How do we measure outcomes?

How and for whom does program x lead to outcome y?

How and For Whom?

B. Karlin

Beyond kWh Model

B. Karlin

Experience

Usability

B. Karlin

Psychometrics

B. Karlin

•  Theory and technique of measurement: knowledge, abilities, attitudes, traits

•  Construction and validation of instruments: questionnaires, tests, assessments.

Psychometrics

B. Karlin

Psychometric Properties

1. Factor Structure

2.  Reliability

3.  Criterion Validity

4.  Sensitivity

System Usability Scale

B. Karlin (Brooke, 1986)

Identified Factors: 1. System usability 2. Learnability

Other scales

ASQ 1. User satisfaction

SUMI 1. Affect 2. Efficiency 3. Learnability 4. Helpfulness 5. Control

PSSUQ 1.  System usefulness 2.  Information quality 3.  Interface quality QUIS

1. Overall reaction 2.  Learning 3. Terminology 4.  Information flow 5.  System output 6. System characteristics UMUX

1. Efficiency 2. Effectiveness 3. Satisfaction

Identified Limitations

B. Karlin

1.  Designed primarily to evaluate products or systems rather than info-visualizations

2.  Assessed with metrics primarily associated w/ease of use (e.g., learnability) & efficiency. Less focus on continued engagement.

Identified Needs

B. Karlin

1. Address the unique needs of eco-feedback displays (as opposed to systems or products)

2. Incorporate validated sub-scales for ease of use and engagement

UPscale (Usability Perception)

B. Karlin

Testing UPscale

B. Karlin

•  Online survey (Mechanical Turk)

•  1103 people

•  Part of larger study, testing framing messages and info-visualization

Results

B. Karlin


2. Reliability

3. Criterion Validity

4. Sensitivity

Results

B. Karlin


2. Reliability


4. Sensitivity Overall scale (α=.85) Ease of use (α=.84) Engagement (α=.83)

Results

B. Karlin


2. Reliability


4. Sensitivity Behavioral intention (p<.001) •  overall scale (r=.536) •  ease of use (r=.213) • engagement (r=.685)

Results

B. Karlin


2. Reliability


4. Sensitivity Image Type. •  Full scale (F=3.616, p=.001) •  Ease of use subscale (F=6.411, p<.001) •  Engagement subscale (F=1.744, p=.095). Demographic Variables. •  Full Scale: Age, Environmentalism •  Engagement: Gender, age, environmentalism, income • Ease of Use: None

UPscale (Usability Perception)

B. Karlin

Closing Thoughts “If you do not know how to ask the right question, you know nothing.”

– Edward Deming

Beth Karlin Transformational Media Lab

[email protected]

Rebecca Ford Center for Sustainability

[email protected]

Thank You!

Program x Outcome y

A theoretical approach

Hypothesis / Theory

Clearly defined and operationalized

Metrics tested for reliability & validity

B. Karlin, 2013

How and for whom does program x lead to outcome y?

The Usability Perception Scale (UPscale): A Measure for Evaluating Feedback Displays

Technology

Transcript of The Usability Perception Scale (UPscale): A Measure for Evaluating Feedback Displays