Get Another Label? Using Multiple, Noisy Labelers Joint work with Victor Sheng and Foster Provost...

Get Another Label? Using Multiple, Noisy Labelers

Joint work with Victor Sheng and Foster Provost

Panos Ipeirotis

Stern School of BusinessNew York University

Motivation

Many task rely on high-quality labels for objects:– relevance judgments

– duplicate database records

– image recognition

– song categorization

– videos

Labeling can be relatively inexpensive, using Mechanical Turk, ESP game …

ESP Game (by Luis von Ahn)

Mechanical Turk Example

“Are these two documents about the same topic?”

Mechanical Turk Example

Motivation

Labels can be used in training predictive models – Duplicate detection systems

– Image recognition

– Web search

But: labels obtained from above sources are noisy. This directly affects the quality of learning models

– How can we know the quality of annotators?

– How can we know the correct answer?

– How can we use best noisy annotators?

1 20 40 60 80 100

Number of examples (Mushroom)

cyQuality and Classification Performance

Labeling quality increases classification quality increases

Q = 0.5

Q = 0.6

Q = 0.8

Q = 1.0

How to Improve Labeling Quality

Find better labelers– Often expensive, or beyond our control

Use multiple, noisy labelers: repeated-labeling– Our focus

Multiple labelers and resulting label quality

Multiple labelers and classification quality

Selective label acquisition

Our Focus: Labeling using Multiple Noisy Labelers

Majority Voting and Label Quality

1 3 5 7 9 11 13

Number of labelers

Ask multiple labelers, keep majority label as “true” label

Quality is probability of majority label being correct

P is probabilityof individual labelerbeing correct

(Sometimes) quality of multiple noisy labelers better than quality of best labeler in set

Multiple noisy labelers improve quality

So, should we always get multiple labels?

Tradeoffs for Classification

Get more labels Improve label quality Improve classification Get more examples Improve classification

1 20 40 60 80 100

Number of examples (Mushroom)

Q = 0.5

Q = 0.6

Q = 0.8

Q = 1.0

Basic Labeling Strategies

Get as many data points as possible, one label each

Repeatedly-label everything, same number of times

Repeat-Labeling vs. Single Labeling

P= 0.6, labeling qualityK=5, #labels/example

Repeated

Single

With high noise, repeated labeling better than single labeling

Repeat-Labeling vs. Single Labeling

P= 0.8, labeling qualityK=5, #labels/example

Repeated

Single

With low noise, more (single labeled) examples better

Estimating Labeler Quality

(Dawid, Skene 1979): “Multiple diagnoses”

– Assume equal qualities– Estimate “true” labels for examples– Estimate qualities of labelers given the “true” labels– Repeat until convergence

Selective Repeated-Labeling

We have seen: – With noise and enough (noisy) examples getting

multiple labels better than single-labeling

Can we do better?

Select data points, in terms of uncertainty score, to allocate multi-label resource, e.g. {+,-,+,+,-,+,+} vs. {+,+,+,+}

Natural Candidate: Entropy

Entropy is a natural measure of label uncertainty:

E({+,+,+,+,+,+})=0 E({+,-, +,-, +,- })=1

Strategy: Get more labels for high-entropy examples

||)( 22 S

negativeSpositiveS |:||:|

What Not to Do: Use Entropy

0 400 800 1200 1600 2000

Number of labels (waveform, p=0.6)

ENTROPYUNF

Improves at first, hurts in long run

EntropyRound robin

Why not Entropy

In the presence of noise, entropy will be high even with many labels

Entropy is scale invariant – (3+ , 2-) has same entropy as (600+ , 400-)

Estimating Label Uncertainty (LU)

Observe +’s and –’s and compute Pr{+|obs} and Pr{-|obs}

Label uncertainty = tail of beta distribution

0.50.0 1.0

Beta probability density function

Label Uncertainty

p=0.7 5 labelers

(3+, 2-) Entropy ~ 0.97

Label Uncertainty

p=0.7 10 labelers

(7+, 3-) Entropy ~ 0.88

Label Uncertainty

p=0.7 20 labelers

(14+, 6-) Entropy ~ 0.88

Comparison

0.60.650.7

0.750.8

0.850.9

0 400 800 1200 1600 2000Number of labels (waveform, p=0.6)

UNF MULU LMU

Label Uncertainty

Uniform, round robin

Model Uncertainty (MU)

However, we do not have only labelers

A classifier can also give us labels!

Model uncertainty: get more labels for ambiguous/difficult examples

Intuitively: make sure that difficult cases are correct

- - - -

- - - -- -

- - - -

- - - -- - - -- - - -

- - - -

Label + Model Uncertainty

Label and model uncertainty (LMU): avoid examples where either strategy is certain

MULULMU SSS

Comparison

0.60.650.7

0.750.8

0.850.9

0 400 800 1200 1600 2000Number of labels (waveform, p=0.6)

UNF MULU LMU

Label Uncertainty

Uniform, round robin

Label + Model Uncertainty

Model Uncertainty alone also improves

quality

Classification Improvement

0 400 800 1200 1600 2000Number of labels (spambase, p=0.6)

UNF MULU LMU

Conclusions

Gathering multiple labels from noisy users is a useful strategy

Under high noise, almost always better than single-labeling

Selectively labeling using label and model uncertainty is more effective

More Work to Do

Estimating the labeling quality of each labeler

Increased compensation vs. labeler quality

Example-conditional quality issues (some examples more difficult than others)

Multiple “real” labels

Hybrid labeling strategies using “learning-curve gradient”

Other Projects

SQoUT projectStructured Querying over Unstructured Texthttp://sqout.stern.nyu.edu

Faceted InterfacesEconoMining project

The Economic Value of User Generated Contenthttp://economining.stern.nyu.edu

SQoUT: Structured Querying over Unstructured Text

Information extraction applications extract structured relations from unstructured text

May 19 1995, Atlanta -- The Centers for Disease Control and Prevention, which is in the front line of the world's response to the deadly Ebola epidemic in Zaire , is finding itself hard pressed to cope with the crisis…

Date Disease Name Location

Jan. 1995 Malaria Ethiopia

July 1995 Mad Cow Disease U.K.

Feb. 1995 Pneumonia U.S.

May 1995 Ebola Zaire

Information Extraction System

(e.g., NYU’s Proteus)

Disease Outbreaks in The New York Times

SQoUT: The QuestionsOutput Tokens

…Extraction

System(s)

Text Databases

3. Extract output tuples2. Process documents1. Retrieve documents from database/web/archive

Questions: 1.How to we retrieve the documents?2.How to configure the extraction systems?3.What is the execution time? 4.What is the output quality?

SIGMOD’06, TODS’07, + in progress

EconoMining Project

Show me the Money!

Applications (in increasing order of difficulty)

Buyer feedback and seller pricing power in online marketplaces (ACL 2007)

Product reviews and product sales (KDD 2007)

Importance of reviewers based on economic impact (ICEC 2007)

Hotel ranking based on “bang for the buck” (WebDB 2008)

Political news (MSM, blogs), prediction markets, and news importance

Basic Idea

Opinion mining an important application of information extraction

Opinions of users are reflected in some economic variable (price, sales)

Some Indicative Dollar ValuesPositive Negative

Natural method for extracting sentiment strength and polarity

good packaging -$0.56

Naturally captures the pragmatic meaning within the given context

captures misspellings as well

Positive? Negative ?

Thanks!

Q & A?

Get Another Label? Using Multiple, Noisy Labelers Joint work with Victor Sheng and Foster Provost...

Documents

Transcript of Get Another Label? Using Multiple, Noisy Labelers Joint work with Victor Sheng and Foster Provost...

PANOS ADIRE E ELEMENTOS DA LINGUAGEM VISUAL: UM … · estruturais da Linguagem Visual, princípios da Gestalt e história de panos africanos, mais especificamente, os panos Adire.

Panos Hatziprokopiou Aristotle University of Thessaloniki

PANOS INF EVO/ Q68 · e12998 e12999 e13001 e12991 panos evo led flq68 10w 827 (v) panos inf led ldo spq68 9/12w 927 panos inf led ldo spq68 9/11w 930 e14528 panos inf led ldo flq68

Noisy Cricket Stereo Amplifier - 1.5W Hookup Guide Sheets/Sparkfun PDFs...Noisy Cricket Stereo Amplifier - 1.5W Hookup Guide Introduction The Noisy Cricket Stereo Amplifier, uses the

Panos Karnezis / ROĐENDANSKA ZABAVA

Télécharger le fichier "Noisy-Solidaire_2011_02.pdf"

Panos cocina personalizados

Fête du quartier de Noisy le Grand

Panos Circulo 2013

M14, set 1 goran henriks, carlo favaretti - lloyd provost

Noisy Studio

GRIECHENLAND 2014 INSPIRIERENDE MOMENTESeite31: links: Samaria-Schlucht©Panos-Fotolia.com rechts: Palast vonKnossos ©Panos-Fotolia.com Seite32: Palmenstrand vonVai ©Panos-Fotolia.com

Revista noisy no. 28 octubre 09

Panos de copa

Security with Noisy Data

Por debaixo dos panos lucia helena guerra

Panos Karnezis / LAVIRINT

Noisy Video Super-Resolution

MUROS DE SUPORTE OU PANOS ENTERRADOS MUR1

Estudios panos 1 slp10.pdf