Evaluating How Developers Use General-Purpose Web …using Joda Time library Non-code Task hotel in...

55
Evaluating How Developers Use General-Purpose Web-Search for Code Retrieval Date: May 29, 2018 1 Md Masudur Rahman, Jed Barson, Sydney Paul, Joshua Kayan, Federico Andres Lois, Sebastian Fernandez Quezada, Christopher Parnin, Kathryn T. Stolee, Baishakhi Ray

Transcript of Evaluating How Developers Use General-Purpose Web …using Joda Time library Non-code Task hotel in...

Page 1: Evaluating How Developers Use General-Purpose Web …using Joda Time library Non-code Task hotel in gothenburg Hotel booking in Sweden 2 5 hotel in stockholm More time and website

Evaluating How Developers Use General-Purpose Web-Search for Code Retrieval

Date: May 29, 2018

�1

Md Masudur Rahman, Jed Barson, Sydney Paul, Joshua Kayan, Federico Andres Lois, Sebastian Fernandez Quezada, Christopher Parnin, Kathryn T. Stolee, Baishakhi Ray

Page 2: Evaluating How Developers Use General-Purpose Web …using Joda Time library Non-code Task hotel in gothenburg Hotel booking in Sweden 2 5 hotel in stockholm More time and website

Coding Task

�2

Convert a date string to a time object

Page 3: Evaluating How Developers Use General-Purpose Web …using Joda Time library Non-code Task hotel in gothenburg Hotel booking in Sweden 2 5 hotel in stockholm More time and website

�3

string to time

Page 4: Evaluating How Developers Use General-Purpose Web …using Joda Time library Non-code Task hotel in gothenburg Hotel booking in Sweden 2 5 hotel in stockholm More time and website

�4

string to time

string to time

Search Log

Page 5: Evaluating How Developers Use General-Purpose Web …using Joda Time library Non-code Task hotel in gothenburg Hotel booking in Sweden 2 5 hotel in stockholm More time and website

�5

string to time

Java

string to time

Search Log

Page 6: Evaluating How Developers Use General-Purpose Web …using Joda Time library Non-code Task hotel in gothenburg Hotel booking in Sweden 2 5 hotel in stockholm More time and website

�6

string to time using java

string to time

Search Log

Page 7: Evaluating How Developers Use General-Purpose Web …using Joda Time library Non-code Task hotel in gothenburg Hotel booking in Sweden 2 5 hotel in stockholm More time and website

�7

string to time using java

Search Log

string to time using javastring to time

Page 8: Evaluating How Developers Use General-Purpose Web …using Joda Time library Non-code Task hotel in gothenburg Hotel booking in Sweden 2 5 hotel in stockholm More time and website

�8

string to time using java

Search Log

string to time using javastring to time

Page 9: Evaluating How Developers Use General-Purpose Web …using Joda Time library Non-code Task hotel in gothenburg Hotel booking in Sweden 2 5 hotel in stockholm More time and website

�9

string to time using java

DateTime

Search Log

string to time using javastring to time

Page 10: Evaluating How Developers Use General-Purpose Web …using Joda Time library Non-code Task hotel in gothenburg Hotel booking in Sweden 2 5 hotel in stockholm More time and website

�10

date string to DateTime using java

string to time using javastring to time

Search Log

Page 11: Evaluating How Developers Use General-Purpose Web …using Joda Time library Non-code Task hotel in gothenburg Hotel booking in Sweden 2 5 hotel in stockholm More time and website

�11

date string to DateTime using java

Search Log

string to time using javadate string to DateTime using java

string to time

Page 12: Evaluating How Developers Use General-Purpose Web …using Joda Time library Non-code Task hotel in gothenburg Hotel booking in Sweden 2 5 hotel in stockholm More time and website

�12

date string to DateTime using java

Joda Time library

Search Log

string to time using javadate string to DateTime using java

string to time

Page 13: Evaluating How Developers Use General-Purpose Web …using Joda Time library Non-code Task hotel in gothenburg Hotel booking in Sweden 2 5 hotel in stockholm More time and website

�13

date string to DateTime using Joda Time library

string to time using javadate string to DateTime using java

string to time

Search Log

Page 14: Evaluating How Developers Use General-Purpose Web …using Joda Time library Non-code Task hotel in gothenburg Hotel booking in Sweden 2 5 hotel in stockholm More time and website

�14

date string to DateTime using Joda Time library

Search Log

string to time using javadate string to DateTime using javadate string to DateTime using Joda…

string to time

Page 15: Evaluating How Developers Use General-Purpose Web …using Joda Time library Non-code Task hotel in gothenburg Hotel booking in Sweden 2 5 hotel in stockholm More time and website

�15

date string to DateTime using Joda Time library

Page 16: Evaluating How Developers Use General-Purpose Web …using Joda Time library Non-code Task hotel in gothenburg Hotel booking in Sweden 2 5 hotel in stockholm More time and website

�16

date string to DateTime using Joda Time library

X

X

X

Page 17: Evaluating How Developers Use General-Purpose Web …using Joda Time library Non-code Task hotel in gothenburg Hotel booking in Sweden 2 5 hotel in stockholm More time and website

�17

world cup fixtures

Search Log

string to time using javadate string to DateTime using javadate string to DateTime using Joda …

string to time

Page 18: Evaluating How Developers Use General-Purpose Web …using Joda Time library Non-code Task hotel in gothenburg Hotel booking in Sweden 2 5 hotel in stockholm More time and website

�18

world cup fixtures

string to time using javadate string to DateTime using javadate string to DateTime using Joda …world cup fixtures

string to time

Page 19: Evaluating How Developers Use General-Purpose Web …using Joda Time library Non-code Task hotel in gothenburg Hotel booking in Sweden 2 5 hotel in stockholm More time and website

�19

place to visit in gothenburg

Search Log

string to time using javadate string to DateTime using javadate string to DateTime using Joda …world cup fixtures

string to time

Page 20: Evaluating How Developers Use General-Purpose Web …using Joda Time library Non-code Task hotel in gothenburg Hotel booking in Sweden 2 5 hotel in stockholm More time and website

�20

Search Log

string to time using javadate string to DateTime using javadate string to DateTime using Joda …world cup fixturesplace to visit in gothenburg

string to time

place to visit in gothenburg

Page 21: Evaluating How Developers Use General-Purpose Web …using Joda Time library Non-code Task hotel in gothenburg Hotel booking in Sweden 2 5 hotel in stockholm More time and website

Code Query

Code

Query string to timestring to time using javadate string to DateTime using javadate string to DateTime using Joda Time library

�21

Page 22: Evaluating How Developers Use General-Purpose Web …using Joda Time library Non-code Task hotel in gothenburg Hotel booking in Sweden 2 5 hotel in stockholm More time and website

Search Task

Code

Query string to timestring to time using javadate string to DateTime using javadate string to DateTime using Joda Time library

�22

Convert a date string to a DateTime object using Joda Time library

Search Task

Page 23: Evaluating How Developers Use General-Purpose Web …using Joda Time library Non-code Task hotel in gothenburg Hotel booking in Sweden 2 5 hotel in stockholm More time and website

Code vs Non-code

Code Non-Code

Query world cup fixturesplace to visit in gothenburghotel in gothenburg

�23

Query string to timestring to time using javadate string to DateTime using javadate string to DateTime using Joda Time library

Page 24: Evaluating How Developers Use General-Purpose Web …using Joda Time library Non-code Task hotel in gothenburg Hotel booking in Sweden 2 5 hotel in stockholm More time and website

General Purpose Search Engine for Code Retrieval

Code Non-Code

Query world cup fixturesplace to visit in gothenburghotel in gothenburg

�24

Query string to timestring to time using javadate string to DateTime using javadate string to DateTime using Joda Time library

Page 25: Evaluating How Developers Use General-Purpose Web …using Joda Time library Non-code Task hotel in gothenburg Hotel booking in Sweden 2 5 hotel in stockholm More time and website

Research Goal

Code Non-Code

Query world cup fixturesplace to visit in gothenburghotel in gothenburg

�25

๏ Query characteristics๏ User behaviorQuery

string to timestring to time using javadate string to DateTime using javadate string to DateTime using Joda Time library

Page 26: Evaluating How Developers Use General-Purpose Web …using Joda Time library Non-code Task hotel in gothenburg Hotel booking in Sweden 2 5 hotel in stockholm More time and website

Dataset

�26

Query Search Log

string to time using javadate string to DateTime using javadate string to DateTime using Joda Time libraryworld cup fixturesplace to visit in gothenburg

string to time

Users: 310 (mostly developer)

Consist of code and non-code queries

Total query: 150K

Chrome plugin

hotel in gothenburg

Page 27: Evaluating How Developers Use General-Purpose Web …using Joda Time library Non-code Task hotel in gothenburg Hotel booking in Sweden 2 5 hotel in stockholm More time and website

Dataset

�27

Query Search Log

?No label

Code or Non-codestring to time using javadate string to DateTime using javadate string to DateTime using Joda Time libraryworld cup fixturesplace to visit in gothenburg

string to time

hotel in gothenburg

Page 28: Evaluating How Developers Use General-Purpose Web …using Joda Time library Non-code Task hotel in gothenburg Hotel booking in Sweden 2 5 hotel in stockholm More time and website

Dataset

�28

Query Search Log

?No label

Code or Non-code

Query Classifier

string to time using javadate string to DateTime using javadate string to DateTime using Joda Time libraryworld cup fixturesplace to visit in gothenburg

string to time

hotel in gothenburg

Page 29: Evaluating How Developers Use General-Purpose Web …using Joda Time library Non-code Task hotel in gothenburg Hotel booking in Sweden 2 5 hotel in stockholm More time and website

�29

Intent-based Query Classification

Page 30: Evaluating How Developers Use General-Purpose Web …using Joda Time library Non-code Task hotel in gothenburg Hotel booking in Sweden 2 5 hotel in stockholm More time and website

Code Intent Analysis

�30

Query: javascript function to get mp3 play length

Page 31: Evaluating How Developers Use General-Purpose Web …using Joda Time library Non-code Task hotel in gothenburg Hotel booking in Sweden 2 5 hotel in stockholm More time and website

Code Intent Analysis

�31

Query: javascript function to get mp3 play length CodeScore?

Page 32: Evaluating How Developers Use General-Purpose Web …using Joda Time library Non-code Task hotel in gothenburg Hotel booking in Sweden 2 5 hotel in stockholm More time and website

Code Intent Analysis

�32

Token Code Intent

S = set of code related tags n = popularity of a tag

Query: javascript function to get mp3 play length CodeScore17 7 0 6 5 8 3 ?

Page 33: Evaluating How Developers Use General-Purpose Web …using Joda Time library Non-code Task hotel in gothenburg Hotel booking in Sweden 2 5 hotel in stockholm More time and website

Code Intent Analysis

�33

Query: javascript function to get mp3 play length CodeScore17 7 0 6 5 8 3 46

Token Code Intent Query Code Intent

Page 34: Evaluating How Developers Use General-Purpose Web …using Joda Time library Non-code Task hotel in gothenburg Hotel booking in Sweden 2 5 hotel in stockholm More time and website

Query Code Score

�34

Query Code Scorestring to time 12string to time using java 20date string to DateTime using java 22.5world cup fixtures 0messi curly goal 2.6place to visit in gothenburg 0

Page 35: Evaluating How Developers Use General-Purpose Web …using Joda Time library Non-code Task hotel in gothenburg Hotel booking in Sweden 2 5 hotel in stockholm More time and website

Query Code Score

�35

Query Code Score Labelstring to time 12 ?string to time using java 20 ?date string to DateTime using java 22.5 ?world cup fixtures 0 ?messi curly goal 2.6 ?place to visit in gothenburg 0 ?

Page 36: Evaluating How Developers Use General-Purpose Web …using Joda Time library Non-code Task hotel in gothenburg Hotel booking in Sweden 2 5 hotel in stockholm More time and website

Query Code Score

�36

Query Code Score Labelstring to time 12 ?string to time using java 20 ?date string to DateTime using java 22.5 ?world cup fixtures 0 ?messi curly goal 2.6 ?place to visit in gothenburg 0 ?

Classifier Evaluation

Precision: 87%Recall: 86%F1-score: 87%

Threshold = 10

Manually annotated 380 queries

Page 37: Evaluating How Developers Use General-Purpose Web …using Joda Time library Non-code Task hotel in gothenburg Hotel booking in Sweden 2 5 hotel in stockholm More time and website

Query Code Score

�37

Query Code Score Labelstring to time 12 Codestring to time using java 20 Codedate string to DateTime using java 22.5 Codeworld cup fixtures 0 Non-codemessi curly goal 2.6 Non-codeplace to visit in gothenburg 0 Non-code

Classifier Evaluation

Precision: 87%Recall: 86%F1-score: 87%

Threshold = 10

Manually annotated 380 queries

Page 38: Evaluating How Developers Use General-Purpose Web …using Joda Time library Non-code Task hotel in gothenburg Hotel booking in Sweden 2 5 hotel in stockholm More time and website

Query Code Score

�38

Query Code Score Labelstring to time 12 Codestring to time using java 20 Codedate string to DateTime using java 22.5 Codeworld cup fixtures 0 Non-codemessi curly goal 2.6 Non-codeplace to visit in gothenburg 0 Non-code

Code : 89K (59%)Non-code : 61K (41%)

Annotated Data

Classifier Evaluation

Precision: 87%Recall: 86%F1-score: 87%

Threshold = 10

Manually annotated 380 queries

Page 39: Evaluating How Developers Use General-Purpose Web …using Joda Time library Non-code Task hotel in gothenburg Hotel booking in Sweden 2 5 hotel in stockholm More time and website

Research Questions

�39

Query Characteristics

User Behavior

RQ1. How do query characteristics differ for code and non-code queries?

RQ2. How do search behaviors vary for code and non-code related queries?

RQ3. How do task sessions vary for code and non-code related search tasks?

Page 40: Evaluating How Developers Use General-Purpose Web …using Joda Time library Non-code Task hotel in gothenburg Hotel booking in Sweden 2 5 hotel in stockholm More time and website

Results

�40

Page 41: Evaluating How Developers Use General-Purpose Web …using Joda Time library Non-code Task hotel in gothenburg Hotel booking in Sweden 2 5 hotel in stockholm More time and website

RQ1: Query Characteristics

�41

Code queries often longer (more tokens) than non-code

date string to DateTime using java

date string to DateTime using Joda Time library

world cup fixtures

messi curly goal

hotel in gothenburgjavascript function to get mp3 play length

Code Non-code

Page 42: Evaluating How Developers Use General-Purpose Web …using Joda Time library Non-code Task hotel in gothenburg Hotel booking in Sweden 2 5 hotel in stockholm More time and website

RQ1: Query Characteristics

�42

date string to DateTime using javadate string to DateTime using Joda Time library

world cup fixturesmessi curly goal

hotel in gothenburgjavascript function to get mp3 play length

Code Non-code

Page 43: Evaluating How Developers Use General-Purpose Web …using Joda Time library Non-code Task hotel in gothenburg Hotel booking in Sweden 2 5 hotel in stockholm More time and website

RQ1: Query Characteristics

�43

date string to DateTime using javadate string to DateTime using Joda Time library

world cup fixturesmessi curly goal

hotel in gothenburgjavascript function to get mp3 play length

Code Non-code

Page 44: Evaluating How Developers Use General-Purpose Web …using Joda Time library Non-code Task hotel in gothenburg Hotel booking in Sweden 2 5 hotel in stockholm More time and website

RQ1: Query Characteristics

�44

Code Non-code

16K 12K 33K

Code queries contain less vocabulary (unique tokens) than non-code

Page 45: Evaluating How Developers Use General-Purpose Web …using Joda Time library Non-code Task hotel in gothenburg Hotel booking in Sweden 2 5 hotel in stockholm More time and website

�45

RQ2: Query Search BehaviorQuery # term added # term deleted

Code

string to time - -string to time using java 2 -date string to DateTime using Joda Time library 4 2

Non-codehotel in gothenburg - -best hotel in gothenburg 1 -

Page 46: Evaluating How Developers Use General-Purpose Web …using Joda Time library Non-code Task hotel in gothenburg Hotel booking in Sweden 2 5 hotel in stockholm More time and website

�46

RQ2: Query Search BehaviorQuery # term added # term deleted

Code

string to time - -string to time using java 2 -date string to DateTime using Joda Time library 4 2

Non-codehotel in gothenburg - -best hotel in gothenburg 1 -

Edited query

Page 47: Evaluating How Developers Use General-Purpose Web …using Joda Time library Non-code Task hotel in gothenburg Hotel booking in Sweden 2 5 hotel in stockholm More time and website

�47

User often add/delete more terms (avg. 2) to a code compared to non-code (avg. 1)

RQ2: Query Search BehaviorQuery # term added # term deleted

Code

string to time - -string to time using java 2 -date string to DateTime using Joda Time library 4 2

Non-codehotel in gothenburg - -best hotel in gothenburg 1 -

Page 48: Evaluating How Developers Use General-Purpose Web …using Joda Time library Non-code Task hotel in gothenburg Hotel booking in Sweden 2 5 hotel in stockholm More time and website

�48

RQ2: Query Search BehaviorQuery # term added # term deleted Code Score

Code

string to time - - 12string to time using java 2 - 20date string to DateTime using Joda Time library 4 2 30.5

Page 49: Evaluating How Developers Use General-Purpose Web …using Joda Time library Non-code Task hotel in gothenburg Hotel booking in Sweden 2 5 hotel in stockholm More time and website

�49

RQ2: Query Search BehaviorQuery # term added # term deleted Code Score

Code

string to time - - 12string to time using java 2 - 20date string to DateTime using Joda Time library 4 2 30.5

Edit query to increase code intent

Page 50: Evaluating How Developers Use General-Purpose Web …using Joda Time library Non-code Task hotel in gothenburg Hotel booking in Sweden 2 5 hotel in stockholm More time and website

�50

RQ3: Task Search BehaviorQuery # query Task intent

Code Task

string to time

4Converting a date string to a Time

objectstring to time using javadate string to DateTime using Joda Time library

Non-code Task

hotel in gothenburg2 Hotel booking in

Gothenburgbest hotel in gothenburg

More queries required to complete a code task

Page 51: Evaluating How Developers Use General-Purpose Web …using Joda Time library Non-code Task hotel in gothenburg Hotel booking in Sweden 2 5 hotel in stockholm More time and website

�51

RQ3: Task Search BehaviorQuery Task intent Search duration

(minute) # web visit

Code Task

string to timeConverting a date string to a Time

object6 15string to time using java

date string to DateTime using Joda Time library

Non-code Task

hotel in gothenburg Hotel booking in Sweden 2 5

hotel in stockholm

More time and website visit required to complete code related tasks

Page 52: Evaluating How Developers Use General-Purpose Web …using Joda Time library Non-code Task hotel in gothenburg Hotel booking in Sweden 2 5 hotel in stockholm More time and website

Summary

Code Non-Code

�52

Code queries are linguistically different Users modify code queries more often Users give significantly more effort for code task

General Search Engine

Page 53: Evaluating How Developers Use General-Purpose Web …using Joda Time library Non-code Task hotel in gothenburg Hotel booking in Sweden 2 5 hotel in stockholm More time and website

Summary

Code Non-Code

General Search Engine

�53

Code queries are linguistically different Users modify code queries more often Users spend significantly more effort for code task

Code search is less effective

Page 54: Evaluating How Developers Use General-Purpose Web …using Joda Time library Non-code Task hotel in gothenburg Hotel booking in Sweden 2 5 hotel in stockholm More time and website

Summary

Code Non-Code

General Search Engine

�54

Code queries are linguistically different Users modify code queries more often Users spend significantly more effort for code task

Code search is less effective

Special treatment required to improve code retrieval

Page 55: Evaluating How Developers Use General-Purpose Web …using Joda Time library Non-code Task hotel in gothenburg Hotel booking in Sweden 2 5 hotel in stockholm More time and website

Question?

Code Non-Code

General Search Engine

�55

Code queries are linguistically different Users modify code queries more often Users spend significantly more effort for code task

Code search is less effective

Special treatment required to improve code retrieval