Peters matthew periodictableseo

Modern On Page FactorsSMX Advanced

Matthew Peters, PhDmatt@moz.com @mattthemathman

“philadelphia phillies”

“Relevance” vs “Ranking”

Conceptually “relevance” determination and “ranking” can be thought of a two different steps (even if they are implemented as one in a search engine)

Relevance

Ranking

Is this page relevant to “philadelphia phillies”?

query-body similarity: 0.74

Is this page relevant to “philadelphia phillies”?

query-body similarity: 0.74

query-title similarity: 0.8

query-H1 similarity: 1.0

etc …

Measuring query-document similarity

Goal: given query + document string, compute “similarity”

See “Introduction to Information Retrieval” by Manning et al:http://nlp.stanford.edu/IR-book/

> 700 papers

Goal: given query + document string, compute “similarity”

In this context “document” can also refer to title tag, meta description, H1, etc.

Query Model

tokenizationnormalization (stemming)query expansionintent

Query Model

Document Model

tokenizationnormalization (stemming)vector space representationlanguage model

Query Model

Document Model

tokenizationnormalization (stemming)vector space representationlanguage model

Scoring function

Query representation

Language identification

Word segmentation(Japanese, Chinese)

Tokenization + normalization{reviews, reviewer, reviewing} -> review

Spelling correction

Query expansion

User intent (transactional, navigational, informational) Local

Classification(images, video, news)

Spelling correction

Query expansion

User intent (transactional, navigational, informational) Local

Classification(images, video, news)

Topic Model (LDA)

Entity extraction

Spelling correction

Document representation

TF-IDF

TF-IDF Language ModelP(optimization | search, engine) >>P(walking | search, engine)

Probability Ranking Principle P(R = 1 | d, q) or P(R = 0 | d, q)

TF-IDF Language ModelP(optimization | search, engine) >>P(walking | search, engine)

Which method performs best?

What are the characteristics of sites that rank highly?

14,000+ keywordsTop 50 results600,000 URLsGoogle-US, no personalizationMarch 2013

Mean Spearman Correlation

Remember: “correlation is not causation”

Which method performs best?

We tried a few different types of smoothing for the language model, Dirichlet worked best (Zhai and Lafferty SIGIR 2001)

Impact of stemming

Porter stemmer provided a slight increase in correlations

These correlations are still relatively low compared to other factors

50 results

450 random pages

movie reviews

50 results

450 random pages

movie reviews For each query:500 pages10% relevant90% irrelevant

50 results

450 random pages

URL ID PA In SERP?

86 92 1

355 90 0

… … …

27 18 0

URL ID LanguageModel

In SERP?

213 0.97 1

156 0.95 1

… … …

355 0.06 0

50 results

450 random pages

URL ID PA In SERP?

86 92 1

355 90 0

… … …

27 18 0

In SERP?

213 0.97 1

156 0.95 1

… … …

355 0.06 0

P@50 is the “Precision of the top 50 results”. It is the percentage of top 50 results by PA/Language Model that are actually in the SERP.

Top 50 ranked

50 results

450 random pages

URL ID PA In SERP?

86 92 1

355 90 0

… … …

27 18 0

In SERP?

213 0.97 1

156 0.95 1

… … …

355 0.06 0

P@50 is the “Precision of the top 50 results”. It is the percentage of top 50 results by PA/Language Model that are actually in the SERP.

Top 50 ranked

Takeaways

Implication: Query-document similarity is based on decades of research. It’s immune to algorithm change.

Takeaways

Action item: With sophisticated query and document models, no need to optimize separately for similar words, e.g. “movie reviews” vs “movie review”.

Takeaways

Action item: Each page is relevant to many different keywords, so optimize each page for a broad set of related keywords, instead of a single keyword.

Takeaways

Action item: Each page is relevant to many different keywords, so optimize each page for a broad set of related keywords, instead of a single keyword.

Use case: Content creation. What keywords will this new blog post target? Is it relevant to a set of queries?

Thanks for watching!Matthew Peters

matt@moz.com @mattthemathman

Peters matthew periodictableseo

Technology

Transcript of Peters matthew periodictableseo

2010 - Dussel Peters

Cemento - Claudius Peters

jürgen peters personaltraining

Vidya peters

Peters Ved Berg

Christiane Peters

Matthew Henry

Matthew Cripps

TOM PETERS

BAGAGE ser peters. Diensten in het ziekenhuis Ser Peters Opleidersdagen mei 2011.

Thomas Peters vba-programmierung.com …...Thomas Peters – vba-programmierung.com Vertraulich ! Weitergabe nur mit vorheriger schriftlicher Zustimmung von Thomas Peters - vba-programmierung.com

Freebook Peters Hugo

Silke Peters - FH-Diakonie · 2014. 2. 27. · Gefördert durch: [Peters-Silke_Existenzielle-Kommunikation_Langfassung.docx]. Silke Peters ‚Ich verstehe die Welt nicht mehr‘ Existenzielle

Jeroen peters

Vuurwerk Peters

xxxx 0093 xxxxA 02 fr - Claudius Peters...16 Claudius Peters Modernisation fluidisation Silo (item 619) 34 Claudius Peters Vanne doseuse (item 682) 18 Claudius Peters Contrôle de

Matthew Nepali_01

Matthew Leonidis & Theo Karapanagiotis. Theo Karapanagiotis & Matthew Leonidis.

DESIGN PORTFOLIO · 2016. 2. 13. · Saul Bass Louise Fili Jessica Hische Christopher Lee Matthew Tapia Allan Peters A.M. Cassandre DESIGN INFLUENCES. PubLIcaTION DESIGN caLLIOPE

Belinda & Matthew