總之來講講 Search

Post on 12-Jul-2015

404 views 1 download

Transcript of 總之來講講 Search

總之來講講 Search

Andy Dai andy@dorm7.com

What is search?

找 Django 相關的⽂文件

$ grep -ri django *

docs = Document.objects.filter( Q(title__icontain="django")| Q(body__icontain="django") )

Not Scalable!

這問題其實很早之前就被解決了

Inverted Index

django

flask

python

web

ruby

file1

file1

file2

file2

file3file2file1 file4

file4

search (python and django)

django

flask

python

web

ruby

file1

file1

file2

file2

file3file2file1 file4

file4

file4

Phrase Search

django

web

file1 (1,3)

file3 (41)file2 (3,4)file1 (2) file4 (53)

Search (“django web”)

django

web

file1 (1,3)

file3 (41)file2 (3,4)file1 (2) file4 (53)

Building Inverted Index“Django is a high-level Python Web framework that encourages rapid development and clean, pragmatic design.”

django high level python web framework

encourage rapid development clean pragmatic

design

What we do• Split into tokens

• Drop stop words

• Normalize

• Lowercase

• stemming

Congratulations! Now you can build you own search

今天不是要講 Django + ElasticSearch 嗎?

ElasticSearch

• Open Source

• Distributed

• Document-based

• JSON over HTTP

Multiple Backends

• ElasticSearch

• Whoosh

• Solr

• Xapian

pip  install  django-­‐haystack  pip  install  elasticsearch  

設定

• setting.py

• search_indexes.py

• search template for document

讓我們直接來看 Code

https://github.com/daikeren/es_test

Q&A