How to Make Ruby CGI Script Faster - CGIを高速化する小手先テクニック -
-
Upload
kwatch -
Category
Technology
-
view
7.573 -
download
1
Transcript of How to Make Ruby CGI Script Faster - CGIを高速化する小手先テクニック -
How to MakeRuby CGI Script FasterCGIスクリプトを高速化する小手先テクニック
makoto kuwatahttp://www.kuwata-lab.com/
Nagoya RubyKaigi 02
What I'll talk and not
I talk about
Why CGI is so slow?
How to improve your code?
I don't talk about
Scale out
Database
Key Value Store
Or other kool topics
Why CGI Scriptis So Slow?
Example Coderequire 'cgi'cgi = CGI.newprint cgi.headerodd = falseENV.each do ¦k, v¦ odd = ! odd klass = odd ? 'odd' : 'even' print "<tr class=\"#{klass}\">\n" print " <td>#{k}</td><td>#{v}</td>\n" print "</tr>\n"end
}}
}
Benchmark
1.80%0.87%
57.78%
39.55%
Process Invocation require 'cgi'cgi = CGI.new render HTML
Mac OS X 10.6Ruby 1.8.7-p334Core2 Duo 2GHz
https://gist.github.com/850390
Why CGI is so slow?
FACT
Process invocation is slow「プロセスの起動が遅い」は事実だが真実ではない
TRUTH
Library loading is much slow真実はライブラリの読み込みが遅いこと
Benchmark of 'require'(none)
erbtime
urifileutils
cgitmpdirpstoredate2
openssltempfile
cgi/sessionyaml
rexml/document0 10 20 30 40 50
40.9632.27
30.1822.5321.92
19.419.19
17.8417.3117.26
16.1515.9
9.266.15 1.8.7-p334
(ms)
Library loading is much slower than process invocation
(プロセス起動よりライブラリ読み込みのほうがずいぶん遅い)
https://gist.github.com/850386
Benchmark of 'require'(none)
erbtime
urifileutils
cgitmpdirpstoredate2
openssltempfile
cgi/sessionyaml
rexml/document0 20 40 60 80 100 120
109.4281.61
57.3853.82
36.160
48.839.07
25.8237.52
43.4230.88
21.5813.35 1.9.2-p180
(ms)
https://gist.github.com/850386
Benchmark of 'require'(none)
erbtime
urifileutils
cgitmpdirpstoredate2
openssltempfile
cgi/sessionyaml
rexml/document0 20 40 60 80 100 120
1.8.7-p3341.9.2-p180
(ms)
1.9 is slower than 1.8 for process invocation and library loading(1.9 はプロセス起動とライブラリの読み込みが1.8よりずいぶん遅い)
Case Study: tmpdirtmpdir.rb requires fileutils.rbtmpdir.rbはfileutils.rbをrequireしてるので遅い
## beforerequire 'fileutils'def Dir.mktmpdir() .... if block_given? ... FileUtils.remove_entry_secure path ...
Case Study: tmpdirChange to require only when necessary必要な時にだけrequireするように変更
## afterdef Dir.mktmpdir() .... if block_given? ... require 'fileutils' \ unless defined?(FileUtils) FileUtils.remove_entry_secure path ...
Benchmark
tmpdir.rb
0 5 10 15 20
6.99
17.84
before after
(ms)
Case Study: delegateDelegateClass makes library loading slowDelegateClassを使うと読み込みが遅くなる
## cgi.rbclass Cookie < DelegateClass(Array) ...
## tempfile.rbclass Tempfile < DelegateClass(File) ...
Benchmark
cgi.rb
tempfile.rb
0 5 10 15 20 25
19.29
14.17
22.53
17.31
Using DelegateClassWithout DelegateClass
(ms)
Case Study: opensslOpenssl pollutes other librariesopensslが遅いので、それをrequireしている他のライブラリの速度も汚染してしまう
## cgi/session.rbdef create_new_id require 'securerandom' session_id = SecureRandom.hex(16) ...## securerandom.rbrequire 'openssl'
Case Study: opensslCalculate session id without securerandom.rbsecurerandom.rbに頼らず自力でセッションIDを生成
PRIVATE = '431e077067178a6dd061f1e2ab'def create_new_id() seed = [rand(), rand(), rand(), rand(), Time.now.to_f, $$].pack('ddddds') Digest::SHA256.hexdigest(seed + PRIVATE)end
How to ImproveYour Code?
How to improve?
ESCAPE_ = { "&" => "&", "<" => "<", ">" => ">", '"' => """,}def escape_html(s) s.to_s.gsub(/[&<>"]/) {¦c¦ ESCAPE_[c] }endalias h escape_html
Answer
ESCAPE_ = { .... }def h(s) s.to_s.gsub(/[&<>"]/) {¦c¦ ESCAPE_[c] }enddef escape_html(s) s.to_s.gsub(/&/, "&"). gsub(/>/, ">"). gsub(/</, "<"). gsub(/"/, """)end
Benchmarkh(s1)
escape_html(s1)
h(s2)escape_html(s2)
h(s3)escape_html(s3)
0 12.5 25 37.5 50
s1 : 0 html characterss2 : 5 html characterss3 : 15 html characters
Ruby 1.8.7-p334
h() is much slower than escape_html() when there
are '&<>"' characters(「&<>"」があるとescape_html() よりh()のほうがずっと遅い)
https://gist.github.com/850396
How to improve?
def CGI::parse(query) params = {} query.split(/[&;]/n).each do ¦pairs¦ key, val = pairs.split(/=/, 2) \ .collect {¦v¦ CGI::unescape(v) } (params[key] ¦¦= []) << val end paramsend
Answer
def CGI::parse(query) params = {} query.split(/[&;]/n).each do ¦pairs¦ key, val = pairs.split(/=/, 2) key = CGI::unescape(key) val = CGI::unescape(val) (params[key] ¦¦= []) << val end paramsend
Answer
def CGI::parse(query) params = {} query.split(/[&;]/n).each do ¦pairs¦ key, val = pairs.split(/=/, 2) key = CGI::unescape(key) if key =̃ /%/ val = CGI::unescape(val) (params[key] ¦¦= []) << val end paramsend
Ignore '+' :)
Benchmark
parse1()parse2()parse3()
0 2 4 6 8
Ruby 1.8.7-p334
parse1()parse2()parse3()
0 2 4 6 8
Ruby 1.9.2-p180
https://gist.github.com/850402
How to improve?
<form action="/" method="post"> <p>Name: <input name="user[123][name]"></p> <p>Mail: <input name="user[123][mail]"></p> <p>Items: <input name="user[123][items][]"> <input name="user[123][items][]"></p> <p><input type="submit"></p></form>
Answer
<form action="/" method="post"> <p>Name: <input name="user.123.name"></p> <p>Mail: <input name="user.123.mail"></p> <p>Items: <input name="user.123.items[]"> <input name="user.123.items[]"></p> <p><input type="submit"></p></form>
Example Codedef parse_dotted_query(qs, d=nil) params = {} rexp = d ? /[#{d}] */n : DEFAULT_SEP (qs ¦¦ '').split(rexp).each do ¦p¦ k, v = unescape(p).split('=', 2) normalize_dotted_params(params, k, v) end return paramsend
DEFAULT_SEP = /[&;] */n
def normalize_dotted_params(params, k, v) items = k.split(/\./) hash = params items[0...-1].each do ¦item¦ if hash[item].is_a?(Hash) hash = hash[item] else hash = hash[item] = {} end end item = items[-1] #if item.end_with?('[]') if item =̃ /\[\]\z/ item = item[0...-2] if hash[item].is_a?(Array) hash[item] << v else hash[item] = [v] end else hash[item] = v endend
Benchmark
parse_nested_query()
parse_dotted_query()
0 3.5 7 10.5 14
Ruby 1.8.7-p334
parse_nested_query()
parse_dotted_query()
0 3.5 7 10.5 14
Ruby 1.9.2-p180
https://gist.github.com/850407
How to improve?
cgi = CGI.newupfile = cgi['file']fpath = "up/" + upfile.original_filenameFile.open(fpath, 'wb') do ¦f¦ while s = upfile.read(4096) f.write(s) endendupfile.close() (一時ファイルを読み込んで
書き出すより、rename したほうが速い)
renaming is faster than read&write
Answer
cgi = CGI.newupfile = cgi['file']fpath = "up/" + upfile.original_filenameif upfile.local_path # when Tempfile File.rename(upfile.local_path, fpath)else # when StringIO File.open(fpath, 'wb') do ¦f¦ f.write(upfile.read()) endend
(別パーティションにある一時ファイルを移動するのは大きな無駄)
Moving file on other partition is too expensive
Answer
ENV['TMPDIR'] = '/home/username/tmp'require 'cgi'cgi = CGI.newupfile = cgi['file']puts upfile.local_file #=> "/home/username/tmp/CGI201102-1671"
(require 'cgi' より先に設定すること、あと書き込みパーミッションに注意)
Set $TMPDIR before loading cgi.rb
How to improve?
def read_multipart(boundary, cont_len) while s = stdin.read(10*1024) while s =̃ boundary ...
(CやJava と同じように書いても速くならない, Rubyが得意とする書き方に改めるべき)
C/Java style coding is slow in Ruby
Answer
def read_multipart(boundary, cont_len) max = 10 * 1024 * 1024 if content_length <= max items = s.read().split(boundary) ... else while s = stdin.read(10*1024) while s =̃ boundary ...
Conslusion
ほんとにRubyのせい?There are a lot of code which is not so efficent(世の中のコードは無駄が多い)
If your code is slow, it is due to yourself, not to Ruby.(あなたのコードが遅いのはあなた自身のせいであって、Rubyのせいではない)
one more thing...
Benchmarker
Benchmarker
Benchmark utility
Repeat benchmarks, average results, ...gem install benchmarker
CGIAlt, CGIExt
CGIAlt
Fast and cgi.rb compatible
http://cgialt.rubyforge.org/
CGIExt
Implemented in C
http://cgiext.rubyforge.org/
thank you