How to Make Ruby CGI Script Faster - CGIを高速化する小手先テクニック -

How to MakeRuby CGI Script FasterCGIスクリプトを高速化する小手先テクニック

makoto kuwatahttp://www.kuwata-lab.com/

Nagoya RubyKaigi 02

http://www.kuwata-lab.com

http://www.kuwata-lab.com

What I'll talk and not

I talk about

Why CGI is so slow?

How to improve your code?

I don't talk about

Scale out

Database

Key Value Store

Or other kool topics

Why CGI Scriptis So Slow?

Example Coderequire 'cgi'cgi = CGI.newprint cgi.headerodd = falseENV.each do ¦k, v¦ odd = ! odd klass = odd ? 'odd' : 'even' print "<tr class=\"#{klass}\">\n" print " <td>#{k}</td><td>#{v}</td>\n" print "</tr>\n"end

}}

}

Benchmark

1.80%0.87%

57.78%

39.55%

Process Invocation require 'cgi'cgi = CGI.new render HTML

Mac OS X 10.6Ruby 1.8.7-p334Core2 Duo 2GHz

https://gist.github.com/850390



Why CGI is so slow?

FACT

Process invocation is slow「プロセスの起動が遅い」は事実だが真実ではない

TRUTH

Library loading is much slow真実はライブラリの読み込みが遅いこと

Benchmark of 'require'(none)

erbtime

urifileutils

cgitmpdirpstoredate2

openssltempfile

cgi/sessionyaml

rexml/document0 10 20 30 40 50

40.9632.27

30.1822.5321.92

19.419.19

17.8417.3117.26

16.1515.9

9.266.15 1.8.7-p334

(ms)

Library loading is much slower than process invocation

(プロセス起動よりライブラリ読み込みのほうがずいぶん遅い)





erbtime

urifileutils


openssltempfile

cgi/sessionyaml

rexml/document0 20 40 60 80 100 120

109.4281.61

57.3853.82

36.160

48.839.07

25.8237.52

43.4230.88

21.5813.35 1.9.2-p180

(ms)





erbtime

urifileutils


openssltempfile

cgi/sessionyaml

rexml/document0 20 40 60 80 100 120

1.8.7-p3341.9.2-p180

(ms)

1.9 is slower than 1.8 for process invocation and library loading(1.9 はプロセス起動とライブラリの読み込みが1.8よりずいぶん遅い)

Case Study: tmpdirtmpdir.rb requires fileutils.rbtmpdir.rbはfileutils.rbをrequireしてるので遅い

## beforerequire 'fileutils'def Dir.mktmpdir() .... if block_given? ... FileUtils.remove_entry_secure path ...

Case Study: tmpdirChange to require only when necessary必要な時にだけrequireするように変更

## afterdef Dir.mktmpdir() .... if block_given? ... require 'fileutils' \ unless defined?(FileUtils) FileUtils.remove_entry_secure path ...

Benchmark

tmpdir.rb

0 5 10 15 20

6.99

17.84

before after

(ms)

Case Study: delegateDelegateClass makes library loading slowDelegateClassを使うと読み込みが遅くなる

## cgi.rbclass Cookie < DelegateClass(Array) ...

## tempfile.rbclass Tempfile < DelegateClass(File) ...

Benchmark

cgi.rb

tempfile.rb

0 5 10 15 20 25

19.29

14.17

22.53

17.31

Using DelegateClassWithout DelegateClass

(ms)

Case Study: opensslOpenssl pollutes other librariesopensslが遅いので、それをrequireしている他のライブラリの速度も汚染してしまう

## cgi/session.rbdef create_new_id require 'securerandom' session_id = SecureRandom.hex(16) ...## securerandom.rbrequire 'openssl'

Case Study: opensslCalculate session id without securerandom.rbsecurerandom.rbに頼らず自力でセッションIDを生成

PRIVATE = '431e077067178a6dd061f1e2ab'def create_new_id() seed = [rand(), rand(), rand(), rand(), Time.now.to_f, $$].pack('ddddds') Digest::SHA256.hexdigest(seed + PRIVATE)end

How to ImproveYour Code?

How to improve?

ESCAPE_ = { "&" => "&", "<" => "<", ">" => ">", '"' => """,}def escape_html(s) s.to_s.gsub(/[&<>"]/) {¦c¦ ESCAPE_[c] }endalias h escape_html

Answer

ESCAPE_ = { .... }def h(s) s.to_s.gsub(/[&<>"]/) {¦c¦ ESCAPE_[c] }enddef escape_html(s) s.to_s.gsub(/&/, "&"). gsub(/>/, ">"). gsub(/</, "<"). gsub(/"/, """)end

Benchmarkh(s1)

escape_html(s1)

h(s2)escape_html(s2)

h(s3)escape_html(s3)

0 12.5 25 37.5 50

s1 : 0 html characterss2 : 5 html characterss3 : 15 html characters

Ruby 1.8.7-p334

h() is much slower than escape_html() when there

are '&<>"' characters(「&<>"」があるとescape_html() よりh()のほうがずっと遅い)




How to improve?

def CGI::parse(query) params = {} query.split(/[&;]/n).each do ¦pairs¦ key, val = pairs.split(/=/, 2) \ .collect {¦v¦ CGI::unescape(v) } (params[key] ¦¦= []) << val end paramsend

Answer

def CGI::parse(query) params = {} query.split(/[&;]/n).each do ¦pairs¦ key, val = pairs.split(/=/, 2) key = CGI::unescape(key) val = CGI::unescape(val) (params[key] ¦¦= []) << val end paramsend

Answer

def CGI::parse(query) params = {} query.split(/[&;]/n).each do ¦pairs¦ key, val = pairs.split(/=/, 2) key = CGI::unescape(key) if key =̃ /%/ val = CGI::unescape(val) (params[key] ¦¦= []) << val end paramsend

Ignore '+' :)

Benchmark

parse1()parse2()parse3()

0 2 4 6 8

Ruby 1.8.7-p334

parse1()parse2()parse3()

0 2 4 6 8

Ruby 1.9.2-p180




How to improve?

<form action="/" method="post"> <p>Name: <input name="user[123][name]"></p> <p>Mail: <input name="user[123][mail]"></p> <p>Items: <input name="user[123][items][]"> <input name="user[123][items][]"></p> <p><input type="submit"></p></form>

Answer

<form action="/" method="post"> <p>Name: <input name="user.123.name"></p> <p>Mail: <input name="user.123.mail"></p> <p>Items: <input name="user.123.items[]"> <input name="user.123.items[]"></p> <p><input type="submit"></p></form>

Example Codedef parse_dotted_query(qs, d=nil) params = {} rexp = d ? /[#{d}] */n : DEFAULT_SEP (qs ¦¦ '').split(rexp).each do ¦p¦ k, v = unescape(p).split('=', 2) normalize_dotted_params(params, k, v) end return paramsend

DEFAULT_SEP = /[&;] */n

def normalize_dotted_params(params, k, v) items = k.split(/\./) hash = params items[0...-1].each do ¦item¦ if hash[item].is_a?(Hash) hash = hash[item] else hash = hash[item] = {} end end item = items[-1] #if item.end_with?('[]') if item =̃ /\[\]\z/ item = item[0...-2] if hash[item].is_a?(Array) hash[item] << v else hash[item] = [v] end else hash[item] = v endend

Benchmark

parse_nested_query()

parse_dotted_query()

0 3.5 7 10.5 14

Ruby 1.8.7-p334

parse_nested_query()

parse_dotted_query()

0 3.5 7 10.5 14

Ruby 1.9.2-p180




How to improve?

cgi = CGI.newupfile = cgi['file']fpath = "up/" + upfile.original_filenameFile.open(fpath, 'wb') do ¦f¦ while s = upfile.read(4096) f.write(s) endendupfile.close() (一時ファイルを読み込んで

書き出すより、rename したほうが速い)

renaming is faster than read&write

Answer

cgi = CGI.newupfile = cgi['file']fpath = "up/" + upfile.original_filenameif upfile.local_path # when Tempfile File.rename(upfile.local_path, fpath)else # when StringIO File.open(fpath, 'wb') do ¦f¦ f.write(upfile.read()) endend

(別パーティションにある一時ファイルを移動するのは大きな無駄)

Moving file on other partition is too expensive

Answer

ENV['TMPDIR'] = '/home/username/tmp'require 'cgi'cgi = CGI.newupfile = cgi['file']puts upfile.local_file #=> "/home/username/tmp/CGI201102-1671"

(require 'cgi' より先に設定すること、あと書き込みパーミッションに注意)

Set $TMPDIR before loading cgi.rb

How to improve?

def read_multipart(boundary, cont_len) while s = stdin.read(10*1024) while s =̃ boundary ...

(CやJava と同じように書いても速くならない, Rubyが得意とする書き方に改めるべき)

C/Java style coding is slow in Ruby

Answer

def read_multipart(boundary, cont_len) max = 10 * 1024 * 1024 if content_length <= max items = s.read().split(boundary) ... else while s = stdin.read(10*1024) while s =̃ boundary ...

Conslusion

ほんとにRubyのせい？There are a lot of code which is not so efficent(世の中のコードは無駄が多い)

If your code is slow, it is due to yourself, not to Ruby.(あなたのコードが遅いのはあなた自身のせいであって、Rubyのせいではない)

one more thing...

Benchmarker

Benchmarker

Benchmark utility

Repeat benchmarks, average results, ...gem install benchmarker

CGIAlt, CGIExt

CGIAlt

Fast and cgi.rb compatible

http://cgialt.rubyforge.org/

CGIExt

Implemented in C

http://cgiext.rubyforge.org/

http://cgialt.rubyforge.org

http://cgialt.rubyforge.org

http://cgiext.rubyforge.org

http://cgiext.rubyforge.org

thank you

How to Make Ruby CGI Script Faster - CGIを高速化する小手先テクニック -

Technology

Transcript of How to Make Ruby CGI Script Faster - CGIを高速化する小手先テクニック -