Как понять, что происходит на сервере? / Александр Крижановский (NatSys Lab., Tempesta Technologies)
Как Web-акселератор акселерирует ваш сайт / Александр...
-
Upload
ontico -
Category
Engineering
-
view
262 -
download
6
Transcript of Как Web-акселератор акселерирует ваш сайт / Александр...
Target RPSActual RPSHyperScanVanilla NginxPCRE-JIT
11
10001000.03333467762100010001000.02222375455
22
20002000.04444485989200020002000.07778046348
33
30003000.06666728984300030003000.06666728984
44
40004000.08888971978400040004067.26809506246
55
50005000500050004168.44211438114
66
60006000600060004109.90181945957
77
70006854.60618363795700070004132.23140392475
88
80006854.76001348953800080004135.11487879612
99
90006814.08075018844900090004136.29342180833
1010
100006850.5923878368710000100004132.00374792404
10
15000685515000150004136
10
2000068512000020000.55558370444135
10
2500068552500023830.83287990534132
10
3000068513000023828.5404180584133
Web-acceleration Technologies
Alexander Krizhanovsky
Tempesta Technologies, Inc.
Who am I?
CEO & CTO at NatSys Lab & Tempesta Technologies
Tempesta Technologies (Seattle, WA)
Subsidiary of NatSys Lab. developing Tempesta FW a first and only hybrid of HTTP accelerator and firewall for DDoS mitigation & WAF
NatSys Lab (Moscow, Russia)
Custom software development in:
high performance network traffic processing
databases
Web-content Acceleration
Web-framework caching (e.g. Django caching)
=> whole site, pages, compiler objects, templates, any data
Downstream caching (RFC 7234, e.g. mod_cache):
reduces origin server requests (thundering herd)
=> whole site, pages
forward proxy cache (e.g. Squid, ATS)
reverse proxy (Web-accelerator) cache (e.g. Squid, Varnish etc.)
SSL acceleration
Private caching (Web-browser cache)
...eAccelerator, xslcache etc.
Web-acceleration
Web-caching
(how Web-accelerator accelerates your site)
To Cache
static (e.g. video, images, CSS, HTML)
some dynamic
Negative results (e.g. 404)
Permanent redirects
Incomplete results (206, RFC 7233 Range Requests)
Methods: GET, POST, whatever
GET /script?action=delete this is your responsibility
(but some servers don't cache URIs w/ arguments)
Not to Cache
Responses to Authenticated requests
Unsafe methods (RFC 7231 4.2.1)
(safe methods: GET, HEAD, OPTIONS, TRACE)
Explicit no-cache directive
Set-Cookie (?)
Cache POST?
Idempotent POST (e.g. web-serarch) just like GET
Non-idempotent POST (e.g. blog comment) cache response for following GET
RFC 7234 4.4: URI must be invalidated
Cache Cookies?
Varnish, Nginx, ATS don't cache responses w/ Set-Cookie by default
mod_cache and Squid do cache responses w/ Set-Cookie by default
RFC 7234:
Note that the Set-Cookie response header field [RFC6265] does not
inhibit caching; a cacheable response with a Set-Cookie header
field can be (and often is) used to satisfy subsequent requests to
caches. Servers who wish to control caching of these responses are
encouraged to emit appropriate Cache-Control response header
fields.
Cache Entries Freshness
RFC 7234: freshness_lifetime > current_age
Freshness calculation:
Last-Modified when a resource was modified at origin server
Date response generation timestamp
Age the age the object has been in proxy cache
Expires when a cache entry expires
Revalidation:Conditional requests (RFC 7232, e.g. If-Modified-Since)
Background activity or on-request job
Stale Cache Entries
Sometimes is OK, e.g. Nginx: proxy_cache_use_stale
Expired responses
Invalidated by unsafe methods
Error responses for the URI
Timeout
Etc.
Cache-Control
A cache MUST obey the requirements of the Cache-Control directives
Freshness and staleness control
Explicit cache/no-cache
Private caching (browser vs proxy) caching not privacy!
Pragma: no-cache
Vary
(secondary keys say hello to databases)
Accept-Language return localized version of page (no need /en/index.html)
User-Agent mobile vs desktop (bad!)
Accept-Encoding don't send compressed page if browser doesn't understaind it
Request headers normalization is required!
Buffering vs Streaming
Buffering
Seems everyone by default
Performance degradation on large messages
200 means Ok, not incomplete response
Streaming
Tengine (patched Nginx) w/
proxy_request_buffering & fastcgi_request_buffering
More performance, but 200 doesn't mean full response
Cache Storage
Plain files (Nginx, Squid, Apache HTTPD)
Meta-data in RAM
Filesystem database
Easy to manage
Database (Apache Traffic Server, Tempesta FW)
Faster access
Persistency (experimental in Varnish, upcoming in Tempesta FW)
no real consistency
Cache Storage: mmap(2)
Alistair Wooldrige, BBC Digital Media Distribution: How we
improved throughput by 4x,
http://www.bbc.co.uk/blogs/internet/entries/17d22fb8-cea2-49d5-be14-86e7a1dcde04
48 CPUs, 512GB RAM, 8TB SSD
Cache Key
Primary key: URI path + Host
POST key: URI path + Host + body
Secondary (Vary) key: any headers
E.g. Nginx custom cache key:
proxy_cache_key "$request_uri|$request_body"
Cache Purging
$ curl -X PURGE
Not RFC-defined
Squid, Varnish, Nginx (by wildcard)
Use case:
Update some resource at upstream (POST can invalidate an entry)
Send PURGE & GET reuests to the cache
Now cache is up to date
Cache Busting
No access to Web-accelerator or Web-server
E.g. force users to use a new version of CSS or Ad?