HTTPプロクシライブラリproxy2の設計と実装

32
HTTPプロクシライブラリ proxy2の設計と実装 inaz2 PyCon JP 2016 2016/09/22 Design and Implementation of proxy2, the HTTP Proxy Library

Transcript of HTTPプロクシライブラリproxy2の設計と実装

Page 1: HTTPプロクシライブラリproxy2の設計と実装

HTTPプロクシライブラリproxy2の設計と実装

inaz2

PyCon JP 2016

2016/09/22

Design and Implementation of proxy2, the HTTP Proxy Library

Page 2: HTTPプロクシライブラリproxy2の設計と実装

About me

• inaz2• https://twitter.com/inaz2

• https://github.com/inaz2

• Security engineer & Python programmer

• Weblog: ももいろテクノロジー• http://inaz2.hatenablog.com/

2

Page 3: HTTPプロクシライブラリproxy2の設計と実装

HTTP Proxy

• There are some proxies for caching or load balancing

• But the “proxy” in this talk is a little different with these

3

Page 4: HTTPプロクシライブラリproxy2の設計と実装

Do you know Proxomitron?

• http://www.proxomitron.info/• Since 1999 till 2003

4

Page 5: HTTPプロクシライブラリproxy2の設計と実装

Local debug proxy

• Intercept and modify the HTTP request/response

5

Request

Response

Logging and modifying

Page 6: HTTPプロクシライブラリproxy2の設計と実装

Major debugging proxies

• Useful for debugging and security testing

• Burp Proxy• https://portswigger.net/burp/proxy.html

• Fiddler• http://www.telerik.com/fiddler

• OWASP ZAP• https://www.owasp.org/index.php/OWASP_Zed_Attack_Proxy_

Project

• Charles• https://www.charlesproxy.com/

• mitmproxy• https://mitmproxy.org/

6

Page 7: HTTPプロクシライブラリproxy2の設計と実装

These are useful but …

• Not intended for automated translation

• Not intended for large-scale logging and statistics

• Able to extend but not handy

• I need a proxy like tcpdump (or like tail -f)

• I need a proxy that is easy to use with crawlers

• I need a proxy fully customizable

7

Page 8: HTTPプロクシライブラリproxy2の設計と実装

proxy2

• https://github.com/inaz2/proxy2• Single python script

• Require no external modules

• Support IPv6

• Support HTTP/1.1 persistent connection

• Support HTTPS relay/intercept

• Easy to customize with Python!

8

Page 9: HTTPプロクシライブラリproxy2の設計と実装

Demo

9

Page 10: HTTPプロクシライブラリproxy2の設計と実装

Customizing handlers

• Change User-Agent header

10

Page 11: HTTPプロクシライブラリproxy2の設計と実装

11

Page 12: HTTPプロクシライブラリproxy2の設計と実装

12

Page 13: HTTPプロクシライブラリproxy2の設計と実装

Design and implementation

13

Page 14: HTTPプロクシライブラリproxy2の設計と実装

Disclaimer

• This script doesn’t support Python 3 yet …• Pull Request is welcome (;´Д`)

14

Page 15: HTTPプロクシライブラリproxy2の設計と実装

Design policy

• Make it simple, less dependent• Single python script

• Use standard modules only

• Implement it as base class• Prepare {request,response,save}_handler()

• Users derive the class and override each handler

• Default handlers dump HTTP headers and some useful info

15

Page 16: HTTPプロクシライブラリproxy2の設計と実装

Connection flow and handlers

16

client proxy2 server

Request

Request

Response

Response

request_handler(req)(modify the request)

response_handler(req, res)(modify the response)

save_handler(req, res)(task that takes long time)

Page 17: HTTPプロクシライブラリproxy2の設計と実装

Making HTTP server is easy

• Use BaseHTTPServer module• https://hg.python.org/cpython/file/2.7/Lib/BaseHTTPServer.py

• Server with multi-threading and IPv6 support

• Request handler

17

Page 18: HTTPプロクシライブラリproxy2の設計と実装

Roadblocks on HTTP/1.1 proxy

• HTTP/1.1 Persistent Connection

• Content-Encoding

• Hop-by-hop Headers

18

Page 19: HTTPプロクシライブラリproxy2の設計と実装

HTTP/1.1 Persistent Connection

• Reusing connection to the same server

• httplib.HTTPConnection()• Low-level http client

• threading.local()• Thread-local storage (as the server is multi-thread)

19

Page 20: HTTPプロクシライブラリproxy2の設計と実装

Content-Encoding

• Response body can be compressed• For handlers, proxy2 decompress and re-compress it

• gzip and deflate module

20

Page 21: HTTPプロクシライブラリproxy2の設計と実装

Hop-by-hop Headers

• In RFC 2616 (deprecated), proxy must remove the below headers:• Connection, Keep-Alive, Proxy-Authenticate, Proxy-Authorization,

TE, Trailers, Transfer-Encoding, Upgrade

• RFC 7230 no longer defines the implicit list• "hop-by-hop" header fields are required to appear in the Connection

header field (A.2)

• http://lists.w3.org/Archives/Public/ietf-http-wg/2014JulSep/1771.html

• Although, proxy2 remove the above headers for compatibility

21

Page 22: HTTPプロクシライブラリproxy2の設計と実装

Handling HTTPS

• HTTPS = HTTP over SSL/TLS

• When you access “https://www.example.com/”, the client sends the HTTP request:• CONNECT www.example.com:443 HTTP/1.1

• The proxy returns the HTTP response:• 200 Connection Established

• After that, the client starts SSL/TLS handshake and encrypted transmission

22

Page 23: HTTPプロクシライブラリproxy2の設計と実装

HTTPS relay

• Just relay handshakes and encrypted payloads

• proxy2 can’t understand the content

23

client proxy2 server

CONNECT

Connection Established

Handshake and encrypted transmission

Page 24: HTTPプロクシライブラリproxy2の設計と実装

HTTPS relay

• select.select()• pick out readable sockets in the list

• Receive data and send it to the other socket

24

Page 25: HTTPプロクシライブラリproxy2の設計と実装

HTTPS intercept (Man-in-the-Middle)

• The proxy generates the certificate for a requested domain

• And works as a HTTPS server with the generated certificate

25

client proxy2 server

CONNECT

Connection Established

Handshake and transmission Handshake and transmission

Page 26: HTTPプロクシライブラリproxy2の設計と実装

HTTPS intercept (Man-in-the-Middle)

• ssl.wrap_socket()• Make a socket over SSL/TLS

• with a private key and the corresponding public key’s certificate

• wrap BaseHTTPRequestHandler.connection

26

Page 27: HTTPプロクシライブラリproxy2の設計と実装

Generating SSL/TLS certificates

• In this case, proxy2 depends on OpenSSL• You know poor implementations cause severe security risks

• OpenSSL makes a Certificate Authority “proxy2 CA” and generates certificates signed by the CA

• The browser can install the CA certificate from “http://proxy2.test/” through proxy2

27

proxy2 CA

signed certificatessign

“I’ll trust your sign.”

client

Page 28: HTTPプロクシライブラリproxy2の設計と実装

28

Page 29: HTTPプロクシライブラリproxy2の設計と実装

29

Page 30: HTTPプロクシライブラリproxy2の設計と実装

Recap

• Proxy is fun

• Python’s “batteries” are very powerful• BaseHTTPServer, httplib, threading, gzip, deflate, select, ssl

• HTTP proxy is easy to understand but not simple

• proxy2 made it simple

30

Page 31: HTTPプロクシライブラリproxy2の設計と実装

References

• proxy2: HTTPS pins and needles• http://www.slideshare.net/inaz2/20150509-sumidasec-

47934674

• RFC 2616 (deprecated)• https://tools.ietf.org/html/rfc2616

• RFC 7230-7235• https://tools.ietf.org/html/rfc7230

31

Page 32: HTTPプロクシライブラリproxy2の設計と実装

Thank you!inaz2

32