Python profiling

44
Python Profiling Dreampuf Sep. 2012 http://huangx.in

description

Python profiling article http://huangx.in/2012/9/python_profiling.html

Transcript of Python profiling

Page 1: Python profiling

Python ProfilingDreampuf Sep. 2012http://huangx.in

Page 2: Python profiling

Introducton

性能分析(Performance analysis 或 profiling)是收集程序运行时信息为手段研究程序行为的分析方法。

-- Wikipedia

Page 3: Python profiling

Introductondef qks(ls):    if not ls : return ls   p, ls = ls[0], ls[1:]    return qks([i for i in ls if i < p]) + [p] + qks([i for i in ls if i >= p])    if __name__ == "__main__":    import random   ls = [random.randint(0, 100) for i in xrange(10)]

Page 4: Python profiling

Agenda

time utility, time module, timeit, sys.getsizeof

Module: profile, cProfile, hotshot, pystats

Line Profiler, Memory Profiler, Objgraph

RunSnakeRun

In Action

Conclusion

Page 5: Python profiling

绿曼巴蛇,最快的蛇,每小时11公里。

Page 6: Python profiling

time utility

dreampuf@HX:~/hub/fortest$ time python test_timeit.py0.143877983093

real     0m0.184s  #实际总耗时user     0m0.166s  #应用耗时sys     0m0.015s  #应用期间执行系统指令耗时

Page 7: Python profiling

time utility

    dreampuf@HX:~/hub/fortest$ time sleep 5        real     0m5.036s    user     0m0.001s    sys     0m0.003s

Page 8: Python profiling

time utility

系统级别耗时(⼀一般而言 real > user + sys)应用耗时和系统调用耗时比率

Page 9: Python profiling

time module    dreampuf@HX:~/hub/fortest$ python test_timeit.py    2.31266021729e-05

    import time    def time_module():        start = time.time()        qks(ls)        return time.time() - start        COUNT = 100000    print sum([time_module() for i in xrange(COUNT)])/COUNT

Page 10: Python profiling

time module

平台精度灵活

Page 11: Python profiling

time module class TimerIt(object): def __enter__(self): self.start = time.time() def __exit__(self, *args): print time.time() - self.start

with t: qks(ls*10) with t: qks(ls*100)

Page 12: Python profiling

timeit

    @Timer    def caller():        qks(ls)        print caller.timeit(10000)   #0.14427113533

Page 13: Python profiling

timeit

In [1]: from timeit import timeitIn [2]: timeit('s.appendleft(3)',   ...:        'import collections; s= collections.deque()', number=1000000)Out[2]: 0.1150519847869873In [3]: timeit('s.insert(0, 3)', 's=[]', number=1000000)Out[3]: 392.4638919830322

Page 14: Python profiling

timeit

不同平台使用的time不同(Xinux 使用 `time.time()`, Windows 使用`time.clock()`)

python -m timeit -s "import collections; s = collections.deque()" -n 1000000 "s.appendleft(3)"

Page 15: Python profiling

sys.getsizeof    import sys    def sys_getsizeof():        a = range(1000000)        b = xrange(1000000)        c = [i for i in xrange(1000000)]        print sys.getsizeof(a)            print sys.getsizeof(b)             print sys.getsizeof(c)    

Page 16: Python profiling

sys.getsizeof    import sys    def sys_getsizeof():        a = range(1000000)        b = xrange(1000000)        c = [i for i in xrange(1000000)]        print sys.getsizeof(a)     #8000072        print sys.getsizeof(b)     #40        print sys.getsizeof(c)     #8697472

Page 17: Python profiling

sys.getsizeof

    class Entity(object):        def __init__(self, name, age):            self.name = name            self.age = age    es = [Entity("dreampuf", 100) for i in xrange(1000000)]    print sys.getsizeof(es)     #8697472    print sum([sys.getsizeof(i) for i in es])   #64000000

Page 18: Python profiling

sys.getsizeof

和实际有所出入(Python对象缓存池)

容器还是对象

Page 19: Python profiling

眼镜蛇王,聪明

Page 20: Python profiling

Module: profile

    from cProfile import Profile as profile    def profile_module():        p = profile()        p.enable()        qks(ls*100)        p.disable()        print p.getstats()

Page 21: Python profiling

Module: profile

dreampuf@HX:~/hub/fortest$ python test_timeit.py[_lsprof.profiler_entry(code="<method 'disable' of '_lsprof.Profiler' objects>", callcount=1....

Page 22: Python profiling

Module: profile    from cProfile import Profile as profile    from pstats import Stats    def profile_module():        p = profile()        p.snapshot_stats()        p.enable()        dirs(".")        p.disable()        p.print_stats(2)

Page 23: Python profiling

Module: profile    dreampuf@HX:~/hub/fortest$ python test_timeit.py             135259 function calls in 0.071 seconds            Ordered by: cumulative time            ncalls  tottime  percall  cumtime  percall filename:lineno(function)            1    0.002    0.002    0.071    0.071 test_timeit.py:101(dirs)           22    0.005    0.000    0.068    0.003 test_timeit.py:74(main)        10519    0.008    0.000    0.047    0.000 ast.py:203(walk)        10498    0.004    0.000    0.035    0.000 {method 'extend' of 'c...        20975    0.015    0.000    0.031    0.000 ast.py:173(iter_child_nodes)           21    0.000    0.000    0.014    0.001 ast.py:32(parse)           21    0.014    0.001    0.014    0.001 {compile}        26234    0.009    0.000    0.011    0.000 ast.py:161(iter_fields)        39368    0.008    0.000    0.008    0.000 {isinstance}

.....

Page 24: Python profiling

Module: profile

行级别调用信息(时间,次数)可以保存执行结果友好的输出信息

Page 25: Python profiling

profile vs. cProfile vs. hotshot实现 开销 维护 平台

profile Python 高 持续 Python Like

cProfile Python C API 中 持续 CPython

hotshot C module 低 停止 CPython

Page 26: Python profiling

Line Profiler

    from line_profiler import LineProfiler    def line_profiler():        p = LineProfiler()        dirs_lineprofiler = p(dirs)        dirs_lineprofiler(".")        p.print_stats()

Page 27: Python profiling

Line Profiler    dreampuf@HX:~/hub/fortest$ python test_timeit.py    Timer unit: 1e-06 s    File: test_timeit.py    Function: dirs at line 101    Total time: 0.108716 s       Line #      Hits         Time  Per Hit   % Time  Line Contents    ==============================================================       101                                           def dirs(dirpath):       102         1         1129   1129.0      1.0      for current, ds, fs in os.walk(dirpath):       103        49           56      1.1      0.1          for f in fs:       104        48           50      1.0      0.0              if f[-3:] not in ('tml', '.py'):       105        26           16      0.6      0.0                  continue       106        22       107461   4884.6     98.8              main(os.path.join(current, f))       107         1            4      4.0      0.0          break

Page 28: Python profiling

Line Profiler

直观准确(Python C API 实现的trace)

Page 29: Python profiling

Memory Profiler

    dreampuf@HX:~/hub/fortest$ python test_timeit.py    Filename: test_timeit.py        Line #    Mem usage    Increment   Line Contents    ================================================       101      8.68 MB      0.00 MB   def dirs(dirpath):       102      8.68 MB      0.01 MB       for current, ds, fs in os.walk(dirpath):       103     11.59 MB      2.91 MB           for f in fs:       104     11.59 MB      0.00 MB               if f[-3:] not in ('tml', '.py'):       105     11.59 MB      0.00 MB                   continue       106     11.59 MB      0.00 MB               main(os.path.join(current, f))       107     11.59 MB      0.00 MB           break

Page 30: Python profiling

Objgraph

    import objgraph    def objgraph_profile():        c = (i for i in xrange(1000000))        objgraph.show_refs([c], filename='simple-c.png')

Page 31: Python profiling

Objgraph

a = range(1000000)

Page 32: Python profiling

Objgraph

b = xrange(1000000)

Page 33: Python profiling

Objgraph

c = (i for i in xrange(1000000))

Page 34: Python profiling

Objgraph

c = (i for i in xrange(1000000))

Page 35: Python profiling

Objgraph

es = [Entity("dreampuf", 100) for i in xrange(1000)]

Page 36: Python profiling

Objgraph

es = [Entity("dreampuf", 100) for i in xrange(1000)]

Page 37: Python profiling

Run Snake Run

Page 38: Python profiling

In Action    File: /home/dreampuf/hub/guokr/dev_guokr/nutshell/tag/views_tag.py    Function: tag_index at line 50    Total time: 0.144165 s         Line #      Hits         Time  Per Hit   % Time  Line Contents    ==============================================================        50                                           @profile.profile        51                                           def tag_index(req, tag_name):                  73                                               #标签页问答动态        74         1         4536   4536.0      3.1      question_list, page_next = qa_question_list(tag_name, TAG_ACTIVITY, page_len*(page_num-1), page_len)        75         2         4237   2118.5      2.9      activity_info_list = [qa_item(question_id) for question_id in question_list]

        79                                               #取出100条做合并,不⼀一定能合并出5个        80         1          544    544.0      0.4      tag_hot_answer_id = cacheutil.mccallcache(600)(pgrpc.backend.get_tag_hot_qa)(tag_name, count=20)        81         1        23148  23148.0     16.1      tag_hot_answer = contents.get_list(tag_hot_answer_id)        82                                               #tag_hot_qa_id = list(set([i.related_content_id for i in tag_hot_answer]))        93                                               #标签页侧边栏        94         1        63515  63515.0     44.1      side = _tag_side(tag_name)

Page 39: Python profiling

In Action    Filename: /home/dreampuf/hub/guokr/dev_guokr/nutshell/tag/views_tag.py       Line #    Mem usage    Increment   Line Contents    ================================================        50                             @profile.mem_profile        51                             def tag_index(req, tag_name):        55     26.09 MB      1.68 MB       if req.method != 'GET':        56                                     raise Http404

        73                                 #标签页问答动态        74     25.26 MB     -0.82 MB       question_list, page_next = qa_question_list(tag_name, TAG_ACTIVITY, page_len*(page_num-1), page_len)        75     26.09 MB      0.82 MB       activity_info_list = [qa_item(question_id) for question_id in question_list]               83                                 #有序且不重复        84     24.84 MB     -1.25 MB       tag_hot_qa_id = []        85     26.09 MB      1.25 MB       for i in tag_hot_answer:

Page 40: Python profiling

In Action    File: /home/dreampuf/hub/guokr/dev_guokr/nutshell/api/views_tag.py    Function: tag_logo at line 29    Total time: 0.32325 s       Line #      Hits         Time  Per Hit   % Time  Line Contents    ==============================================================        31                                           @profile.profile        32                                           def tag_logo(req, tag_name):        33         1           13     13.0      0.0      ukey = req.session.get('ukey')        36         1        32441  32441.0     10.0      tag_info = services.minerva.get_tag(tag=tag_name)               46         1        17087  17087.0      5.3      form = TagLogoForm(req.POST, req.FILES)        47         1          274    274.0      0.1      if not form.is_valid():        48                                                   return sessauth.api_error(2,u'参数格式错误')               51         1         8438   8438.0      2.6      data,request_id, filename = get_upload_file(req, FILE_UPLOAD_USER_MAXSIZE)               55         1            5      5.0      0.0      try:        56         1       104795 104795.0     32.4          gkimage = imageutil.GKImage(data).resize((48, 48))        57         1          395    395.0      0.1          new_data = gkimage.str_value()        58                                               except Exception, ex:        59                                                   print ex

Page 41: Python profiling

In Action    Filename: /home/dreampuf/hub/guokr/dev_guokr/nutshell/api/views_tag.py    Line #    Mem usage    Increment   Line Contents    ================================================        31                             @profile.mem_profile        32     25.62 MB      0.00 MB   def tag_logo(req, tag_name):        33     25.62 MB      0.01 MB       ukey = req.session.get('ukey')        46     26.31 MB      0.68 MB       form = TagLogoForm(req.POST, req.FILES)        47     26.31 MB      0.00 MB       if not form.is_valid():        48                                     return sessauth.api_error(2,u'参数格式错误')                           50     26.31 MB      0.00 MB       try:        51     26.95 MB      0.64 MB           data,request_id, filename = get_upload_file(req, FILE_UPLOAD_USER_MAXSIZE)        52                                 except UploadError,ex:        53                                     return sessauth.api_error(ex.errnum,ex.errmsg)        55     26.95 MB      0.00 MB       try:        56     27.11 MB      0.16 MB           gkimage = imageutil.GKImage(data).resize((48, 48))        57     27.11 MB      0.00 MB           new_data = gkimage.str_value()        58                                 except Exception, ex:        59                                     print ex        79     27.29 MB      0.18 MB       pgrpc.cache.kill('tag', tag_name)        87     26.66 MB     -0.64 MB       'context': tag_info['logo'],        89     26.68 MB      0.02 MB       services.juno.audit(**data)

Page 42: Python profiling

How to

定位问题,缓存  

优化算法,使用trick  

理清逻辑,缓存,分时计算  

Page 43: Python profiling

Conclusion

理解你的程序不要过度抽象使用合适的Entity

使用`__slots__`

避免重复引用不常见情况使用异常,避免对常见情况使用异常推崇函数式编程和迭代,使用装饰器和元类

Page 44: Python profiling

Reference- [究竟是什么使Linux滴答地响](http://article.yeeyan.org/view/20180/5371)- [The Python Profilers](http://docs.python.org/library/profile.html)- [Profiling threads](http://code.activestate.com/recipes/465831-profiling-threads/)- [High Performance Python tutorial](http://ianozsvald.com/HighPerformancePythonfromTrainingatEuroPython2011_v0.2.pdf)- [RunSnakeRun](http://www.vrplumber.com/programming/runsnakerun/) 图形性能分析工具- [用profile协助程序性能优化](http://blog.csdn.net/lanphaday/article/details/1483728)- [profile,cProfile, and stats - Performance analysis of Python programs](http://www.doughellmann.com/PyMOTW/profile/)- [Line Profiler](http://packages.python.org/line_profiler/)- [Memory Profiler](https://github.com/fabianp/memory_profiler)- [Objgraph](http://mg.pov.lt/objgraph/)- [A guide to analyzing Python performance](http://www.huyng.com/posts/python-performance-analysis/)- [性能分析](http://zh.wikipedia.org/wiki/性能分析)