PyCon India 2012 Talk

21
Compe&&on Programming and Problem Solving using (of course) Python Dhruv Baldawa @dhruvbaldawa

Transcript of PyCon India 2012 Talk

Page 1: PyCon India 2012 Talk

Compe&&on  Programming  and  Problem  Solving  using    (of  course)  Python  

Dhruv  Baldawa  

@dhruvbaldawa  

Page 2: PyCon India 2012 Talk

About  Me  

•  B.E.  Computer  Science  from  University  of  Mumbai  

•  GSoCer  2011  and  2012  •  Na&onal  Cyber  Olympiad  Gold  medallist  2007  •  Currently  working  at  Enthought,  Inc.  

@dhruvbaldawa  

Page 3: PyCon India 2012 Talk

Why  is  solving  difficult  ?  

•  the  dataset  is  too  large  to  be  iterate  even  once  

•  the  complexity  is  huge  or  very  bad  •  or  BOTH  !!  

@dhruvbaldawa  

Page 4: PyCon India 2012 Talk

What  this  talk  covers  

•  Memory  Management  – Common  problems  – List  forms  v/s  Iterator  forms  –  itertools  

•  Time  Management  – Choosing  the  correct  data  structures  – Approach  

•  Profiling  &ps  

@dhruvbaldawa  

Page 5: PyCon India 2012 Talk

Memory  Management  

Why  is  it  important  ?  •  Most  of  the  common  compe&&ons  require  you  store  and  compute  large  amounts  of  data.  

•  Running  out  of  memory  limit  is  pre[y  common.  

•  There  is  “always”  a  way  to  improve  memory  consump&on  of  your  program.  

@dhruvbaldawa  

Page 6: PyCon India 2012 Talk

Common  Problems  

•  MemoryError[1]  – when  you  run  out  of  memory,  but  the  situa&on  can  s&ll  be  rescued  (by  dele&ng  certain  objects)  

•  OverflowError[1]  – when  the  result  of  an  arithme&c  opera&on  is  too  large  to  be  represented  

@dhruvbaldawa  

Page 7: PyCon India 2012 Talk

List  forms  v/s  Iterator  forms  

•  List  forms  – be[er  to  store  and  re-­‐use  results  of  computa&ons  – use  if  you  want  to  perform  list  opera&ons,  where  you  need  to  store  en&re  lists  

– consumes  much  more  memory  – only  prefer  when  a  real  list  cannot  be  avoided  – all  the  elements  are  generated/ini&alized  at  once    

@dhruvbaldawa  

Page 8: PyCon India 2012 Talk

List  forms  v/s  Iterator  forms  

•  Iterator  forms  –  lazy  and  on-­‐demand  genera&on  of  values  – very  low  memory  consump&on  – very  useful  when  you  only  need  to  work  with  the  value  at  hand  

– Cons:  •  cannot  step  backwards  •  cannot  skip  or  jump  forwards    

–  they  are  scalable  and  memory-­‐friendly  and  used  where  real  lists  are  not  required  

@dhruvbaldawa  

Page 9: PyCon India 2012 Talk

List  forms  v/s  Iterator  forms  

•  range  v/s  xrange[2]  •  list  comprehensions  v/s  generator  expressions  •  func&ons  v/s  generators  

@dhruvbaldawa  

Page 10: PyCon India 2012 Talk

import  itertools  

•  itertools  module  provides  a  set  of  fast,  memory  efficient  iterators  

•  provides  fast  implementa&ons  for  common  jobs  like  product,  permuta&ons,  combina&ons  

•  itertools  documenta&on  

@dhruvbaldawa  

Page 11: PyCon India 2012 Talk

import  itertools  

@dhruvbaldawa  

Page 12: PyCon India 2012 Talk

import  itertools  

@dhruvbaldawa  

Page 13: PyCon India 2012 Talk

Time  Op&miza&ons  

•  choosing  the  right  data  structure  •  choosing  the  right  approach  

@dhruvbaldawa  

Page 14: PyCon India 2012 Talk

choosing  the  right  data  structure  

“Bad  programmers  worry  about  the  code.  Good  programmers  worry  about  data  structures  and  their  rela9onships.”  

 

-­‐-­‐Linus  Torvalds  

@dhruvbaldawa  

Page 15: PyCon India 2012 Talk

Data  Structures  in  Python  

•  don’t  re-­‐invent  the  wheel.  Use  tuple,  list,  dict,  sets,  as  they  are  coded  in  C  and  hence  are  FAST  

•  for  membership  tests  use  dict/set  [O(1)]  instead  of  lists/tuple  [O(n)]  

•  use  collec&ons  •  Queue  opera&ons  like  pop(),  insert()  are  be[er  in  collec&ons.deque  [O(1)]  than  lists  [O(n)]  

•  use  bisect,  heapq  for  sorted  lists  

@dhruvbaldawa  

Page 16: PyCon India 2012 Talk

import  collec&ons[4]  

•  deque  •  Counter  •  OrderedDict  •  defaultdict  

@dhruvbaldawa  

Page 17: PyCon India 2012 Talk

Memoiza&on[5]  •  caching  results  from  previous  procedure  calls,  and  using  it  directly  

•  if  a  func&on  call  returns  the  same  value  when  the  same  set  of  arguments  are  passed,  then  it  can  be  memoized  !  

!def memoized_function(value):! if value in cache:! return cache[value]! else:!! ! !cache[value] = compute(value)!! ! !return cache[value]!

@dhruvbaldawa  

Page 18: PyCon India 2012 Talk

Collatz  Conjecture[6]  

f(n)  =  n/2;  if  n  is  even,                  =  3n+1;  if  n  is  odd    For  example:  f(4)  =  4  -­‐>  2  -­‐>  1  f(3)  =  3  -­‐>  10  -­‐>  5  -­‐>  16  -­‐>  8  -­‐>  4  -­‐>  2  -­‐>  1  

@dhruvbaldawa  

Page 19: PyCon India 2012 Talk

Mul&plica&on  problem  

•  Mul&ply  two  numbers  assuming  there  is  no  mul&plica&on  operator  available  

@dhruvbaldawa  

Page 20: PyCon India 2012 Talk

Profiling  

•  On  Unix  machines,  you  can  simply:  &me  python  my_script.py  

•  python  –m  cProfile  my_script.py  •  On  iPython,  you  can  use:  %&meit  my_expensive_func&on()  

@dhruvbaldawa  

Page 21: PyCon India 2012 Talk

References  1.  h[p://docs.python.org/library/excep&ons.html  2.  h[p://jus&ndailey.blogspot.in/2011/09/python-­‐range-­‐vs-­‐xrange.html  3.  h[p://en.wikipedia.org/wiki/List_comprehension  4.  h[p://docs.python.org/library/collec&ons.html  5.  h[p://en.wikipedia.org/wiki/Memoiza&on  6.  h[p://en.wikipedia.org/wiki/Collatz_conjecture  7.  h[p://wiki.python.org/moin/PythonSpeed/PerformanceTips  8.  h[p://wiki.python.org/moin/PythonSpeed  9.  h[p://www.udacity.com/wiki/CS215%20Unit%201%20Code?

course=cs215  

@dhruvbaldawa