Post on 02-Jan-2016
Regularities Considered Harmful: Forcing Randomness to Memory Accesses to Re-
duce Row Buffer Conflicts for Multi-Core, Multi-Bank Systems
Embedded Lab.Park Yeongseong
ACM ASPLOS’13 Heekwon Park, Computer Science Department University of Pittsburgh
Introduction Background Regularity Considered Harmful Design and Implementation Performance Evaluation Conclusions Q&A
Contents
Recent computer architecture (Multi-Core) A vast amount of main memory
Introduction
Need to re-examine ◦ internal policies, mechanisms
Rethinking the memory allocation issue
Background Problem
◦ Row buffer conflict
Approach◦ Memory container◦ Randomize memory ac-
cess
< Conceptual memory organization >
Row-buffer Conflict◦ Precharging◦ Activating operation
Delay Energy Consumption
Background
< Row-buffer hit and conflict overhead >
Background
< Conflict does not occur > < Conflict occurs>
Kernel-level memory allocator◦ Mapping between virtual pages and physical page
frames Memory controller
◦ Banks
CPU cache mode ◦ Uncacheable
Variables numerous times Access
two variables mutu-ally dependent
Memory Organization Analysis
Memory Organization Analysis Figure (d) ranges from 0 to
2,000,000 (roughly 128MB size)
Figure (c) zooms in on the 590,000 ~ 640,000 portion of Figure (d)
Figure (b) zooms in on a por-tion of iterations of Figure (c)
Figure (a) zooms in on a por-tion of iterations of Figure (b)
< Analysis result>
Regularity Considered Harmful
< Sequential access pattern >
Modified Algorithm◦ Set the two variable : lo-
cated in the same cache line
◦ Different starting physical address
Average elapsed time◦ 2052μsec
Regularity Considered Harmful
< Random access pattern >
Average elapsed time◦ 1925 μsec
“1/total number of banks”.
Design and Implementation
< Comparison between buddy and randomized algorithm>
Individual page frame management Downward search
Experiment Environment◦ IBM x3650 M2 Server◦ Intel XEON x5570 quad core processors◦ 32GB DDR3 Memory◦ 450GB SAS Disk 8◦ Linux kernel version 2.6.32
Performance Evaluation
Benchmark category◦ 1 Group : Memory intensive benchmark
Stream, Sysbench-memory, Ramspeed
◦ 2 Group : CPU or I/O intensive benchmark Kernel Compile, Dbench, Unixbench
◦ 3 Group : To represent diverse application do-mains PARSEC
Performance Evaluation
Performance Evaluation
< Memory intensive benchmark results >
< CPU or I/O intensive benchmark results >
: kernel-level memory allocator◦ Multi-core, Multi-bank systems
Dedicate multiple banks to a core◦ Maximize memory parallelism
Same bank Access reduce
Conclusions
Memory container
Randomizing memory allocation algorithm
http://people.cs.pitt.edu/~parkhk/publications.html
멀티 - 코어 멀티 - 뱅크에서의 메모리 참조 패턴에 따른 성능 분석 – 학위논문 ( 석사 ) 이 상엽
References