Locality of Reference

Post on 21-Jan-2018

170 views 0 download

Transcript of Locality of Reference

LocalityofReference2017.10.28 NAGOYA.BIN #1 KOUJI MATSUI (@KEKYO2)

Kouji Matsui - kekyo• NAGOYA city, AICHI pref., JP

• Twitter – @kekyo2 / Facebook

• ux-spiral corporation

• Microsoft Most Valuable Professional VS and DevTech 2015-

• Certified Scrum master / Scrum product owner

• Center CLR organizer.

• .NET/C#/F#/IL/metaprogramming or like…

• Bike rider

Agenda

•Physical side scales

• Logical side scales

•Data stream between physicals and logicals

• Locality of reference

•Anti-locality of reference

•Conclusion

Physical side scales

Physical side scales

Processor #1

Physical Core #4

Logical Core #1Logical Core #1Logical Core #1Logical Core #1Physical Core #3

Logical Core #1Logical Core #1Logical Core #1Logical Core #1Physical Core #2

Logical Core #1Logical Core #1Logical Core #1Logical Core #1Physical Core #1

Logical Core #1Logical Core #1Logical Core #1Logical Core #1

Processor #2

Physical Core #8

Logical Core #1Logical Core #1Logical Core #1Logical Core #1Physical Core #7

Logical Core #1Logical Core #1Logical Core #1Logical Core #1Physical Core #6

Logical Core #1Logical Core #1Logical Core #1Logical Core #1Physical Core #5

Logical Core #1Logical Core #1Logical Core #1Logical Core #17

Processor #3

Physical Core #12

Logical Core #1Logical Core #1Logical Core #1Logical Core #1Physical Core #11

Logical Core #1Logical Core #1Logical Core #1Logical Core #1Physical Core #10

Logical Core #1Logical Core #1Logical Core #1Logical Core #1Physical Core #9

Logical Core #1Logical Core #1Logical Core #1Logical Core #33

Physical side scales

The memory/IO bind at the fixed CPU/Core(Non configurable)

Physical side scales

The “shared cache memory” bind at the fixed CPU/Core

(Non configurable)

Physical side scales

The “cache memory” bind at the fixed CPU/Core(Non configurable)

The “shared cache memory” bind at the fixed CPU/Core

(Non configurable)

Agenda

•Physical side scales

• Logical side scales

•Data stream between physicals and logicals

• Locality of reference

•Anti-locality of reference

•Conclusion

Logical side scales

Process #1

VirtualMemory

Space

Thread #1Thread #1Thread #1Thread #1Thread #1

Process #2

VirtualMemory

Space

Thread #1Thread #1Thread #1Thread #1Thread #11

Process #3

VirtualMemory

Space

Thread #1Thread #1Thread #1Thread #1Thread #21

Process #4

VirtualMemory

Space

Thread #1Thread #1Thread #1Thread #1Thread #31

Process #5

VirtualMemory

Space

Thread #1Thread #1Thread #1Thread #1Thread #41

Logical side scales

Thread #1Thread #1Thread #1Thread #1Thread #1

Thread #1Thread #1Thread #1Thread #1Thread #11

Thread #1Thread #1Thread #1Thread #1Thread #21

Thread #1Thread #1Thread #1Thread #1Thread #31

Thread #1Thread #1Thread #1Thread #1Thread #41

Logical Core #4

Logical Core #3

Logical Core #2

Logical Core #1

This is true story

Execution context

Logical side scales

Thread #1Thread #1Thread #1Thread #1Thread #1

Thread #1Thread #1Thread #1Thread #1Thread #11

Thread #1Thread #1Thread #1Thread #1Thread #21

Thread #1Thread #1Thread #1Thread #1Thread #31

Thread #1Thread #1Thread #1Thread #1Thread #41

Logical Core #4

Logical Core #3

Logical Core #2

Logical Core #1

Switch execution context

Agenda

•Physical side scales

• Logical side scales

•Data stream between physicals and logicals

• Locality of reference

•Anti-locality of reference

•Conclusion

Data stream between physicals and logicals

Thread #1Thread #1Thread #1Thread #1Thread #1

Thread #1Thread #1Thread #1Thread #1Thread #11

Thread #1Thread #1Thread #1Thread #1Thread #21

Thread #1Thread #1Thread #1Thread #1Thread #31

Thread #1Thread #1Thread #1Thread #1Thread #41

Logical Core #4

Logical Core #3

Logical Core #2

Logical Core #1

L1/L2 cache #1

L1/L2 cache #2 L1/L2 cache #4

L1/L2 cache #3

Thread #1Thread #1Thread #1Thread #1Thread #1

Thread #1Thread #1Thread #1Thread #1Thread #11

Thread #1Thread #1Thread #1Thread #1Thread #21

Thread #1Thread #1Thread #1Thread #1Thread #31

Thread #1Thread #1Thread #1Thread #1Thread #41

Logical Core #4

Logical Core #3

Logical Core #2

Logical Core #1

L1/L2 cache #1

L1/L2 cache #2 L1/L2 cache #4

L3 cache #1 L3 cache #2

L1/L2 cache #3

Thread #1Thread #1Thread #1Thread #1Thread #1

Thread #1Thread #1Thread #1Thread #1Thread #11

Thread #1Thread #1Thread #1Thread #1Thread #21

Thread #1Thread #1Thread #1Thread #1Thread #31

Thread #1Thread #1Thread #1Thread #1Thread #41

Logical Core #4

Logical Core #3

Logical Core #2

Logical Core #1

L1/L2 cache #1

L1/L2 cache #2

L1/L2 cache #3

L1/L2 cache #4

L3 cache #1 L3 cache #2

NUMA node bound memory

Agenda

•Physical side scales

• Logical side scales

•Data stream between physicals and logicals

• Locality of reference

•Anti-locality of reference

•Conclusion

Thread #1Thread #1Thread #1Thread #1Thread #31

Thread #1Thread #1Thread #1Thread #1Thread #41

Logical Core #4

L1/L2 cache #4

L3 cache #2

NUMA node bound memorydeclaredType

currentType

stopType

field

FieldInfo[]

Thread #33 context

Load

/Prelo

ad

Thread #1Thread #1Thread #1Thread #1Thread #31

Thread #1Thread #1Thread #1Thread #1Thread #41

Logical Core #4

L1/L2 cache #4

L3 cache #2

NUMA node bound memory

__stack0_0

Thread #42 context

__stack0_1

__stack0_2

__stack1_0

declaredType

currentType

local0

local1

field

Load

/Prelo

ad

Switch

Thread #1Thread #1Thread #1Thread #1Thread #21

Thread #1Thread #1Thread #1Thread #1Thread #31

Thread #1Thread #1Thread #1Thread #1Thread #41

Logical Core #4

Logical Core #3L1/L2 cache #3

L1/L2 cache #4

L3 cache #2

NUMA node bound memorydeclaredType

currentType

stopType

field

FieldInfo[]

Thread #1Thread #1Thread #1Thread #1Thread #21

Thread #1Thread #1Thread #1Thread #1Thread #31

Thread #1Thread #1Thread #1Thread #1Thread #41

Logical Core #4

Logical Core #3L1/L2 cache #3

L1/L2 cache #4

L3 cache #2

NUMA node bound memorydeclaredType

currentType

stopType

field

FieldInfo[]

stopType

field

FieldInfo[]

field

Load

/Prelo

ad

Switch

Agenda

•Physical side scales

• Logical side scales

•Data stream between physicals and logicals

• Locality of reference

•Anti-locality of reference

•Conclusion

Thread #1Thread #1Thread #1Thread #1Thread #21

Thread #1Thread #1Thread #1Thread #1Thread #31

Thread #1Thread #1Thread #1Thread #1Thread #41

Logical Core #4

Logical Core #3L1/L2 cache #3

L1/L2 cache #4

L3 cache #2

NUMA node bound memoryCommon value

Common value

Common value

Load

/Prelo

ad

Load

/Prelo

ad

These threads access common value

Thread #1Thread #1Thread #1Thread #1Thread #21

Thread #1Thread #1Thread #1Thread #1Thread #31

Thread #1Thread #1Thread #1Thread #1Thread #41

Logical Core #4

Logical Core #3L1/L2 cache #3

L1/L2 cache #4

L3 cache #2

NUMA node bound memoryCommon value

Common value

Common value

Race condition(Receive coherence penalty)

STRATEGY:• Turn to immutable• Hashed indexer

Wri

te b

ack

Wri

te b

ack

Agenda

•Physical side scales

• Logical side scales

•Data stream between physicals and logicals

• Locality of reference

•Anti-locality of reference

•Conclusion

Conclusion

The execution context bounds not THREAD. The code executor is CPU CORE.

CPU cores have structuable nested cache system.

Cache miss penalty is large.

Cache coherency penalty is large.

Both I/O systems too.

Important cache-related architecture:◦ Locality of reference

◦ Immutable

Thanks join!

My blog◦ http://www.kekyo.net/

Current active project:◦ IL2C - A translator implementation of .NET intermediate language to C

language.◦ YouTube: http://bit.ly/2xtu4MH

◦ GitHub: https://github.com/kekyo/IL2C