TreadMarks : Distributed Shared Memory on Standard Workstations and Operating Systems

34
Paper Presentati 2011 Fall Distributed Information Processi TreadMarks: Distributed Shared Memory on Standard Workstations and Operating Systems Jeongho Kim, Youngseok Son M.S. Student. Real-Time Operating Systems Laboratory Seoul National University

description

TreadMarks : Distributed Shared Memory on Standard Workstations and Operating Systems. Jeongho Kim, Youngseok Son M.S. Student. Real-Time Operating Systems Laboratory Seoul National University. Introduction Techniques I mplementation: TreadMarks Evaluation. - PowerPoint PPT Presentation

Transcript of TreadMarks : Distributed Shared Memory on Standard Workstations and Operating Systems

Page 1: TreadMarks : Distributed Shared Memory on Standard Workstations and Operating Systems

Paper Presentation2011 Fall Distributed Information Processing

TreadMarks: Distributed Shared Memory on Standard Workstations and Operating Systems

Jeongho Kim, Youngseok SonM.S. Student.

Real-Time Operating Systems Laboratory Seoul National University

Page 2: TreadMarks : Distributed Shared Memory on Standard Workstations and Operating Systems

Paper Presentation2011 Fall Distributed Information Processing

I. IntroductionII. TechniquesIII. Implementation: TreadMarksIV. Evaluation

ContentsTreadMarks: Distributed Shared Memory on Standard Workstations and Operating Systems

Page 3: TreadMarks : Distributed Shared Memory on Standard Workstations and Operating Systems

Paper Presentation2011 Fall Distributed Information Processing

The distributed shared memory (DSM) implements the shared memory model in distributed systems, which have no physical shared memory

The shared memory model provides a virtual address space shared between all processors

The high cost of communication in distributed systems is that DSM systems move data between processors

Shared memory

Distributed Shared MemoryI. Introduction

network

Processor 1

Mem 1

Processor 2

Mem 2

Processor N

Mem N

Page 4: TreadMarks : Distributed Shared Memory on Standard Workstations and Operating Systems

Paper Presentation2011 Fall Distributed Information Processing

OS dependency Kernel modifications

Poor performance ① Communication overhead② False sharing

Problems of Existing DSMI. Introduction

Page 5: TreadMarks : Distributed Shared Memory on Standard Workstations and Operating Systems

Paper Presentation2011 Fall Distributed Information Processing

① Communication overhead If the communication occurs whenever the x variable is changed, it

costs a lot of overhead.

Solution) The communication occurs, when the x variable is changed finally

Problems of Existing DSMI. Introduction

w(x) w(x) w(x)Process A

Process B

w(x) w(x) w(x)

page page page

page

Process A

Process B

t

t

t

t

Page 6: TreadMarks : Distributed Shared Memory on Standard Workstations and Operating Systems

Paper Presentation2011 Fall Distributed Information Processing

② False sharing A situation in which two or more processes access different variables

within a page.• If only one process is allowed to write to a page at a time, false sharing

leads to unnecessary communication, called the “ping-pong” effect.

Problems of Existing DSMI. Introduction

Page

ProcessCode using variable a

a bPage

ProcessCode using variable b

a b

① Access to write variable a

② Transfer the page

③ Access to read variable b ④ waitX

Page 7: TreadMarks : Distributed Shared Memory on Standard Workstations and Operating Systems

Paper Presentation2011 Fall Distributed Information Processing

User-level implementation Good performance

Reduction of communication overhead• Lazy release consistency• Invalidate-based protocol

Solving false sharing• multiple writer protocol

ObjectivesI. Introduction

Page 8: TreadMarks : Distributed Shared Memory on Standard Workstations and Operating Systems

Paper Presentation2011 Fall Distributed Information Processing

I. IntroductionII. TechniquesIII. Implementation: TreadMarksIV. Evaluation

ContentsTreadMarks: Distributed Shared Memory on Standard Workstations and Operating Systems

Page 9: TreadMarks : Distributed Shared Memory on Standard Workstations and Operating Systems

Paper Presentation2011 Fall Distributed Information Processing

1. Lazy release consistency2. Multiple-writer protocol

Related WorksII. Related Works

Page 10: TreadMarks : Distributed Shared Memory on Standard Workstations and Operating Systems

Paper Presentation2011 Fall Distributed Information Processing

Release Consistency (RC) is a memory consistency model that permits a node to delay making its changes of shared data visible to other nodes until certain synchronization ac-cesses occur

Shared memory accesses Ordinary access Synchronization access

• acquire access• release access

Example

1. Lazy Release ConsistencyII. Related Works

w(y)1w(x)1 rel(L1)Process

acq(L1) t

L1: lock related to x, y

Page 11: TreadMarks : Distributed Shared Memory on Standard Workstations and Operating Systems

Paper Presentation2011 Fall Distributed Information Processing

Two types of RC① Eager release consistency② Lazy release consistency

1. Lazy Release ConsistencyII. Related Works

Page 12: TreadMarks : Distributed Shared Memory on Standard Workstations and Operating Systems

Paper Presentation2011 Fall Distributed Information Processing

① Eager release consistency (ERC) postpones sending the modifications to the next release. Understanding ERC

• Release operation does not complete (is not performed) until the acknowl-edgements from all the processes are received.

• L1 is lock related to x, y and L2 is lock related to z

1. Lazy Release ConsistencyII. Related Works

Process A

w(y)1w(x)1

acq(L1)

w(z)1acq(L2)

r(x)0

z=1

z=1

r(x)0

x=1y=1

rel(L1)

apply changes

r(x)1

rel(L2)

apply changesr(z)0

r(y)1 r(z)1

r(z)1Process B

Process Capply changes

acq(L1)

t

t

t

apply changes

x=1y=1

Page 13: TreadMarks : Distributed Shared Memory on Standard Workstations and Operating Systems

Paper Presentation2011 Fall Distributed Information Processing

① Eager release consistency (ERC) postpones sending the modifications to the next release. Understanding ERC

• Release operation does not complete (is not performed) until the acknowl-edgements from all the processes are received.

• L1 is lock related to x, y and L2 is lock related to z

1. Lazy Release ConsistencyII. Related Works

w(y)1w(x)1

acq(L1)

w(z)1acq(L2)

r(x)0

z=1

z=1

r(x)0

x=1y=1

rel(L1)

apply changes

r(x)1

rel(L2)

apply changesr(z)0

r(y)1 r(z)1

r(z)1

apply changes

acq(L1)

t

t

t

apply changes

x=1y=1

Process A

Process B

Process C

Page 14: TreadMarks : Distributed Shared Memory on Standard Workstations and Operating Systems

Paper Presentation2011 Fall Distributed Information Processing

② Lazy release consistency (LRC) postpones sending of modi-fications until a remote processor actually needs them Understanding ERC

• It is guaranteed that the acquirer of the same lock sees the modification that precede the release in program order.

1. Lazy Release ConsistencyII. Related Works

Process A

Process B

rel(L1)w(x)1

t

r(x)0 r(x)0 acq(L1)r(x)0 r(x)1

Process Cr(x)0 r(x)0 acq(L2)r(x)0 r(x)0

How did process B know whether variable x has to be updated?1) Write notice2) Timestamp

Page 15: TreadMarks : Distributed Shared Memory on Standard Workstations and Operating Systems

Paper Presentation2011 Fall Distributed Information Processing

1. Lazy Release ConsistencyII. Related Works

② Lazy release consistency (LRC)1) Write Notice

• The releaser send write notice to all other processes

Process A

Process B

rel(L1)w(x)1

r(x)0 r(x)0 acq(L1)r(x)0 r(x)1write notice x=1

Page 16: TreadMarks : Distributed Shared Memory on Standard Workstations and Operating Systems

Paper Presentation2011 Fall Distributed Information Processing

1. Lazy Release ConsistencyII. Related Works

② Lazy release consistency (LRC)2) Timestamp

• Satisfying the happened-before relationship between all operations is enough to satisfy LRC

• The ordering is applied to process intervals.– Interval is a segment of time in the execution of a single process.– New interval begins each time a process executes a synchronization opera-

tion.

Processrel(L1) acq(L3)

1 2 3

Page 17: TreadMarks : Distributed Shared Memory on Standard Workstations and Operating Systems

Paper Presentation2011 Fall Distributed Information Processing

1. Lazy Release ConsistencyII. Related Works

② Lazy release consistency (LRC)2) Timestamp

• It is implemented as a vector timestamp

Process 1rel(L1)

Process 2

Process 2

acq(L1) rel(L1)

(1,0,0)

(1,1,0)

(1,0,1)

(1,2,0) (1,3,0)

(2,3,0)

(1,3,2)

Page 18: TreadMarks : Distributed Shared Memory on Standard Workstations and Operating Systems

Paper Presentation2011 Fall Distributed Information Processing

Multiple-writer protocol is designed to solve false sharing It is possible that several processes make modifications to different

variables at the same page

Two techniques Twin

• It is a copied page of original page• It will be compared with a changed page.

Diff• diff is a difference between twin and ‘copyset’

2. Multiple-Writer ProtocolII. Related Works

X Y

Process2Process1

PageVariable Variable

Page 19: TreadMarks : Distributed Shared Memory on Standard Workstations and Operating Systems

Paper Presentation2011 Fall Distributed Information Processing

Two techniques (cont’d) Twin

Diff

Note that twinning and diffing not only allows multiple independent writers but also significantly reduces the amount of data sent

2. Multiple-Writer ProtocolII. Related Works

write P twin

writable working copy

release:

diff

Page 20: TreadMarks : Distributed Shared Memory on Standard Workstations and Operating Systems

Paper Presentation2011 Fall Distributed Information Processing

Solving communication overhead Lazy release consistency

Solving false sharing Multiple-writer protocol

Implementation Lazy release consistency + Multiple-writer protocol Eager release consistency + Multiple-writer protocol

SummaryII. Related Works

Page 21: TreadMarks : Distributed Shared Memory on Standard Workstations and Operating Systems

Paper Presentation2011 Fall Distributed Information Processing

Eager release consistency + multiple-writer protocol

SummaryII. Related Works

t

P1

P2

rel(L1)w(x)

acq(L1)

P3

acq(L2)

rel(L2)

r(y)

w(y) inv(P)

inv(P)

xy

page Pr(x)

twin diff

twin diff

Page 22: TreadMarks : Distributed Shared Memory on Standard Workstations and Operating Systems

Paper Presentation2011 Fall Distributed Information Processing

Lazy release consistency + multiple-writer protocol

SummaryII. Related Works

t

P1

P2

rel(L1)w(x)

acq(L1)

P3

acq(L2)

rel(L2)

r(x)

w(y) inv(P)

inv(P)

xy

page Pr(y)

twin diff

twin diff

Page 23: TreadMarks : Distributed Shared Memory on Standard Workstations and Operating Systems

Paper Presentation2011 Fall Distributed Information Processing

I. IntroductionII. TechniquesIII. Implementation: TreadMarksIV. Evaluation

ContentsTreadMarks: Distributed Shared Memory on Standard Workstations and Operating Systems

Page 24: TreadMarks : Distributed Shared Memory on Standard Workstations and Operating Systems

Paper Presentation2011 Fall Distributed Information Processing

1) Data structures2) Interval and Diff creation3) Access misses4) Garbage collection5) Etc

ImplementationIII. Implementation: TreadMarks

t

Page 25: TreadMarks : Distributed Shared Memory on Standard Workstations and Operating Systems

Paper Presentation2011 Fall Distributed Information Processing

Lazy release consistency Write notice Timestamp

Multiple-writer protocol Diff

1) Data Structure (1/2)III. Implementation: TreadMarks

t

Page 26: TreadMarks : Distributed Shared Memory on Standard Workstations and Operating Systems

Paper Presentation2011 Fall Distributed Information Processing

1) Data Structure (2/2)III. Implementation: TreadMarks

t

CopysetPage State

Page Array1 2 3 4 ...

Write Notice Records

Write Notice Record 1

Write Notice Record 2

Write Notice Record 3

Processor Array1 2 3 4 ...

1 2 3 4 ... Process Array

Vector Timestamp

Diff Pool

Diff value is managed by diff poolfor efficient memory management

If there is a write notice record in a page,happen before relationship should be consideredthrough the vector time stamp

Page 27: TreadMarks : Distributed Shared Memory on Standard Workstations and Operating Systems

Paper Presentation2011 Fall Distributed Information Processing

Interval craetion logically

• a new interval begins at each release and acquire In practice

• Interval creation can be postponed until we communicate with another process, avoiding overhead if a lock is reacquired by the same processor

Diff creation With lazy diff creation, these pages remain writable until a diff request

or a write notice arrives for that page• a subsequent write results in a write notice for the next interval

2) Interval and Diff creationIII. Implementation: TreadMarks

t

Page 28: TreadMarks : Distributed Shared Memory on Standard Workstations and Operating Systems

Paper Presentation2011 Fall Distributed Information Processing

When access miss occurs without write notice

• Initially setup that processor 0 has the page with write notice

1. Get the diffs from the write notice with small timestamp2. Create an actual diff which is correction of all diff related to the page3. The twin is discarded and the result is copied to copyset

3) Access MissesIII. Implementation: TreadMarks

t

Page 29: TreadMarks : Distributed Shared Memory on Standard Workstations and Operating Systems

Paper Presentation2011 Fall Distributed Information Processing

Lock & barrier Statically assigned manager

Garbage collection It reclaim the space used by write notice records, interval records, and

diffs It is triggered when the free space drops below a threshold

Unix Aspects TreadMarks relies on Unix standard libraries

• Remote process creation, interprocessor communication, and memory management.

4) Etc.III. Implementation: TreadMarks

t

Page 30: TreadMarks : Distributed Shared Memory on Standard Workstations and Operating Systems

Paper Presentation2011 Fall Distributed Information Processing

I. IntroductionII. TechniquesIII. Implementation: TreadMarksIV. Evaluation

ContentsTreadMarks: Distributed Shared Memory on Standard Workstations and Operating Systems

Page 31: TreadMarks : Distributed Shared Memory on Standard Workstations and Operating Systems

Paper Presentation2011 Fall Distributed Information Processing

Experimental Environment 8 DECstation-5000/240 Connected to a 100-Mbps ATM LAN and a 10-Mbps Ethernet

Applications Water – molecular dynamics simulation , 343 molecules for 5 steps Jacobi – Successive Over-Relaxation with a grid of 2000 by 1000 el-

ements TSP – branch & bound algorithm to solve the traveling salesman prob-

lem for a 19 cities Quicksort – sorts an array of 256K integers. Using bubblesort to sort

subarray of less than 1K element ILINK – genetic linkage analysis

Experimental EnvironmentlV. Evaluation

Page 32: TreadMarks : Distributed Shared Memory on Standard Workstations and Operating Systems

Paper Presentation2011 Fall Distributed Information Processing

Message rate messages / sec

Speedup

ResultslV. Evaluation

Page 33: TreadMarks : Distributed Shared Memory on Standard Workstations and Operating Systems

Paper Presentation2011 Fall Distributed Information Processing

Diff creation rate diffs / rate

ResultslV. Evaluation

Page 34: TreadMarks : Distributed Shared Memory on Standard Workstations and Operating Systems

Paper Presentation2011 Fall Distributed Information Processing

End of Doc.