The World Leader in High Performance Signal Processing Solutions SMP Implementing on Blackfin BF561...

17
The World Leader in High Performance Signal Processing Solutions SMP Implementing on Blackfin BF561 Graf Yang ( 杨杨杨 ) Oct 18, 2008

Transcript of The World Leader in High Performance Signal Processing Solutions SMP Implementing on Blackfin BF561...

Page 1: The World Leader in High Performance Signal Processing Solutions SMP Implementing on Blackfin BF561 Graf Yang ( 杨明明 ) Oct 18, 2008.

The World Leader in High Performance Signal Processing Solutions

SMP Implementingon Blackfin

BF561

Graf Yang ( 杨明明 )Oct 18, 2008

Page 2: The World Leader in High Performance Signal Processing Solutions SMP Implementing on Blackfin BF561 Graf Yang ( 杨明明 ) Oct 18, 2008.

Agenda

BF561 architecture Cache coherency solution Interrupt dispatch SMP status and applications SMP performance Limitations

Page 3: The World Leader in High Performance Signal Processing Solutions SMP Implementing on Blackfin BF561 Graf Yang ( 杨明明 ) Oct 18, 2008.

BF561 architecture

Page 4: The World Leader in High Performance Signal Processing Solutions SMP Implementing on Blackfin BF561 Graf Yang ( 杨明明 ) Oct 18, 2008.

BF561 architecture (cont.)

Block diagram

Page 5: The World Leader in High Performance Signal Processing Solutions SMP Implementing on Blackfin BF561 Graf Yang ( 杨明明 ) Oct 18, 2008.

BF561 architecture (cont.)

Memory architecture• L1 run at core speed L1 scratchpad sram 4K L1 instruction cache 16K L1 instruction sram 16K L1 data cache 32K L1 data sram 32K

• L2 run at 1/2 core speed Data or instruction sram 128K Shared by CoreA/B Cached (Disabled in SMP)

Page 6: The World Leader in High Performance Signal Processing Solutions SMP Implementing on Blackfin BF561 Graf Yang ( 杨明明 ) Oct 18, 2008.

BF561 architecture (cont.)

Compare to x86

BF561 x86

Cache Coherency N/A Cache coherency protocols

Atomic Instruction N/A

Local interrupt controller CEC LAPIC

System interrupt controller SIC (SICA, SICB) IOAPIC

Local timer Core timer LAPIC timer/TSC

Peripheral timer General purpose timer HPET/8254 PIT

Inter-Processor Interrupt SICB LAPIC

Lock# signal

Lock instruction prefix

Page 7: The World Leader in High Performance Signal Processing Solutions SMP Implementing on Blackfin BF561 Graf Yang ( 杨明明 ) Oct 18, 2008.

BF561 architecture (cont.)

How to boot CoreB

Cores Booting Method Starting Address Settings

CoreA

BMODE[1:0]#=00

Boot from 8/16-bit external flash memory BMODE[1:0]#=01

BMODE[1:0]#=11

CoreB Execute from L1 instruction memory SICA_SYSCR[5]

Execute from 16-bit external memory (bypass) 0x2000 0000 (BANK0)

0xEF00 0000 (BOOTROM)

Boot from SPI serial EEPROM (16-bit addressable) 0xEF00 0000 (BOOTROM)

0xFF60 0000 (L1 I-SRAM)

Page 8: The World Leader in High Performance Signal Processing Solutions SMP Implementing on Blackfin BF561 Graf Yang ( 杨明明 ) Oct 18, 2008.

Cache coherency solution

• Why cache coherence• Jiffies, Spin-lock, Semaphore, Mutex, ...

Page 9: The World Leader in High Performance Signal Processing Solutions SMP Implementing on Blackfin BF561 Graf Yang ( 杨明明 ) Oct 18, 2008.

Cache coherence solution (cont.)

• Cache policy• Main memory - Write Through

• Shared on chip SRAM (L2 SRAM) – No cacheable

• Global Lock: protect atomic data• A special spin lock that stay in share on chip SRAM (L2 SRAM)

• Operate functions: _get_core_lock/_put_core_lock

• Parameter: address of atomic data

• Spin lock: based on global lock• Invalidate all the data cache if the same lock has been got by another CPU

• Atomic ops: based on global lock• Protect the atomic operations

• Memory barrier• Invalidate all the data cache

Page 10: The World Leader in High Performance Signal Processing Solutions SMP Implementing on Blackfin BF561 Graf Yang ( 杨明明 ) Oct 18, 2008.

Interrupt dispatch

• Peripheral interrupt trigger both cores

• Two kinds of irq handlers

Page 11: The World Leader in High Performance Signal Processing Solutions SMP Implementing on Blackfin BF561 Graf Yang ( 杨明明 ) Oct 18, 2008.

Interrupt dispatch (cont.)

• Time monotonicity problem• Using two Core timers causes not monotonic

• Using gptimer and 'handle_simple_irq' casues CoreB sticky

• Solution• Use general purpose timer0 instead of Core timers

• Use handle_percpu_irq() instead of handle_simple_irq()

Page 12: The World Leader in High Performance Signal Processing Solutions SMP Implementing on Blackfin BF561 Graf Yang ( 杨明明 ) Oct 18, 2008.

Interrupt dispatch (cont.)

• Inter-processor interrupt: SICB_SYSCR• Write 1 to CA_supplement_int0 trigger an interrupt to CoreA

• Write 1 to CB_supplement_int0 trigger an interrupt to CoreB

• Interrupt handler write 1 to relevant bit to clear interrupt request

Inter-processor interrupt implementing • Per-cpu message queue

• Per-cpu spin lock

• Per-cpu interrupt

Page 13: The World Leader in High Performance Signal Processing Solutions SMP Implementing on Blackfin BF561 Graf Yang ( 杨明明 ) Oct 18, 2008.

SMP status and application (cont.)

• SMP status• 2008R1.5svn://sources.blackfin.uclinux.org/svn/uclinux-dist/branches/2008R1/bfin_patch/smp_patch/

• Trunksvn://sources.blackfin.uclinux.org/svn/uclinux-dist/trunk/bfin_patch/smp_patch/

• Application - Multi-task

• Video encoder/decoder - codec1 on CoreB, codec2 on CoreA

• VoIP - codec on CoreB, network stack on CoreA

Page 14: The World Leader in High Performance Signal Processing Solutions SMP Implementing on Blackfin BF561 Graf Yang ( 杨明明 ) Oct 18, 2008.

SMP performance

• Whetstone test result• Test software: Whetstone

• Test Hardware: BF561, Core Clock 600MHz, System Clock: 100MHz

• Test Environment 1: UP

• Test Environment 2: SMP

Command Line UP SMP

whetstone 15s 15s

whetstone ; whetstone 30s 30s

whetstone & whetstone 30s 20s

• Performance analysis

•Invalidate entire data cache: 79130 times in whetstone test

Page 15: The World Leader in High Performance Signal Processing Solutions SMP Implementing on Blackfin BF561 Graf Yang ( 杨明明 ) Oct 18, 2008.

Limitations

• Store routines to L1 I-SRAM• Store shared data to L1 D-SRAM• User multi-threads running on different Cores

Page 16: The World Leader in High Performance Signal Processing Solutions SMP Implementing on Blackfin BF561 Graf Yang ( 杨明明 ) Oct 18, 2008.

16

Questions?

Page 17: The World Leader in High Performance Signal Processing Solutions SMP Implementing on Blackfin BF561 Graf Yang ( 杨明明 ) Oct 18, 2008.

The World Leader in High Performance Signal Processing Solutions

The End Thank you!