Kandroid for nhn_deview_20131013_v5_final
-
Upload
naver-d2 -
Category
Technology
-
view
1.897 -
download
0
Transcript of Kandroid for nhn_deview_20131013_v5_final
High Performance
Android App. Development
양정수 (Jeongsoo Yang)
yangjeongsoo at gmail.com
www.kandroid.org
CONTENT
1. The History of Android Performance Features
2. Performance Comparison
of the Three Programming Models
3. Case Study :
Performance Features of the Google Chrome Browser
4. Questionnaire :
Multi-Core vs. GPU,
Android vs. Chrome,
and Beyond Android
Let’s go back to last summer.
“Why will Android always lag behind iOS?”
If you are a manufacturer,
how would you solve this problem?
What was OOOOOOO’s Approach?
1. The Optimization of SoC, Android Platform,
and Built-in Apps
2. Belief that this was Apple’s approach to
success
The history of Android Performance Features
• Android alpha (at least two internal releases)
• Android beta (5 Nov. 2007) – SDK : Java VM vs. Dalvik VM
• Android first commercial release (23 Sep. 2008)
• AOSP (21 Oct. 2008) – Zygote : Preload & Prelink (ASLR)
• Cupcake (27 Apr. 2009) – NDK & Stable native API
• Froyo (20 May 2010) – JIT (Just-in-time compilation)
• Jingerbread (6 Dec. 2010) – StrictMode & NativeActivity
• Honeycomb (22 Feb. 2011) – GPUI, SMP, and RenderScript
• JellyBean (27 Jun. 2012) - Project Butter (Jank Buster)
Source : http://www.youtube.com/watch?v=V5E5revikUU
Fast & Smooth - Jelly Bean and Project Butter
Facebook Android Development :
A Scrolling Performance Story - Dec 5, 2012
Their Android Performance Challenges
• Why Android stutter more?
• Measure Improvement
• Garbage Collection
• Memory
• View Optimization
• Main Thread
• User Perception
Source : http://velocity.oreilly.com.cn/2012/ppts/Facebook-Android-Performance-OReilly-Velocity-Beijing-Dec-2012.pdf
0
10
20
30
40
Dumb Recycling views ViewHolder
FPS
The World of List View
Tuesday, June 1, 2010, Google I/O
50
60
Source : https://dl.google.com/googleio/2010/android-world-of-listview-android.pdf
Event
(AdapterView)
Invalidate Adapter
Measurement
Layout
Draw
• getView()
Bitmap
Decoding
Network
I/O
Dumb
Recycle
View
Holder
Storage Async
Drawable
A
B
• If an AdapterView has many children,
part A is as important as part B.
• If the AdapterView has few children,
part B becomes a bottleneck.
• Bitmap decoding is responsible.
Memory and Performance
March 29, 2013, 11th Kandroid Conference
Source : http://www.kandroid.org/board/data/board/conference/file_in_body/1/511th_kandroidconf_memory_and_performance.pdf
What lessons can we learn from history?
1. Always Measure
• Before beginning optimization, ensure the
problem requires solving.
2. Terminology used for Android Performance Features
• Bitmap Allocation
• Layer, DisplayList, DisplayList Property
• Input Latency
• FPS, VSync
CONTENT
1. The History of Android Performance Features
2. Performance Comparison
of the Three Programming Models
3. Case Study :
Performance Features of the Google Chrome Browser
4. Questionnaire :
Multi-Core vs. GPU,
Android vs. Chrome,
and Beyond Android
Comparison and Analysis of
the Three Programming Models
in Google Android (2012, Intel)
• What are the Three Programming Models ?
• Working Flow Comparison
• Execution Model Comparison
• Performance Difference and Analysis
• Differences in Development and Deployment
• Conclusion & Unified Programming Model
Source : http://people.apache.org/~xli/papers/applc2012-android-programming-models.pdf
SDK
(API Level)
AOSP
Branch
NDK
(Revision)
2010 2012 2014
8 7 6 5 4
2 3 4
D E F G
5
9 10
H I
13
6
14
7
15
8
16 17 18
RenderScript
2011 2013
J
9
2009 2008
C M
1 2 3
1
android.support.
v8.renderscript
What are the Three Programming Models?
Experiment setup : Balls
SDK : Workflow & Execution Model
Source
Program
Input
Output
Java
Compiler
Bytecode
(class File) dx utility Bytecode
(dex File) Dalvik VM
JDK
Android SDK Android Runtime
public static long
sumArray(int[] arr) {
long sum = 0;
For (int i : arr)
{
sum += i;
}
return sum;
}
000b: iload 05
000d: iload 04
000f: if_icmpge 0024
0012: aload_3
0013: iload 05
0015: iaload
0016: istore 06
0018: lload_1
0019: iload 06
001b: i2l
001c: ladd
001d: lstore_1
001e: iinc 05, #+01
0021: goto 000b
0007: if-ge v0, v2, 0010
0009: aget v1, v8, v0
000b: int-to-long v5, v1
000c: add-long/2addr v3, v5
000d: add-int/lit8
v0, v0, #int 1
000f: goto 0007
SDK : Workflow & Execution Model
Activity
Thread
Looper
Message
Queue
H
handleMessage()
ViewRootImpl
handleMessage()
TLS Queue Thread Pool
execute()
AsyncTask
• onPreExecute()
• onProgressUpdate()
• onPostExecute
• doInBackground()
A PP L I CAT IONS
ActivityThread
L I NUX K ERNEL
L I B RAR I E S
Surface Manager
RUNTIME
Dalvik Virtual Machine
Core Libraries
OpenGL|ES
SGL
Media Framework
FreeType
SSL SQLite WebKit Libc
HelloAndroid
Activity
Looper
Message
Queue
Service
Receiver
Provider
View
H
Handle
Message()
ViewRoot
Handle
Message()
Custom
구현
JNI
Custom
Library
A PP L I CAT IONS
ActivityThread
L I NUX K ERNEL
L I B RAR I E S
Surface Manager
RUNTIME
Dalvik Virtual Machine
Core Libraries
OpenGL|ES
SGL
Media Framework
FreeType
SSL SQLite WebKit Libc
Looper
Message
Queue
H
Handle
Message()
ViewRoot
Handle
Message()
JNI
Custom
Library
NativeActivity
NDK : Workflow & Execution Model
Native Layer
(RenderScript)
Native Layer
(LLVM Code)
Reflected
Layer
(C99)
helloworld.rs
helloworld.bc
ScriptC_
helloworld
.java
ScriptField_
xxxxxxxxx
.java
llvm-
rs-cc
App Java
Sources
App Java
Sources
RenderScript : Workflow & Execution Model
APK file
Native Layer
(LLVM Code)
Reflected
Layer
helloworld.bc
ScriptC_
helloworld
.java
Framework
Layer
App Java
Sources
App Java
Sources
App Java
Sources
Dalvik
JIT
Compiler
libbcc
LLVM based
Jit compiler
System lib.
.bc files
Multicore CPUs
GPUs/DSPs
RenderScript : Workflow & Execution Model
0
10
20
30
40
50
60
70
200 300 400 500 600 700 800 900 1000
SDK
SDK-MT
NDK
NDK-MT
Performance Analysis : Multiple Worker Thread
Source : http://people.apache.org/~xli/papers/applc2012-android-programming-models.pdf
0
10
20
30
40
50
60
70
200 300 400 500 600 700 800 900 1000
SDK-MT
NDK-MT
Renderscript
Performance Analysis : Runtime design diff.
Source : http://people.apache.org/~xli/papers/applc2012-android-programming-models.pdf
Current Issues & Unified Programming Model
Class Sub Class SDK NDK RS NDK + RS
Programmability
Memory
Management O X X X
Library
Extensibility O O X O
Portability O △ O O
Security Strong Typing
and Verification O X X X
Performance
Vector Type X △ O O
Thread Pool O △ O O
OpenGLES O O X O
1
2
3
3
1. Is it possible to support vector type without changing the JNI implementation?
2. Why did Google make RS separate from NDK?
3. Are Memory Management and Strong Typing critical issues?
Hooray! We done the GDK building system.
This building system is independent on NDK.
It builds what is should (bitcodes), and then transfer the control
to NDK building system, doing the remaining building.
This code is still ugly. Need cleanup.
gdk git commit id : edde771d8940a6f1b00fd68bcca1486b575e6d9e
Author : Nowar Gu <[email protected]>
GDK : What is GDK?
Current Issues & Unified Programming Model
LOCAL_PATH := $(call my-dir)
include $(CLEAR_VARS)
LOCAL_MODULE := hello_llvm
LOCAL_CFLAGS := -D NUM=7788
LOCAL_SRC_FILES := hello_llvm.c test.cpp
LOCAL_C_INCLUDES := jni/test-include
include $(BUILD_BITCODE)
include $(CLEAR_VARS)
LOCAL_MODULE := test2
LOCAL_SRC_FILES := test2.c
include $(BUILD_BITCODE)
Android-portable.mk
$ cd ~/android-4.1.1_r1/gdk/samples/hello-llvm/jni
$ export OUT=~/android-4.1.1_r1/out/target/product/generic
$ ../../../gdk-build --ndk-root=~/android-ndk-r8b
GDK : hello-llvm Sample Code
Build Command
$ ../../../gdk-build --ndk-root=/home/jsyang/android-ndk-r8b
Compile Bitcode : hello_llvm <= hello_llvm.c
Compile++ Bitcode: hello_llvm <= test.cpp
BitcodeLibrary : libhello_llvm.bc
Install BCLib : libhello_llvm.bc => res/raw/libhello_llvm.bc
Compile Bitcode : test2 <= test2.c
BitcodeLibrary : libtest2.bc
Install BCLib : libtest2.bc => res/raw/libtest2.bc
Gdbserver : [arm-linux-androideabi-4.6] libs/armeabi/gdbserver
Gdbsetup : libs/armeabi/gdb.setup
Gdbserver : [arm-linux-androideabi-4.6] libs/armeabi-v7a/gdbserver
Gdbsetup : libs/armeabi-v7a/gdb.setup
Compile thumb : hello_llvm <= hello_llvm.c
Compile++ thumb : hello_llvm <= test.cpp
StaticLibrary : libstdc++.a
SharedLibrary : libhello_llvm.so
Install : libhello_llvm.so => libs/armeabi/libhello_llvm.so
Compile thumb : hello_llvm <= hello_llvm.c
Compile++ thumb : hello_llvm <= test.cpp
StaticLibrary : libstdc++.a
SharedLibrary : libhello_llvm.so
Install : libhello_llvm.so
=> libs/armeabi-v7a/libhello_llvm.so
Google I/O 2013 : Game Development Env.
Game Services
In Practice
An Introduction to
Play Game Services
CONTENT
1. The History of Android Performance Features
2. Performance Comparison
of the Three Programming Models
3. Case Study :
Performance Features of the Google Chrome Browser
4. Questionnaire :
Multi-Core vs. GPU,
Android vs. Chrome,
and Beyond Android
Google I/O 2012 Keynote
Android vs. Chrome
Google I/O 2013 Keynote
Sundar Pichai (SVP, Android, Chrome & Apps)
Performance Features
of the Google Chrome Browser
• Why did Google Develop Chrome?
• Chrome’s Multi-process Architecture on Android
• Chrome’s Hardware Acceleration on Android
• Chrome’s Networking on Android
• Improved VSync scheduling for Chrome on Android
Chromium
Chromium OS
Chromium
Chrome browser :
uses multiple processes !
safer
more stable
Separate threads for separate web apps
Separate address spaces
for separate web apps
Sandbox the web app’s process
blink
v8
Fast rendering engine, small footprint
Out-of-Process iframes
Optimized JS engine,
many opportunities
Why did Google Develop Chrome?
faster
Chrome’s Multi-process Architecture on Android
Source : https://sites.google.com/a/chromium.org/dev/developers/design-documents/multi-process-architecture
Chrome’s Multi-process Architecture on Android
Main
Thread
(UI)
I/O
Thread
Main
Thread
Render
Thread
Browser Process Render Process
IPC
android:process=":sandboxed_process0" android:isolatedProcess="true“
android:process=":privileged_process2" android:isolatedProcess="false"
android:isolatedProcess
If set to true, this service will run under a special process that is isolated from the rest of
the system and has no permissions of its own. The only communication with it is
through the Service API (binding and starting).
Chrome’s Hardware Acceleration on Android
Source : http://www.chromium.org/developers/design-documents/gpu-accelerated-compositing-in-chrome
Render Process
Brower Process WebKit / Skia
Shared
Memory
Bitmap
HWND
Software Rendering Architecture
IPC
Chrome’s Hardware Acceleration on Android
Source : http://www.chromium.org/developers/design-documents/gpu-accelerated-compositing-in-chrome
Render Process
GPU Process
(server)
WebKit / Skia
Shared
Memory
Bitmaps
& Arrays
HWND
Compositing with the GPU process
IPC
Compositor
Compositor
Context
GL/D3D
Commands
Browser
Process
Chrome’s Hardware Acceleration on Android
Compositing with the GPU process
USER PID PPID NAME u0_a94 24458 125 org.chromium.content_shll_apk u0_a94 24462 24458 GC u0_a94 24463 24458 Signal Catcher u0_a94 24464 24458 JDWP u0_a94 24465 24458 Compiler u0_a94 24466 24458 ReferenceQueueD u0_a94 24467 24458 FinalizerDaemon u0_a94 24468 24458 FinalizerWatchd u0_a94 24469 24458 Binder_1 u0_a94 24470 24458 Binder_2 u0_a94 24472 24458 AsyncTask #1 ... u0_a94 24486 24458 ntent_shell_apk ... u0_i50 24526 125 org.chromium.content_shel_apk:sandboxed_process1
Chrome’s Hardware Acceleration on Android
Compositing with the GPU Thread
Browser
Process
GPU Thread
(server)
Compositor
Context
GL/D3D
Chrome’s Hardware Acceleration on Android
systrace, chrome://gpu
Improved vsync scheduling for Chrome on
Android - Author: skyostil@
• Motivation • Improved vsync scheduling
• Vsync notification message • Triggering vsync based on input
• Case studies • Conclusion
Source : https://docs.google.com/a/chromium.org/document/d/16822du6DLKDZ1vQVNWI3gDVYoSqCSezgEmWZ0arvkP8/edit
Improved VSync Scheduling on Android
1 2 3 4
1
1
0
2 3 4
2 3 4
1 1 2 3
1
1
0
2 3 4
2 3 4
VSync VSync VSync VSync
GPU
CPU
Display
GPU
CPU
Display
Drawing
without
VSync
Drawing
with
VSync
Google I/O 2012 : For Butter or Worse - VSync
Source : http://commondatastorage.googleapis.com/io2012/presentations/live%20to%20website/109.pdf
Browser
Process
Render
Process
System VSync internal
timer’s tick
Improved VSync Scheduling on Android
Old Architecture
Browser
Process
Improved VSync Scheduling on Android
Old Architecture
Render
Process
System VSync
~3.2ms
internal
timer’s tick
Improved VSync Scheduling on Android
Old Architecture and Problems
Source : https://docs.google.com/a/chromium.org/document/d/16822du6DLKDZ1vQVNWI3gDVYoSqCSezgEmWZ0arvkP8/edit
Improved VSync Scheduling on Android
New Architecture
Browser
Process
Render
Process
System VSync
~3.2ms
internal
timer’s tick
Improved VSync Scheduling on Android
New Architecture : Improvement and Problem
Source : https://docs.google.com/a/chromium.org/document/d/16822du6DLKDZ1vQVNWI3gDVYoSqCSezgEmWZ0arvkP8/edit
Improved VSync Scheduling on Android
New Architecture : Improvement and Limit
Conclusion
This document describes improvements to vsync scheduling which allows
Chrome on Android to generally respond to scroll gestures within a single
vsync interval. These improvements apply to regular page scrolling, while
lowering the latency of main thread and JavaScript-driven updates are left as
future work.
Source : https://docs.google.com/a/chromium.org/document/d/16822du6DLKDZ1vQVNWI3gDVYoSqCSezgEmWZ0arvkP8/edit
What lessons can we learn from Chrome?
1. On a GUI system, a scheduler for input and drawing
is the most important .
2. If you review Chrome technology in this perspective,
you can find valuable documents.
3. This is one document that is recommended.
https://docs.google.com/document/d/1LUFA8MDpJcDHE0_L2EHvrcwqOMJhzl5dqb0AlBSqHOY/edit
CONTENT
1. The History of Android Performance Features
2. Performance Comparison
of the Three Programming Models
3. Case Study :
Performance Features of the Google Chrome Browser
4. Questionnaire :
Multi-Core vs. GPU,
Android vs. Chrome,
and Beyond Android
Multi-Core vs. GPU
ISP
GPU
LPDDR3
CPU
~12GB/s
70GFLOPs ~20GFLOPs
DSP
Dark Silicon and the End of Multicore Scaling http://falsedoor.com/doc/ISCA11.pdf
GreenDroid: An Architecture for the Dark Silicon Age http://darksilicon.org/papers/taylor-aspdac-final-2012.pdf
Android vs. Chrome
We have two options: The first is maintaining the status
quo, the other is merging the two platforms.
Status Quo
• Android : Phone, Tablet, Google TV, Car
• Chromium : Chrome Brower, Chrome OS
Merged State
• Android-centric Merger
• Chrome-centric Merger
• Alternate Merger
Android vs. Chrome
For Android-centric merger to succeed,
we must answer one question:
“Is it possible to build chrome
via Android Infrastructure?”
For Chrome-centric merger to succeed,
we must answer a different question:
“Is it possible to run Android Apps in the Chrome?”
Do you have any alternate courses?
Beyond Android
With the current market saturation, is it possible to create
a new platform?
• B2G, Tizen, LG Web OS, Ubuntu Mobile
The successful development of a new platform could lead
to advances. Nevertheless, it would be just one more in
an already saturated market. So, we must streamline the
market and focus on cross-platform compatibility with
the status quo.
• Web-based (e.g. PhoneGap)
• Native-based (e.g. Cocos2d-x)
• VM-based (e.g. Mono)
CONTENT
1. The History of Android Performance Features
2. Performance Comparison of the Three Programming Models
3. Case Study : Performance Features of the Google Chrome Browser
4. Questionnaire : Multi-Core vs. GPU, Android vs. Chrome, and Beyond Android
지금까지 경청해 주셔서 감사합니다.
질문 있으시면 해 주세요.