Robotics - PKU · 2020. 2. 9. · tracking data association object recognition segmentation filters...
Transcript of Robotics - PKU · 2020. 2. 9. · tracking data association object recognition segmentation filters...
Robotics
13
AI Slides (6e) c© Lin Zuoquan@PKU 1998-2020 13 1
13 Robotics∗
13.1 Robots
13.2 Computer vision
13.3 Motion planning
13.4 Controller
AI Slides (6e) c© Lin Zuoquan@PKU 1998-2020 13 2
Robots
Robots are physical agents that perform tasks by manipulating thephysical world
Wide application: Industry, Agriculture, Transportation, Health, En-vironments, Exploration, Personal Services, Entertainment, Humanaugmentation and so on ⇐ Robotic age ⇒ Intelligent Robots
AI Slides (6e) c© Lin Zuoquan@PKU 1998-2020 13 3
Types of robots
• Manipulators: physically anchored to their workplacee.g., factory assembly line, the International Space Station
• Mobile robots: move about their environment– Unmanned ground vehicles (UGVs), e.g., The planetary Rover
(in Mars), intelligent vehicles– Unmanned air vehicles (UAVs), i.e., drone– Autonomous underwater vehicles (AUVs)– Autonomous fight unit
• Mobile manipulator: combined mobility with manipulation– Humanoid robots: mimic the human torso
Other: prosthetic devices (e.g., artificial limbs), intelligent environ-ments (e.g., house equipped with sensors and effectors), multibodysystems (swarms of small cooperating robots)
AI Slides (6e) c© Lin Zuoquan@PKU 1998-2020 13 4
Hardware
A diverse set of robot hardware comes from interdisciplinary tech-nologies
– Processors (controllers)– Sensors– Effectors– – Manipulators
AI Slides (6e) c© Lin Zuoquan@PKU 1998-2020 13 5
Sensors
Passive sensors or active sensors– Range finders: sonar (land, underwater), laser range finder, radar(aircraft), tactile sensors, GPS
– Imaging sensors: cameras (visual, infrared)– Proprioceptive sensors: shaft decoders (joints, wheels), inertial sen-sors, force sensors, torque sensors
AI Slides (6e) c© Lin Zuoquan@PKU 1998-2020 13 6
Manipulators
R
RRP
R R
Configuration of robot specified by 6 numbers⇒ 6 degrees of freedom (DOF)
6 is the minimum number required to position end-effector arbitrarily.For dynamical systems, add velocity for each DOF.
AI Slides (6e) c© Lin Zuoquan@PKU 1998-2020 13 7
Non-holonomic robots
θ
(x, y)
A car has more DOF (3) than controls (2), so is non-holonomic;cannot generally transition between two infinitesimally close configu-rations
AI Slides (6e) c© Lin Zuoquan@PKU 1998-2020 13 8
Software
Pipeline architecture: execute multiple processes in parallel– sensor interface layer– perception layer– planning and control layer– vehicle interface layer
AI Slides (6e) c© Lin Zuoquan@PKU 1998-2020 13 9
Example: A robot car
Touareg interface
Laser mapper
Wireless E-Stop
Top level control
Laser 2 interface
Laser 3 interface
Laser 4 interface
Laser 1 interface
Laser 5 interface
Camera interface
Radar interface Radar mapper
Vision mapper
UKF Pose estimation
Wheel velocity
GPS position
GPS compass
IMU interface Surface assessment
Health monitor
Road finder
Touch screen UI
Throttle/brake control
Steering control
Path planner
laser map
vehicle state (pose, velocity)
velocity limit
map
vision map
vehicle state
obstacle list
trajectory
road center
RDDF database
driving mode
pause/disable command
Power server interface
clocks
emergency stop
power on/off
Linux processes start/stop heart beats
corridor
SENSOR INTERFACE PERCEPTION PLANNING&CONTROL USER INTERFACE
VEHICLE
INTERFACE
RDDF corridor (smoothed and original)
Process controller
GLOBAL
SERVICES
health status
data
Data logger File system
Communication requests
vehicle state (pose, velocity)
Brake/steering
Communication channels
Inter-process communication (IPC) server Time server
AI Slides (6e) c© Lin Zuoquan@PKU 1998-2020 13 10
Computer vision
behaviorsscenes
objects
depth map
optical flowdisparityedges regions
features
image sequence
s.f. contour
trackingdata association
objectrecognition
segmentationfilters
edgedetection
matching
s.f.motion
s.f.shading
s.f.stereo
HIGH−LEVEL VISION
LOW−LEVEL VISION
Vision requires combining multiple cues and commonsense knowledge
AI Slides (6e) c© Lin Zuoquan@PKU 1998-2020 13 11
Visual recognition
Computer vision– visual recognition– – image classification ⇐ (deep) learning– – – object detection, segmentation, image captioning etc.
⇐ 2D→3D
Deep learning become an important tool for computer visione.g., CNNs, such as PixelCNN
E.g., ImageNet: large scale visual recognition challenge
AI Slides (6e) c© Lin Zuoquan@PKU 1998-2020 13 12
Images
AI Slides (6e) c© Lin Zuoquan@PKU 1998-2020 13 13
Images
195 209 221 235 249 251 254 255 250 241 247 248
210 236 249 254 255 254 225 226 212 204 236 211
164 172 180 192 241 251 255 255 255 255 235 190
167 164 171 170 179 189 208 244 254 234
162 167 166 169 169 170 176 185 196 232 249 254
153 157 160 162 169 170 168 169 171 176 185 218
126 135 143 147 156 157 160 166 167 171 168 170
103 107 118 125 133 145 151 156 158 159 163 164
095 095 097 101 115 124 132 142 117 122 124 161
093 093 093 093 095 099 105 118 125 135 143 119
093 093 093 093 093 093 095 097 101 109 119 132
095 093 093 093 093 093 093 093 093 093 093 119
255 251
I(x, y, t) is the intensity at (x, y) at time t
CCD camera ≈ 1,000,000 pixels; human eyes ≈ 240,000,000 pixelsi.e., 0.25 terabits/sec
AI Slides (6e) c© Lin Zuoquan@PKU 1998-2020 13 14
Image classification
CNNs are the state-of-the-art results
AI Slides (6e) c© Lin Zuoquan@PKU 1998-2020 13 15
Perception
Perception: the process mapping sensor measurements into internalrepresentations of the environment
– sensors: noisy– environment: partially observable, unpredictable, dynamic
• HMMs/DBNs can represent the transition and sensor models of apartially observable environment
• DNNs can recognize vision and various objects– the best internal representation is not known– unsupervised learning to learn sensor and motion models from
data
AI Slides (6e) c© Lin Zuoquan@PKU 1998-2020 13 16
Perception
Stimulus (percept) S, World W
S = g(W )
E.g., g = “graphics”. Can we do vision as inverse graphics?
W = g−1(S)
AI Slides (6e) c© Lin Zuoquan@PKU 1998-2020 13 17
Perception
Stimulus (percept) S, World W
S = g(W )
E.g., g = “graphics”. Can we do vision as inverse graphics?
W = g−1(S)
Problem: massive ambiguity
AI Slides (6e) c© Lin Zuoquan@PKU 1998-2020 13 18
Perception
Stimulus (percept) S, World W
S = g(W )
E.g., g = “graphics.” Can we do vision as inverse graphics?
W = g−1(S)
Problem: massive ambiguity
AI Slides (6e) c© Lin Zuoquan@PKU 1998-2020 13 19
Perception
Stimulus (percept) S, World W
S = g(W )
E.g., g = “graphics.” Can we do vision as inverse graphics?
W = g−1(S)
Problem: massive ambiguity
AI Slides (6e) c© Lin Zuoquan@PKU 1998-2020 13 20
Localization
Compute current location and orientation (pose) given observations(DBN)
Xt+1Xt
At−2 At−1 At
Zt−1
Xt−1
Zt Zt+1
• Treat localization as a regression problem• Can be done in DNNs
AI Slides (6e) c© Lin Zuoquan@PKU 1998-2020 13 21
Mapping
Localization: given map and observed landmarks, update pose distri-bution
Mapping: given pose and observed landmarks, update map distribu-tion
SLAM (simultaneous localization and mapping): given observed land-marks, update pose and map distribution
Probabilistic formulation of SLAMadd landmark locations L1, . . . , Lk to the state vector,proceed as for localization
AI Slides (6e) c© Lin Zuoquan@PKU 1998-2020 13 22
Mapping
AI Slides (6e) c© Lin Zuoquan@PKU 1998-2020 13 23
Detection
Edge detections in image ⇐ discontinuities in scene1) depth2) surface orientation3) reflectance (surface markings)4) illumination (shadows, etc.)
AI Slides (6e) c© Lin Zuoquan@PKU 1998-2020 13 24
Object detection
Single object: classification + localizatione.g., single-stage object detectors: YOLO/SSD/RetinaNet
Multiple objects: each image needs a different number of outputs– apply a CNN to many different crops of the image, CNN clas-
sifies each crop as object or backgrounde.g. lots of object detectors
3D object detection: harder than 2De.g., simple camera model
Object detection + Captioning = Dense captioning
Objects + relationships = Scene graphs
AI Slides (6e) c© Lin Zuoquan@PKU 1998-2020 13 25
Prior knowledge
Shape from . . . Assumes
motion rigid bodies, continuous motionstereo solid, contiguous, non-repeating bodiestexture uniform textureshading uniform reflectancecontour minimum curvature
AI Slides (6e) c© Lin Zuoquan@PKU 1998-2020 13 26
Object recognition
Simple idea– extract 3D shapes from 2D image
Problems– extracting curved surfaces from image– improper segmentation, occlusion– unknown illumination, shadows, noise, complexity, etc.
Approaches– machine learning (deep learning) methods based on image statis-
tics
AI Slides (6e) c© Lin Zuoquan@PKU 1998-2020 13 27
Segmentation
Segmentation: is all needed– Semantic segmentation: no objects, just pixelslabel each pixel in the image with a category label; dont differen-
tiate instances, only care about pixelse.g., CNN– Instance segmentation + object detection → multiple object
AI Slides (6e) c© Lin Zuoquan@PKU 1998-2020 13 28
Motion Planning
Path planning: find a path from one configuration to another– various path planning algorithms in discrete spaces– path plan vs. task plan
Continuous spaces: plan in configuration space defined by the robot’sDOFs
conf-3
conf-1conf-2
conf-3
conf-2
conf-1
e
ss
e
Solution is a point trajectory in free C-space
AI Slides (6e) c© Lin Zuoquan@PKU 1998-2020 13 29
Configuration space planning
Basic problem: ∞d states! Convert to finite state space
Cell decomposition:divide up space into simple cellseach of which can be traversed “easily” (e.g., convex)
become discrete graph search problem
Skeletonization:reduce the free space to a one-dimensional representationidentify finite number of easily connected points/linesthat form a graph s.t. any two points are connectedby a path on the graph
AI Slides (6e) c© Lin Zuoquan@PKU 1998-2020 13 30
Cell decomposition example
2DOFs robot arm
workspace coordinates configuration space
start
goal
startgoal
Problem: may be no path in pure freespace (white area) cellsSolution: recursive decomposition of mixed (free+obstacle) cells
AI Slides (6e) c© Lin Zuoquan@PKU 1998-2020 13 31
Skeletonization: Voronoi diagram
Voronoi diagram: locus of points equidistant from obstacles
Problem: doesn’t scale well to higher dimensions
AI Slides (6e) c© Lin Zuoquan@PKU 1998-2020 13 32
Skeletonization: probabilistic roadmap
A probabilistic roadmap is generated by random points in C-space andkeeping those in freespace; create graph by joining pairs by straightlines
Problem: need to generate enough points to ensure that every start/goalpair is connected through the graph
AI Slides (6e) c© Lin Zuoquan@PKU 1998-2020 13 33
Controller
Can view the motor control problem as a search problemin the dynamic rather than kinematic state space:
– state space defined by x1, x2, . . . , x1, x2, . . .– continuous, high-dimensional
Deterministic control: many problems are exactly solvableesp. if linear, low-dimensional, exactly known, observable
Simple regulatory control laws are effective for specified motions
Stochastic optimal control: very few problems exactly solvable⇒ approximate/adaptive methods
AI Slides (6e) c© Lin Zuoquan@PKU 1998-2020 13 34
Example
Suppose a controller has “intended” control parameters θ0which are corrupted by noise, giving θ drawn from Pθ0
Output (e.g., distance from target) y = F (θ)y
AI Slides (6e) c© Lin Zuoquan@PKU 1998-2020 13 35
Biological motor control
Motor control systems are characterized by massive redundancy
Infinitely many trajectories achieve any given task
E.g., 3-link arm moving in plane throwing at a targetsimple 12-parameter controller, one degree of freedom at target11-dimensional continuous space of optimal controllers
Idea: if the arm is noisy, only “one” optimal policy minimizes errorat target
I.e., noise-tolerance might explain actual motor behaviour
AI Slides (6e) c© Lin Zuoquan@PKU 1998-2020 13 36