Post on 03-Jun-2018
8/12/2019 LMS Filtering
1/81
8/12/2019 LMS Filtering
2/81
Let the filter coefficients be
w=
w0w1
...wN1
.
Filter output:
y(n) =N1k=0
wkx(n k) =wHx(n) =d(n),where
x(n) = x(n)x(n 1)...
x(n N+ 1)
.EE 524, # 11 2
8/12/2019 LMS Filtering
3/81
Wiener-Hopf equation:
R(n)w(n) =r(n) wopt(n) =R(n)1r(n),
where
R(n) = E {x(n)x(n)H
},r(n) = E {x(n)d(n)}.
EE 524, # 11 3
8/12/2019 LMS Filtering
4/81
Adaptive Filtering (cont.)
Example 1: Unknown system identification.
EE 524, # 11 4
8/12/2019 LMS Filtering
5/81
Adaptive Filtering (cont.)
Example 2: Unknown system equalization.
EE 524, # 11 5
8/12/2019 LMS Filtering
6/81
Adaptive Filtering (cont.)
Example 3: Noise cancellation.
EE 524, # 11 6
8/12/2019 LMS Filtering
7/81
8/12/2019 LMS Filtering
8/81
Adaptive Filtering (cont.)
Example 5: Interference cancellation without reference input.
EE 524, # 11 8
8/12/2019 LMS Filtering
9/81
Adaptive Filtering (cont.)
Idea of the Least-Mean-Square (LMS) algorithm:
wk+1=wk (wE {|ek|2}), ()
where the indices are given as subscripts [e.g. d(k) =dk], and
E {|ek|2} = E {|dk wHk xk|2}= E {|dk|2} wHk r rHwk+ wHkRwk,
(wE {|ek|2}) = Rw r.
Use single-sample estimates ofR and r:
R= xkxHk, r=xkdk,EE 524, # 11 9
8/12/2019 LMS Filtering
10/81
and insert them into():
wk+1=wk+ xkek, ek=dk wHk xk LMS alg.
EE 524, # 11 10
8/12/2019 LMS Filtering
11/81
Adaptive Filtering: Convergence Analysis
Convergence analysis: Subtract wopt from both sides of the previousequation:
wk+1
wopt vk+1 =wk wopt vk +xk(d
k
xHk wk) (
)
and note that
xk(dk xHk wk) = xkdk xkxHk wk= xkd
k xkxHk wk+ xkxHk wopt xkxHk wopt
= (xkdk xkxHk wopt) xkxHk vk.
EE 524, # 11 11
8/12/2019 LMS Filtering
12/81
Observe that
Exk(d
k xHk wk)
= r Rwopt
0
RE {vk} = RE {vk}.
Let ck= E {vk}. Then
ck+1= [I R]ck ( )
Sufficient condition for convergence:
ck+1
max.
EE 524, # 11 15
8/12/2019 LMS Filtering
16/81
Normalized LMS
A promising variant of LMS is the so-called Normalized LMS (NLMS)algorithm:
wk+1=wk+
xk2xkek, ek=dk wHk xkNLMS alg.
The sufficient condition for convergence:
0<
8/12/2019 LMS Filtering
17/81
Recursive Least Squares
Idea of the Recursive Least Squares (RLS) algorithm: use sample estimateRk (instead of true covariance matrix R) in the equation for the weightvector and find wk+1 as an update to wk. Let
Rk+1 =
Rk+ xk+1xHk+1rk+1 = rk+ xk+1dk+1,
where 1 is the (so-called) forgetting factor. Using the matrix inversionlemma,we obtain
R1k+1 = (Rk+ xk+1xHk+1)1=
1
R1k R1k xk+1xHk+1R1k + xHk+1
R1k xk+1
.
EE 524, # 11 17
8/12/2019 LMS Filtering
18/81
Therefore,
wk+1 = R1k+1rk+1=R1k R1k xk+1xHk+1R1k
+ xHk+1
R1k xk+1
rk+
1
R1k R1k xk+1x
Hk+1R
1k
+ xHk+1R1k xk+1xk+1dk+1= wk gk+1xHk+1wk+ gk+1dk+1,
where
gk+1=R1k xk+1
+ xHk+1R1k xk+1 .
EE 524, # 11 18
8/12/2019 LMS Filtering
19/81
Hence, the updating equation for the weight vector is
wk+1 = wk gk+1xHk+1wk+ gk+1dk+1= wk+ gk+1 (d
k+1 xHk+1wk)
ek,k+1
= wk+ gk+1ek,k+1.
EE 524, # 11 19
8/12/2019 LMS Filtering
20/81
RLS algorithm:
Initialization: w0= 0, P0= 1I
For each k= 1, 2, . . ., compute:
hk = Pk1xk,
k = 1/( + hHk xk),
gk = hkk,
Pk = 1
Pk1 gkhHk
,
ek1,k = dk
wHk1xk,
wk = wk1+ gkek1,k,
ek = dk wHk xk.
EE 524, # 11 20
8/12/2019 LMS Filtering
21/81
Example
LMS linear predictor of the signal
x(n) = 10ej2fn + e(n)
where f= 0.1 and
N= 8,
e(n) is circular unit-variance white noise,
1= 1/[10 tr(R)], 2= 1/[3 tr(R)], 3= 1/[tr(R)].EE 524, # 11 21
8/12/2019 LMS Filtering
22/81
EE 524, # 11 22
8/12/2019 LMS Filtering
23/81
Adaptive Beamforming
The above scheme describes narrowband beamforming, i.e.
conventional beamforming if w1, . . . , wN do not depend on theEE 524, # 11 23
8/12/2019 LMS Filtering
24/81
input/output array signals,
adaptive beamforming ifw1, . . . , wNare determined and optimized basedon the input/output array signals.
Input array signal vector:
x(i) =
x1(i)x2(i)
...xN(i)
.Complex beamformer output:
y(i) =wHx(i).
EE 524, # 11 24
8/12/2019 LMS Filtering
25/81
Adaptive Beamforming (cont.)
Input array signal vector:
x(k) =
x1(k)x2(k)
...
xN(k)
.
Complex beamformer output:
y(k) = wHx(k),
x(k) =
xs(k)
signal+xN(k)
noise+
xI(k)
interference.
The goal is to filter out xI and xN as much as possible and, therefore,
EE 524, # 11 25
8/12/2019 LMS Filtering
26/81
to obtain an approximation
xS of xS. Most popular criteria of adaptive
beamforming:
MSE minimum
minw
MSE, MSE = E {|d(i) wHx(i)|2}.
Signal-to-Interference-plus-Noise-Ratio (SINR)
maxw
SINR, SINR = E {|wHxs|2}
E
{|wH(xI+ xN)
|2
}.
EE 524, # 11 26
8/12/2019 LMS Filtering
27/81
Adaptive Beamforming (cont.)
EE 524, # 11 27
8/12/2019 LMS Filtering
28/81
Adaptive Beamforming (cont.)
In the sequel, we consider the max SINR criterion. Rewrite the snapshotmodel as
x(k) =s(k)as+ xI(k) + xN(k),
where aS is the known steering vectorof the desired signal. Then
SINR = 2s |wHas|2
wHE {(xI+ xN)(xI+ xN)H}w =2s |wHas|2wHRw
whereR= E {(xI+ xN)(xI+ xN)H}
is the interference-plus-noise covariance matrix.
Obviously, SINR does not depend on rescaling of w, i.e. if wopt is anoptimal weight, then wopt is such a vector too. Therefore, max SINR is
EE 524, # 11 28
8/12/2019 LMS Filtering
29/81
equivalent to
minw wHRw subject to wHaS= const.
Let const = 1. Then
H(w) = wHRw+ (1
wHas) +
(1
aHs w)
wH(w) = (Rw as) = 0 =Rw = as= wopt=R1as.
This is a spatial version of the Wiener-Hopf equation!
From the constraint equation, we obtain
= 1
aHs R1as
EE 524, # 11 29
8/12/2019 LMS Filtering
30/81
and therefore
wopt= 1
aHs R1as
R1as MVDR beamformer.
Substitutingwopt into the SINR expression, we obtain
max SINR = SINRopt= 2s (aHs R1as)2
aHs R1RR1as
=2saHs R
1as.
If there are no interference sources (only white noise with variance 2):
SINRopt= 2s
2aHs as= N
2s
2 .
EE 524, # 11 30
8/12/2019 LMS Filtering
31/81
Adaptive Beamforming (cont.)
Let us study what happens with the optimal SINR if the covariance matrixincludes the signal component:
Rx= E {xxH} =R + 2sasaHs .
Using the matrix inversion lemma, we have
R1x as = (R + 2sasa
HS)
1as
=
R1 R
1asaHs R
1
1/2s + aHs R
1as
as
= 1 aHs R1as1/2s + a
Hs R
1asR1as
= R1as.
EE 524, # 11 31
8/12/2019 LMS Filtering
32/81
Optimal SINR is not affected!
However, the above result holds only if
there is an infinitenumber of snapshots and
aS is knownexactly.
EE 524, # 11 32
8/12/2019 LMS Filtering
33/81
Adaptive Beamforming (cont.)
Gradient algorithm maximizing SNR(very similar to LMS):
wk+1=wk+ (as xkxHk wk),
where, again, we use the simple notation wk =w(k) and xk =x(k). The
vector wk converges to wopt R1as if
0< < 2
max= 0< < 2
tr{R}.
The disadvantage of the gradient algorithms is that the convergence maybe very slow, i.e. it depends on the eigenvalue spread ofR.
EE 524, # 11 33
8/12/2019 LMS Filtering
34/81
Example
N= 8,
single signal from s= 0 andSNR = 0 dB,
single interference from I= 30 and INR= 40 dB,
1= 1/[50 tr(R)], 2= 1/[15 tr(R)], 3= 1/[5tr(R)].EE 524, # 11 34
8/12/2019 LMS Filtering
35/81
EE 524, # 11 35
8/12/2019 LMS Filtering
36/81
Adaptive Beamforming (cont.)
Sample Matrix Inversion (SMI) Algorithm:
wSMI=R1aS,where
R is the sample covariance matrix
R= 1K
Kk=1
xkxHk.
Reed-Mallet-Brennan (RMB) rule: under mild conditions, the mean losses(relative to the optimal SINR) due to the SMI approximation ofwopt donot exceed 3 dB if
K 2N.Hence, the SMI provides very fast convergence rate, in general.
EE 524, # 11 36
8/12/2019 LMS Filtering
37/81
Adaptive Beamforming (cont.)
Loaded SMI:wLSMI=R1DLaS, RDL=R + I,
where the optimal weight 22. LSMI allows convergence faster thanN snapshots!
LSMI convergence rule: under mild conditions, the mean losses (relative tothe optimal SINR) due to the LSMI approximation ofwopt do not exceedfew dBs if
K Lwhere L is the number of interfering sources. Hence, the LSMI providesfaster convergence rate than SMI (usually, 2N
L)!
EE 524, # 11 37
8/12/2019 LMS Filtering
38/81
Example
N= 10,
single signal from s= 0 andSNR = 0 dB,
single interference from I= 30 and INR= 40 dB,
SMI vs. LSMI.EE 524, # 11 38
8/12/2019 LMS Filtering
39/81
EE 524, # 11 39
8/12/2019 LMS Filtering
40/81
EE 524, # 11 40
8/12/2019 LMS Filtering
41/81
EE 524, # 11 41
8/12/2019 LMS Filtering
42/81
Adaptive Beamforming (cont.)
Hung-Turner (Projection) Algorithm:
wHT= (I X(XHX)1XH)aS,
i.e. data-orthogonal projection is used instead of inverse covariance matrix.For Hung-Turner method, a satisfactory performance is achieved with
K L.
Optimal value ofKKopt=
(N+ 1)L 1.
Drawback: number of sources should be known a priori.
EE 524, # 11 42
8/12/2019 LMS Filtering
43/81
EE 524, # 11 43
8/12/2019 LMS Filtering
44/81
This effect is sometimes referred to as the signal cancellation phenomenon.Additional constraints are required to stabilize the mean beam response
minw wHRw subject to CHw=f.
1. Point constraints: Matrix of constrained directions:
C= [aS,1,aS,2 aS,M],
where aS,i are all taken in the neighborhood ofaS and include aS as well.Vector of constraints:
f=
11...1
.EE 524, # 11 44
8/12/2019 LMS Filtering
45/81
8/12/2019 LMS Filtering
46/81
8/12/2019 LMS Filtering
47/81
EE 524, # 11 47
8/12/2019 LMS Filtering
48/81
Adaptive Beamforming (cont.)
Generalized Sidelobe Canceller (GSC): Let us decompose
wopt=R1C(CHR1C)1f
into two components, one in the constrained subspace, and one orthogonalto it:
wopt = (PC+ PC)
Iwopt
= C(CHC)1 CHR1C(CHR1C)1 I
f
+PCR1C(CHR1C)1f.
EE 524, # 11 48
8/12/2019 LMS Filtering
49/81
Generalizing this approach, we obtain the following decomposition for wopt:
wopt=wq Bwa,
wherewq= C(C
HC)1f
is the so-called quiescentweight vector,
BHC= 0,
B is the blocking matrix, and wa is the new adaptive weight vector.
EE 524, # 11 49
8/12/2019 LMS Filtering
50/81
Generalized Sidelobe Canceller (GSC):
Choice ofB is not unique. We can take B =PC. However, in this caseB is not of full rank. More common choice is to assume N (N M)full-rank matrix B. Then, the vectors z = BHx and wa both haveshorter length(N
M)
1 relative to the N
1 vectors xand wq.
Since the constrained directions are blockedby the matrix B, the signalcannot be suppressed and, therefore, the weight vector wa can adapt
EE 524, # 11 50
8/12/2019 LMS Filtering
51/81
freelyto suppress interference by minimizing the output GSC power
QGSC = (wq Bwa)HR(wq Bwa)= wHqRwq wHqRBwa wHa BHRwq
+wHa BHRBwa.
The solution is wa,opt= (BHRB)1BHRwq.
EE 524, # 11 51
8/12/2019 LMS Filtering
52/81
Adaptive Beamforming (cont.)
Generalized Sidelobe Canceller (GSC): Noting that
y(k) =wHq x(k), z(k) =BHx(k),
we obtain
Rz = E {z(k)z(k)H}= BHE {x(k)x(k)H}B= BHRB,
ryz = E {z(k)y
(k)}= BHE {x(k)x(k)H}wq= BHRwq.
EE 524, # 11 52
8/12/2019 LMS Filtering
53/81
Hence,wa,opt=R
1
z
ryz
Wiener-Hopf equation!
EE 524, # 11 53
8/12/2019 LMS Filtering
54/81
How to Choose B?
ChooseN M linearly independent vectors bi:
B = [b1b2 bNM]
so that
bi ck, i= 1, 2, . . . , N M, k= 1, 2, . . . , M ,where ck is the kth column ofC.
There are manypossible choices ofB!
EE 524, # 11 54
8/12/2019 LMS Filtering
55/81
Example: GSC in the Particular Case of Normal
Direction (Single) Constraint and for a Particular Choiceof Blocking Matrix:
EE 524, # 11 55
8/12/2019 LMS Filtering
56/81
In this particular example
C=
1
1...1
.
BH = 1 1 0 0 00 1
1
0 0
... ... ... ... ...0 0 0 1 1 ,
and
x(k) =
x1(k)x2(k)
...xN(k)
, z(k) =
x1(k) x2(k)x2(k) x3(k)
...xN1(k) xN(k)
.
EE 524, # 11 56
8/12/2019 LMS Filtering
57/81
Partially Adaptive Beamforming
In many applications, number of interfering sources is much less than thenumber of adaptive weights [adaptive degrees of freedom (DOFs)]. In suchcases, partially adaptive arrayscan be used.
Idea: use nonadaptive preprocessor reducing the number of adaptive
channels:y(i) =THx(i),
where
y has a reduced dimensionM
1(M < N) compared withN
1vector
x,
T is anN M full-rank matrix.EE 524, # 11 57
8/12/2019 LMS Filtering
58/81
EE 524, # 11 58
8/12/2019 LMS Filtering
59/81
EE 524, # 11 59
8/12/2019 LMS Filtering
60/81
Partially Adaptive Beamforming
There are two types of nonadaptive preprocessors:
subarraypreprocessor,
beamspacepreprocessor.
For arbitrary preprocessor:
Ry = E {y(i)y(i)H} =THE {x(i)x(i)H}T =THRT.
Recall the previously-used representation:
R= ASAH + 2I.
EE 524, # 11 60
8/12/2019 LMS Filtering
61/81
After the preprocessing, we have
Ry = THASAHT+ 2THT
= ASAH + Q
A = THA
Q = 2THT.
Preprocessing changes array manifold. Preprocessing may lead to colored noise.
Choosing Twith orthonormal columns, we have
THT =I ,
and, therefore, the effect of colored noise may be removed.
EE 524, # 11 61
8/12/2019 LMS Filtering
62/81
Partially Adaptive Beamforming
EE 524, # 11 62
8/12/2019 LMS Filtering
63/81
Preprocessing matrix in this particular case:
TH = 1
3
1 1 1 0 0 0 0 0 00 0 0 1 1 1 0 0 00 0 0 0 0 0 1 1 1
,(note that THT =I here!)
In the general case
T =
aS,1 0 00 aS,2 0... ... ... ...0
0 aS,M
,
where L = N/M is the size of each subarray, and THT = I holds true ifaHS,kaS,k= 1, k= 1, 2, . . . , M .
EE 524, # 11 63
8/12/2019 LMS Filtering
64/81
8/12/2019 LMS Filtering
65/81
Wideband Space-Time Processing (cont.)
Wideband case:
Higher dimension of the problem (N P instead ofN),
Steering vector depends on frequency.
EE 524, # 11 65
8/12/2019 LMS Filtering
66/81
8/12/2019 LMS Filtering
67/81
EE 524, # 11 67
8/12/2019 LMS Filtering
68/81
Constant Modulus Algorithm (cont.)
Simple example: 2 sources, 2 antennas.
EE 524, # 11 68
8/12/2019 LMS Filtering
69/81
Let
w= w1w2 be a beamformer. Output of beamforming:
yk=wH
xk= [w1 w
2] x1,kx2,k .
Constant modulus property:|s1,k| = |s2,k| = 1 for all k.
Possible optimization problem:
min J(w) where J(w) = E [(|yk|2 1)2].EE 524, # 11 69
8/12/2019 LMS Filtering
70/81
EE 524, # 11 70
8/12/2019 LMS Filtering
71/81
The CMA cost function as a function ofy (for simplicity, y is taken to be
real here).No unique minimum! Indeed, if yk = w
Hxk is CM, then anotherbeamformer is w, for any scalar that satisfies|| = 1.
EE 524, # 11 71
8/12/2019 LMS Filtering
72/81
2 (real-valued) sources, 2 antennas
EE 524, # 11 72
8/12/2019 LMS Filtering
73/81
Iterative Optimization
Cost function:
J(w) = E [(|yk|2 1)2], yk=wHxk.
Stochastic gradient method: wk+1=wk[J(wk)]
, where is stepsize, >0.
Derivative: Use|yk|2 =ykyk =wHxxHw.
J = 2E {(|yk|2
1) (wH
xkx
H
k w)}= 2E {(|yk|2 1) (xkxHk w)}= 2E {(|yk|2 1)xkyk}
EE 524, # 11 73
8/12/2019 LMS Filtering
74/81
Algorithm CMA(2,2):
yk = wHk xk
wk+1 = wk xk(|yk|2 1)yk.
EE 524, # 11 74
8/12/2019 LMS Filtering
75/81
Advantages:
The algorithm is extremely simple to implement Adaptive tracking of sources Converges to minima close to the Wiener beamformers (for each source)
Disadvantages:
Noisy and slow
Step size should be small, else unstable
Only one source is recovered (which one?) Possible convergence to local minimum (with finite data)
EE 524, # 11 75
8/12/2019 LMS Filtering
76/81
EE 524, # 11 76
8/12/2019 LMS Filtering
77/81
EE 524, # 11 77
8/12/2019 LMS Filtering
78/81
Other CMAs
Alternative cost function: CMA(1,2)
J(w) = E [(|yk| 1)2] = E [(|wHxk| 1)2].EE 524, # 11 78
8/12/2019 LMS Filtering
79/81
Corresponding CMA iteration:
yk = wHk xk
k = yk|yk| yk
wk+1 = wk+ xkk.
Similar to LMS, with update error yk|yk| yk. The desired signal is estimated
by yk|yk|.
EE 524, # 11 79
8/12/2019 LMS Filtering
80/81
Other CMAs (cont.)
Normalized CMA (NCMA; becomes scaling independent)
wk+1=wk+
xk2xkk.
Orthogonal CMA (OCMA): whiten using data covariance R
wk+1= wk+ R1k xk
k.
EE 524, # 11 80
8/12/2019 LMS Filtering
81/81
Least squares CMA (LSCMA): block update, trying to optimizeiteratively
minw
sHwHX2whereX= [x1 x2 xT] andsH is the best blind estimate at step k ofthe complete source vector (at all time points t= 1, 2, . . . , T )
sH = y1|y1|, y2|y2|, . . . , yT|yT|,where
yt=wHk xt, t= 1, 2, . . . , K .
and
wHk =sHXH(XXH)1.EE 524, # 11 81