Lecture 10 Discrete Fourier Transforms (cont’d)links.uwaterloo.ca/amath391w13docs/set4.pdf · In...
Transcript of Lecture 10 Discrete Fourier Transforms (cont’d)links.uwaterloo.ca/amath391w13docs/set4.pdf · In...
Lecture 10
Discrete Fourier Transforms (cont’d)
Some properties of DFTs
We now establish a few properties of DFTs which are discrete analogues of properties of Fourier
Transforms that you may have seen.
Linearity
F(f + g) = Ff + Fg. (1)
This is quite trivial to show. Let h = f + g, F = Ff and G = Fg. By definition, for 0 ≤ k ≤ N − 1,
H[k] =
N−1∑
n=0
h[n] exp
(
−i2πkn
N
)
=N−1∑
n=0
(f [n] + g[n]) exp
(
−i2πkn
N
)
=
N−1∑
n=0
f [n] exp
(
−i2πkn
N
)
+
N−1∑
n=0
g[n] exp
(
−i2πkn
N
)
= F [k] + G[k], (2)
which proves Eq. (1).
Conjugate symmetry
Let F be the DFT of a sampled real signal f of N points. Then
F [N − k] = F [k], k = 0, 1, · · · , N − 1. (3)
Proof: By definition,
F [k] =
N−1∑
n=0
f [n] exp
(
−i2πkn
N
)
, (4)
so that
F [N − k] =
N−1∑
n=0
f [n] exp
(
−i2π(N − k)n
N
)
102
=N−1∑
n=0
f [n] exp
(
i2πkn
N
)
exp(−i2πn)
= F [k], (5)
and the desired result follows.
Note that the above result may also be written as
F [k] = F [N − k], k = 0, 1, · · · , N − 1. (6)
In other words, F [k] and F [N − k] are complex conjugates of each other. The consequences of this
very important result will be discussed shortly.
Shift Theorem
Let
f = (f [0], f [1], · · · , f [N − 1]) and g = (f [1], f [2], · · · , f [N − 1], f [0]). (7)
In other words, g is obtained from f by shifting the sequence of sample values one space to the left.
We may consider the first element f [0] to be “wrapped-around” and placed at the end, or we may view
it as coming from the next N data points, since f [N ] = f [0]. Then the DFT coefficients of G = Fg
and F = Ff are related as follows,
G[k] = ω−kF [k], k = 0, 1, · · · , N − 1, where ω = exp
(
−i2π
N
)
. (8)
Proof: By defnition,
G[k] =N−1∑
n=0
g[n] exp
(
−i2πkn
N
)
,
=
N−1∑
n=0
f [n + 1] exp
(
−i2πkn
N
)
=
N−1∑
n=0
f [n + 1] exp
(
−i2πk(n + 1)
N
)
exp
(
i2πk
N
)
= exp
(
i2πk
N
) N−1∑
m=0
f [m] exp
(
−i2πkm
N
)
= ω−kF [k]. (9)
103
Remarks:
1. The Shift Theorem may be applied repeatedly. For example, if gM is obtained by left-shifting
the entries of f M times, i.e.,
f = (f [0], f [1], · · · , f [N − 1]) and gM = (f [M ], f [M + 1], · · · , f [M − 2], f [M − 1]), (10)
then (Exercise)
GM [k] = ω−MkF [k]. (11)
2. The Shift Theorem may also be applied for right-shifted sequences. We leave this as an exercise.
Convolution Theorem
Let f and g be two N -periodic complex vectors. Define the (circular) convolution of these two vectors
as the vector h with components
h[n] =
N−1∑
j=0
f [j]g[n − j], n = 0, 1, · · · , N − 1. (12)
Then the DFT of h is related to the DFTs of f and g as follows,
H[k] = F [k]G[k], (13)
in other words, the pointwise product – the discrete analogue of multiplying two functions together,
i.e., f(x)g(x).
Proof:
H[k] =
N−1∑
n=0
h[n] exp
(
−i2πkn
N
)
,
=N−1∑
n=0
N−1∑
j=0
f [j]g[n − j]
exp
(
−i2πkn
N
)
=
N−1∑
n=0
N−1∑
j=0
f [j] exp
(
−i2πkj
N
)
g[n − j] exp
(
−i2πk(n − j)
N
)
=
N−1∑
j=0
f [j] exp
(
−i2πkj
N
) N−1∑
n=0
g[n − j] exp
(
−i2πk(n − j)
N
)
=
N−1∑
j=0
f [j] exp
(
−i2πkj
N
)
[
N−1∑
l=0
g[l] exp
(
−i2πkl
N
)
]
= F [k]G[k]. (14)
104
The second-to-last line in the sequence follows from the fact that the products f [j]g[n− j] exhaust all
possible pairs since the vectors are N -periodic.
Some consequences of the above results
Conjugate Symmetry
Recall that if f ∈ RN is a real-valued signal, then its DFT F has conjugate symmetry of the form
F [N − k] = F [k], k = 0, 1, · · · , N. (15)
Assuming that N is even, this implies that we need only compute at most the first N/2 coefficients,
F [0], F [1], · · · , F [N/2 − 1]. The other coefficients may be computed by complex conjugating these
coefficients. But there are some other interesting features.
First of all, note that for k = 0, conjugate symmetry implies that F [N ] = F [0]. But by definition,
F [0] =N−1∑
n=0
f [n], (16)
which is real-valued. Therefore F [N ] = F [0] ∈ R.
For further discussion, we must consider two cases:
1. N is even (This is the case for most applications.)
Setting k = N/2 (an integer) in Eq. (15) implies that the “middle element”
F [N/2] = F [N/2], (17)
implying that F [N/2] ∈ R. All other DFT coefficients F [1], F [2], · · · F [N/2 − 1] are complex. If
we consider a real number to represent one degree of freedom and a complex number two degrees
of freedom, then the total number of “degrees of freedom” represented by the DFT coefficients,
F [0], F [1], · · · , F [N/2 − 1], F [N/2], (18)
is 1+2∗(N/2−1)+1 = N . This is the number of degrees of freedom represented by the original
data vector f ∈ RN .
Example: Let N = 10. The periodic DFT 10-vector is composed of the elements
F [0], F [1], · · · , F [5], · · · , F [9]. (19)
105
The “middle element,” F [10/2] = F [5], is real-valued. Because of conjugate symmetry, the
sequence in (19) is determined uniquely by the DFT coefficients F [0], · · · , F [5]. Since elements
F [1] to F [4] are complex, the total number of degrees of freedom is 1 + 4 × 2 + 1 = 10.
2. N is odd
In this case, N/2 is not an integer, so there is no “middle element” F [N/2]. The N -dimensional
DFT is uniquely determined by the elements
F [0], F [1], · · · , F [N/2 − 1], F [int(N/2)], (20)
where int(x) denotes “the integer part of x”.
Example: Let N = 9. The periodic DFT 9-vector is composed of the elements
F [0], F [1], · · · , F [4], F [5], · · · , F [8]. (21)
There is no “middle element”. Here, int(N/2) = int(9/2) = 4. Because of conjugate symmetry,
the elements in (21 is determined the DFT coefficients F [0], · · · , F [4]. (Setting k = 4 in (15)
yields F [5] = F [4].) Since elements F [1] to F [4] are complex, the total number of degrees of
freedom is 1 + 4 × 2 = 9
Finally, we note that Examples 1 and 2 of the previous lecture, the cos(2xn) and sin(2xn) data
sets, along with Example 3, where an additional sin(5xn) set was added, demonstrated conjugate
symmetry.
High- and low-frequency DFT coefficients
From Eq. (15), it follows that
|F [N − k]| = |F [k]|, k = 0, 1, 2, · · · , N − 1. (22)
In other words, a plot of the magnitudes of DFT coefficients |F [k]| will be symmetric with respect to
the “middle”:
1. In the case that N is even, the the symmetry will be about the line k = N/2.
2. In the case that N is odd, the symmetry will be about the line k = N/2, a non-integer.
106
In order to simplify the discussion, and keeping in mind that most, if not all, applications employ even
values of N , we shall assume, from this point onward, that N is even.
The plots of DFT coefficient magnitudes presented in the previous lecture all demonstrate this
symmetry.
From our discussions of Fourier series expansions, we know that the magnitudes of the coefficients
an and bn of cos(nx) and sin(nx), respectively, decay to zero in the limit n → ∞. Since the DFT is
based on complex exponentials, we also expect the DFT coefficients |F [k]| to decay with increasing k.
But from the conjugate symmetry property, the region of high-frequency DFTs is centered about the
value k = N/2, as sketched below.
0 N − 1N/2
k
|F [k]|
increasing oscillationincreasing oscillation
low frequency low frequencyregion region
highest frequency
General behaviour of magnitudes |F [k]| of DFT coefficients.
In applications, it is often more desirable to consider a slightly revised version of this plot of DFT
coefficient magnitudes. Recalling that the DFT values F [k] are N -periodic, the right-half of the above
figure is identical to the plot of DFT coefficients immediately to the left of the point k = 0. As such,
we may consider the plot of F [k] magnitudes centered at k = 0, as shown below. In this way, moving
away from k = 0, in either direction corresponds to higher frequencies.
We shall discuss the region of high-frequency DFT coefficients very shortly.
Shift Theorem
As mentioned in the previous lecture, the Shift Theorem may be applied more than once to treat
sequences that have been left- or right-shifted by M entries. We leave it as an exercise for the reader
107
N/2
|F [k]|
low frequencyregion
k
0−N/2 + 1
high frequencyhigh frequencyregion region
A revised version of the above plot of the magnitudes |F [k]| of DFT coefficients.
to show how the Shift Theorem may be used to derive the DFT of the sin(2xn) sequence of Example
2 of the previous lecture from the cos(2xn) sequence of Example 1 (or vice versa).
Convolution Theorem
We shall discuss some interesting consequences and applications of this theorem in the next lecture.
“Thresholding” of DFT coefficients as a method of data compression
Data compression is a fundamental area of research and development in signal and image processing.
Practically speaking, you’d like to get as many songs or images on a DVD – or your iPod - as you
can. But you probably know that in order to squeeze more songs on a device, you have to “compress”
the digital data sets representing the songs/images, i.e., reduce the storage space required for each
item. But reducing the storage space means throwing out some information, implying a reduction
in the quality or fidelity of the song or image. Because of the redundancy of signals and images,
some compression is possible without any noticeable changes in aural or visual quality. But as the
compression is pushed higher, noticeable distortions eventually appear – for example, echoing or hissing
in the case of audio and blockiness or blurring in the case of images.
It is not the purpose of this course to study data compression methods in any detail. However,
our study of Fourier, and later wavelet, transforms naturally takes us to some of the basic concepts
that underlie compression methods.
108
The truncation of the Fourier series expansion of a function f(x), i.e., approximating f(x) by the
partial sum SN (x) may be viewed as a compression method. In general, it is impossible to store all
Fourier coefficients an and bn. We know that for a function f ∈ L2[a, b], the coefficients an and bn
decay to zero: for a sufficiently large N , all coefficients for which n > N will be “negligible.” (Of
course, as we have discussed, N will depend on the decay of the coefficients which, in turn, depends
on the regularity of the function f .
In the case of discrete data sets, i.e., the discrete Fourier transform, we already have a finite
number of coefficients representing our signal/image of interest. Given a signal f ∈ RN , the N DFT
coefficients |F [k]|, k = 0, 1, · · · , N − 1 permit a perfect reconstruction of f . We now wish to perform
compression on this data set.
That being said, it is probably useful to mention one simple, yet important, fact: You can’t
perform compression by simply deleting the signal values f [n]. That would be too brutal. That is not
to say that you couldn’t exploit the redundancy of the signal, e.g., that contiguous elements – f [n]
and f [n + 1] – of the signal are generally close in magnitude. You could do this by keeping f [n] and
perhaps using a single digit, say “1”, to indicate that f [n] is repeated in the next data element. Or
you could register the difference between f [n] and f [n + 1] which generally would require less storage
space. This is the essence of “predictive coding.” But this is getting us deeper into the subject of
data compression and away from the course. Here, we simply wish to show that one can work on the
discrete transforms – Fourier and wavelet – of signals.
One of the simplest methods of performing compression is thresholding, i.e., a perhaps less brutal
removal of “insignificant coefficients” in a discrete Fourier or wavelet transform. (The words significant
and insignificant are heavily used in signal/image compression literature, especially with regard to
wavelets.) The idea is simple: You set a threshold value ǫ > 0, and delete all coefficients with
magnitudes less than ǫ. Let’s present this a little more mathematically:
Thresholding algorithm: Let f = (f [0], f [1], · · · , f [N − 1]) ∈ RN (or CN) represent our signal of
interest, with DFT,
F = Ff = (F [0], F [1], · · · , F [N − 1]) ∈ CN . (23)
Let ǫ ≥ 0 be a threshold parameter. Now define the new sequence
F̃ǫ = (F̃ǫ[0], · · · , F̃ǫ[N − 1]), (24)
109
as follows,
F̃ǫ[k] =
F [k], if |F [k]| ≥ ǫ,
0, if |F [k]| < ǫ.(25)
Then define
f̃ǫ = F−1F̃ǫ. (26)
f̃ǫ is the “compressed signal” corresponding to the threshold parameter ǫ. Of course, f̃0 = f , since
you haven’t thrown away any DFT coefficients. We expect that for “small” values of ǫ (the question,
of course, is “What is small?” It will depend upon the signal, in particular, on entire ensemble of
DFT coefficients F [k].) f̃ǫ should well approximate f . This could be summarized in the statement
‖f − f̃ǫ‖ → 0 as ǫ → 0. (27)
The quantity on the left may be viewed as the approximation error.
Of course, the question that remains is, “What values of ǫ do we use?” This depends on the
signal and the nature of the DFT spectrum F [k]. One could, for example, sort the DFT coefficients in
decreasing order of magnitude and then decide where to “cut.” Instead of prescribing an ǫ-value, one
may also decide to throw away a prescribed percentage of DFT coefficients based on their insignificance,
e.g., the 10% most insignificant coefficients.
Example: We now illustrate this method with a simple example. Consider the following function
f(x) = e−x2/10[sin(2x) + 2 cos(4x) + 0.4 sin(x) sin(10x)], 0 ≤ x ≤ 2π. (28)
(This function was also used in the book of Boggess and Narcowich, p. 139.) We sample this function
on [0, 2π] with N = 256, i.e.,
f [n] = f(xn), xn =2πn
N, n = 0, 1, · · · , 255. (29)
This original sampled signal is plotted on the top left of the next figure. Consecutive data points f [n]
have been connected with straight lines so that the signal may be seen more clearly. On the top left
is a plot of the magnitudes of the DFT coefficients F [k] of this signal. Note that the 2π-extension of
this signal is not continuous at the endpoints, since f(0) 6= f(2π).
The sets of plots that follow show the results of thresholding for increasing values of ǫ: 1.5, 2.5,
10.0, 20.0 and 40.0. In each case, the signal f̃ and thresholded DFT spectrum F̃ are shown. Also, the
percentage of coefficients retained by the thresholding procedure is given, as well as the L2 (Euclidean)
110
distance ‖f − f̃‖ between f and f̃ , as well as the percentage relative error of approximation, computed
as follows,‖f − f̃‖
‖f‖× 100. (30)
There are some noteworthy features demonstrated in these plots:
1. First of all, as expected, as ǫ increases, more and more of the inner “high frequency” region of
the DFT spectrum gets deleted.
2. Also as expected, the L2 (Euclidean) distance ‖f − f̃‖ increases as ǫ increases.
3. Note that at small ǫ values, e.g., ǫ = 1.5, the most significant error/distortion to the signal
occurs in the region 5 ≤ x ≤ 6. Indeed, one observes “ringing” there. This is because of the
discontinuity of the 2π-periodic extension of f : The value of f(x255) is close to zero, whereas
f(x256) = f(x0) = 2. That being said, we must mention that the use of all 255 DFT coefficients
will reconstruct the data series f [n] perfectly. However, when terms of the DFT are deleted, the
Gibbs phenomenon will appear.
4. As ǫ is increased, the signal is altered in other regions as well, e.g., the relative minimum near
x = 0.6, and the relative maximum near x = 0.1.
Finally, let us return to the observation that the relative error in approximation of the signal f by
f̃ǫ increases with ǫ. In the next figure is plotted the relative error vs. ǫ for 0 ≤ ǫ ≤ 110. To construct
this plot, the relative errors were computed in increments of ∆ǫ = 0.5.
This figure is a simple example of a rate-distortion curve: a plot of the distortion, or error, vs.
the rate of thresholding. There are some noteworthy features in this plot:
1. For 0 ≤ ǫ < 1, there is no noticeable error. This might be because there are no coefficients with
magnitudes less than these ǫ values, or there are so few that their absence does not affect the
fidelity of reconstruction.
2. As ǫ is increased, there are flat regions. Once again, these might correspond to regions that are
not occupied by the DFT coefficients.
3. At ǫ = 107.5, 100% error is achieved. This is because the DFT coefficient with the highest
magnitude is F [4] = 107.36.
111
Thresholding: Given an ǫ > 0, remove all DFT coefficients |F [k]| < ǫ.
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 6.5
x
0
25
50
75
100
125
0 50 100 150 200 250
|F(k)|
k
Left: Original signal f [n], n = 0, · · · , 255, obtained by sampling f(x) in Eq. (28). Right: Magnitudes |F [k]| of
DFT coefficients. ‖f‖ = 12.95.
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 6.5
x
0
25
50
75
100
125
0 50 100 150 200 250
|F(k)|
k
Threshold ǫ = 1.5. 44.9% of original coeffs retained. Reconstructed signal f̃ [n] and magnitudes |F̃ [k]|. L2
error ‖f − f̃‖2 = 0.83. Relative L2 error 6.3%.
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 6.5
x
0
25
50
75
100
125
0 50 100 150 200 250
|F(k)|
k
Threshold ǫ = 2.5. 25.4% of original coeffs retained. Reconstructed signal f̃ [n] and magnitudes |F̃ [k]|. L2
error ‖f − f̃‖2 = 1.19. Relative L2 error 9.0%.
112
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 6.5
x
0
25
50
75
100
125
0 50 100 150 200 250
|F(k)|
k
Threshold ǫ = 10.0. 8.2% of original coeffs retained. Reconstructed signal f̃ [n] and magnitudes |F̃ [k]|. L2
error ‖f − f̃‖2 = 2.18. Relative L2 error 16.5%.
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 6.5
x
0
25
50
75
100
125
0 50 100 150 200 250
|F(k)|
k
Threshold ǫ = 20.0. 5.1% of original coeffs retained. Reconstructed signal f̃ [n] and magnitudes |F̃ [k]|. L2
error ‖f − f̃‖2 = 3.75. Relative L2 error 28.5%.
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 6.5
x
0
25
50
75
100
125
0 50 100 150 200 250
|F(k)|
k
Threshold ǫ = 40.0. 3.1% of original coeffs retained. Reconstructed signal f̃ [n] and magnitudes |F̃ [k]|. L2
error ‖f − f̃‖2 = 5.44. Relative L2 error 41.4%.
113
0
10
20
30
40
50
60
70
80
90
100
0 10 20 30 40 50 60 70 80 90 100 110
relative
error
epsilon
Relative error of approximation of signal f with thresholded signal f̃ǫ vs. ǫ.
In order to clarify the questions raised in points 1 and 2 above, it is probably more instructive to
plot the relative error vs. the percentage of coefficients removed: this is a more accurate indication of
the compression rate. This plot is shown below.
0
10
20
30
40
50
60
70
80
90
100
0 10 20 30 40 50 60 70 80 90 100
relative
error
percentage of coeffs removed
Relative error of approximation of signal f with thresholded signal f̃ǫ vs. percentage of coefficients removed.
This plot is much more instructive. It shows that, in fact, significant compression can be performed
before the error becomes appreciable. When 70% of the coefficients are removed, the relative error is
10%.
114
For this simple example, the compression ratio R may defined as
R =number of DFT coefficients used in original signal
number of DFT coefficients used in thresholded signal(31)
In this case, the numerator is N = 256. A plot of the relative error vs. compression rate is shown
below. This is the true rate-distortion curve for this experiment.
0
10
20
30
40
50
60
70
80
90
100
0 10 20 30 40 50 60 70 80 90 100 110 120 130
relative
error
compression rate
Relative error of approximation of signal f with thresholded signal f̃ǫ vs. compression rate.
The point situated roughly at (3, 10) corresponds to the 70% removal rate mentioned for the
previous plot. The plot stops at R = 128 – after this rate, virtually all coefficients are removed, with
corresponds to R = ∞, where the asymptotic value of 100% relative error is achieved.
115
Lecture 11
Discrete Fourier transforms (cont’d)
Thresholding as a method of denoising
The method of thresholding can also be used to “denoise” signals. (In fact, it had been a rather
standard method for wavelet-based denoising of signals and images.) First of all, we should qualify
what we mean by “noisy signals/images.”
The transmission, reproduction or recording of signals/images, as “pure” as they may be initially,
generally introduces distortions. Some of these distortions may be quite systematic in nature, e.g.,
the scratch on a lens of a digital camera. But signals/images may also be subject to distortions that
may be considered as random in nature, for example, the distortion of an audio signal that is send
over a very poor communications line. There are various models for such degradations, according to
the application. In what follows, we employ one of the simplest and most standard models, namely
additive Gaussian noise. Our actual implementation of this model is also quite simple. (That
being said, the majority of research papers basically use the same type of simplified model.)
Let f0 = (f0[0], f0[1], · · · , f0[N − 1]) denote a “pure” or “noiseless”, i.e., undegraded, signal. For
example, it could represent part of an audio track that was recorded in a “perfect studio”. (Of course,
no such studio exists.) We then assume that this perfect signal f0 is degraded according to the
following model,
f = f0 + n, (32)
where n ∈ RN denotes a random N -vector. The components n[i], 1 ≤ i ≤ N − 1 (we’ll use i as
an index instead of the usual n, to avoid the confusing notation “n[n]”) are independent random
variables, which are identically distributed according to the normal or Gaussian distribution N (0, σ),
i.e., zero-mean, standard deviation σ > 0. The vector f then represents the noisy signal. Of course,
what we want is to find f0, or at least a good approximation to it, from f .
As you know from probability/statistics, a proper interpretation of this model implies that we
must consider a large collection or ensemble of such noisy signals produced by this random process.
f0 remains the same, but we’ll have many different noisy signals f produced by the random N -vectors
n. And if we examined the values assumed by a particular entry in the n vectors, say n[5], we would
see that, very roughly, the mean of these values would be near zero, and the variance near σ.
116
Actually, let’s stop here for a moment and mention that this represents one way of extracting
approximations to the noiseless image f0: By collecting a large number M of such distorted images f
and taking the average of them. If M is large enough, then the average of the n vectors will be roughly
(0, 0, · · · , 0) ∈ RN . Therefore the average of all of these noisy signals will be a rough approximation
to f0. This is one of the oldest methods of noise reduction.
Here, however, we assume that we do not have access to a large number of noisy signals, but only
one – produced, in essence, from a particular realization of the random vector n. In the numerical
experiment below, this particular realization is constructed by simply generating N random numbers
from a random number generator that is designed to generate them according to a normal N (0, σ)
distribution.
Here is the important point:
By “denoising” the noisy signal f , we mean finding approximations to the
noiseless signal f0.
We can never find f0 exactly, since the elements of the random vector n are not known deterministically.
The best we can do is to find approximations to f0.
For the noiseless signal, we shall once again employ the discrete signal of length N = 256,
f0[n] = f(xn), xn =2πn
N, n = 0, 1, · · · , 255, (33)
obtained by sampling the function
f(x) = e−x2/10[sin(2x) + 2 cos(4x) + 0.4 sin(x) sin(10x)], 0 ≤ x ≤ 2π. (34)
The function is plotted once again at the top left in the figure below.
A particular vector n ∈ RN was also generated by means of a random number generator (in the
FORTRAN programming language) using standard deviation value σ = 0.1. The vector n is plotted
at the top right in the figure below.
Finally, the noisy signal f = f0 + n is constructed by adding the components of these two signals,
cf. Eq. (32). The result is plotted at the bottom of the figure.
We now show that some “denoising” of the signal f , i.e, finding approximations to f0, may be
achieved by thresholding the discrete Fourier transform F of f . First of all, recall that the DFT is a
117
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 6.5
x
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 6.5
x
Left: Noiseless signal f0[n], sampled from f(x) in Eq. (34). Right: Noise vector n, σ = 0.1.
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 6.5
x
Resulting noisy signal f = f0 + n.
118
linear operator. This means that
F = F(f) = F(f0 + n) = F(f0) + F(n) = F0 + N. (35)
Here we run the risk of confusion since N , the DFT of n, also denotes the number of samples. We
hope that things will be clear by context.
This addition property of the DFTs is illustrated in the next figure. At the top left is the DFT
F0 of the noiseless signal f . At the top right is the DFT N of the pure-noise signal n. These two are
added to produce the DFT F of the noisy signal F .
0
25
50
75
100
125
0 50 100 150 200 250
|F(k)|
k
0
10
20
30
40
50
0 50 100 150 200 250
|F(k)|
k
Left: Noiseless DFT F0, obtained from sampled signal f . Right: Noise DFT N , obtained from noise vector n.
0
10
20
30
40
50
0 50 100 150 200 250
|F(k)|
k
Resulting noisy DFT F = F0 + N .
We now step back and examine the DFT N of the pure-noise signal. Perhaps the most noteworthy
feature of this plot is that the DFT coefficients do not exhibit the decay characteristic of “normal”
signals. In fact, they do not appear to decay at all. It is a fact, which will not be proved here,
119
that the coefficients N [k] are also random – for all intents and purposes, we may view them as being
generated randomly from a normal distribution. (That being said, each of the coefficients N [k] is
related deterministically from the noise vector coefficients n[i].) In the next figure, we show how the
amplitude of the DFT coefficients is related to the amplitude of the noise vector n. It shows the
noise vector n used in this experiment, with σ = 0.1 (top), along with its DFT, and a noise vector
corresponding to σ = 0.5, so that the random entries can assume values of larger magnitude.
Noise and its DFT representation
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 6.5
x
0
10
20
30
40
50
0 50 100 150 200 250
|F(k)|
k
Pure noise f [n] signal, zero-mean, σ = 0.1, and corresponding DFT coefficient magnitudes |N [k]|.
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 6.5
x
0
10
20
30
40
50
0 50 100 150 200 250
|F(k)|
k
Pure noise f [n] signal, zero-mean, σ = 0.5, and corresponding DFT coefficient magnitudes |N [k]|.
We now come to the main point behind this thresholding denoiser. The coefficients N [k] of
the pure noise vector are relatively small in magnitude for all frequencies k. They are seen to be
insignificant with respect to the low-frequency coefficients F0[k] of the noiseless signal. They are not
insignificant with respect to the high-frequency coefficients F0[k]. Therefore we shall assume that
most of the high-frequency content of the noisy DFT F comes from noise. Since these coefficients are
insignificant with respect to much of the low-frequency content, we conjecture that thresholding might
be able to “remove” much of the noise content in F , and thereby provide reasonable approximations
120
to F0, hence f0.
The results of thresholding for a number of ǫ values are shown in the next figures. For each ǫ
value are presented the resulting signal as well as the L2 error and relative L2 of approximation to
the noiseless signal f0.
What is rather interesting, and potentially discouraging, is that the L2 error, 1.74, for the case
ǫ = 2.0 is greater than the error, 1.66, for the actual noisy image! As ǫ is increased, however, the error
decreases - it’s 1.64 at ǫ = 3.0 - but then increases again.
The results are not very encouraging and, indeed, thresholding of DFTs is not a very good
method. But it’s not the thresholding that’s the problem – it’s the DFTs. They are too global: each
DFT coefficient contains information from the entire signal. We’ll see later that thresholding works
quite well with wavelet transforms, because of the locality of wavelet functions.
121
Simple denoising by thresholding
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 6.5
x
0
10
20
30
40
50
0 50 100 150 200 250
|F(k)|
k
Left: Original noisy signal f = f0 + n, σ = 0.1. Magnitudes |F [k]| of DFT coefficients. L2 error
‖f0 − f̃‖2 = 1.66. Relative L2 error 12.6%.
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 6.5
x
0
10
20
30
40
50
0 50 100 150 200 250
|F(k)|
k
Threshold ǫ = 2.0. 57.4% of original coeffs retained. Reconstructed signal f̃ [n] and magnitudes
|F̃ [k]|. L2 error ‖f0 − f̃‖2 = 1.71. Relative L2 error 13.0%.
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 6.5
x
0
10
20
30
40
50
0 50 100 150 200 250
|F(k)|
k
Threshold ǫ = 3.0. 37.1% of original coeffs retained. Reconstructed signal f̃ [n] and magnitudes
|F̃ [k]|. L2 error ‖f0 − f̃‖2 = 1.64. Relative L2 error 12.4%.
122
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 6.5
x
0
10
20
30
40
50
0 50 100 150 200 250
|F(k)|
k
Threshold ǫ = 4.0. 25.3% of original coeffs retained. Reconstructed signal f̃ [n] and magnitudes
|F̃ [k]|. L2 error ‖f − f̃‖2 = 1.70. Relative L2 error 12.9%.
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 6.5
x
0
10
20
30
40
50
0 50 100 150 200 250
|F(k)|
k
Threshold ǫ = 5.0. 16.8% of original coeffs retained. Reconstructed signal f̃ [n] and magnitudes
|F̃ [k]|. L2 error ‖f − f̃‖2 = 1.72. Relative L2 error 13.0%.
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 6.5
x
0
10
20
30
40
50
0 50 100 150 200 250
|F(k)|
k
Threshold ǫ = 10.0. 9.0% of original coeffs retained. Reconstructed signal f̃ [n] and magnitudes
|F̃ [k]|. L2 error ‖f − f̃‖2 = 2.06. Relative L2 error 15.6%.
123
A closer look at the Convolution Theorem
In this section, we examine some particular examples, along with a very simple, yet interesting,
application to signal processing.
Recall the Convolution Theorem:
Let f and g be two N -periodic complex vectors. Define the (circular) convolution of these two
vectors as the vector h with components
h[n] =
N−1∑
j=0
f [j]g[n − j], n = 0, 1, · · · , N − 1. (36)
Then the DFT of h is related to the DFTs of f and g as follows,
H[k] = F [k]G[k]. (37)
We first rewrite the RHS of (36) slightly, via a change of variables:
h[n] =N−1∑
j=0
f [n − j]g[j], n = 0, 1, · · · , N − 1. (38)
In this way, we can view f as a “signal” and g as a “mask”: The convolution then produces a new
signal h from f .
In the figure below, we align the vector g with f in a manner appropriate to the convolution. The
terms that are joined by lines are multiplied and then added together to form the entry h[n]:
f [n] f [n + 1]
g[0]g[−1] g[1]
g[N − 1] g[N ]
f [n − 1]
= =
. . .. . .
. . .
. . .
Terms in convolution of f with g contributing to h[n].
The convolution operation may be viewed as a kind of “reverse scalar product: To compute f [n],
we “flip” the order of the elements of g with respect to g[0], which is lined up with f [n] and then
perform the scalar product. We’ll come back to this idea in our study of wavelets.
Let us now examine a few special cases for g:
124
1. g[0] = 1 and g[n] = 0 otherwise: The only term that contributes to the sum in Eq. (36) is
f [n]g[0] = f [n]. Therefore
h[n] = f [n], n = 0, 1, · · · , N − 1, (39)
or simply h = f . This has the appearance of an identity operation, but it is more convenient to
view g as the discrete version of the Dirac delta function. This will become clearer in the next
example.
In this example, the DFTs of f and h are identical, i.e., H = F . From the Convolution Theorem,
H[k] = F [k]G[k], implying that G[k] = 1. But we could have also derived this result by directly
computing the DFT of g:
G[k] =
N−1∑
n=0
g[n] exp
(
−2πkn
N
)
= g[0] exp(0) (since only g[0] is nonzero)
= 1. (40)
2. g[1] = 1 and g[n] = 0 otherwise: The only term that contributes to the sum in Eq. (36) is
f [n − 1]g[1] = f [n − 1]. Therefore,
h[n] = f [n − 1], n = 0, 1, · · · , N − 1. (41)
Thus,
(h[0], h[1], · · · , h[N − 1]) = (f [N − 1)], f [0], f [1], · · · , f [N − 2]). (42)
In other words, g correponds to the right-shift operator.
We’ll leave it as an exercise for the reader to determine the DFT of g, i.e., G[k], in two different
ways.
3. Of course, we can generalize the above result: g[k0] = 1 and g[n] = 0 otherwise, where k0 ∈
{0, 1, · · · , N − 1}. Then g is a k0-fold right-shift operator.
Once again, we’ll leave it as an exercise for the reader to determine the DFT of g, i.e., G[k], in
two different ways.
125
Lecture 12
Discrete Fourier transforms (cont’d)
A closer look at the Convolution Theorem: “Averaging” as a convolution
With reference to Eq. (38) from the previous lecture, we now consider the following “mask” g: For
an α ∈ [0, 1],
g[0] = α,
g[1] =1
2(1 − α),
g[−1] = g[N − 1] =1
2(1 − α),
g[n] = 0, otherwise. (43)
In other words, g has at most three non-zero elements. When α = 1, we have the Dirac delta mask.
Note that
g[0] + g[1] + g[−1] = 1. (44)
The convolution of a signal f and g then produces the signal
h[n] = g[1]f [n − 1] + g[0]f [n] + g[−1]f [n + 1]
= αf [n] +1
2(1 − α) (f [n − 1] + f [n + 1]) . (45)
This may be viewed as a weighted averaging of f [n] with its immediate neighbours to produce a new
signal value h[n]. In the special case α = 1/3, the weighting is uniform:
h[n] =1
3(f [n − 1] + f [n] + f [n + 1]). (46)
Since only the immediate neighbours of f [n] are employed in this averaging procedure, it is
often referred to as “local averaging.” The effect of this procedure is to “smoothen out” a signal.
For example, if f [n] lies higher in value than its neighbours, as sketched below, then averaging will
produce a lower value. And, of course, if f [n] lies lower in value, then averaging will produce a higher,
i.e., more positive, value.
This “smoothing” effect may also be viewed as “blurring”, especially if there are sharp disconti-
nuities in the signal, as sketched in the next figure. Signal f consists of two flat, i.e., constant, regions,
126
n n + 1n − 1
.
.
.
f [n]
f [n − 1]f [n + 1].
h[n]
Local averaging at f [n] to produce h[n].
and an “edge,” or discontinuity between 4 and 5. One application of the convolution/local averaging
will lower the value of f [4] one-third the way down towards the value of f [5], to produce the value h[4]
and raise the value of f [5] one-third the way up to produce the value h[5]. In summary, signal values
are changed at 4 and 5. The other signal values are unaffected since they lie in constant regions –
local averaging will not change their values. The result of this operation is a slightly “blurred” edge,
i.e., a more gradual change in values from the highest ones to the lowest ones.
20 1 3 54 86 7
20 1 3 54 86 7
20 1 3 54 86 7
. . . . .
. . . .
. . . .
...
. .
.
..
..
. .
with edge between
4 and 5
one application of
local averaging
another application
of local averaging
original signal, f ,
h = f ∗ g
r = h ∗ g = f ∗ (g ∗ g)
..
The blurring of an “edge” or discontinuity of a signal by local averaging.
Another application of the averaging operator will alter the values of the signal at 3,4,5 and 6.
127
The reader can see that the gradient of the signal has been further decreased in magnitude, i.e., the
graph has become less steep.
One final point: the reader may have already noticed how each application of the local averaging
operator increases the region of influence, i.e., the points affected by the averaging. Each application
affects an additional signal value, previously unaffected, on either side of the original edge.
Local averaging viewed in the frequency domain
Let us now examine what is happening in the frequency domain, i.e., in “k-space,” with the DFTs.
Once again, the DFT H of the blurred signal will be related to F as follows,
H[k] = F [k]G[k], k = 0, 1, · · · , N − 1. (47)
Since we know g, we may compute G[k]: By definition,
G[k] =
N−1∑
n=0
g[n] exp
(
−i2πkn
N
)
= g[−1] exp
(
i2πk
N
)
+ g[0] exp(0) + g[1] exp
(
−i2πk
N
)
= α +1
2(1 − α)
[
exp
(
i2πk
N
)
+ exp
(
−i2πk
N
)]
= α + (1 − α) cos
(
2πk
N
)
. (48)
Therefore,
H(k) = F (k)
[
α + (1 − α) cos
(
2πk
N
)]
. (49)
One immediate consequence of this relation is that
H(0) = F (0). (50)
In other words, the zero-frequency component of F is unchanged. But what about the other frequen-
cies? We need to examine the graph of the function G(k) vs. k.
First, we identify some other important values:
G[N/2] = 2α − 1, G[N/4] = G[3N/4] = α. (51)
A qualitative sketch of the graph of G[k] for α < 1/2 is shown in the next figure.
128
1
N/4 N/2 3N/4 N
-1
0 k
dampening of magnitudes
of high frequency DFT coefficients
G[k] = α + (1 − α) cos(2πk/N)
α
2α − 1
The DFT G[k] of the local averaging convolution kernel g.
Perhaps the most important feature of the graph is that
|G[k]| < 1, 1 ≤ k ≤ N − 1. (52)
Then from the fact that
|H[k]| = |F [k]||G[k]|, (53)
we may conclude that
|H[k]| < |F [k]|, 1 ≤ k ≤ N − 1. (54)
In other words, the magnitudes of the DFT coefficients F [k] have been reduced to produce H[k]. For
the particular case α = 1/3, the degree of shrinking is greatest in the high-frequency region, i.e.,
N/4 ≤ k ≤ 3N/4.
Of course, this result is not surprising – we expected that the blurring or smearing of a signal
means that higher frequency components are being diminished in magnitude. But the main point is
that our analysis allows us to move from deblurring or denoising operations in the spatial or temporal
domain to equivalent operations in the frequency domain. We may choose to modify the DFT F of a
signal in order to denoise/deblur it, rather than working on the signal f itself. Of course, the method
of thresholding of DFT coefficients examined in the previous lecture is an example.
129
Repeated applications of the averaging operator
Let us change the name of the averaged signal h = f ∗ g to be h1. If we now apply the averaging
operator to the averaged signal h1, the result is a new signal – call it h2:
h2 = h1 ∗ g = (f ∗ g) ∗ g. (55)
In the frequency domain, the DFT transform of H2 will be H1G. But H1 = FG. The net result is
that
H2[k] = (F [k]G[k]) G[k] = F [k]G[k]2. (56)
It is straightforward to show that if we apply the convolution/averaging operator n times, the result
is the signal hn with DFT transform,
Hn[k] = F [k]G[k]n. (57)
Recall that G[0] = 1 and that G[k] < 1 for k 6= 1. It follows that
Hn[0] = F [0] n = 1, 2, · · · , (58)
and
G[k]n → 0 as n → ∞. (59)
This implies that in the limit n → ∞, the DFT transform Hn[k] – recall that it is a complex N -vector
– will approach the limiting N -vector
H = (F [0], 0, 0, · · · , 0). (60)
The reader may already see that this corresponds to the DFT of a constant function, as expected: If
you keep taking averages, you eventually smooth the function out to a constant, i.e., h[n] = C. The
question is, “What is the value of the constant C?” We leave it for the reader to show that
C =1
N
N−1∑
n=0
f [n], (61)
i.e., the average value of the signal f . This seems to make sense, from a conservation principle, since
the convolution coefficients were chosen to “conserve signals,” cf. Eq. (44).
130
Denoising by local averaging
The fact that local averaging/convolution smoothens an image suggests that it may be able to perform
denoising, essentially by averaging out the fluctuations produced by the additive noise. This is also
supported by the fact that local averaging dampens the magnitudes of high-frequency coefficients.
As such, we consider applying the local averaging operator introduced in the previous section, with
α = 1/3 to the noisy signal f [n] examined in the previous lecture, and treated with the thresholding
algorithm. Recall that the noisy signal was given by f = f0 + n, where f0 is the noiseless signal
given by Eq. (34) and n is an N -vector composed of random numbers generated by a random number
generator from a normal distribution N (0, σ), zero-mean and standard deviation σ = 0.1.
The results are shown in the accompanying figures. We first present again the noiseless signal f0
along with its DFT, then the noisy signal f along with its DFT. In the following figures, we show the
results of applying one, two, three and four convolutions to the signal. Most noticeably, the L2 error
between the signal and the noiseless signal f0 has been reduced from 1.66 to 1.31 after one application
of averaging. (Recall that the first application of thresholding resulted in an increase in the error.)
The L2 error is further reduced to 1.19 after another convolution. However, the third convolution
increases the error, as does the fourth. This decrease in error, followed by an increase, indicates the
tradeoff between smoothing of the signal to remove noise and smoothing of the signal away from the
underlying noiseless signal f0.
Note also the effects of the convolutions on the DFT spectra – one application of convolution
significantly diminishes the high-frequency coefficients in the region 75 ≤ k ≤ 175. After two, applica-
tions this part of the spectrum is almost eliminated. Further applications continue to diminish other
parts of the spectrum, which may actually “oversmooth” the signal.
This experiment has shown that the local averaging/convolution method seems to work better
than the thresholding method. But it is only one experiment, and definite conclusions cannot be
made.
The connection between local averaging/convolution and high-frequency damping also suggests
that we could perform denoising in the frequency domain, as was done for thresholding. Instead of
merely discarding DFT coefficients deemed insignificant, we could apply some kind of dampening
factor to the spectrum, with greater dampening being performed for high frequencies.
I thank one of the students in the class (E. Grant) for telling me after class that this is essentially
131
Simple denoising by convolution operation, fc[n] = 1
3(f [n − 1] + f [n] + f [n + 1])
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 6.5
x
0
25
50
75
100
125
0 50 100 150 200 250
|F(k)|
k
Left: Original noiseless signal, N = 256 samples, f0[n], n = 0, 1, · · · , 255. Magnitudes |F0[k]| of DFT
coefficients.
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 6.5
x
0
10
20
30
40
50
0 50 100 150 200 250
|F(k)|
k
Left: Noisy signal f = f0 + n, σ = 0.1. 0 ≤ x ≤ 2π. N = 256 samples. σ = 0.1. Right: Magnitudes |F [k]| of
DFT coefficients. L2 error ‖f0 − f‖2 = 1.66. Relative L2 error 12.6%.
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 6.5
x
0
10
20
30
40
50
0 50 100 150 200 250
|F(k)|
k
Application of one convolution. Reconstructed signal f̃ [n] and magnitudes |F̃ [k]|. L2 error ‖f0 − f̃‖2 = 1.31.
Relative L2 error 10.0%.
132
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 6.5
x
0
10
20
30
40
50
0 50 100 150 200 250
|F(k)|
k
Application of two convolutions. Reconstructed signal f̃ [n] and magnitudes |F̃ [k]|. L2 error ‖f0 − f̃‖2 = 1.19.
Relative L2 error 9.0%.
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 6.5
x
0
10
20
30
40
50
0 50 100 150 200 250
|F(k)|
k
Application of three convolutions. Reconstructed signal f̃ [n] and magnitudes |F̃ [k]|. L2 error ‖f − f̃‖2 = 1.25.
Relative L2 error 9.5%.
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 6.5
x
0
10
20
30
40
50
0 50 100 150 200 250
|F(k)|
k
Application of four convolutions. Reconstructed signal f̃ [n] and magnitudes |F̃ [k]|. L2 error ‖f − f̃‖2 = 1.29.
Relative L2 error 9.8%.
133
what is done by sophisticated audio processing software packages. For example, I have been told that
a software package will ask for a sample of the “background noise”. It then performs a frequency
analysis of this noise – analogous to the DFT – and then allows you to design a “shaper” to modify
these frequencies as desired.
If time permits, perhaps we can return to this topic near the end of this course. In the meantime,
the reader may wish to experiment with various “frequency-shaping” formulas in an effort to denoise
a given signal.
Signal/image enhancement and “Deconvolution”
In many practical applications, we are given a degraded signal, say, h, and asked to find good approx-
imations to the original signal f that was degraded to produce h. The degradation could be done by
noise or by blurring, or both. In fact, in signal processing literature the general model for degradation
is a composition of a blur along with a noise operator (often additive noise).
If we happen to know (or assume!) that the degradation was accomplished by convolution with a
kernel g, then the DFTs of the degraded signal h and the original signal h are related as follows,
H[k] = F [k]G[k]. (62)
Now suppose that we know the operator g, hence the DFT coefficients, G[k]. One may well be tempted
to solve for F [k] by division, i.e.,
F [k] =H[k]
G[k], k = 0, 1, · · · , N − 1. (63)
and then performing an inverse DFT on F to obtain f , i.e., f = F−1F .
Very nice in theory, but not often successful in practice! One reason is that some coefficients G[k]
may be zero or very close to zero in magnitude. As a result, this procedure is unstable. A more stable
procedure would be to find an FFT F that minimizes the squared distance
‖F − HG‖2. (64)
In L2, this becomes a least-squares problem which is generally more stable.
But there are other problems. Generally, such inverse problems, i.e., given an h, find f such
that h = f ∗ g, are said to be ill-posed because they lack unique solutions. There are often many, if
not an infinite number of, solutions that satisfy the relation, at least approximately. One must often
134
impose additional conditions on the solution during the process, which restricts the space of solutions
that we are exploring, but we still may be able to find useful approximations. The imposition of
additional conditions (do you recall the Lagrange multiplier technique in advanced calculus?) is
known as regularization.
The problem of “inverting” Eq. (62), i.e., find F given H, is known as “deconvolution,” for reasons
that should be clear: One obtains H from F by convolution, so obtaining F from H is the reverse
process, i.e., “undoing” the convolution.
Signal/image enhancement: When does one stop?
In the previous experiments on denoising using thresholding and convolution, we applied a particular
operation on the noisy image several times and observed the errors between the enhanced (i.e., de-
noised) signal and the reference image f0. In both cases, as is found in general, the error will decrease,
but then increase. As such, there are optimal “cutoff times”. Of course, if we know the original signal
f0, we know when to stop. But what if we don’t know f0, the situation we face most often? For
example, we may retrieve a noisy or blurred image f from somewhere, perhaps our own digital camera
and wish to enhance it, i.e., denoise or deblur it. Most often, this is done by trial and error – we look
at the result of the enhancement process and decide if we are satisfied with the result. If not, we may
wish to continue with the enhancement process, perhaps tweaking the control parameters.
A big question in signal/image processing is, “How do we automate this process?” For example,
how can we program a computer to know when to stop applying an operation, say convolution, to a
noisy image, if we don’t know the noiseless image f0?
A simple illustration of DFT for audio signal processing: Handel’s “Hallelujah”
chorus
This lecture actually began with a presentation of some results of simple DFT-based denoising, now
applied to a real audio signal – Handel’s “Hallelujah” chorus – instead of a seemingly sterile math-
ematical function as done earlier. The results, along with the MATLAB file used to generate them
(written by D. Brunet, my former Ph.D. student and the TA for this course in Winter 2011), can be
found in a folder posted below this week’s set of lectures on Waterloo LEARN. The notes summarizing
the presentation are included below.
135
Appendix
DFT-based denoising of an audio signal: Handel’s “Hallelujah” Chorus
Here the ideas of thresholding and low-pass filtering for the purpose of denoising are applied to a more
realistic data set, namely an almost 9-second stream of Handel’s famous “Hallelujah” Chorus. I thank
Dominique Brunet, the TA for this course, for writing a MATLAB script, soundexampleDFT.m, which
performs these denoising operations. The MATLAB file is copied below and also posted along with
these notes at the course website on UW-ACE.
The Handel chorus is available from MATLAB as the digital audio file ’handel’, a data set of
N = 73113 points that were obtained by sampling the (continuous) audio signal at a frequency of
8192 Hz (i.e., 8192 = 213 samplings per second). In the MATLAB program below, it is accessed by
means of the load command. The signal is then normalized to assume values in [−1, 1] by dividing all
elements by the data value with the highest magnitude. This “noiseless audio signal” is also written
into the file Hallelujah.wav, which is also posted.
Gaussian noise of standard deviation σ = 0.1 is then added to this data set to produce a noisy
audio signal, written into the file noisyHallelujah.wav, also posted.
%%
clear all
close all
%% load signal
VOLUME = 1; % multiplicative factor of the amplitude of the signal
load(’handel’,’Fs’,’y’); % load "handel" in vector ’y’ with sample frequency of ’Fs’ hertz
y=y/max(abs(y)); % normalize amplitude of signal
len = length(y); % number of samples in ’y’
%% generate white noise
seed = 0;
randn(’state’,seed) % initialize pseudo-random number generator
sigma = 0.1; % standard deviation of the noise
n = sigma*randn(len,1); % create vector of Gaussian noise
%% additive noise
yn = y + n;
%% low-pass filter
med = (len+1)/2; % middle of the signal
cut = med/2; % cut 50% of the signal
136
medmin = ceil(med - cut); % lower bound
medmax = floor(med + cut); % upper bound
Yn = fft(yn); % DFT
lYn = Yn;
lYn(medmin:medmax) = 0; % clamp middle frequencies
lyn = real(ifft(lYn)); % inverse-DFT
%% threshold
T = 10000;
Yn = fft(yn); % DFT
pYn = abs(Yn).^2; % Power Spectrum of Yn
tYn = Yn.*(pYn>T); % cut frequencies under threshold T
tyn = real(ifft(tYn)); % inverse-DFT
%% play all sounds
pause % (press ’enter’ to continue)
sound(y*VOLUME,Fs) % original
pause
sound(yn*VOLUME,Fs) % noisy
pause
sound(lyn*VOLUME,Fs) % low-pass filter
pause
sound(tyn*VOLUME,Fs) % hard threshold filter
%% display all sound waves
t=(0:len-1)/Fs; % time (step = 1/Fs)
pause
figure, plot(t,y), title(’Handel’), xlabel(’time (s)’), ylabel(’intensity (%)’)
pause
figure, plot(t,yn), title(’noisy Handel’), xlabel(’time (s)’), ylabel(’intensity (%)’), axis([0 9 -1 1])
pause
figure, plot(t,lyn), title(’low-pass Handel’), xlabel(’time (s)’), ylabel(’intensity (%)’), axis([0 9 -1 1])
pause
figure, plot(t,tyn), title(’hard threshold Handel’), xlabel(’time (s)’), ylabel(’intensity (%)’), axis([0 9 -1 1])
137
%% frequency analysis
Y = fft(y);
pY = abs(Y).^2;
%rY = real(Y);
%iY = imag(Y);
Yn = fft(yn);
pYn = abs(Yn).^2;
%rYn = real(Yn);
%iYn = imag(Yn);
plYn = abs(lYn).^2;
%rlYn = real(lYn);
%ilYn = imag(lYn);
ptYn = abs(tYn).^2;
%rtYn = real(tYn);
%itYn = imag(tYn);
pause
%% display all power spectrums
figure, plot(pY), title(’power spectrum Handel’)
%figure, plot(rY), title(’real Handel’)
%figure, plot(iY), title(’imaginary Handel’)
pause
figure, plot(pYn), title(’power spectrum noisy Handel’)
%figure, plot(rYn), title(’real noisy Handel’)
%figure, plot(iYn), title(’imaginary noisy Handel’)
pause
figure, plot(plYn), title(’power spectrum of low-pass filtered noisy Handel’)
%figure, plot(rlYn), title(’real of low-pass filtered noisy Handel’)
%figure, plot(ilYn), title(’imaginary of low-pass filtered noisy Handel’)
pause
figure, plot(ptYn), title(’power spectrum of hard thresholded noisy Handel’)
%figure, plot(rtYn), title(’real of hard thresholded noisy Handel’)
%figure, plot(itYn), title(’imaginary of hard thresholded noisy Handel’)
138
%% write sound waves
wavwrite(y,Fs,’Hallelujah.wav’)
wavwrite(yn,Fs,’noisyHallelujah.wav’)
wavwrite(lyn,Fs,’LPnoisyHallelujah.wav’)
wavwrite(tyn,Fs,’HTnoisyHallelujah.wav’)
The DFT of the noisy signal is then constructed using the fft command. Then two procedures
are applied to this noisy signal in an effort to denoise it:
1. Low-pass filtering: This is a removal of all high-frequency coefficients. In this experiment,
the removal was rather brutal: 50% of the signal was removed. (This means that from the DFT
coefficients ranging from 0 to N = 77113, the coefficients in the range [N/4, 3N/4] were deleted.
The inverse DFT of this set of roughly 38556 points was then taken using the ifft command.
The resulting audio signal is stored in the file LPnoisyHallelujah.wav.
When this signal was played in class, it sounded quite “hollow”.
2. Thresholding: Here, all DFT coefficients of the noisy signal with magnitudes under a prescribed
threshold T – in this case T = 10000 – were set to zero. The inverse DFT of the resulting set
was then taken and the resulting audio signal stored as HTnoisyHallelujah.wav.
When this signal was played in class, it sounded perhaps a little less hollow, but one could
hear some continuous high frequency whining. This is due to the fact that a small number of
high frequency DFT coefficients survived the thresholding. These few coefficients produced the
constant “whine.”
The graphs of all four audio signals – original (noiseless), noisy, low-pass denoised, threshold
denoised – are shown in figure below. And in the next figure are shown the DFT’s of these signals.
The corresponding audio files also posted as .wav files.
139
0 1 2 3 4 5 6 7 8 9−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1Handel
time (s)
inte
nsity (
%)
0 1 2 3 4 5 6 7 8 9−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1noisy Handel
time (s)in
ten
sity (
%)
Left: Original signal, N = 73113 data points. Right: Noisy signal.
0 1 2 3 4 5 6 7 8 9−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1low−pass Handel
time (s)
inte
nsity (
%)
0 1 2 3 4 5 6 7 8 9−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1hard threshold Handel
time (s)
inte
nsity (
%)
Left: Result of low-pass filter applied to noisy signal. Right: Result of hard threshold applied to
noisy signal.
140
0 1 2 3 4 5 6 7 8
x 104
0
1
2
3
4
5
6
7
8
9x 10
5 power spectrum Handel
0 1 2 3 4 5 6 7 8
x 104
0
1
2
3
4
5
6
7
8
9x 10
5 power spectrum noisy Handel
Left: DFT of original signal. Right: DFT of noisy signal.
0 1 2 3 4 5 6 7 8
x 104
0
1
2
3
4
5
6
7
8
9x 10
5 power spectrum of low−pass filtered noisy Handel
0 1 2 3 4 5 6 7 8
x 104
0
1
2
3
4
5
6
7
8
9x 10
5 power spectrum of hard thresholded noisy Handel
Left: DFT of low-pass filtered noisy signal. The frequency components in the interval [N/4, 3N/4]
have been removed. Right: DFT of hard-thresholded noisy signal. Some isolated individual
high-frequency DFT components remain.
141