04291832

download 04291832

of 10

Transcript of 04291832

  • 8/13/2019 04291832

    1/10

    IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 55, NO. 8, AUGUST 2007 1567

    Tree-Search Multiple-Symbol Differential Decodingfor Unitary Space-Time Modulation

    Volker Pauli, Student Member, IEEE, and Lutz Lampe, Member, IEEE

    AbstractDifferential space-time modulation (DSTM) usingunitary-matrix signal constellations is an attractive solution fortransmission over multiple-input multiple-output (MIMO) fad-ing channels without requiring channel state information (CSI)at the receiver. To avoid a high error floor for DSTM in relativelyfast MIMOfading channels, multiple-symbol differential detection(MSDD) has to be applied at the receiver. MSDD jointly processesblocks of several received matrix-symbols, and power efficiencyimproves as the blocksize increases. But since the search space ofMSDD grows exponentially with the blocksize and also with thenumber of transmit antennas and the data rate, the complexity ofMSDD quickly becomes prohibitive. In this paper, we investigatethe application of tree-search algorithms to overcome the com-plexity limitation of MSDD. We devise a nested MSDD structureconsisting of an outer and a number of inner tree-search decoders,which renders MSDD feasible for wide ranges of system parame-ters. Decoder designs tailored for diagonal and orthogonal DSTMcodes are given, and a more power-efficient variant of MSDD, so-called subset MSDD, is proposed. Furthermore, we derive a tightsymbol-error rate approximation for MSDD, which lends itself toefficient numerical evaluation. Numerical and simulation resultsfor different DSTM constellations and fading channel scenariosshow that the new tree-search MSDD achieves a significantly bet-ter performance-complexity tradeoff than benchmark decoders.

    Index TermsDifferential space-time modulation (DSTM),multiple-input multiple-output(MIMO) fading channels, multiple-symbol differential detection (MSDD), sphere decoding, tree-

    search decoding.

    I. INTRODUCTION

    DIFFERENTIAL space-timemodulation(DSTM) with uni-

    tary matrix-signal constellations is an attractive solu-

    tion for power- and/or bandwidth-efficient transmission over

    multiple-input multiple-output (MIMO) fading channels when

    channel state information (CSI) is not available at the receiver.

    Original work on DSTM, in, e.g., [1][4] considered DSTM

    over block-wise constant fading (block-fading) channels and

    conventional differential detection (CDD) based on two consec-

    utively received symbols. As shown in [2, Section IV-C], CDD

    incurs a loss of 3 dB in power efficiency compared to space-time

    Paper approved by R. W. Heath, Jr., the Editor for Modulation/Detectionof the IEEE Communications Society. Manuscript received October 12, 2005;revised May 8, 2006 and December 4, 2006. This work was supported in part bythe Deutsche Forschungsgemeinschaft (DFG) under Grant HU 634/6 and in partby theGerman Academic Exchange Service (DAAD). This paper waspresentedin part at the IEEE Global Conference on Telecommunications (GLOBECOM),2005.

    V. Pauli is with the Lehrstuhl fur Informationsubertragung, UniversitatErlangenNurnberg, Erlangen 91058, Germany (e-mail: [email protected]).

    L. Lampe is with the Department of Electrical and Computer Engineering,University of British Columbia, Vancouver, BC V6T 1Z4, Canada (e-mail:[email protected]).

    Digital Object Identifier 10.1109/TSMCC.2007.902500

    (ST) transmission with perfect CSI in block-fading channels. In

    case of relatively fast fading, CDD further suffers from a high

    error floor due to channel variations from one symbol to the

    other, and DSTM performs even worse than single-antenna dif-

    ferential transmission [5].

    To overcome this limitation, multiple-symbol differential de-

    tection (MSDD) for DSTM was proposed in [5], [6]. Analo-

    gous to MSDD for single-antenna differential phase-shift key-

    ing (DPSK) (e.g. [7]), N 1data symbols are jointly detectedbased onNconsecutively received symbols. The performanceof MSDD improves for increasing observation window size

    N, and approaches that of coherent detection with perfectCSI. As its complexity grows exponentially with N, a num-ber of suboptimal MSDD algorithms were designed for partic-

    ular DSTM constellations and/or values ofN, cf. e.g. [8][10].Furthermore, suboptimal decision-feedback differential detec-

    tion (DFDD) and trellis-based noncoherent sequence detection

    were devised for DSTM with diagonal constellations [1], [2]

    and with orthogonal designs [3], [11] in [5], [12] and [13],

    respectively.

    Recently, the authors presented an optimal maximum-

    likelihood (ML) MSDD algorithm for single-antenna DPSK

    [14]. This algorithm is a depth-first tree search or sphere de-

    coding (SD) algorithm, and was referred to as multiple-symboldifferential sphere decoding (MSDSD) in [14].1 The complex-

    ity of MSDSD was shown to be comparable to that of DFDD

    in most cases. In [15], a similar, Fano decoding algorithm for

    MSDD of single-antenna DPSK was investigated.

    In this paper, we propose the application of tree-search de-

    coding for DSTM transmission over MIMO fading channels and

    MSDD. In this context, we make the following contributions. Three tree-search decoders are devised for MSDD. These

    include MSDSD and Fano decoding similar to the single-

    antenna schemes from [14], [15], as well as a new MSDSD

    scheme using a Fano-type metric for quicker termination

    of the search. The size of the DSTM constellation grows exponentially

    with the number of transmit antennas NT and the datarate R. To avoid a corresponding increase in decodingcomplexity, we propose to nest fast CDD algorithms into

    the outer tree-search MSDD. Whereas sphere decoding is

    typically proposed for conventional CDD, e.g., [4], [12],

    we show that stack decoding is better suited when CDD is

    integrated into MSDD.

    1In the context of this paper, the terms decoding and detection referto the same procedure, and are used interchangeably. When referring to thesignal set of DSTM, the terms constellation and (space-time) code are used

    interchangeably.

    0090-6778/$25.00 2007 IEEE

  • 8/13/2019 04291832

    2/10

    1568 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 55, NO. 8, AUGUST 2007

    Considering a continuous MIMO fading channel

    model, we provide a tight symbol-error rate (SER)

    approximation for MSDD, which lends itself to efficient

    numerical evaluation. Thereby, we consider the individual

    SERs for theN 1consecutive data symbols jointly pro-cessedin oneMSDD block.Numerical results show that the

    SERs for the individual symbols in one block vary signifi-cantly. This motivates the design of a more power-efficient

    variant of MSDD, which we refer to as subset MSDD. We present a thorough comparison of the proposed tree-

    search decoding schemes in terms of SER performance and

    computational complexity considering different DSTM

    constellations. Numerical and simulation results confirm

    that considerable improvements in power-efficiency over

    DFDD and CDD are achieved while decoding complexity

    remains moderate and, in fact, comparable to that of DFDD.

    The paper is organized as follows. Section II introduces the

    DSTM system model. The tree-search MSDD algorithms are

    developed and optimized for DSTM in Section III. Expressions

    for the SER of MSDD are derived and subset MSDD is devisedin Section IV. Numerical and simulated SER-performance and

    complexity results are presented and discussed in Section V,

    and conclusions are given in Section VI.

    In this paper, bold upper case X and lower case x de-

    note matrices and vectors, respectively. ()H, ()T, tr{}, ,det{}, , andE{} denote Hermitian transposition, transpo-sition, trace, Frobenius norm, determinant of a matrix, Kro-

    necker product, and expectation, respectively. IL is theL Lidentity matrix and diag{X1, . . . ,XL} is an LM LNblock-diagonal matrix with the M NmatricesXlon its main diag-onal. A symmetricL LToeplitz matrix is defined by toeplitz{x1, . . . , xL}.

    II. SYSTEMMODEL

    We consider a transmission scheme using NT transmit andNRreceive antennas. At the transmitterNTR bits are mapped to(NT NT)-dimensional unitary matricesV[k],whicharetakenfromasetV

    ={V(l)|l {1, . . . , L}, L= 2NTR}. R isthe data

    rate in bit per channel use. In order to facilitate noncoherent

    detection, the data symbols V[k]are differentially encoded intotransmit symbols

    S[k] = V[k]S[k 1], S[0] = INT . (1)At time = kNT+ i, the transmitter radiates from antenna jthe elementsi,j[k] in theith row and j th column ofS[k], 1i, jNT. The transmit power is independent ofNTatall times:NT

    j=1 |si,j[k]|2 =EbR/T, 1iNT, where Eb is the en-ergy per information bit andTis the modulation interval.

    We assume a frequency-nonselective MIMO Rayleigh fad-

    ing channel. The equivalent complex baseband (ECB) received

    signalri,j[k]at time= kNT+ iand antennaj is given by

    ri,j [k] =NT=1

    si,[k]h,j [] + nj [],

    1iNT, 1jNR. (2)

    h,j[]andnj []denote, respectively, the complex fading gainbetween transmit antennaand receive antennaj and the zero-mean complex additive spatially and temporally white Gaussian

    noise (AWGN) with variance 2n=N0/T effective at the j threceive antenna at time , whileN0is the two-sided noise-powerspectral density in the ECB. The fading channels are assumed

    to be spatially uncorrelated with identical temporal correlationaccording to Clarkes fading model [16]. In particular, h,j []iszero-mean complex Gaussian distributed with autocorrelation

    hh[]=E{h,j [ + ]h,j[]}= J0(2BhT), where J0()

    and BhTdenotethe zeroth-order Besselfunction of the first kindand the maximum normalized fading bandwidth, respectively.

    While the fading channel model (2) will be used for analytical

    purposes in Section IV and performance evaluation in Section V,

    the derivation of low-complexity DSTM detectors requires the

    assumption of a quasi-static fading channel (QSFC), i.e., the

    channel is assumed constant during NTconsecutive modulationintervals (e.g., [6], [10]). Defining the QSFC matrix H[k]withhi,j[kNT] in row i and columnj, 1

    i

    NT, 1

    j

    NR, theQSFC is described by

    R[k] = S[k]H[k] + N[k]. (3)

    The N NT NR matrix R[k] ofN received matrix symbolscan be expressed as

    R[k] =SD[k]H[k] + N[k] (4)

    with the N NT N NT block-diagonal matrix SD[k]=diag{S[k N+ 1], . . . ,S[k]} andN NT NR matrices H[k]=HT[k N+ 1], . . . ,H

    T[k]T and N[k]= NT[k N+1], . . . ,NT[k]T.We would like to point out that the QSFC model considered in

    (3) and (4) is different from the often used block fading model

    [1][4], [17], which stipulates that the channel remains constant

    during the transmission ofNconsecutive DSTM symbols.

    III. TREE-SEARCHMULTIPLE-SYMBOLDIFFERENTIAL

    DECODING

    In this section, we propose and design tree-search algorithms

    for efficient MSDD. To this end, we provide a suitable represen-

    tation of the ML MSDD decision rule in Section III-A, based

    on which the tree-search decoding algorithms are developed inSection III-B. We then present inner decoders, which are nested

    into the outer tree-search decoder and optimized for different

    DSTM constellations in Section III-C.

    A. ML MSDD

    ML MSDD processes blocks R[k] ofN consecutively re-ceived matrix symbols to find estimates for the N 1 data sym-bols V[k]

    =VT[k N+ 2], . . . ,VT[k]T corresponding to

    the Ntransmit symbolsS[k]=

    ST[k N+ 1], . . . ,ST[k]

    T

    .

    In analogy to the single-antenna case (e.g., [7]) consecutively

    processed blocks R[k] overlap by at least one matrix symbol,

  • 8/13/2019 04291832

    3/10

    PAULI AND LAMPE: TREE-SEARCH MULTIPLE-SYMBOL DIFFERENTIAL DECODING FOR UNITARY SPACE-TIME MODULATION 1569

    i.e., the observation window of length Nmoves forward by atmost N 1symbols at a time. For the sake of readability, in thefollowing we omit the reference[k]to time and address subma-trices of the earlier block-matrices via subscripts with Xibeing

    the th submatrix ofXand 1i(NorN 1).Using the QSFC model (4) and the fact that SD is a unitary

    matrix, the ML MSDD decision rule can be derived as [13,Section V]

    V= argmin

    VVN1{tr{RHSD

    C

    1 INTSH

    DR}}, (5)

    where C=hh+

    2nIN

    and hh

    = toeplitz

    {hh[0], hh[NT], . . . , hh[(N 1)NT]}. Using the Choleskyfactorization C1 = UHU with upper triangular U andtr{XXH}=X2, we obtain

    V= argmin

    VVN1

    N1n=1

    VnRn,n+ Xn2

    (6)

    with

    Xn= Sn+1

    Nj=n+1

    SH

    j Rn,j (7)

    where Rn,j=un,jRj , 1nN 1, njN, and un,j

    is the element ofU in rown and column j . It is worth point-

    ing out that un,j =p(Nn)jn /

    (Nn)e , where p

    (n)j and (

    (n)e )2

    denote thejth coefficient of thenth order linear backward min-imum mean-squared error (MMSE) predictor for the discrete

    time random processH[k] + N[k]and the corresponding error

    variance, respectively, andp(n)0 =1, n, e.g., [18].

    We note that the brute-force search would evaluate (6) for

    all 2(N1)NTR V (SN= INT can be chosen without loss ofoptimality), i.e., decoding complexity would be exponential in

    N, NT, andR.

    B. Efficient Tree-Search Decoding AlgorithmsOuter Decoder

    It can be observed that the ML metric in (6) is a sum ofN 1nonnegative scalar terms

    2n=VnRn,n+ Xn2, 1nN 1 (8)

    which through Xn and (1) depend on symbols Vj , n + 1j

    N

    1. Thus, ML MSDD constitutes a tree-search prob-

    lem, and V from (6) corresponds to the best path in the tree.For an efficient (fast) tree search for MSDD, we concentrate

    on the application of depth-first search (DFS) and best-first

    search (BFS) algorithms, which achieve the best performance-

    complexity tradeoff (see [19] for coherent MIMO detection).

    1) Sphere Decoding AlgorithmMSDSD: Sphere decoding

    algorithms are DFS-type algorithms. We refer to the application

    of SD to accomplish MSDD as MSDSD (see [14] for DPSK).

    To formulate the SD algorithm, let us define

    d2n=

    N1

    i=nViRi,i+ Xi2 =d2n+1+ 2n, 1nN 1

    (9)

    whered2N = 0and d21 is the ML metric in (6). Starting at n =

    N 1, the SD algorithm selects candidates for symbols Vnbased on tentative decisions Vj = Vj , n + 1jN 1,and continues to decrement n as long as the current metricdn does not exceed a given maximum metric (radius) , suchthat

    dn. (10)If the decoder reaches n= 1, the metric of the currently best

    candidate V is used to further reduce the size of the searchspace by updating = d1. Ifdnexceedsfor any value ofn, nis incremented and a new candidate forVn is examined. If the

    decoder returns ton= N, i.e., ifdN1> , it means that thereare no further candidates inside the current sphere and that the

    ML solution Vhas been found. For the ordering of candidatesfor any Vn we employ the SchnorrEuchner (SE) enumeration

    strategy, i.e. we check candidates in order of increasing n, asthis allows for an initialization with and a (usually) fastconvergence to the ML solution [14], [20], [21].

    2) Sphere Decoding With Fano-Type MetricMSDSD-FM:

    The sphere decoding structure described earlier is very efficient

    in finding the ML solution in moderate to high signal-to-noise

    ratios (SNRs). However, toward lower SNR the normalized pre-

    dictor coefficientspj(i)/(

    i)e become very small, and d2nbecomes

    less dependent on tentative decisions Vj , n + 1jN 1.In other words, d2n resembles a sum ofN n CDD metrics,and thus, scales almost linearly with N n. It is, therefore,advisable to adjust the metric used in SD to take into account

    the lengthN nof the considered tree path, i.e., to introducea bias b. This idea was first proposed by Fano for sequential

    decoding of convolutional codes [22], and hence, we refer to itas MSDSD with Fano-type metric (MSDSD-FM). MSDSD-FM

    uses the same search strategy as MSDSD but with Fano-type

    path metric

    d2n=d2n+ (n 1)b (11)

    withd2n from (9). We propose to use the expected value of2n,

    given Vj = Vj , njN 1, as bias

    b=E{2n

    Vj = Vj , njN 1}= NTNR, (12)which leads to a simple implementation. The result in (12) is due

    to the fact that

    Xn is the MMSE estimate forVnRn,n, and

    that the filter coefficients are normalized by the corresponding

    error variance [see (6)]. It should be noted that different from

    MSDSD, MSDSD-FM is suboptimal in that the ML MSDD

    solution is not necessarily found. Furthermore, it is worth men-

    tioning that an SD with Fano-type metric was presented for

    coherent detection of space-time codes in [23].

    3) Initial Search Radius: When using the SE enumeration,

    no initial radius is required, i.e., init can be assumed.Accordingly, our first strategy is to initialize the MSDSD al-

    gorithm with a very large value ofinit. The disadvantage ofthis strategy is a potentially slow convergence or equivalently

    high complexity, in cases where the decoder makes false ten-

    tative decisions for symbols Vn with n close to N, before

  • 8/13/2019 04291832

    4/10

    1570 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 55, NO. 8, AUGUST 2007

    reaching n= 1for the first time. Therefore, our second strategyis to use a relatively small initfor initialization, and in case nosolution Vis found for this radius, to increase it gradually bysteps ofinit[21]. Since for MSDSD(-FM) the expected metric

    of the correct solution equals E{d21| V = V}= (N 1)NTNR(see Section III-B.2), we choose the squared start radius 2init

    proportional to that value expressed as

    2init= (N 1)NTNRc , c >0 . (13)The value of the constant c influences the overall search time.Simulation results presented in Section V confirm that the use

    of a finite initial radius is beneficial especially for DSTM with

    large constellations.

    4) Fano Decoding AlgorithmMSDD-Fano: The Fano de-

    coding algorithm is a BFS type algorithm. It has recently been

    applied to MSDD of single-antenna DPSK [15] and to coherent

    MIMO detection [19]. In this paper, we propose its application

    to MSDD for DSTM using the metric of (11), and refer to the

    resulting algorithm as MSDD-Fano. We do not repeat the basicdescription of the basic Fano algorithm, whose details can be

    found in [22].

    C. Application To Different Signal ConstellationsInner

    Decoders

    MSDSD, MSDSD-FM, and MSDD-Fano break the exponen-

    tial dependence of decoder complexity onN. Also, in order tobreak exponential dependence of decoder complexity on NTandR it is necessary to give some thought on the search for in-dividual candidate symbols Vn. In this regard, it is important tonote that minimizingn is equivalent to the problem of finding

    the ML solution for CDD. This observation allows us to imple-ment MSDD for DSTM using a nested structure of an outer and

    N 1 identical inner decoders. The outer decoder, which solves(6), initializes an inner decoder at stage n, 1nN 1, withmatrices Rn,n and Xn to find a new candidate Vn that mini-mizesn. It further provides the inner decoder with: 1) a list ofcandidates that have been examined previously given the same

    set of tentative decisions Vj , n + 1jN 1, and thus,are to be excluded from the search; and 2) a start radius r toforce n< r such that (10) is still fulfilled. In our investigations,we considered diagonal (cyclic) and dicyclic group codes [24]

    and orthogonal [11] and Cayley [4] nongroup codes. Due to

    space limitations, however, we present details only for diagonal

    and orthogonal constellations.1) Diagonal Constellations: Diagonal constellations are de-

    fined by the set [1], [2]

    VD=

    diag{e 2L c1 , . . . , e 2L cNT}l l {0, . . . , L 1} .

    (14)

    A straightforward approach, which we will refer to as full search

    (FS) inner decoding, is to compute nfor all elements ofVDandto process them in order of increasingn. In this case, MSDDbenefits from the efficiency of the outer tree-search algorithm,

    but its complexity is still exponential inNTand R.In order to overcome this exponential dependence of com-

    plexity on NT and R we turn to a more sophisticated strat-

    egy using a lattice-based CDD algorithm proposed in [25] (see

    also [12]). We will refer to this approach as inner lattice decod-

    ing (LD). Using the cosine-approximation for small arguments,

    the problem of finding the minimizer for ncan be turned into a(degenerated)NT-dimensional lattice-decoding problem of theform

    x = argminxNT {Bx t}, (15)

    where the NT NT matrix B has non-zero entries bi,j onlyin the first columnj = 1and on the main diagonal i = j , andVn= diag{e 2L c1 , . . . , e 2L cNT}mod(x1,L) (see [12], [25] fordetails). Due to the cosine-approximation,Vnis not necessarilythe optimal solution, but as will be seen in Section V only small

    performance degradations result. Since x in (15) and as suchVn can be obtained from a tree search, we propose two tree-

    search algorithms, which are particularly suited for efficient

    inner decoding.

    Sphere Decoding Algorithm: Due to the sparse lower-

    triangular structure ofBin (15), a particularly efficient sphere

    decoder can be applied. The decoder may use the SE strategy

    for finding candidates forx1, and increase i, 2iNTaslong as2i

    =|b1,1x1 t1|2 +

    ij=2 |bj,1x1+ bj,j xj tj |2 r. Thus, the decoder actuallysearches only one dimension of the alleged NT-dimensionalsearch space. We found that this implementation of the in-

    ner sphere decoder is more efficient than performing a QR-

    decomposition and subsequent SD as was advocated in [12] for

    CDD. The reason for this is that in MSDD previously examinedcandidates are excluded at the very beginning of the search as

    they correspond tox1.Stack Decoding Algorithm: One drawback of the SD algo-

    rithm is that the search restarts from i = 1and x1=t1/b1,1every time the inner decoder is called to provide the next best

    candidate Vn, given the same tentative decisions Vj , n + 1jN 1. The stack algorithm (e.g., [22]) seems to be an in-teresting alternative for the inner decoder, since it maintains a

    sorted list of examined paths, and decoding can be continuedif

    the inner decoder is called again.

    As the regular stack algorithm is not directly applicable for

    solving (15), becausexi, 1

    i

    NT, are not from a finite set,we propose a modified stack decoder. Instead of replacing the

    path x(i) = [x1, . . . , xi] currently at the top of the stack withall its extensions x(i+1) = [x(i), xi+1], we only consider itsbest extension x(i+1), and due to the sparse structure of the

    matrix in (15), the next candidate forxi is provided only wheni= 1 according to the SE strategy. The search can be terminatedwithout loss of optimality, if the metric of the path at the top

    of the stack exceeds the start radius r . As for the regular stackalgorithm, the first path of length NTto reach the top of the stackis the optimal path corresponding to the (next-)best candidate

    forVn. It remains to store the stack and to continue with this

    stack when the next candidate is required. While, theoretically,

    limitation of the stack size may lead to decoding errors, we

  • 8/13/2019 04291832

    5/10

    PAULI AND LAMPE: TREE-SEARCH MULTIPLE-SYMBOL DIFFERENTIAL DECODING FOR UNITARY SPACE-TIME MODULATION 1571

    found that limiting the maximum stack size to 2L does notincur a noticeable performance degradation.

    2) Orthogonal Designs (ODs): A second important class of

    nongroup DSTM codes are ODs. In particular, we consider

    DSTM based on Alamoutis code [11], where data symbols Vare taken from the set

    VO=

    12

    a bb a

    a, b L PSK

    . (16)

    For ODs, it is relatively straightforward to show that 2n can beexpressed as

    2n= n+ Re{ann} + Re{bnn}, 1nN 1(17)

    with n=

    2NR

    j=1rn,n,1,jxn,1,j + r

    n,n,2,jxn,2,j , n

    =

    2NR

    j=1rn,n,1,jxn,2,jrn,n,2,jxn,1,j , andn

    =Rn,n2 +

    Xn2, wherern,n,i,jand xn,i,jdenote the elements in theithrow andjth column ofRn,nand Xn, respectively. Apparently,

    an and bn that minimize 2n in (17) are independent of eachother. Hence, the inner decoding simplifies to two SE enumera-

    tions applied to

    L-PSK symbolsanandbn, respectively [14].Consequently, outer and inner decoder for MSDD can be

    integrated into one decoder searching a(2N 2)-dimensionaltree of

    L-PSK symbols. If the MSDSD algorithm from

    Section III-B.1 is applied, this decoder is a direct extension of

    single-antenna MSDSD from [14] to DSTM in the same way

    as for single-antenna MSDSD.

    It is worth mentioning that recently, a similar description of

    MSDD for DSTM with ODs has been presented in [26]. There,

    for the special case ofL = 16(ODs with 4PSK symbols) and

    block fading, thedetection problem is formulatedas a 4(N 1)-dimensional search in binary variables. The scheme proposed in[26] can, however, not be extended to arbitrary fading channels

    as considered in this paper.

    IV. SER ANALYSIS ANDSUBSETMSDD

    In this section, we first derive an approximation for the SER

    of ML MSDD (Section IV-A). In particular, we consider the

    individual SERs for the N 1data symbols in each block V.Evaluation of these individual SERs suggests that decoding of

    only a subset of all N 1 symbols can be greatly beneficial.The corresponding subset MSDD algorithm is presented in Sec-

    tion IV-B.

    A. SER Analysis

    It is intuitive that due to different correlations between the

    received samples in R, symbol decisions on the N 1 datasymbols in V are not equally reliable. Especially in relativelyfast fading environments it can be expected that symbols located

    in the center of the observation window can be detected more

    reliably than those at the edges. In order to substantiate this in-

    tuitive reasoning, we consider the individual symbol-error rates

    SERn forVn, 1nN 1.The pairwise error probability PEP(S S) for detect-

    ing S while S= S was transmitted can be expressed as

    [27, Section III]

    PEP(S S)

    =

    RHpoles

    Res

    1

    s

    NNTi=1

    1 + si

    rr|SF

    NR (18)

    where the summation in (18) is taken over all residues corre-sponding to poles located in the right-hand (RH) side of the

    complexs-plane, i(X)denotes theith eigenvalue ofX, and

    rr|S=SD

    C INT

    SH

    D (19)

    F = SD

    C

    1 INT

    SH

    DSDC

    1 INTSH

    D

    (20)

    C= toeplitz{hh[0], hh[1], . . . , hh[N NT 1]}

    + 2nINNT (21)

    SD=diag{S1,:,S2,:, . . . ,SNNT,:} (22)

    where Si,: denotes the ith row ofS. Note that C is the au-tocorrelation matrix of the regular fading-plus-noise process

    [see (2)]. Depending on the multiplicities of the eigenvalues

    j(rr|SF), analytical or numercial evaluation of (18) may bepreferrable [27][30].

    With this, theSERn forVncan be upper bounded using theunion bound, averaged over allLN transmit sequences:

    SERn 1LN

    S

    S, Vn =Vn

    PEP(S S). (23)

    Evaluating (23) is computationally tractable only for small con-

    stellations and observation window sizes N. In order to obtaina simpler approximation forSERnwe examine the PEP of (18)

    more closely. It can be shown that it depends on Sand Sonlythrough the matrix

    P(C INT)PH

    SHD SD C1 INT SHDSD C1 INT (24)where we introduced P

    = S

    H

    DSD.

    1) Group Constellations: ForLbeing a power of two, diag-

    onal (or cyclic) and dicyclic codes [24] fully represent full-rankunitary group constellations. In both cases, P is a sparse ma-

    trix with a single one in each row. Whereas Pis constant and

    independent ofSDfor diagonal constellations, there are 2N dif-

    ferent matricesPin case of dicyclic constellations. This means

    that the set ofLN matrices Scan be partitioned intoKequiva-

    lence classes Sk

    Pwith respect to P, 1kK, whereK= 1and K= 2N for diagonal and dicyclic constellations, respec-

    tively. Furthermore, since SH

    DSD = diag{SH1S1, . . . ,SHNSN}

    with SHnSn V, 1nN, for group constellations, the in-ner sum of (23) is independent of which class representative

    S

    S

    k

    Pis chosen. We can, thus, limit the outer summation to

    onlyKLN terms (K= 1 orK= 2N).

  • 8/13/2019 04291832

    6/10

    1572 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 55, NO. 8, AUGUST 2007

    To reduce the number of terms of the inner sum in (23), we

    propose to take only the dominating instead of all LN1 1relevant error events into account. Following a similar reason-

    ing as in [28] for single-antenna DPSK and utilizing the results

    of [17] for the design of unitary space-time constellations, these

    dominant error events are found to correspond to matrix pairs

    {S,S} with the highest correlation =SH

    S. A brief exam-ination ofreveals that error events with only one nonidenticaltransmit matrix Sn= Snsuch that

    n= Re {tr{SHnSn}}, 1nN (25)

    is maximized, which also maximizes. Collecting all matricesS= [ST1 . . . S

    T

    n1ST

    n ST

    n+1 . . . ST

    N]T with Sn= Sn maxi-

    mizingn in sets Sn, 1nN, we can approximateSERn

    by

    SERn 1K

    SSk

    Pk=1. . .K

    S Sn

    PEP(S S) +

    S Sn+1

    PEP(S S)

    (26)

    for1nN 1.2) Nongroup Constellations: For nongroup constellations,

    e.g., orthogonal [11] and Cayley [4] codes, a simplification

    of (23) based on equivalence classes with respect to P is not

    possible. Furthermore, it is not guaranteed that the quadruple

    Qn={Sn,Sn,Sn+1,Sn+1}with Sn= Sn,Sn+1= Sn+1,

    and Sn maximizing (25) is admissible, i.e., that there is aVn Vsuch that Sn+1= VnSn. Hence, the correlation n(25) is not necessarily an appropriate indicator for the domi-

    nant error events for every realization ofS. However, we foundfrom numerical evaluations that the SER of nongroup codes

    is approximated quite accurately using (26) with a few can-

    didates S for which Qn is admissible and Sn according to(25), 1nN 1. For the performance evaluation in Sec-tion V, we usethis approximation with theL admissible matricesS= [INT , (V

    (l))T, INT , (V(l))T, . . .]T, 1lL.

    3) Evaluation: At this point, it is illustrative to consider

    SERn for the example of diagonal DSTM with parametersNT= 3, R= 1, BhT = 0.03, and NR= 1. Fig. 1 shows nu-merical results from (26) and simulation results for the SNR

    in terms of10log10(Eb/N0)required to achieve, respectively,SERn = 10

    5 (solid lines) and the average error rate SER =(1/N 1)N1n=1 SERn= 105 (dashed lines) as function ofthe positionn, 1nN 1, for different window sizes N.MSDD is implemented using MSDSD with FS inner decoding

    (MSDSD-FS). Also included is the SER for coherent detection

    assuming perfect CSI.

    First, we observe a good agreement between the SER approxi-

    mation from (26) and the simulated SER. Second, it can be seen

    that the individual error rates SERn are almost identical forsymbols Vn, 2

    n

    N

    2, but significantly deteriorate for

    symbols at the edges of the observation window. More specifi-

    Fig. 1. Required10log10(Eb/N0) to achieve SER = 105 for positionn in observation window of MSDSD-FS. Parameters: Diagonal constellation,NT = 3,NR = 1, R= 1, andB

    hT = 0.03.

    cally, there are differences of 58 dB in power efficiency when

    comparing edge and nonedge positions.

    B. Subset MSDD

    The observations from Fig. 1 strongly suggest a variant of

    MSDD, which we refer to as subset MSDD. For subset MSDD,

    only the decoding decisions for the N 1N 1 symbolsVn located in the middle of the observation window, i.e.,

    (N N)/2< n

  • 8/13/2019 04291832

    7/10

    PAULI AND LAMPE: TREE-SEARCH MULTIPLE-SYMBOL DIFFERENTIAL DECODING FOR UNITARY SPACE-TIME MODULATION 1573

    TABLE ISUMMARY OFALGORITHMS FORDIFFERENTIALDETECTION FORUNITARY DSTM

    Fig. 2. Comparison of various decoders withN= 10 (CDD with N= 2).Required10 log10(Eb/N0) to achieveSER = 105 versus BhT. Diagonalconstellation withNT= 3,NR= 1, R= 1. Lines: numerical results, Mark-

    ers: simulation results.

    decoder is applicable, which is denoted by an (additional) suffix

    -FS (all constellations), -LD (only for diagonal constel-

    lations), respectively. As benchmark algorithms, we also con-

    sider: 1) CDD(N = 2); 2) DFDD [5], [12]; and 3) the branch-intersect-detect MSD decoder (MSDD-BID) of [10], which is

    a suboptimal MSDD algorithm for diagonal constellations. Un-

    less specified otherwise, the channel model according to (2) is

    applied for simulation.

    A. SER Results

    We present SER results for three illustrative examples. In

    case of DSTM with diagonal constellations we used parameters

    [c1, . . . , cNT ]from [2, Table I].1) Diagonal Constellation With R= 1, NT = 3, NR = 1,

    and MSDD With N = 10: Fig. 2 compares various decoders(with FS inner decoding) in terms of the SNRrequired to achieve

    SER = 105 as function of the normalized fading bandwidthBhT. Both numerical results according to Section IV-A (solidlines) and simulation results (markers) are plotted. As reference

    curves, the SNR for MSDSD-FS under block-fading (BhT = 0)and for coherent detection with perfect CSI are also shown.

    First, we note the good match of the SER approximation from

    Section IV-A and the simulated SER. The slight fluctuations

    Fig. 3. Comparison of FS and LD-based inner decoding. SER versus10log10(Eb/N0) for MSDSD with N= 10, MSDSD-S with N= 10 andN= 8, DFDD, and CDD. Diagonal constellation with NT= 3,NR=1, R= 1, andBhT = 0.03. Solid lines: FS; Dashed lines: LD.

    of the numerically obtained curves are due to the particular

    fading autocorrelation according to the Bessel function. Second,

    we observe that CDD-FS suffers from a relatively high error

    floor already in moderately fast fading withBhT0.006. Theperformance of MSDD-BID approaches that of ML MSDD for

    BhT0.01, but rapidly deteriorates asBhTgrows. MSDSD-FS is considerably more robust to fast fading than MSDD-BID,

    and also outperforms DFDD by about 26 dB depending on

    the fading bandwidth. Suboptimal MSDD-Fano-FS performs

    within 0.30.7 dB of optimal MSDSD-FS. Finally, it can be

    seen that subset MSDD (MSDSD-S-FS) significantly improves

    performance in the fast fading regime. Almost all the gain in

    power efficiency is already accomplished with N = 8, whichcorresponds to a moderate complexity increase by a factor of

    1.29compared to MSDD withN= 10.In Fig. 3, we evaluate the performance with inner LD for

    an exemplary fading bandwidth of BhT = 0.03. MSDSD,MSDSD-S, and DFDD withN= 10and CDD are considered,and LD inner decoding (dashed lines) is compared with FS

    inner decoding (solid lines). It can be seen that the cosine-

    approximation involved in LD causes only small performance

    degradations in the order of 0.20.3 dB regardless of the partic-

    ular detector. We also observed (not shown in Fig. 3) that even

  • 8/13/2019 04291832

    8/10

    1574 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 55, NO. 8, AUGUST 2007

    Fig. 4. Comparison of various decoders with N= 6 (CDD with N= 2)and LD-based inner decoding. SER versus 10log10(Eb/N0) for diagonalconstellation withNT = 3,NR= 1, R= 2, andBhT = 0.03.

    smaller performance losses occur at higher ratesR >1, sincethe cosine-approximation becomes more accurate.

    2) Diagonal Constellation With R= 2, NT = 3, NR=1, BhT = 0.03, and MSDD With N = 6: The power efficiencyof MSDSD-FM is compared to optimal MSDSD and MSDD-

    Fano in Fig. 4. As FS inner decoding is computationally com-

    plex for this scenario withL= 64, we consider LD in all cases.The respective curves for CDD-LD, DFDD-LD, and MSDD-

    BID are also included for comparison. MSDSD-FM-LD clearly

    outperforms CDD-LD and MSDD-BID, which suffer from a

    high error floor in this fast fading scenario. We further observethat the use of the suboptimal Fano-type metric results in rela-

    tively small performance losses of 0.51.0 dB. As expected,

    the performance of MSDSD-FM-LD is very similar to that

    of MSDD-Fano-LD. Finally, MSDSD-FM-LD and especially

    MSDSD-FM-S-LD achieve significant gains over DFDD-LD,

    which has been studied in detail in [12].

    3) Orthogonal Design With R= 1, NT = 2, NR= 1, andMSDD WithN= 10: Fig. 5 shows the performance of DSTMwith ODs in terms of the SNR required to achieve SER = 105.MSDSD, using the efficient decoder described in Section III-C,

    is compared with DFDD and CDD (N= 2), respectively.Numerical results (see Section IV-A, lines) and simulation

    results (markers) are plotted. Since for DSTM with ODs the

    NT antennas transmit constantly, the QSFC model is only anapproximation. In order to illustrate the degradation due to the

    deviation of the QSFC model from the underlying channel (2),

    numerical and simulation results assuming the QSFC model

    are also included in Fig. 5 (dashed lines).

    First,we observe that SERs fromthe approximation in Section

    IV-A and simulated SERs closely match also for this nongroup

    constellation. Furthermore, it can be seen that the performance

    is limited by channel variations during the transmission of one

    ST symbol, which are not accounted for in the QSFC model, and

    thus, cause an error floor regardless of the detection algorithm.

    MSDSD and DFDD can cope with much faster fading than

    Fig. 5. Comparison of various decoders withN= 10 (CDD with N= 2).Required 10log10(Eb/N0) to achieve SER = 105 versus BhT. Orthogonaldesign with NT = 2,NR= 1, and R = 1. Fading channel model according

    to (2) (solid lines) and QSFC (dashed lines). Lines: numerical results, Markers:simulation results.

    CDD. The performance gains due to MSDSD over DFDD are

    between about 1 dB and 4 dB depending onBhT.

    B. Computational Complexity

    We now compare the computational complexity of the pro-

    posed algorithms. In particular, we consider the effects of a finite

    initial search radiusinit(see Section III-B) andthe useof sphere

    and stack decoding algorithms for inner LD (see Section III-C);and compare the complexity of the proposed MSDD implemen-

    tations with those of CDD, DFDD, and MSDD-BID. As a mea-

    sure of complexity, we choosethe average numberof real-valued

    flops per decoded symbol. Exemplarily, we present results for di-

    agonal DSTM with R= 2, NT= 3, NR = 1, BhT = 0.03,andMSDD withN= 6(see Fig. 4 for performance results).

    1) Initial Radius: To study the effect of the initial radius

    we consider MSDSD-FS (the results are very similar for other

    MSDSD algorithms). Fig. 6 depicts the complexity for differ-

    ent SNRs as a function of the constant c introduced in (13).While complexity increases if the initial radius is chosen too

    small, substantial complexity savings of up to 60% compared to

    init can be obtained by choosing an optimal finite startradius. It is interesting to note that for the optimal parameter

    c1.7, the transmitted block Vlies inside the sphere with ra-dius init with a probability of about 95% independent of theSNR. It can also be observed that, for this relatively fast fading

    example, the complexity of MSDSD with infinite initial radius

    may increase for larger SNR. This is due to the fact that for high

    SNR the radius after the first tentative decision Vmay be muchlarger than the metric for the ML decision. A finite start radius,

    on the other hand, leads to the more intuitive result that the de-

    coding complexity decreases monotonically with SNR. All of

    the following results assume the optimal finite start radius with

    c= 1.7.

  • 8/13/2019 04291832

    9/10

    PAULI AND LAMPE: TREE-SEARCH MULTIPLE-SYMBOL DIFFERENTIAL DECODING FOR UNITARY SPACE-TIME MODULATION 1575

    Fig. 6. Complexity of MSDSD-FS withN= 6 versus parameterc [initialsquared radius2

    init= (N 1)NTNRc]. Diagonal constellation with NT =

    3,NR = 1, R= 2, andBhT = 0.03.

    Fig. 7. Complexity versus10 log10(Eb/N0) for different MSDSD algo-rithms withN= 6. Diagonal constellation withNT = 3,NR= 1, R= 2,and BhT = 0.03. Solid lines: without Fano-type metric, Dashed lines: withFano-type metric. LD with sphere and stack decoder, respectively.

    2) MSDSD versus MSDSD-FM and Inner Decoding With FS

    versus LD: Fig. 7 compares the different possibilities to reduce

    the complexity of MSDSD, i.e., MSDSD versus MSDSD-FM

    and MSDSD(-FM) with inner FS versus inner LD are consid-

    ered. LD is implemented using sphere andstack decoder, respec-

    tively. It can be seen that the complexity of MSDSD decreases

    rapidly with increasing SNR, since the search quickly terminates

    for small enough noise. Inner LD leads to a significant complex-

    ity reduction, and the stack decoder is clearly advantageous over

    the sphere decoder. Furthermore, applying the Fano-type met-

    ric speeds up the search especially in the lower SNR range, as

    was anticipated in Section III-B. Combining Fano-type metric

    with inner LD using the stack algorithm, i.e., MSDSD-FM-LD,

    yields a tremendous complexity reduction by a factor of about

    10 to 50 compared to MSDSD-FS in the range of relevant SNR.

    Fig. 8. Complexity versus10 log10(Eb/N0)for MSDSD-FM-LD, MSDD-Fano-LD, and benchmark algorithms with N= 6 (CDD-LD with N= 2).

    Diagonal constellation with NT = 3,NR= 1, R= 2, and BhT = 0.03.MSDSD-LD-FM with inner stack algorithm.

    This gain comes at the expense of a less than 1 dB loss in power

    efficiency (see Fig. 4).

    3) Comparison With Benchmark Decoders: Fig. 8 compares

    the average complexity of MSDSD-FM-LD considered earlier

    with those of MSDD-Fano-LD and the benchmark detectors

    CDD-LD, DFDD-LD, and MSDD-BID, respectively. While

    achieving a similar power efficiency (Fig. 4), MSDSD-FM-LD

    requires a lower complexity than MSDD-Fano-LD. The com-

    plexity of MSDSD-FM-LD is also well in the range of those

    of DFDD-LD and CDD-LD, which are less power efficient and

    suffer from a high error floor, respectively. MSDD-BID, whichalso shows a high error floor for this transmission scenario, is

    considerably more complex than MSDSD-FM-LD.

    We, thus, conclude that MSDSD-FM-LD and its subset

    MSDD version with the stack algorithm for inner LD offer

    the overall best performance-complexity tradeoff.

    VI. CONCLUSION

    In this paper, we have investigated tree-search MSDD. We

    have proposed a nested decoder structure consisting of an outer

    and N 1inner tree-search decoders, where the inner decodersefficiently search the signal space of the potentially large DSTM

    constellation. Decoder designs tailored for diagonal and orthog-

    onal DSTM codes have been given. Considering the underlying

    continuous MIMO fading channel, a tight SER approximation

    for MSDD hasbeen obtained, which canbe evaluated efficiently.

    The novel subset MSDD scheme further improves power effi-

    ciency of MSDD by discarding less reliable decisions at the

    edges of the observation window. Numerical and simulation

    results presented for different DSTM constellations and fad-

    ing channel scenarios have shown: 1) to what extent MSDD

    outperforms benchmark decoders increasing the range of fad-

    ing bandwidths for which DSTM is applicable; 2) that sphere

    decoding with a Fano-type metric (MSDSD-FM) achieves

    the best performance-complexity tradeoff with a complexity

  • 8/13/2019 04291832

    10/10

    1576 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 55, NO. 8, AUGUST 2007

    comparable to that of DFDD in most cases; 3) that inner stack

    decoding is favorably applied to provide an ordered sequence

    of candidate signal points for outer tree-search decoding; and

    4) that subset MSDD yields impressive additional performance

    gains at a very moderate increase in complexity.

    REFERENCES

    [1] B. L. Hughes, Differential space-time modulation, IEEE Trans. Inf.Theory, vol. 46, no. 7, pp. 25672578, Nov. 2000.

    [2] B. M. Hochwald and W. Sweldens, Differential unitary space-time mod-ulation, IEEE Trans. Commun., vol. 48, no. 12, pp. 20412052, Dec.2000.

    [3] V. Tarokh and H. Jafarkhani, A differential detection scheme for transmitdiversity, IEEE J. Sel. Areas Commun., vol. 18, no. 7, pp. 11681174,Jul. 2000.

    [4] B. Hassibi and B. M. Hochwald, Cayley differential unitary space-timecodes, IEEE Trans. Inf. Theory, vol. 48,no. 6, pp. 14851503, Jun. 2002.

    [5] R. Schober and L. Lampe, Noncoherent receivers for differential space-time modulation, IEEE Trans. Commun., vol. 50, no. 5, pp. 768777,May 2002.

    [6] B. Bhukania and P. Schniter, Multiple-symbol detection of differentialunitary space-time modulation in fast-Fading channels with known corre-lation, inProc. Conf. Inf. Sci. Syst., Princeton, NJ, Mar. 2002.

    [7] D. Divsalar and M. K. Simon, Multiple-symbol differential detection ofMPSK, IEEE Trans. Commun., vol. 38, no. 3, pp. 300308, Mar. 1990.

    [8] C. Gao, A. M. Haimovich, and D. Lao, Multiple-symbol differential de-tection for space-time block codes, inProc.Conf. Inf. Sci. Syst., Princeton,NJ, Mar. 2002.

    [9] P. Tarasak and V. K. Bhargava, Reduced complexity multiple symboldifferential detection of space-time block code, in Proc. IEEE WirelessCommun. Netw. Conf., Orlando, FL, Mar. 2002, vol. 1, pp. 505509.

    [10] T. Cui and C. Tellambura, Multiple-symbol differential unitary space-time demodulation with reduced-complexity, in Proc. IEEE Int. Conf.Commun., vol. 2, Seoul, Korea, May 2005, pp. 762766.

    [11] S. M. Alamouti, A simple transmitter diversity scheme for wireless com-munications, IEEEJ. Sel. Areas Commun., vol. 16,no. 7, pp. 14511458,Oct. 1998.

    [12] C. Ling, W. H. Mow, K. H. Li, and A. C. Kot, Multiple-antenna dif-

    ferential lattice decoding, IEEE J. Sel. Areas Commun., vol. 23, no. 9,pp. 18211829, Sep. 2005.[13] E. Chiavaccini and G. M. Vitetta, Further results on differential space-

    time modulations, IEEE Trans. Commun., vol. 51, no. 7, pp. 10931101,Jul. 2003.

    [14] L. Lampe, R. Schober, V. Pauli, and C. Windpassinger, Multiple-symboldifferential sphere decoding, in Proc. IEEE Int. Conf. Commun., Jun.2004, vol. 2, pp. 787791.

    [15] P. Pun and P. Ho, The performance of Fano-multiple symbol differentialdetection, inProc. IEEE Int. Conf. Commun., Seoul, Korea, May 2005,pp. 25162521.

    [16] R. H. Clarke, A statistical theory of mobile-radio reception, Bell Syst.Tech. J., vol. 47, pp. 9571000, Jul. 1968.

    [17] B. M. Hochwald, T. L. Marzetta, T. J. Richardson, W. Sweldens, andR. Urbanke, Systematic design of unitary space-time constellations,IEEE Trans. Inform. Theory, vol. 46, no. 6, pp. 19621973, Sep. 2000.

    [18] G. M. Vitettaand D. P. Taylor, Maximumlikelihood decodingof uncoded

    and coded PSK signal sequences transmitted over Rayleigh flat-Fadingchannels, IEEE Trans. Commun., vol. 43, no. 11, pp. 27502758, Nov.1995.

    [19] A. D. Murugan, H. El Gamal, M. O. Damen, and G. Caire, A unifiedframework for tree search decoding: Rediscovering the sequential de-coder, IEEE Trans. Inf. Theory, vol. 52, no. 3, pp. 933953, Mar. 2006.

    [20] E. Agrell, T. Eriksson, A. Vardy, and K. Zeger, Closest point search inlattices, IEEE Trans. Inf. Theory, vol. 48, no. 8, pp. 22012214, Aug.2002.

    [21] O. Damen,H. El Gamal,and G. Caire, Onmaximum likelihood detectionand the search for the closest lattice point, IEEE Trans. Inf. Theory,vol. 49, no. 10, pp. 23892402, Oct. 2003.

    [22] R. Johannesson and K. Sh. Zigangirov, Fundamentals of ConvolutionalCoding. Piscataway, NJ: IEEE, Feb. 1999.

    [23] Z. Guo and P. Nilsson, Reduced complexity SchnorrEuchner decodingalgorithms for MIMO systems, IEEE Trans. Commun., vol. 8, no. 5,

    pp. 286288, May 2004.

    [24] B. L. Hughes, Optimal space-time constellations from groups, IEEETrans. Inf. Theory, vol. 49, no. 2, pp. 401410, Feb. 2003.

    [25] K. L. Clarkson, W. Sweldens, and A. Zheng, Fast multiple antenna dif-ferential decoding, IEEE Trans. Commun., vol. 49, no. 2, pp. 253261,Feb. 2001.

    [26] W.-K. Ma, C.-Y. Chi, and P. C. Ching, Extended differential unitaryspace-time modulation: A non-coherent scheme with error penalty lessthan 3 dB, inProc. Int. Conf. Acoust., Speech Signal Process, Toulouse,France, May 2006, vol. 4, pp. 529532.

    [27] M. Brehler and M. K. Varanasi, Asymptotic error probability anal-ysis of quadratic receivers in Rayleigh-fading channels with applica-tion to a unified analysis of coherent and noncoherent space-time re-ceivers, IEEE Trans. Inf. Theory, vol. 47, no. 6, pp. 23832399,Sep. 2001.

    [28] P. Ho and D. Fung, Error performance of multiple-symbol differen-tial detection of PSK signals transmitted over correlated Rayleigh fad-ing channels, IEEE Trans. Commun., vol. 40, no. 10, pp. 2529,Oct. 1992.

    [29] E. Biglieri, G. Caire, G. Taricco, and J. Ventura-Traveset, Computingerror probabilities over Fading channels: A unified approach, Eur. Trans.Telecommun., vol. 9, pp. 1525, 1998.

    [30] S. Siwamogsatham, M. P. Fitz, and J. H. Grimm, A new view of per-formance analysis of transmit diversity schemes in correlated Rayleighfading, IEEE Trans. Inf. Theory, vol. 48, no. 4, pp. 950956, Apr.2002.

    Volker Pauli (S04) received the Diploma degreein electrical engineering from the University ofErlangen, Erlangen, Germany, in 2003.

    He is currently a Research Assistant inthe Telecommunications Laboratory, University ofErlangen. He was a Visiting Scholar at the EcolePolytechnique Federale de Lausanne, Lausanne,Switzerland, in 2002 and the University of BritishColumbia, Vancouver, Canada, in 2003 and 2005.His current research interests include noncoherent

    detection, iterative receiver structures, multiple-inputmultiple-output (MIMO)systems, andmulticarriermodulation andequalization.

    Lutz Lampe (S98M02) received the Diploma andthe Ph.D. degree from the University of Erlangen,Germany, in 1998 and 2002, respectively, both inelectrical engineering.

    He is currently an Associate Professor in the De-partment of Electrical and Computer Engineering,University of British Columbia, Vancouver, Canada.

    His current research interests include the areas ofcommunications and information theory applied towireless and powerline transmission.

    Dr. Lampe has been a Vice-Chair of the IEEECommunications Society Technical Committee on Power Line Communica-tions since 2004. He was General Chair of the 2005 International Symposiumon Power Line Communications and its Applications (ISPLC 2005) and a Co-Chair of the General Symposium of the 2006 IEEE Global TelecommunicationsConference (Globecom 2006). He is an Editor for the IEEE T RANSACTIONSON WIRELESS COMMUNICATIONS and has been an Associate Editor for theIEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY from 2004 to 2007. Hewas the co-recipient of the Eurasip Signal Processing Journal Best Paper Award2005, the Best Paper Award at the IEEE International Conference on Ultra-Wideband (ICUWB) 2006, and the Best Student Paper Awards at the EuropeanWireless Conference 2000 and at the International Zurich Seminar 2002. In2003, he received the Dissertation Award of the German Society of InformationTechniques (ITG).