cac mã

download cac mã

of 64

Transcript of cac mã

  • 8/13/2019 cac m

    1/64

    PHN 1

    GII THIU TI

    I.1. GII THIU

    Cng ngh thng tin (CNTT-IT) c mnh danh l ngnh ca mi ngnh. Thc

    vy, ta c th thy hu ht mi ngnh, ngh u t nhiu c ng dng CNTT vo cng

    vic ca n.

    Bo mt thng tin, nn d liu l cng vic cn thit trong ngnh CNTT. T thi xa

    xa, con ngi bit gn gi thng tin trong qun s, trong chin tranh nhm lm cho

    thng tin c an ton, khng lt vo tay i phng. Mun vy, con ngi c nhiu

    cch lm cho thng tin truyn i c gn nh v an ton khi lu thng. thngtin n tay ngi nhn mt cch b mt, gn gng nht th h phi nn thng tin li hay

    dng cc k hiu c bit c qui c trc.

    Ngy nay vi s tin b ca khoa hc k thut, CNTT c nng ln mt tm cao

    mi. Mi ngnh, ngh u phi ng dng CNTT mt cch trit pht trin mt

    cch tt nht. Ngay nh c cm t Chnh ph in t m ta thng nghe cho ta thy

    tm quan trng ca vic a CNTT vo cuc sng. Khng nm ngoi s pht trin c

    tnh qui lut chung ca x hi, CNTT v ang lm ht sc mnh pht trin ngy

    mt tt hn, nhanh hn, nh hn, gn hn . CNTT v Vin thng c mi lin h

    tng i cht ch vi nhau. Mt trong nhng tiu ch gip ngnh CNTT pht trin l

    s dng cng ngh bo mt, nn d liu, thng tin trong lu tr v truyn thng.

    Trong k thut truyn s liu, bo mt v nn d liu (ngun tin) truyn i l 2 vn

    quan trng, nhiu c s l thuyt v m ha ngun cho ta thy tm quan trng ca

    vic m ha v nn d liu. Cc thut ton nn d liu ra i t rt lu nh m nn

    Shannon-Fano, Huffman hay Lempel Ziv Welch (LZW) c cho l kinh in ca

    cng ngh nn d liu.Trong bi lun vn ny, em s trnh by i nt v cc thut ton nn d liu thng

    dng hin nay v so snh tnh hiu qu ca vic nn mt s loi d liu khc nhau gia

    2 loi m nn Shannon-Fano v Huffman. Phn m phng tnh hiu qu ca 2 loi m

    Trang 1

  • 8/13/2019 cac m

    2/64

    nn thng dng ny em s trnh by bng chng trnh c vit trn nn tng ngn

    ng lp trnh C v C++.

    I.2. MC CH V PHM VI CA TI

    n ny tp trung vo cc vn xoay quanh thut ton Shannon-Fano v

    Huffman, phn tch cc u, nhc im ca thut ton ny so vi thut ton kia bng

    hnh thc m phng t l nn ca n. Cc phn s thc hin trong n ny gm :

    1. Tm hiu l thuyt nn d liu, tng hp cc kt qu trn th gii v nn

    d liu. Tng quan v cc phng php m ha ngun, nguyn tc lm vic ca cc

    phng php, tc v t l nn.

    2. Phn loi v ng dng. Nhim v t ra l a ra c ni dung ca cc

    thut ton m ha ngun .

    3. Ni dung thut ton Shannon-Fano v Huffman . Trong phn ny s i su

    phn tch cc u nhc im ca tng thut ton, la chn thut ton Shannon-Fano

    v Huffman l cc thut ton nn d liu kinh in so snh tnh hiu qu ca tng

    loi vi mt s loi d liu khc nhau.

    4. Khi qut v chng trnh, kt qu ca thut ton. Trong phn ny s nu cu

    trc ca chng trnh v cc vn khi xy dng chng trnh. Nhim v quan trng

    l phi m phng c thut ton, xy dng lu thut ton v coding.

    I.3. PHUNG PHP NGHIN CU

    - a ra c c s l thuyt

    - Tin hnh nghin cu thut ton ca cc loi m nn Shannon-Fano v Huffman.

    - a ra nguyn tc m ha v bi ton m ha.

    - Chy chng trnh v kim th

    - nh gi u im v nhc im

    PHN 2

    NI DUNG

    Trang 2

  • 8/13/2019 cac m

    3/64

    CHNG 1

    S LC V NN D LIU

    II.1.1 TNG QUAN V NN D LIU

    Trong khoa hc my tnh v l thuyt thng tin, nn d liu l qu trnh m ha thng

    tin dng t bit hn so vi thng tin cha c m ha bng cch dng mt hoc kt hp

    ca cc phng php no . Da theo nguyn tc ny giptrnh cc hin tng knh

    truyn b qu ti v vic truyn tin tr nn kinh t hn. Nn d liu gip tit kim cc

    ti nguyn nh dung lng b nh, bng thng, thi gian. Ngc li, d liu cnn cn phi c gii nn c (thc thi, nghe, xem v.v), qu trnh ny cng i

    hi cc ti nguyn nht nh. Mt v d in hnh l vic nn video i c th i hi

    phn cng t tin qu trnh gii nn nhanh ta c th xem c. Do vic

    thit k mt chng trnh nn d liu ph thuc nhiu yu t nh mc nn, mo

    (i vi nn c tn hao), ti nguyn h thng dng thc hin qu trnh nn v gii

    nn d liu.

    II.1.2 TNG QUAN CC LOI M NN

    II.1.2.1. Cc chng trnh nn hot ng nh th no

    Nguyn tc ca cc chng trnh nn ni chung ging nhau: Tn dng s lp li

    ca d liu, cc chui d liu lp li c thay th bi con tr chung c di b hn.

    K thut ny rt c hiu qu i vi d liu dng text, bng tnh, hoc file DBF (nn

    trn 70%), v tnh lp li ca d liu loi ny cao: File chng trnh (.EXE hoc .COM)

    nn c t hn.

    II.1.2.2. Tc v t l nn

    Trang 3

  • 8/13/2019 cac m

    4/64

    Ngay c khi tt c cc chng trnh nn file u dng chung mt thut ton th hot

    ng ca chng cng khc nhau. Mi hng trin khai thut ton mt kiu dung ha

    hai vn : thi gian v t l nn. Chng trnh PKZIP thng tri hn cc chng

    trnh nn khc v mt tc , v mt t l nn, nhiu khi n cng kh hn. Tnh nnh ca cc chng trnh nn cng l iu cn quan tm. Cc file nn ni chung rt

    t khi b hng. Cng cn lu l cc loi file nn khng tng thch vi nhau, tc l

    nu gi file nn cho ngi khc th ngi cn phi c chng trnh thch hp mi

    gii nn ra c. Tuy nhin gii quyt vn ny, c 3 chng trnh ARC + PLUS,

    LHA v PKZIP u cho php to file nn t bung - tc file nn dng chng trnh

    thc hin, khi chy s t ng bung ra, trn th trng cng bt u xut hin chng

    trnh chuyn i t dng file nn ny sang dng file nn khc, v d chng trnh

    D'Compress for Windows chuyn cc file PKZIP, ARC, LHA sang dng ARJ.

    Cc chng trnh nn gi khng cao (PKZIP: 47USD, LHA cung cp min ph)

    nn c dng kh rng ri. Hn ch hin nay ca chng l giao din ngi dng

    khng thun tin, thng phi g lnh vi nhiu tham s du nhc ca DOS thc

    hin mt cng vic no . Ci tin theo hng ny ang c thc hin: ARC +

    PLUS c giao din kiu menu, PKZIP cng c phn b sung l PKZIP menu.

    Nhiu chng trnh qun l file trong DOS v trong Windows bt u dng k

    thut nn. Chng trnh Magellan ca hng Lotus dng PKZIP t nm 1990, chng

    trnh Xtree Gold a PKZIP vo cng c qun l file nm 1991.

    Th mc nn ri sau li phi bung ra dng ca cc chng trnh nn file kh

    rm r, chnh bi l do ny m cc chng trnh nn a nh Stacker hoc Super

    Store c s dng tng i rng ri. Cc chng trnh nn a cng hot ng trn

    nguyn tc ging nh nn file, ch khc l chng t ng nn v bung m ngi dng

    khng phi quan tm n. Thi gian v t l nn ca cc chng trnh nn loi ny

    khc nhau. bung 3,5 Mb d liu, chng trnh ny ht 12 giy, chng trnh khc

    40 giy. T s nn i vi file vn bn cng khc: t 2:1 n 3:1. Tm li khi dng

    chng trnh nn a, ngi dng yn tm l dung lng trng ca cng dng nh

    tng khong 2 ln.

    Trang 4

  • 8/13/2019 cac m

    5/64

    Vic bung v nn khi lm vic vi file s lm cng vic chm li i cht. i vi

    cc file d liu ln, iu ny th hin kh r. Khi lm vic, cc chng trnh nn a

    hot ng dng thng tr, bi th mt mt n chim dng b nh RAM, mt mt c

    th gy xung t vi cc chng trnh thng tr khc. Cc chng trnh nn file khic s c ch hng mt vi file, cn chng trnh nn a lm hng c a. Tuy iu

    ny rt t khi xy ra nhng n cng lm cho nhiu ngi e ngi khng dm dng.

    ci t chng trnh nn a cn phn chia li cng v my tnh cn c khi

    ng bng a nn trc khi chng trnh nn hot ng. Nu dng Windows th phn

    khng nn cn kh ln (thng thng cn dnh 10 Mb cho vng khng nn, ch nn

    vng a cn li).

    Mt iu c th lm ngi dng au u l phi quyt nh t l nn l bao nhiu.

    Vi t l nn 10:1 chng hn, chng trnh nn s dnh nhiu "con tr" tr n

    cc d liu, mi con tr chim 2 byte, khi d xy ra trng hp khng con tr,

    chng trnh bo a y m thc ra khng phi nh vy.

    Cui cng, vic loi b chng trnh nn a khi ci t cng l mt vn hi

    phin toi. Nhiu chng trnh - chng hn Double Density c chc nng loi b. i

    vi cc chng trnh khc cn tm cc file n ca chng trnh nn v xa b chng i.

    C khi phi format li cng.

    Tm li, d mt s hn ch, nn d liu l cch thc kinh t nht m rng dung

    lng cng. Ngoi ra cn tit kim c kh nhiu thi gian v kinh ph khi nn d

    liu trc khi truyn i

    T l nn l mt trong cc c trng quan trng nht ca mi phng php nn.

    Tuy nhin, v cch nh gi v cc kt qu cng b trong cc ti liu cng cn c

    quan tm xem xt. Nhn chung, ngi ta nh ngha t l nn nh sau :

    T l nn = 1/ r x %

    Vi r l t s nn c nh ngha : r = kch thc d liu gc / kch thc d liu

    thu c sau nn. Nh vy hiu sut ca nn l : ( 1 - t l nn) x %

    Trang 5

  • 8/13/2019 cac m

    6/64

    Trong cc trnh by sau khi ni n kt qu nn, chng ta dng t s nn, th d nh

    10 trn 1 c ngha l d liu gc l 10 sau khi nn ch c 1 phn.

    Tuy nhin, cng phi thy rng nhng s o ca mt phng php nn ch c gi trvi chnh s nn , v rng hiu qu ca nn cn ph thuc vo kiu d liu nh nn.

    nhiu khi t l nn cao cng cha th ni rng phng php l hiu qu hn cc

    phng php khc, v cn cc chi ph khc nh thi gian, khng gian v thm ch c

    phc tp tnh ton na. Th d nh nn phc v trong truyn d liu : vn t ra l

    hiu qu nn c tng hp vi ng truyn khng.

    II.1.2.3 Cc loi d tha d liu.

    Nh trn ni, nn nhm mc ch gim kch thc d liu bng cch loi b dtha d liu. vic xc nh bn cht cc kiu d tha d liu rt c ch cho vic xy

    dng cc phng php nn d liu khc nhau. Ni mt cch khc, cc phng php

    nn d liu khc nhau l do s dng cc kiu d tha d liu khc nhau. C 4 kiu d

    tha chnh c trnh by cc mc sau y.

    II.1.2.3.1. S phn b k t.

    Trong mt dy kt, cmt sktctn sut xut hin nhiu hn mt sdykhc. Do vy, ta cthmhodliu mt cch c ng hn. Cc ktctn xut

    xut hin cao hn c thay th bi mt tmnhphn vi sbt nh; ngc li cc

    dy ctn xut xut hin thp sc mha bi tmcnhiu bt hn. y chnh l

    bn cht ca phng php mhoHuffman hay Shannon-Fano.

    II.1.2.3.2. Slp li cua cc kt

    Trong mt stnh hung nh trong nh, 1 khiu (bt "0" hay bt "1") c l p il p li mt sln. Kthut nn dng trong trng hp ny lthay dy l p bi dy

    mi gm 2 thnh phn: s ln l p vkhiu dng m ha. Phng php mho

    kiu ny ctn lmha lot di RLC (Run Length Coding).

    Trang 6

  • 8/13/2019 cac m

    7/64

    II.1.2.3.3. Nhng mu sdung tn sut

    Cthcdy khiu no xut hin vi tn sut tng i cao. Do vy, cth

    mhobi t bt hn. y lc sca phng php mhokiu tin do Lempel-Ziv a ra vcci tin vo nm 1977, 1978 vdo ctn gi lphng php nn

    LZ77, LZ78. Nm 1984, Terry Welch ci tin hiu qu hn v t tn l LZW

    (Lempel-Ziv- Welch). Thut ton nn d liu da vo mu s dng tn sut hiu qu

    phi k n phng php nn d liu ca Shannon-Fano v Huffman.

    II.1.2.3.4. d tha vitri

    Do sphthuc ln nhau ca dliu, i khi bit c khiu (gitr) xut hinti mt vtr, ng thi cthon trc sxut hin ca cc gitrcc vtrkhc

    nhau mt cch phhp. Chng hn, nh biu din trong mt li hai chiu, mt s

    im hng dc trong mt khi dlu li xut hin trong cng vtrcc hng khc

    nhau. Do vy, thay vlu trdliu, ta chcn lu trvtrhng vct. Phng php

    nn da trn sd tha ny gi lphng php mhodon.

    Cch nh gid tha nh trn hon ton mang tnh trc quan nhm biu th

    mt ci gxut hin nhiu ln. i vi dliu nh, ngoi c thchung , ncncnhng c thring. Thdnh cng dng khng cn ton bdliu th ca nh

    mchcn cc thng tin c trng biu din nh nh bin nh hay vng ng nht. Do

    vy, cnhng phng php nn ring cho nh da vo bin i nh hay da vo biu

    din nh.

    II.2.2 PHN LOI V NG DNG

    II.1.3.1 Da vo nguyn l nn:

    Theo cch ny ngi ta phn thnh 2 h:

    II.1.3.1.1 Cc thut ton nn khng tn hao:

    Trang 7

  • 8/13/2019 cac m

    8/64

    Trong phng php nn khng tn hao, d liu c nn sau khi gii nn sging

    y nh ban u. Trong thng dng nht l thut ton Lempel-Ziv (LZ). DEFLATE,

    l mt bin th ca thut ton LZ, c ti u ha nhm tng tc gii nn v t l

    nn, b li thut ton ny c tc ca qu trnh nn chm. DEFLATE c dngtrong PKZIP, GZIP, v PNG. LZW (Lemple-Zip-Welch) c dng trong nh dng

    file GIF. Hai bin th ca thut ton LZ cng ng ch l thut ton LZX dng trong

    nh dng file CAB ca Microsoft (Microsoft cn dng thut ton nn ny trong file

    CHM, cc file office 2007) v thut ton LZMA dng trong chng trnh 7-ZIP.

    Cc thut ton nn khng tn hao c dng nn cc file nh file thc thi, file

    vn bn, word, excel, v.v Cc loi d liu ny khng th sai lch d ch mt bit, nht

    l cc file chng trnh.

    Cc thut ton nn khng tn hao c bn:

    1. Run-length encoding (RLE).

    2. Dictionary coders.

    3. LZ-77 & LZ-78.

    4. LZW.

    5. Burrows and Wheeler transform (BWT).

    6. Prediction by partial matching (PPM).

    7. Context mixing (CM).

    8. Entropy encoding.

    9. Huffman coding (huffman ng thng dng bc cui cng ca qu trnh nn

    file gm nhiu bc).

    10.Adaptive Huffman.

    11.Arithematic.

    12.Shannon-Fano coding.

    13.Range coding.

    Trang 8

  • 8/13/2019 cac m

    9/64

    II.1.3.1.2 Cc thut ton nn tn hao:

    Trong cc phng php nn tn hao th d liu c nn khi gii nn ra s khng

    ging vi d liu gc, tuy nhin phi m bo d liu sau khi nn vn cn hu ch.i vi hnh nh, m thanh, video, do gii hn ca mt v tai ngi nn mt lng ln

    dung lng c th c tit kim bng cch loi b cc phn d tha, trong khi cht

    lng hu nh khng thay i.

    Trong thc t, cc file hnh nh m thanh hay l video c lu tr trn my tnh

    u c nn c tn hao tit kim dung lng v bng thng. i lp vi nn

    khng tn hao cc phng php nn c tn hao thng gy gim cht lng rt nhanh

    khi thc hin nn v gii nn qui nhiu ln.Cc mu hnh nh m thanh s c chia

    thnh cc phn nh v c bin i qua min khc. Cc h s bin i ny s c

    lng t ha sau c m ha bng m huffman hoc m ha s hc.

    Cc mu hnh nh m thanh trc c s dng d on cc mu tip theo. Sai

    s gia d liu d on v d liu thc s c lng t ha ri m ha

    u im ca nn tn hao so vi nn khng tn hao l nn tn hao trong nhiu

    trng hp cho t l nn cao hn rt nhiu so vi bt c thut ton nn khng tn hao

    c bit, trong khi vn m bo c cht lng. Nn tn hao thng c s dng

    nn nh, m thanh, video. m thanh c th nn vi t l 10:1 m hu nh khnggim cht lng. Video c th nn vi t l 300:1 vi cht lng gim t.

    II.1.3.2 Da vo cch thc thc hin nn

    Theo cch ny, ngi ta cng phn thnh hai h:

    Phng php khng gian (Spatial Data Compression): cc phng php

    thuc hny thc hin nn bng cch tc ng trc tip ln vic ly mu ca nh trongmin khng gian.

    Phng php s dng bin i (Transform Coding): Gm cc phng

    php tc ng ln sbin i ca nh gc mkhng tc ng trc tip nh htrn.

    Trang 9

  • 8/13/2019 cac m

    10/64

    CHUNG 2

    CC PHNG PHP NN D LIU

    II.2.1 PHUNG PHP NN KHNG TN HAO

    II.2.1.1. M hnh thng k

    II.2.1.1.1 Thut ton Shannon-Fano:

    Cc bc thc hin m ho theo thut ton Shanon-Fano:

    - Bc 1: Sp xp cc k t theo th t gim dn.

    - Bc 2: Tnh xc sut

    - Bc 3: quy lm hai phn, mi phn c tng xc sut gn bng

    nhau. M ho phn trn bng bit 0 (hoc bit 1), phn di bng bit 1(hoc bit 0).

    - Bc 4: V s cy.

    - Bc 5: Tnh Entropy, s bits m ho trung bnh v s bit m ho

    thng thng.

    v V d m t thut ton

    Thng k lng tin:

    K hiu A B C D E

    S ln xut hin 15 7 6 5 6

    M ha lng tin:

    Trang 10

  • 8/13/2019 cac m

    11/64

    K hiu m PiLog2(1/

    pi)M Tng bits

    A 15 15/39 1.38 0 0 30

    B 7 7/39 2.48 0 1 14

    C 6 6/39 2.7 1 0 12

    E 6 6/39 2.7 1 1 0 18

    D 5 5/39 2.96 1 1 1 15

    S bits s dng trung bnh: (tng bits/ s ln xut hin).

    R = (30+14+12+18+15) / 39 = 2.29 bits

    II.2.1.1.2 Thut ton Huffman

    Thut ton Huffman c u im l h s nn tng i cao, phng php thc

    hin tng i n gin, i hi t b nh, c th xy dng da trn cc mng b hn

    64KB. Nhc im ca n l phi cha c bng m vo tp tin nn th pha nhn mic th gii m c do hiu sut nn ch cao khi ta thc hin nn cc tp tin ln.

    Nguyn l:

    Nguyn l ca phng php Huffman l m ha cc bytes trong tp d liu

    ngun bng bin nh phn. N to m di bin thin l mt tp hp cc bits. y l

    phng php nn kiu thng k, nhng k t xut hin nhiu hn s c m ngn hn

    (gn ging Shannon-Fano).

    Thut ton:

    Thut ton nn:

    Trang 11

  • 8/13/2019 cac m

    12/64

    - Bc 1: Tm hai k t c trng s nh nht ghp li thnh mt, trng

    s ca k t mi bng tng trng s ca hai k t em ghp.

    - Bc 2: Trong khi s lng k t trong danh sch cn ln hn mt

    th thc hin bc mt, nu khng th thc hin bc ba.

    - Bc 3: Tch k t cui cng v to cy nh phn vi quy c bn

    tri m 0, bn phi m 1.

    v Xt v d.

    Thng k lng tin:

    K hiu A B C D E

    S ln xut hin 15 7 6 5 6

    M ha lng tin:

    K hiu Xc sut M Tng bit

    A 15/39 1

    13/39 0

    0 1 24/39 0

    0 11/39

    1 15

    B 7/39 000 21

    C 6/39 001 18

    E 6/39 010 18

    D 5/39 011 15

    - S bit trung bnh: 87/39 =2.23 (

  • 8/13/2019 cac m

    13/64

    Thut ton gii nn:

    -Bc 1: c ln lt tng bit trong tp tin nn v duyt cy nh phn c

    xc nh cho n khi ht mt l. Ly k t l ghi ra tp gii nn.

    -Bc 2: Trong khi cha ht tp tin nn th quay li thc hin bc mt, ngc

    li th thc hin bc tip theo.

    -Bc 3: Khi ht tp tin, kt thc thut ton.

    II.2.1.1.3 Thut ton Run-length:

    Loi d tha n gin nht trong mt tp tin l cc ng chy di gm cc k tlp li, iu ny thng thy trong cc tp tin ha bitmap, cc vng d liu hng

    ca cc tp tin chng trnh, mt s tp tin vn bn...

    v V d, xt chui sau:

    AAAABBBAABBBBBCCCCCCCCDABCBAAABBBBCCCD

    Chui ny c th c m ho mt cch c ng hn bng cch thay th chui k

    t lp li bng mt th hin duy nht ca k t lp li cng vi mt bin m s ln kt c lp li. Ta mun ni rng chui ny gm bn ch A theo sau bi ba ch B

    ri li theo sau bi hai ch A, ri li theo sau bi nm ch B... Vic nn mt chui theo

    phng php ny c gi l m ho di lot. Khi c nhng lot di, vic tit kim

    c th l ng k. C nhiu cch thc hin tng ny, tu thuc vo cc c trng

    ca ng dng (cc lot chy c khuynh hng tng i di hay khng ? C bao nhiu

    bit c dng m ho cc k t ang c m ?).

    Nu ta bit rng chui ca chng ta ch cha cc ch ci, th ta c th m ho bin

    m mt cch n gin bng cch xen k cc con s vi cc ch ci. V vy chui k t

    trn c m ho li nh sau: 4A3BAA5B8CDABCB3A4B3CD

    Trang 13

  • 8/13/2019 cac m

    14/64

    y "4A" c ngha l "bn ch A"... Ch l khng ng m ho cc lot

    chy c di 1 hoc 2 v cn n hai k t m ho.

    i vi cc tp tin nh phn mt phin bn c tinh ch ca phng php nyc dng thu c s tit kim ng k. tng y l lu li cc di lot,

    tn dng s kin cc lot chy thay i gia 0 v 1 trnh phi lu chnh cc s 0 v

    1 . iu ny gi nh rng c mt vi lot chy ngn (Ta tit kim cc bit trn mt

    lot chy ch khi di ca ng chy l ln hn s bit cn biu din chnh n

    trong dng nh phn), nhng kh c phng php m ho di lot no hot ng

    tht tt tr phi hu ht cc lot chy u di.

    Vic m ho di lot cn n cc biu din ring bit cho tp tin v cho bn

    c m ho ca n, v vy n khng th dng cho mi tp tin, iu ny c th hon

    ton bt li, v d, phng php nn tp tin k t c ngh trn s khng dng

    c i vi cc chui k t c cha s. Nu nhng k t khc c s dng m ho

    cc s m, th n s khng lm vic vi cc chui cha cc k t . Gi s ta phi m

    ho bt k k t no t mt bng ch ci c nh bng cch ch dng cc k t t bng

    ch ci . minh ho, gi s ta phi m ho bt k mt chui no t mt ch ci ,

    ta s gi nh rng ta ch c 26 ch ci trong bng ch ci (v c khong trng) lm

    vic.

    c th dng vi ch ci biu din cc s v cc k t khc biu din cc phn

    t ca chui s c m ho, ta phi chn mt k t c gi l k t "Escape". Mi

    mt s xut hin ca k t bo hiu rng hai ch ci tip theo s to thnh mt cp

    (s m, k t) vi cc s m c biu din bng cch dng k t th i ca bng ch

    ci biu din s i. V vy, chui v d ca chng ta s c biu din nh sau vi Q

    c xem l cc k t

    Escape"QDABBBAABQHCDABCBAAAQDBCCCDT hp ca k t "Escape", s m v mt k t lp li c gi l mt dy Escape.

    Ch rng khng ng m ho cc ng chy c chiu di t hn bn k t, v t

    nht l cn n ba k t m ho bt k mt lot chy no.

    Trang 14

  • 8/13/2019 cac m

    15/64

    Trong trng hp bn thn k t "Escape" xut hin trong dy k t cn m ho ta

    s dng mt dy "Escape" vi s m l 0 (k t space) biu din k t "Escape".

    Nh vy trong trng hp k t "Escape" xut hin nhiu th c th lm cho tp tin nn

    phnh to hn trc.

    Cc lot chy di c th c ct ra m ho bng nhiu dy Escape, v d, mt

    lot chy gm 51 ch A s c m ho nh QZAQYA bng cch dng trn.

    Phng php m ho di lot thng c p dng cho cc tp tin ho

    bitmap v thng c cc mng ln cng mu c biu din di dng bitmap

    l cc chui bit c ng chy di. Trn thc t, n c dng trong cc tp

    tin .PCX, .RLE.

    II.2.1.2. M hnh t in

    II.2.1.2.1 Thut ton LZ78

    Thay v thng bo v tr on vn lp li trong qu kh, m LZ78 nh s tt c

    cc on vn sao cho mi on ghi nhn s hiu on vn lp li trong qu kh cng

    vi mt k t m n lm cho on khc vi on trong qu kh. Nh vy mi on

    mi l mt on k t trong qu kh cng vi mt k t trong qu kh. Chnh v thon mi khc vi on c trong qu kh.

    V d: Gi s ta c on vn bn sau: aaabbabaabaaabab

    Theo thut ton LZ78 th chng c phn on nh sau:

    Trang 15

    Input A Aa b Ba baa baaa bab

    on 1 2 3 4 5 6 7

    output 0+a 1+a 0+b 3+a 4+a 5+a 4+b

  • 8/13/2019 cac m

    16/64

    Nh vy bn nn ca chng ta l: (0,a); (1,a); (0,b); (3,a); (4,a); (5,a); (4,b)

    Thut ton nn:

    Bc 1: c mt k t -> ch, on c gn bng 1, kt np k t vo t in,

    w=ch;

    Bc 2: While not eof(f) do

    Begin

    c tip k t tip theo w:= ww+ch;

    If w thuc t in then ww:=w;

    Else begin

    Code(w,j);

    Ghi j v ch vo tp nn.

    Thm w vo t in.

    End;

    End;

    Bc 3: Dng chng trnh.

    Thut ton gii nn

    Bc 1: c thng tin v t in c lu trong tp nn, tl:=false;

    Bc 2: while not eof(f) do

    Begin

    c byte tip theo -> b

    Trang 16

  • 8/13/2019 cac m

    17/64

    Decode(b,s,t);

    If tl=false then w:=w+s

    Else w:=ww+s;

    TIMCHU(w,t);

    If t=false then

    Begin

    Ghi s ra tp gii nn Thm s vo t in

    End

    Else Begin

    ww:=s;

    End; End;

    Bc 3: Dng chng trnh.

    nh gi: Ni chung thut ton LZ78 l mt thut ton nn vn bn kh tt, c thi

    gian chy chng trnh tng i nhanh tuy nhin kh nng tit kim cha c khai

    thc.

    II.2.1.2.2 Thut ton LZW

    Gii thut nn LZW xy dng mt t in lu cc mu c tn sut xut hin cao

    trong nh. T in l tp hp nhng cp t vng v ngha ca n. Trong , t vng s

    l cc t m c sp xp theo th t nht nh. Ngha l mt chui con trong d liu

    nh. T in c xy dng ng thi vi qu trnh c d liu. S c mt ca mt

    chui con trong t in khng nh rng chui tng xut hin trong phn d liu

    c. Thut ton lin tc tra cu v cp nht t in sau mi ln c mt k t d

    liu u vo.

    Trang 17

  • 8/13/2019 cac m

    18/64

    Do kch thc b nh khng phi v hn v m bo tc tm kim, t in

    ch gii hn 4096 phn t dng lu ln nht l 4096 gi tr ca cc t m. Nh

    vy di ln nht ca t m l 12 bits (4096 = 212 ). Cu trc t in nh sau.

    0 0

    255 255

    256 256| Clear Code

    257 257| End of Information

    258 Chui mi

    4095 Chui mi

    256:M xo CC khc phc tnh

    trng mu lp ln hn 4096, nu mu lp

    ln hn 4096 th gi CC xy dng tin cho phn tip theo.

    Eoi: Bo hiu ht mt phn nn.

    - 256 t m u tin theo th t t 0255 cha cc s nguyn t 0255.

    y l m ca 256 k t c bn trong bng m ASCII.

    - T m th 256 cha mt m c bit l m xo (CC- Clear Code). Mc

    ch vic dng m xo nhm khc phc tnh trng s mu lp trong nh ln hn 4096.

    Khi mt nh c quan nim l nhiu mnh nh, v t in l mt b t in gm

    Trang 18

  • 8/13/2019 cac m

    19/64

    nhiu t in con. C ht mt mnh nh ngi ta li gi mt m xo bo hiu kt

    thc mnh nh c, bt u mnh nh mi ng thi khi to li t in cho mnh nh

    mi. M xo c gi tr l 256.

    - T m th 257 cha m kt thc thng tin (EOI End of information).

    M ny c gi tr l 257. Nh chng ta bit, mt file nh GIF c th c cha nhiu

    nh.Mi mt nh s c m ho ring.Chng trnh gii m s lp li thao tc gii m

    tng nh cho n khi gp m kt thc thng tin th dng li.

    - Cc t m cn li (t 258 n 4095) cha cc mu thng lp li trong

    nh. 512 phn t u tin ca t in biu din bng 9 bit. Cc t m t 512 n 1023

    biu din bi 10 bit, t 1024 n 2047 biu din bi 11 bit v t 2048 n 4095 biu

    din bi 12 bit.

    Nguyn tc hot ng ca n nh sau:

    - Mt xu k t l mt tp hp t hai k t tr ln.

    - Nh tt c cc xu k t gp v gn cho n mt du hiu (token) ring.

    - Nu ln sau gp li xu k t , xu k t s c thay th bng du hiuca n.

    - Phn quan trng nht ca phng php nn ny l phi to mt mng rt

    ln dng lu gi cc xu k t gp (Mng ny c gi l "T in"). Khi cc

    byte d liu cn nn c em n, chng lin c gi li trong mt b m cha

    (Accumulator) v em so snh vi cc chui c trong "t in". Nu chui d liu

    trong b m cha khng c trong "t in" th n c b sung thm vo "t in" v

    ch s ca chui trong "t in" chnh l du hiu ca chui. Nu chui trong b m

    cha c trong "t in" th du hiu ca chui c em ra thay cho chui dng

    d liu ra.

    Qu trnh nn:

    Trang 19

  • 8/13/2019 cac m

    20/64

    LZW bt u bi 1 t in 256 k t (trong trng hp s dng bng m 8 bits) v

    s dng chng nh tp k t chun. Sau mi ln c n c 8 bits (v d 't', 'r', ...) v

    m ha thnh con s tng ng vi ch mc ca k t trong t in.

    Mi khi LZW i qua 1 chui con mi (gi s "tr") th n thm chui con vo t

    in;

    Mi khi n i qua 1 chui con m n thy trc , n ch c thm 1 k t mi

    na v cng vi chui con bit to ra 1 chui con mi. Ln tip theo LZW bt

    gp mt chui con c, n ch c vic s dng s ch mc tng ng trong t in.

    Thng th ngi ta s nh sn s lng ln nht cc t trong t in (gi s

    4096), v th vic nn LZW khng lm tiu tn ht ton b b nh. V vy m ca cc

    chui con trong v d ny l 12 bits (2 ^ 12 = 4096). Cn thit phi lp m di hn s

    bits ca mt k t (12 vs 8 bits), do o khi rt nhiu chui con lp li s c thay th

    bi mt m duy nht th vic nn c thc hin.

    V d: Cc bc m ho chui "ABCBCABCABCD" nh sau:

    Cc bc thc hin.

    - Bc 1: w = NIL;

    - Bc 2: Trong khi c c k t th k trong chui:

    - Bc 3: Nu wk tn ti trong t in th w=wk

    - Bc 4: Cn khng th thm wk vo trong t in, m ho ng ra cho

    w,w=k

    - k=k+1

    Trang 20

  • 8/13/2019 cac m

    21/64

    Count W K wk symbol index output

    0 Nil A A

    1 A B AB AB 258 65

    2 B C BC BC 259 66

    3 C B CB CB 260 67

    4 B C BC

    5 BC A BCA BCA 261 259

    6 A B AB

    7 AB C ABC ABC 262 258

    8 C A CA CA 263 67

    9 A B AB

    10 AB C ABC

    11 ABC D ABCD ABCD 264 262

    12 D NIL D 68

    Chui ra: 65- 66- 67- 259 -258- 67 (output)

    u vo kch thc: 12 x 8 = 96 bits.

    u ra kch thc l: 5 x 8 + 3 x 9 = 67 bits

    T l nn l: 96 /67 =1.43

    Trang 21

  • 8/13/2019 cac m

    22/64

    II.2.2 PHUNG PHP NN TN HAO

    II.2.2.1 Phng php nn nh MPEG.

    MPEG (Moving Picture Expert Group) c ra i vo nm 1988 nhm mc

    ch chun ho cho nn tn hiu m thanh v video. MPEG - 1 c th nn tn hiu video

    ti 1.5Mbit/s vi cht lng VHS v m thanh lp th (stereo audio) vi tc 192 bit/

    s. N c dng lu tr video v m thanh trn CD-ROM.

    Vo nhng nm 1990, MPEG-2 ra i nhm p ng cc tiu chun nn

    video cho truyn hnh. MPEG-2 c kh nng m ho tn hiu truyn hnh tc 3-

    15Mbit/s v truyn hnh nt cao tc ti 15-30Mbit/s. MPEG-2 cho php m

    ho tn hiu video vi nhiu mc phn gii khc nhau, chng c kh nng p ng

    cho nhiu ng dng khc nhau. Nhiu thut ton tng ng vi nhiu cc ng dng

    khc nhau pht trin v c tp hp li thnh mt b tiu chun y ca MPEG.

    Vic p dng ton b cc c im ca chun MPEG-2 trong tt c cc b m ho v

    gii m l khng cn thit do s phc tp ca thit b cng nh s tn km v di thng

    ca ng truyn V vy trong hu ht cc trng hp ta ch s dng mt phn nht

    nh trong ton b cc c im ca chun MPEG-2, chng thng c gi l profiles

    v levels. Mt profile s xc nh mt thut ton (iu chnh bitstream v phn gii

    mu) v mt level s xc nh mt s tiu ch bt buc cho cc tham s ca bc nh (v

    d nh kch thc nh v s lng bit).

    MPEG-4 tr thnh mt tiu chun cho nn nh k thut truyn hnh s, cc ng

    dng v ho v video tng tc hai chiu (games, videoconferencing) v cc ng

    dng multimedia tng tc hai chiu (World Wide Web hoc cc ng dng nhm phn

    pht d liu video nh truyn hnh cp, Internet video...) vo nm 1999. Ngy nay,

    MPEG-4 tr thnh mt tiu chun cng ngh trong qu trnh sn xut, phn phi v

    truy cp vo cc h thng video. N gp phn gii quyt vn v dung lng chocc thit b lu tr, gii quyt vn v bng thng ca ng truyn tn hiu video

    hoc kt hp c hai vn trn.

    Trang 22

  • 8/13/2019 cac m

    23/64

    MPEG khng phi l mt cng c nn n l m u im ca nn nh dng

    MPEG chnh l ch MPEG c mt tp hp cc cng c m ho chun, chng c th

    c kt hp vi nhau mt cch linh ng phc v cho mt lot cc ng dng khc

    nhau.

    Nn MPEG l s kt hp hi ho ca bn k thut c bn: Tin x l

    (Preprocessing), on trc s chuyn ng ca cc frame b m ho (temporal

    prediction), b chuyn ng b gii m (motion compensation) v m lng t ho

    (quatisation coding). Cc b lc tin x l s lc ra nhng thng tin khng cn thit t

    tn hiu video v nhng thng tin kh m ho nhng khng quan trng cho s cm th

    ca mt ngi. K thut on chuyn ng da trn nguyn tc l cc nh trong chui

    video dng nh c lin quan mt thit vi nhau theo thi gian: Mi frame ti mtthi im nht nh s c nhiu kh nng ging vi cc frame ng ngay pha trc v

    ngay pha sau n. Cc b m ho s tin hnh qut ln lt tng phn nh trong mi

    frame gi l macro blocks, sau n s pht hin macro block no khng thay i t

    frame ny ti frame khc. B m ho s tin on trc s xut hin ca cc macro

    blocks khi bit v tr v hng chuyn ng ca n. Do ch nhng s thay i gia

    cc khi trong frame hin ti (motion compesated residual) v cc khi c tin on

    mi c truyn ti bn pha thu. Pha bn thu tc b gii m lu tr sn nhng

    thng tin m khng thay i t frame ny ti frame khc trong b nh m ca n vchng c dng in thm mt cch u n vo cc v tr trng trong nh c

    khi phc.

    Nh chng ta u bit, nn tn hiu video c thc hin nh vic loi b c s

    d tha v khng gian (spatial coding) v thi gian (temporal coding). Trong MPEG,

    vic loi b d tha v thi gian (nn lin nh) c thc hin trc ht nh s dng

    cc tnh cht ging nhau gia cc nh lin tip (Inter-frame techniques). Chng ta c

    th s dng tnh cht ny to ra cc bc nh mi nh vo nhng thng tin t nhngnh gi trc n (predicted). Do vy pha b m ho, ta ch cn gi nhng bc

    nh c thay i so vi nhng nh trc, sau ta li dng phng php nn v khng

    gian loi b s d tha v khng gian trong chnh bc nh sai khc ny. Nn v

    khng gian da trn nguyn tc l pht hin s ging nhau ca cc im nh (pixels)

    Trang 23

  • 8/13/2019 cac m

    24/64

    ln cn nhau (Intra-frame coding techniques). JPEG ch p dng phng php nn theo

    khng gian v n c thit k x l v truyn cc nh tnh. Tuy nhin nn tn hiu

    theo phng php ca JPEG cng c th c dng nn cc bc nh mt cch c

    lp trong dy tn hiu video. ng dng ny thng c gi l JPEG ng (MotionJPEG). Trong mt chu k gi mt dy cc bc nh theo kiu JPEG ng, nh u tin

    c nn nh s loi b d tha v khng gian, sau cc nh tip theo c nn

    nh s loi b d tha v thi gian (nn lin nh). Qu trnh c lp i lp li cho

    mt dy cc bc nh trong tn hiu video.

    Thut ton nn MPEG cng da trn php bin i DCT cho cc khi nh 8x8

    picxels tm ra s tha v khng gian mt cch c hiu qu gia cc im nh trong

    cng mt bc nh. Tuy nhin, trong trng hp c mi tng quan cht ch gia ccim nh trong cc bc nh k tip nhau tc l trong trng hp hai bc nh lin tip

    c ni dung trng nhau, k thut Inter-frame coding techniques s c dng cng

    vi vic tin on s d tha v khng gian to thnh k thut tin on b chuyn

    ng gia cc bc nh (Motion compesated prediction between frames). Trong nhiu

    s nn MPEG, ngi ta thng kt hp c vic tin on b chuyn ng theo

    thi gian v php bin i thng tin theo khng gian t hiu qu nn cao (Hybrid

    DPCM/DCT coding of video).

    Hu ht cc s nn MPEG u dng k thut ly mu b xung

    (Subsampling) v lng t ho (Quantization) trc khi m ho. Ly mu b xung

    nhm mc ch lm gim kch thc bc nh u vo theo c theo chiu ngang v

    chiu dc, nh vy s gim s lng cc im nh trc m ho. Cng nn nh rng

    trong mt s trng hp ngi ta cn ly mu b xung theo thi gian lm gim s

    lng cc bc nh trong dy nh trc khi m ho. y c xem nh l mt k thut

    rt c bn nhm loi b s d tha da vo kh nng lu nh ca mt ngi cm th.

    Thng thng, chng ta c th phn bit s thay i v sng ca nh (changes inBrightness) tt hn so vi s thay i v mu (Chromaticity changes). Do trc ht

    cc s nn MPEG s tin hnh chia bc nh thnh cc thnh phn Y (Luminance

    hay brightness plane) v UV (Chrominance hay color planes) tc l mt thnh phn v

    sng v hai thnh phn v mu. Cc tn hiu video thnh phn ny s c ly

    Trang 24

  • 8/13/2019 cac m

    25/64

    mu (samples) v s ho (digitised) to nn cc im nh ri rc theo t l 4 : 2 : 2

    v 4 : 2 : 0.

    K thut tin on b chuyn ng c s dng nh l mt trong nhng cng cmnh lm gim s d tha v khng gian gia cc bc nh. Khi nim v b

    chuyn ng l da trn s phn on hng chuyn ng ca cc bc nh tc l

    cc nh thnh phn trong dy video s c thay th gn ng. K thut tin on b

    chuyn ng gia cc bc nh c xem nh l bin php hn ch bt cc thng s

    ca chuyn ng bi vic dng cc vector chuyn ng m t s dch chuyn ca

    cc im nh. Kt qu tin on tt nht ca mt im nh l da trn s tin on b

    chuyn ng t mt bc nh m ho c truyn pha trc ca n. C hai thng

    s, sai s chuyn ng (bin ) v cc vectors chuyn ng (hng chuyn ng)u c truyn ti pha bn nhn. Tuy nhin do c mi quan h tng quan cht ch

    gia cc im nh v khng gian (trng v khng gian), mt vector chuyn ng c

    th c dng cho mt khi cc im nh gm cc pixels ln cn nhau (MPEG -1 v

    MPEG -2 dng cc khi 16 x16 pixels).

    Trong MPEG-2, c nhiu phng php tin on s chuyn ng. V d mt

    khi nh c th c tin on xui t nhng nh c truyn trc n (Forward

    Predicted), c th on ngc t nhng nh truyn sau n (Backward Predicted) hoc

    theo c hai chiu (Bidirectionally Predicted). Cc phng php dng tin on cc

    khi trong cng mt nh cng c th khng ging nhau, chng c th thay i t khi

    n sang khi kia. Hn na, hai trng (fields) trong cng mt khi cng c th c

    tin on theo hai cch khc nhau dng cc vector c lp nhau hoc chng c th

    dng chung mt vector. i vi mi khi nh, b m ho s chn cc phng php

    tin on thch hp, c gng m bo cht lng nh tt nht khi c gii m trong

    iu kin yu cu kht khe v s bit. Cc thng s lin quan ti chn phng php tin

    on cng c truyn ti b gii m cng vi d on sai s nhm khi phc gnchnh xc nh gc.

    Trong MPEG, c 3 kiu nh khc nhau c dng m ho cho cc khi nh.

    Kiu nh Intra (I-pictures) l nh c m ho mt cch c lp m khng cn tham

    Trang 25

  • 8/13/2019 cac m

    26/64

    kho ti cc nh khc. Hiu qu nn tn hiu t c do loi b s tha v khng gian

    m khng c yu t thi gian tham gia vo qu trnh. I-pictures c dng mt cch

    tun hon to thnh cc im ta cho dng d liu trong qu trnh gii m.

    nh Predictive (P-pictures) c th s dng cc nh I hoc P ngay st pha

    trc n b chuyn ng v chnh n cng c th c dng tham kho cho vic

    tin on cc nh khc tip theo. Mi khi nh trong P-picture c th hoc c m

    theo kiu tin on (predicted) hoc c m mt cch c lp (intra-coded). Do s

    dng c nn theo khng gian v thi gian, hiu qu nn ca P-pictures c tng ln

    mt cch ng k so vi I-pictures.

    nh Bidirectionally-Predictive pictures hay B- Pictures c th s dng cc nh

    I hoc P pha trc hoc pha sau n cho vic b chuyn ng v do vy cho kt qu

    nn cao nht. Mi khi trong B-pictures c th c tin on theo chiu ngc, xui,

    c hai hng hoc c m mt cch c lp. c th tin on ngc t mt bc

    nh pha sau n, b m ho s tin hnh sp xp li cc bc nh t th t xut hin mt

    cch t nhin sang mt th t khc ca cc nh trn ng truyn. Do vy t u ra

    ca b m ho, B-pictures c truyn sau cc nh dng tham kho pha trc v

    pha sau ca n. iu ny s to ra tr do phi sp xp li thng tin, tr ny ln

    hay nh l tu thuc vo s cc bc nh B-pictures lin tip nhau c truyn.

    Cc nh I, P, B-pictures thng xut hin theo mt th t lp i lp li mt cch

    tun hon, do ta c khi nim v nhm cc bc nh GOP (Group of Pictures). Mt

    v d ca GOP dng nh t nhin xut hin theo th t nh sau:

    B1B2I3B4B5B7B8P9B10B11P12

    Th t xut hin ca chng trn ng truyn b thay i do s sp xp li cab m ho nh sau:

    I3B1B2P6B4B5P9B7B8P12B10B11

    Trang 26

  • 8/13/2019 cac m

    27/64

    Cu trc ca mt GOP c th c m t bi hai tham s: N l s cc nh trong

    GOP v M l khong cch gia cc nh P-pictures. Nhm GOP ny c miu t nh

    N = 12 v M = 3.

    S CA B M HO V GII M DNG MPEG-2

    Hnh 1. S b m ho v gii m dng MPEG

    M ho MPEG-2

    Qu trnh m ho cho P pictures v B pictures c gii thch nh sau:

    D liu t cc khi nh (macroblocks) cn c m ho s c a n c b

    tr (Subtractor) v b on chuyn ng (Motion Estimator). B on chuyn ng s

    so snh cc khi nh mi c a vo ny vi cc khi nh c a vo trc

    Trang 27

  • 8/13/2019 cac m

    28/64

    v c lu li nh l cc nh dng tham kho (Reference Picture). Kt qu l b

    on chuyn ng s tm ra cc khi nh trong nh tham kho gn ging nht vi khi

    nh mi ny. B on chuyn ng sau s tnh ton vector chuyn ng (Motion

    Vector), vector ny s c trng cho s dch chuyn theo c hai chiu dc v ngangca khi nh mi cn m ho so vi nh tham kho. Chng ta lu rng vector chuyn

    ng c phn gii bng mt na do thc hin qut xen k.

    B on chuyn ng cng ng thi gi cc khi nh tham kho ny m chng

    thng c gi l cc khi tin on (Predicted macroblock) ti b tr tr vi khi

    nh mi cn m ho (thc hin tr tng im nh tng ng tc l Pixel by pixel). Kt

    qu l ta s c cc sai s tin on (Error Prediction) hoc tn hiu d, chng s c

    trng cho s sai khc gia khi nh cn tin on v khi nh thc t cn m ho.

    Tn hiu d hay sai s tin on ny s c bin i DCT, cc h s nhn c

    sau bin i DCT s c lng t ho lm gim s lng cc bits cn truyn. Cc

    h s ny s c a ti b m ho Huffman, ti y s bits c trng cho cc h s

    tip tc c lm gim i mt cch ng k. D liu t u ra ca m ho Huffman

    s c kt hp vi vector chuyn ng v cc thng tin khc (thng tin v I, P, B

    pictures) gi ti b gii m.

    i vi trng hp P-pictures, cc h s DCT cng c a n b gii mni b (nm ngay trong b m ho). Tn hiu d hay sai s tin on c bin i

    ngc li dng php bin i IDCT v c cng thm vo nh ng trc to nn

    nh tham kho (nh tin on). V d liu nh trong b m ho c gii m lun nh

    vo b gii m ni b ngay chnh bn trong b m ho, do ta c th thc hin thay

    i th t cc bc nh v dng cc phng php tin on nh trnh by trn.

    Gii m MPEG-2

    Qu trnh khi phc li nh ti b gii m l hon ton ngc li. T lung d

    liu nhn c u vo, vector chuyn ng c tch ra v a vo b b chuyn

    ng (Motion Compensator), cc h s DCT c a vo b bin i ngc IDCT

    bin tn hiu t min tn s thnh tn hiu min khng gian. i vi P pictures

    Trang 28

  • 8/13/2019 cac m

    29/64

    v B pictures, vector chuyn ng s c kt hp vi cc khi tin on (predicted

    macroblock) to thnh cc nh tham kho.

    II.2.2.2 Phng php nn m thanh

    Mc ch:

    Biu din chui s ngn gn.

    Tc bit thp.

    Cht lng cao

    ng c:

    Gim tc d liu.

    Gim chi ph truyn dn (BW).

    Gim cc yu cu lu tr.

    Cc yu cu:

    Cm nhn trong sut.

    c lp ngun.

    C kh nng a knh.

    tr hp l.

    v K thut nn MPEG1:

    Trang 29

  • 8/13/2019 cac m

    30/64

    Hnh 2. Phn lp m ha MPEG 1

    Gii thiu

    c pht trin trn c s phi hp chun ISO/IEC 11172.

    S dng tn s ly mu ca CD-DA, vi fs=32;44.1;48kHz, m ho 16bits/mu

    tn hiu.

    Tc bt: 32 - 768 kbps/channel.

    Cc kiu: Mono, dual-mono, dual-stereo, joint-stereo.

    Xc nh cc tham s khc nhau v tc , dng s sau khi nn, s mu trong

    header cho mt knh, cu trc thi gian khung, phng php m ho d on v cc

    ch lm vic.

    c tnh:

    Lp I Lp II Lp III

    Dng cho thit b dn dng Dng cho thit b chuyn

    dng, a mi trng

    Dng cho thit b chuyn

    dng, a mi trng

    Tc dng s liu t 32-

    448kbps

    Tc dng s liu t 32-

    384kbps

    Tc dng s liu t 32-

    320kbps

    384mu/khung/knh 1152mu/khung/knh 1152mu/khung/knh

    Trang 30

    MPEG-1

    Lp I Lp II Lp III

    Mono v Stereo

    32, 44.1, 48kHz

  • 8/13/2019 cac m

    31/64

    Lc bng con 0

    Lc bng con 1

    Lc bng con 31

    Lc bng con 2

    Cc mu

    Audio

    ng vo

    12 mu 12 mu 12

    mu12 mu 12 mu 12

    mu

    12 mu 12 mu 12

    mu

    12 mu 12 mu 12

    mu

    Khung

    lp I

    Khung lp II

    v lp III

    Hnh 3. Cc mu Audio Khung lp I : 12x32 =384.

    Khung lp II, III: 12x32x3=1152.

    Kin trc:

    Trang 31

  • 8/13/2019 cac m

    32/64

    Hnh 4. Kin trc 3 lp ca MPEG-1

    Thut ton c bn

    Tin hnh chia ng vo thnh 32 bng con bi cc bng lc.

    Ly 32 mu PCM trong cng mt thi im, kt qu l 32 h s

    tn s ng ra. Trong MPEG-1 lp I th tp 32 gi tr PCM c kt hp vo trong khi

    gm 12 nhm 32 mu ny.

    MPEG-1 lp II v lp III th gm 3 khi 12 nhm ny.

    Phn b bit m bo rng mi nhiu lng t nm di cc ngng

    che.

    Vi mi bng con, xc nh mc bin v mc nhiu bng m hnh

    tm sinh l nghe. SMR (signal-mask rate) c s dng xc nh s

    bit cho qu trnh lng t ho i vi mi bng con vi mc ch gim

    thiu dung lng.

    Trang 32

    Bng lc

    phn tch a

    pha 32 knh

    Lng t

    ho

    M ho MU

    X

    FFT

    LI: 512

    LII: 1024

    Phn tch

    tm sinh l

    m hc

    Phn phi

    bit ng

    32

    D liu

    Thng

    tin thm

    32

    s(n)

    Bng lc

    phn tch a

    pha 32 knh

    MDCT

    FFT

    Phn tch tm

    sinh l m

    hc

    SMR

    32

    s(n)knh

    Vng lp ch nh bit

    Lng t ho

    M ho Huffman

    M thng

    tin thm

    32

    MPEG1 lp1,2

    MPEG1 lp3

    SMR (Signal Mark

    Rate): T s tn

    hiu/ngng che

  • 8/13/2019 cac m

    33/64

    Phn phi bit

    L th tc xc nh s bit cho mi bng con.

    Da vo thng tin vo t m hnh tm sinh l nghe

    V d: Sau khi phn tch, mc ca 16 bng con u l:

    Band 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

    Level (db) 0 8 12 10 6 2 10 60 35 20 15 2 3 5 3 1

    Nu mc ca bng con th 8 l 60 th n che 12 dB bng con th 7 v 15 dB bng

    con th 9.

    Bng con 7 c mc 10dB15dB: gi i.

    Ch c cc mc ln hn mc che l c gi i thay v dng 6 bits m

    ho, ta ch dng 4 bits.

    MPEG-Layer I: B lc DCT 1 khung v tn s bng phng trong mi bng con. M

    hnh tm sinh l nghe s dng che tn s.

    MPEG-Layer II: C 3 khung trong b lc (trc, hin ti v k), tng l 1125 mu. S

    dng vi bits che thi gian.MPEG-Layer III: S dng b lc ti hn p ng tt hn. M hnh tm sinh l nghe

    s dng che thi gian, che tn s, tnh ton d tha stereo v m ho Huffman.

    Cu trc khung:

    Trang 33

  • 8/13/2019 cac m

    34/64

    Hnh 5. Cu trc khung MPEG

    Header: Gm 12 bits ng b; 20 bis thng tin h thng ch th tc bit

    CRC vi a thc sinh x16+x15+x2+1.

    Side Info: Gm phn b bit: lp 1 vi 4 bits tuyn tnh cho cc bng con, lp II

    4 bits cho cc bng con tn thp, 3 bit tn trung v 2 bits tn cao; h s t l l

    6 bits/bng con kt hp vi phn b bits v cc bits m ha cho bng con

    xc nh gi tr, lp III m ha m thanh ni. Bit Reservoir: Bit cung cp, cc mu d liu t 1 hoc 2 khung trc.

    Samples: 32x12 mu i vi lp I v 32x36 mu i vi lp II v lp III.

    Ancillary Data: D liu b sung

    v K thut nn MPEG2:

    M rng MPEG-1 cho cc ng dng mi.

    C kh nng p dng nhiu tc khc nhau, t 32 n 1066kbps. Tn s lymu c th gim 1 na so vi MPEG-1 (16; 22,05; 24kHz).

    Kh nng a knh, tc bits m rng c th ln n 1 Mbps cho cc ng dng

    tc cao. Cho php nn ng thi nhiu knh.

    Cht lng m thanh tu thuc ng dng.

    H tr kh nng lng ting, bnh lun nhiu ngn ng trong phn bits m rng

    (7 knh).

    MPEG-2 s dng m ho cng cao, gim xuyn m, m ho d on linknh v m ho o nh knh trung tm nhn c tc bit kt hp 384

    kbps.

    Trang 34

  • 8/13/2019 cac m

    35/64

    Khung MPEG-2 c chia thnh 2 phn, phn u l MPEG-1stereo, phn m

    rng MPEG-2 cha tt c nhng d liu surround khc.

    Hnh 6. Cu trc khung MPEG 2

    M ha v gii m MPEG2:

    Hnh 7. S m ha v gii m MPEG 2

    CHNG 3

    XY DNG CHNG TRNH NN D LIU

    BNG M SHANNON-FANO V HUFFMAN

    Trang 35

    Mono-stereo

    MPEG-1

    32;44.1;48kHz

    MPEG-2

    Layer ILayer II

    Layer III

    Mono-stereo

    MPEG-2

    16;22,05;24kHzLayer ILayer II

    Layer III

    5 channels

    MPEG-2

    multi channel

    32;44.1;48kHz

    Layer ILayer II

    Layer III

    Matrix

    MPEG-1

    encoder

    MPEG-2

    Extension

    encoder

    L

    C

    R

    LS

    RS

    L0

    R0

    T3

    T4

    T5

    +

    MPEG-1

    decoder

    MPEG-2

    Extension

    decoder

    L0

    R0

    T3

    T4

    T5

    Inverse

    Matrix

    L

    C

    R

    LS

    RS

    channel

  • 8/13/2019 cac m

    36/64

    II.3.1 C IM CA CC THUT TON NN D LIU

    Trong gii hn tm hiu nn d liu di dng mt tp tin em nhn thy ccphng php nn c dng ph bin trn c nhng c im ng ch : thut ton

    nn di lot (Runlength) khng th p dng cho nhiu loi tp tin c, v d nh

    tp tin chng trnh, tp tin c s d liu... v cc lot chy l rt ngn, do nu

    p dng thut ton ny khng nhng khng lm b tp tin m cn lm phnh to chng.

    Thut ton nn LZW c cc u im l h s nn tng i cao, trong tp tin

    nn khng cn phi cha bng m. Nhc im ca thut ton ny l tn nhiu b nh,

    kh thc hin da trn cc mng n gin (b hn 64KB).

    Cc thut ton khc nh Huffman, Shannon-Fano, LZ77 v LZW u c th

    p dng c nn nhiu loi tp tin trn cc my vi tnh.

    Thut ton Shannon-Fano c h s nn tng i tt vi cc file dng text

    nhng do c tnh ca thut ton v yu cu kh phc tp nn him khi c s dng.

    Thut ton Huffman c u im l h s nn tng i cao, phng php thc

    hin tng i n gin, i hi t b nh, c th xy dng da trn cc mng b hn

    64KB. Thng c s dng trong cc cng on cui cun nn m thanh, hnh nh v

    video.

    T cc c tnh thut ton ca cc loi m nn trn, em nhn thy rng thut

    ton m ha ca Shannon-Fano v Huffman c nt tng ng. Tuy nhin s khc

    nhau v thut ton lm cho hiu sut nn (h s entropy) ca Huffman v Shannon-

    Fano khc nhau. Do vy, em s tp trung nghin cu, xy dng m phng v nh gi

    2 loi m nn ny lm r u v nhc ca 2 loi m nn trn.

    II.3.2 PHN TCH THUT TONII.3.2.1 Thut ton Shannon-Fano:

    Trang 36

  • 8/13/2019 cac m

    37/64

    M Shannon-Fano l mt thut ton m ha dng nn d liu. N da trn bng tn

    sut xut hin cc k t cn m ha xy dng mt b m nh phn cho cc k t

    sao cho dung lng (s bt) sau khi m ha l nh nht.

    Cc tp tin ca my tnh c lu di dng cc k t c chiu di khng i l 8 bits.

    Trong nhiu tp tin, xc sut xut hin cc k t ny l nhiu hn cc k t khc, t

    ta thy ngay rng nu ch dng mt vi bit biu din cho cc k t c xc sut xut

    hin ln v dng nhiu bit hn biu din cho cc k t c xc sut xut hin nh th

    c th tit kim c di tp tin mt cch ng k. V d, m ho mt chui nh

    sau: "ABRACADABRA"

    Nu m ho chui trn trong dng m nh phn 5 bit ta s c dy bit sau:

    0000100010100100000100011000010010000001000101001000001

    gii m thng ip ny, ch n gin l c ra 5 bits tng thi im v chuyn

    i n tng ng vi vic m ho nh phn c nh ngha trn. Trong m chun

    ny, ch D xut hin ch mt ln s cn s lng bit ging ch A xut hin nhiu ln.

    Ta c th gn cc chui bit ngn nht cho cc k t c dng ph bin nht, gi s ta

    gn: A l 0, B l 1, R l 01, C l 10 v D l 11 th chui trn c biu din nh sau:

    0 1 01 0 10 0 11 0 1 01 0

    V d ny ch dng 15 bits so vi 55 bits nh trn, nhng n khng thc s l mtm v phi l thuc vo khong trng phn cch cc k t. Nu khng c du phn

    cch th ta khng th gii m c thng ip ny. Ta cng c th chn cc t m sao

    cho thng ip c th c gii m m khng cn du phn cch, v d nh: A l 11,

    B l 00, C l 010, D l 10 v R l 011, cc t m ny gi l cc t m c tnh prefix

    (Khng c t m no l tin t ca t m khc). Vi cc t m ny ta c th m ho

    thng ip trn nh sau : 1100011110101110110001111

    Vi chui m ho ny ta hon ton c th gii m c m khng cn du phncch. Nhng bng cch no tm ra bng m mt cch tt nht ?

    - Bc u tin trong vic xy dng m Shannon-Fano l m s ln xut hin ca mi

    k t trong tp tin s c m ho (trong phn ny ch v d hn ch mt s k t).

    Trang 37

  • 8/13/2019 cac m

    38/64

    - Bc tip theo l xy dng bng m da vo thut ton Shannon-Fano. Thut ton

    ny c thc hin bng cc bc sau:

    Bc 1: Sp xp th t cc lp tin tng (hay gim), ngun tin c thng k t tp tincn thc hin m ha.

    Bc 2: Chia tng ngun tin ra lm 2 nhm sao cho tng ca mi nhm xp x bng

    nhau (hiu ca 2 nhm l nh nht).

    Bc 3: Gn cho mi nhm k hiu l 0 hay 1.

    Bc 4: Lp li bc 2 cho n khi ch cn li 1 nhm tin.

    - Sau khi c bng m Shannon-Fano, ta c th m ha cc gi tin. u im ca

    phng php m ho Shannon-Fano l t c h s nn tng i cao (H s nn

    tu thuc vo cu trc ca cc tp tin). Nhc im ca phng php ny l bn

    nhn mun gii m c thng ip th phi c mt bng m ging nh bng m

    bn gi, do khi nn cc tp tin b h s nn khng c cao.

    - gii m gi tin, nh ni trn ta phi c bng m. Ln lt ta so snh tng

    nhm tin, nu trng vi bng m th ta c c gi tin nh ban u.Chng trnh m ha ngun dng m Shannon-Fano l chng trnh dng thut

    ton Shannon-Fano (c trong l thuyt truyn tin hay k thut truyn s liu) gii

    quyt bi ton m ha theo xc sut xut hin ca cc k t c trong bng m ASCII.

    Sau , da vo bng m ny ta c th m ha cc chui k t hay dng vn bn ra

    thnh m my tnh gm cc k t 0 v 1. Cc k t c m ha s c di khc

    nhau, tuy nhin xt v tng th th di ca chui hay ca vn bn c m ha s

    ngn hn khi ta cha m ha. Nh vy, khi lu tr s t tn b nh hn cng nh khi

    truyn tin s chim t bng thng hn.

    Cc chc nng ca hthng/chng trnh :

    Tng th cu trc ca chng trnh ln lt s thc hin nh sau :

    Trang 38

  • 8/13/2019 cac m

    39/64

    1. Khai bo cc hng, bin ton cc, cc cu trc s thc hin trong sut chng

    trnh.

    typedef struct Node

    {

    char kytu;

    float xacsuat;

    char code[32];

    };

    Node a[256]; // bng m ASCII gm 256 k t, bin a cha mng cc k t trong

    bng m, l bin ton cc.

    2. Khai bo cc chng trnh con hay cu trc ca cc chng trnh con :

    a) Hm read: c file v thng k xc sut hin ca n : Ta tin hnh c cc

    k t t file ri tng s ln xut hin ln, lu vo trong mng a. Mng a l

    ni lu tr s ln xut hin ca k t c trong file, v tr trong mng a l

    m ASCII ca k t .

    b) Hm sapxeptang : sp xp theo s ln xut hin ca k t trong mng a

    theo hng tng dn.c) Hm mahoa : Hm ny s m ha ngun dng m Shannon-Fano cc k

    t c s ln xut hin ln hn 0 trong file. Mi k t u c 1 m bng nh

    phn ring.

    d) Hm giaima: Gii m file m ha theo bng m.

    e) Hm insapxep: In cc k t sau khi sp xp theo xc sut xut hin ra

    mn hnh tin theo di.

    f) Hm thuchien: y l hm lm cng vic chnh ca chng trnh. Hm ny

    s thc hin cng vic m ha file ri lu vo file khc m ca chng trnhnn.

    g) Hm main: cha cc thng tin cn thit ca ti v thc hin gi cc

    chng trnh con.

    Trang 39

  • 8/13/2019 cac m

    40/64

    V d : cho ngun tin gm 10 lp tin nh sau :

    X A B C H N U G O M I

    Px 0.20 0.11 0.08 0.10 0.07 0.09 0.13 0.06 0.12 0.04

    a) Hm sp xp : Hm ny s tin hnh sp xp li cc k t theo xc sut xut

    hin c trn theo th t gim dn nhm phc cho qu trnh m ha cc

    k t ny theo thut ton Shannon-Fano.

    X A G M B H U C N O I

    Px 0.20 0.13 0.12 0.11 0.10 0.09 0.08 0.07 0.06 0.04

    b) Hm m ha : Hm ny c nhim v ti quan trng i vi bi ton ny.

    Hm s m ha cc k t nhp hay cho trc sao cho cc m ny khng

    c trng nhau. Cc bc thc hin nh sau :

    Bc 1 : Da vo th t sp xp trn chia mng a ra lm 2 phn. iu

    kin l hiu ca tng cc xc sut c trong tng phn s l nh nht.

    Bc 2 : Gn cho mi k t ca nhm u gi tr 0 vo trong phn code

    ca n (k t) v nhm th 2 gi tr 1.

    Bc 3 : Lm li bc 1 vi nhm th nht sau l vi nhm th 2.

    iu kin dng l khng th chia nh c na (mng ch cn 1 phn

    t).

    Nh vy, sau khi hon tt cc bc trn, ta c c 1 bng m vi cc ch

    ci c gn vi mt m tng ng nht nh, cc m ny s khng trng

    nhau.V d : Bng m sau khi c m ha :

    X A G M B H U C N O I

    Trang 40

  • 8/13/2019 cac m

    41/64

    Px 0.20 0.13 0.12 0.11 0.10 0.09 0.08 0.07 0.06 0.04

    Code 00 010 011 100 1010 1011 1100 1101 1110 1111

    Trnh t cch phn chia nh sau :

    X Px Code

    A 0.20 0 0

    G 0.13 0 1 0

    M 0.12 0 1 1

    B 0.11 1 0 0

    H 0.10 1 0 1 0

    U 0.09 1 0 1 1

    C 0.08 1 1 0 0

    N 0.07 1 1 0 1

    O 0.06 1 1 1 0

    I 0.04 1 1 1 1

    c) Hm gii m : Hm ny s da vo bng m trn gii m mt chui

    gm cc k t 0 v 1. Hm c kim tra xem c phi d liu nhp vo c

    ng nh yu cu khng v cc k t phi c trong bng m thng k

    trn. Trnh t cc bc nh sau :

    Bc 1 : Kim tra xem k t u tin c trong bng m hay khng. Nu

    c th lu vo trong mng d v tip tc lm bc 1 nhng v tr u tin l v

    tr k tip (cha kim tra) , nu khng c th tip tc bc 2. Bc 2 : Kim tra cp k t u tin v k t th 2 xem c trong bng

    m hay khng. Nu c th lu vo trong mng d, v quay li bc 1 vi k

    t u tin l k t k tip, nu khng c th lm li bc 2 vi 3, 4,,n (s

    Trang 41

  • 8/13/2019 cac m

    42/64

    phn t ca chui nhp vo) v tr k tip.

    Bc 3 : Kim tra xem nu kim tra n phn t cui cng m khng

    thy trng vi phn t no trong bng m th pht thng bo D liu nhp

    vo b li. Lm n khi khng cn phn t no trong mng th dng. Bc 4 : in cc k t va c gii m hoc thng bo li ln mn hnh.

    d) Hm main : Hm ny hin th thng tin ngi thc hin ti. Cc la

    chn m ha file hay gii m. Hin ln mn hnh thng bo ngi s

    dng bit thc hin m ha hay gii m xong cha, file m ha hay gii

    m c lu u.

    II.3.2.2 Thut ton Huffman:

    Thnh lp cy nh phn t tp hp cc k hiu trong thng bo, mi k hiu

    l mt nt l ca cy. Cch thnh lp cy nh sau :

    Chn hai nt a,b c xc sut nh nht trong tp hp cc nt, gi s xc sut nt a nh

    hn hoc bng xc sut nt b. Thnh lp cy nh phn c nt gc x, con tri l a, con

    phi l b. Nt x c xc sut bng tng xc sut ca a v b.

    Tp hp cc nt by gi l cc nt cn li ( loi b a, v nt x. Lp li mt

    cch qui qu trnh trn tp hp ang xt cho n khi tp ny ch cn li mt nt.

    M ca a, b s tm c bng cch ly m ca x ni thm 0 cho a v 1 cho b.

    M ca nt gc l rng.

    Nh vy thc cht qu trnh trn l ta xy dng mt cy nh phn t tp hp

    cc k t mun m ho, cui cng ta c mt cy nh phn c l l cc k t . M

    ca mt k t l mt ng i trn cy t gc n l cha k t, vi 0 i sang tri cn

    1 i sang phi. tng ca gii thut m ho cng ht sc n gin, ta tm b m cho

    cc k t sao cho cc k t c tn sut xut hin cao (xc sut xut hin l ln) s cm ngn (gn vi gc) di trung bnh m ho mt k t l nh nht.

    Trang 42

  • 8/13/2019 cac m

    43/64

    V d:

    Cho bng tn sut ca 5 ch ci A,B,C,D,E nh sau tng ng l 0.10; 0.15; 0.30;

    0.16; 0.29.

    A B C D E

    0.10 0.15 0.30 0.160.2

    9

    Qu trnh xy dng cy Huffman din ra nh sau:

    Trang 43

  • 8/13/2019 cac m

    44/64

    Hnh 8. Bc 1 v Bc 2

    Hnh 9. Bc 3 v Bc 4

    Trang 44

  • 8/13/2019 cac m

    45/64

    Hnh 10. Bc 5

    Nh vy b m ti u tng ng l:

    A B C D E

    010 011 11 00 10

    - c file vn bn, lu vo mng a.

    - Xy dng cy Huffman gii m dng vn bn

    - Hin th cy Huffman v bng m Huffman ra mn hnh

    - Thc hin m ha dng vn bn v di m

    - M rng m ha v gii m mt file vn bn. Kt qu gii m v m ha c ghi

    vo 2 file vn bn khc.

    Trang 45

  • 8/13/2019 cac m

    46/64

    Cu trc chng trnh:

    - File huffman.cpp dng ci t cy huffman. N bao gm cc hm:

    - mahoa_chuoi(): M ha chui.

    - giaima_chuoi(): Gii m chui.

    - general(): Ci t cy Huffman, c s dng trong cc hm trn.

    - make_table(): To bng m cho cc k t, c s dng trong cc hm m ha.

    - quit(): Cc thao tc cui cng trc khi return.

    Cc vn chung trong xy dng chng trnh:

    Cc cu trc d liu s dng trong chng trnh:

    Code:

    typedef struct {

    char kytu;

    int tansuat;

    char code[MAXBITS];}hlist;

    - Kiu d liu ca mng a[] cha tp k t (a[i].kytu), tn s tng ng

    a[i].tansuat) v chui m ha k t (a[i].code)

    Code:

    typedef struct{

    char kytu;

    int contrai,conphai,nutcha;int tansuat;

    int isleft;

    }hnode;

    Trang 46

  • 8/13/2019 cac m

    47/64

    - Kiu d liu ca mng node[] dng ci t cy Huffman. Cc node tng ng vi

    k t (node[i].kytu nu c). Node.contrai, node.conphai, node.nutcha tng ng l ch

    s ca nt con tri, con phi, v nt cha (Nu khng c th c gi tr l -1).Node.tansuat cha tng tn s cc nt l thuc nhnh ca n. Node.isleft bng 1 nu

    nt l con tri ca cha n, bng 0 nu l con phi, bng -1 nu l nt gc (goc).

    Ci t cy Huffman t tp k t c t file:

    - Th tc general()

    - Sp xp li cc a[i]

    - Khi to cc node[i] (i=0 n n-1), node[i].kytu v node[i].tansuat tng ng vi

    a[i].kytu v a[i].tansuat sau khi sp xp. Cc thnh phn cn li c gi tr l 1(cha xc nh).

    - To cy Huffman bng cch chn thm nt mi ng thi sp xp li theo th t tn

    s tng dn.

    Cc vn khc cn gii quyt:

    1. Lp b k t (a[i].kytu) v tn s tng ng (a[i].tansuat) t mt xu k t (s):

    - c k t u tin ca xu cho vo a[0].kytu tng ng l a[0].tansuat bng 1.

    - Duyt tng k t cn li ca xu, nu gp k t no c trong mng a[i].kytu thtng a[i].tansuat ln 1, nu cha c k t th thm phn t mi vo mng v cho tn

    s tng ng bng 1.

    2. Lp tp k t trong file text v tn s tng ng: (Tng t vi xu k t)

    3. Xy dng bng m Huffman (lu trong a[i].code): Th tc make_table()

    - Duyt tt c cc node[i] (i t 0 ti 2*n-2)

    - Nu gp node no c cha k t (node[i].kytu khc NULL) th duyt ln ti root

    c xu m ha (a[i].code) vit theo chiu ngc li.

    - o ngc a[i].code bng hm strrev(), ta c xu m ha ca k t tng ng.

    Trang 47

  • 8/13/2019 cac m

    48/64

    4. M ha:

    - c tng k t ca chui (hoc file), gp phn t no th hin th xu m ha tng

    ng hoc ghi thm xu m ha tng ng ca k t vo file m ha (fileout).

    5. Gii m:

    - Goc tr vo node gc (goc=2*n-2)

    - Duyt cy Huffman t trn xung, gp 0 th nhy xung con tri, gp 1 th nhy

    xung con phi, cho ti khi gp node c thnh phn kytu khc -1.

    - Nu gp node c thnh phn kytu khc -1 th hin th k t ca node v nhy v

    gc (goc=2*n-2).

    II.3.2.3 Lu thut ton :

    II.3.2.3.1 Lu chng trnh m phng thut ton Shannon-Fano:

    -

    Begin

    Lu thut ton hm main:

    Gii thiu ti

    File liTrang 48

  • 8/13/2019 cac m

    49/64

    File tt

    End

    T

    - Lu thut ton m ha Shannon-Fano.

    Trang 49

    Gi hm read c d liu

    trong file v thng k ngun tin

    Nhp tn file cn m

    ha

    Gi hm sapxeptang sp xp

    ngun tin theo tn sut xut hin.

    Gi hm insapxep m ha

    ngun tin v in bng m ra mn

    hnh

    Gi hm thuchien thc hin

    m ha ngun tin, a ra t l nn

    F

    T

    Begin

    Mng a (ngun tin thng k)

    Phn mng a ra thnh 2 phn (phn u t

    0n i, phn sau t i+1n n-1 )vi tng

    xc sut ca mi phn c hiu l nh nht

    Mng ch c 1

    pt

    Gn gi tr 0 vo phn code

    cho phn t

  • 8/13/2019 cac m

    50/64

  • 8/13/2019 cac m

    51/64

    - Lu thut ton m ha Huffman :

    Trang 51

    T

    Begi

    n

    Nhp chui

    ly t file

    m s ln xut hin mi k

    t trong chui

  • 8/13/2019 cac m

    52/64

    CHNG 4

    KT QU TH NGHIM V NHN XT

    II.4.1 KT QU M PHNG

    Trang 52

    Lu vo 1 file cha d liu l k t

    v tn sut xut hin tng ng

    Sp xp cc k t theo th t tn

    sut tng dn (hm general)

    To bng m cc k t dng cy

    Huffman (hm make_table)

    Xut

    chui

    c

    m haEnd

  • 8/13/2019 cac m

    53/64

    Hnh11. mn hnh m phng m Shannon-Fano vi v d 1

    Hnh 12. Thng k tn sut xut hin v bng m ca Shannon-Fano trong v d 1

    Trang 53

  • 8/13/2019 cac m

    54/64

    Hnh 13. Hin th t l nn ca m Shannon-Fano trong v d 1

    Hnh 14. Gii thiu ti dng m Huffman

    Trang 54

  • 8/13/2019 cac m

    55/64

    Hnh 15. Menu iu khin chng trnh nn m Huffman

    Hnh 16. Hin th trc quan thng tin nn ca m Huffman trong v d 1

    Trang 55

  • 8/13/2019 cac m

    56/64

    Hnh 17. Mn hnh gii thiu nn file vidu7.txt vi m Shannon-Fano

    Hnh 18. Thng k d liu trong v d 7

    Trang 56

  • 8/13/2019 cac m

    57/64

    Hnh 19. Kt qu nn Shannon-Fano vi v d 7

    Hnh 20. Giao din chng trnh nn m Huffman vi v d 7

    Trang 57

  • 8/13/2019 cac m

    58/64

    Hnh 21. Thng k d liu trong v d 7 ca m Huffman

    Hnh 22.Kt qu chng trnh nn m Huffman trong v d 7

    II.4.2 CC KT QU

    Trang 58

  • 8/13/2019 cac m

    59/64

    Hnh 23. Thng k t l nn Shannon-Fano v Huffman vi 10 v d khc nhauII.4.3 NHN XT V NH GI

    Series1 : M Shannon-Fano (ct mu nht)

    Series2 : M Huffman (Ct mu m)Hnh 24. Biu so snh t l nn ca Shannon-Fano v Huffman

    u im, nhc im ca phng php m ho Shannon-Fano v Huffman

    Trang 59

  • 8/13/2019 cac m

    60/64

    u im:

    - Thut ton nn Shannon-Fano v Huffman l thut ton nn khng tn

    hao. Do vy, d liu sau khi gii nn l nguyn vn.

    - Hai thut ton nn trn u s dng phng php nn d liu bng cch

    m ha cc k t sao cho s ln xut hin ca k t t l nghch vi s bits c m

    ha, lm cho d liu c lu tr hay truyn i c kch thc nh hn rt nhiu so vi

    ban u.

    - Tc nn nhanh do vic xy dng bng m khng phi mt nhiu thigian.

    - S dng hai thut ton nn trn khng lm tn b nh nhiu, vic lu tr

    d dng.

    - Nhn vo bng thng k v biu , ta c th thy c t l nn file text

    ca 2 m nn ny l rt tt, file c kch thc cng ln th t l nn cng cao (nht l

    m nn Huffman).

    Nhc im:

    - Vic nn d liu l tng i tt i vi cc loi file vn bn, tuy nhin n

    khng hiu qu i vi cc loi file khc.

    - Bng m lun phi i km vi d liu nn phi mt mt khong vng nh

    cho n.

    - Lun lun phi duyt thng tin n 2 ln. Mt ln c d liu to bng

    m v mt ln m ha.

    Trang 60

  • 8/13/2019 cac m

    61/64

    - Vic m ha file c kch thc nh thng khng cho t l nn tt so vi

    file c kch thc ln hn.

    nh gi tnh hiu qu ca phng php Shannon-Fano v Huffman :

    Vi file text c dung lng nh v va th t l nn chnh lch ca 2 m ny

    l khng cao, thng th m Huffman c t l nn cao hn Shannon-Fano nhng

    khng ng k.

    Vi nhng file text c nhiu k t ging nhau th t l nn l tng ng nhau.

    Vi nhng file text c dung lng tng i ln tr ln th chng ta c th

    nhn thy rng m nn Huffman vt tri v kh nng nn d liu do thut tonnn ca Huffman gim c lng tin tt hn. V vy, m ha file text c dung

    lng ln th nn chn m nn Huffman cho t l nn tt hn rt nhiu so vi m

    Shannon-Fano.

    Trang 61

  • 8/13/2019 cac m

    62/64

    PHN 3

    KT LUN V HNG PHT TRIN

    Vi phn m phng thut ton nn d liu vi m Shannon-Fano v Huffman,

    em phn no c kt c u v nhc im ca 2 thut ton ny qua vic so snh

    thut ton v t l nn ca n. Em cng tm hiu, thng k c mt s thut ton

    nn khc vi cc d liu khc nhau nh file vn bn, file audio v video...Sau y l

    nhng mt lm c v nhng mt cn hn ch ca ti ny:

    Nhng mt lm c :

    -M phng thnh cng 2 loi m nn Shannon-Fano v Huffman bng

    chng trnh C.

    - a ra c t l nn ca tng loi, so snh 2 m nn ny v a ra ckt lun.

    - Tm hiu, nghin cu cc thut ton nn khc.

    - Nn thnh cng vi nhng file text c kch thc ln.

    Nhng mt cn hn ch :

    - Cha thc hin c nn d liu ln hn 8 bits nh m ting Vit (c du)

    hoc cc k t c bit.

    - Cha thc hin nn c vi cc file d liu khc ngoi file text.

    Trang 62

  • 8/13/2019 cac m

    63/64

    - Cha thc hin c vic so snh mt cch tng qut v trc quan ca tt

    c cc loi m nn hin nay, dn n vic cha ua ra c kt lun loi m

    nn no l ti u, m nn no l thng dng nht hin nay.

    - Cha thc hin c vic nn cc loi d liu khc nh nn folder

    Hng pht trin ca ti:

    Trong thi gian thc hin n theo quy nh, em thc hin c cc cng

    vic nh nu cc phn trn. Mt s ni dung trong phn hn ch em s tip tc

    hon thnh trong thi gian ti nh nn d liu vi nhiu dng file khc, nn d liu vi

    folder v nghin cu, ci tin thut ton nn nhanh hn, c giao din p mt hn v

    d s dng vo thc t hn.

    Trang 63

  • 8/13/2019 cac m

    64/64

    TI LIU THAM KHO

    [1]. Thut ton trong tin hc V c Thi NXB KHKT

    [2] Gio trnh l thuyt m Nguyn L Anh, Nguyn Vn Xut, Phm Th Long

    Trng HDL ng 1997

    [3] The Data Compression Book 2nd edition - Mark Nelson and Jean-loup Gailly

    [4]. Ngn ng lp trnh C - Quch Tun Ngc.

    [5]. Ngn ng lp trnh C++ - Nhm Ngc Anh Th Press

    [6]. Gio trnh Multimedia GV Nguyn Duy Nht Vin (Lu hnh ni b)

    [7]. Cc ngun t Internet.