Introduction to R

195
ΠΑΝΕΠΙΣΤΗΜΙΟ ΠΕΙΡΑΙΩΣ Τμήμα Στατιστικής και Ασφαλιστικής Επιστήμης Πρόγραμμα Μεταπτυχιακών Σπουδών στην Εφαρμοσμένη Στατιστική Ανάλυση ∆εδομένων με τη Χρήση Στατιστικών Πακέτων Εισαγωγή στο R Σημειώσεις παραδόσεων ∆ημήτριος Αντζουλάκος Πειραιάς 2013 (Α Έκδοση 2008)
  • date post

    16-Aug-2015
  • Category

    Documents

  • view

    224
  • download

    1

description

lessons for R language for statistics

Transcript of Introduction to R

R 2013 ( 2008) 1 1.1 R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.3, R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61.6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71.7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91.8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111.9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121.10 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 2.1 (scalar objects) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152.2 (vectors) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162.3 (matrices) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202.3.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202.3.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232.4 (arrays) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262.5 (factors) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282.6 (lists) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292.7 (data frames) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312.8 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353 3.1 plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413.3 ts.plot, pairs, matplot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 443.4 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 474 4.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 494.2 : . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 514.3 : . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 564.3.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 564.3.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 584.3.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 614.4 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 655 5.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 675.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 675.2.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 685.2.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 705.2.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 715.2.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 745.2.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 755.2.6 Cauchy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 765.2.7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 765.2.8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 775.3 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 806 R 6.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 816.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 826.2.1 (t-test, z-test) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 826.2.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 856.2.3 Bernoulli . . . . . . . . . . . . . . . . . . . . . . . . 866.2.4 (sign test) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 896.2.5 Wilcoxon Signed-Rank . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 926.2.6 (Wald-Wolfowitz) . . . . . . . . . . . . . . . . . . . . 976.2.7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1006.2.7.1 Kolmogorov-Smirnov . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1016.2.7.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1046.2.7.3Q-Q . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1066.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6.3.1 (t-) . . . . . . . . . . . . . . . . . . . . 1126.3.2 . . . . . . . . . . . . . . . . . . . . 1156.3.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1176.3.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1196.3.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1226.3.5.1 Fisher . . . . . . . . . . . . . . . . 1226.3.5.2 . . . . . . . . . . . . . . . . . . . 1246.3.6 Q-Q . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1276.3.7 Kolmogorov-Smirnov . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1286.3.8 (sign test) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1336.3.9 Wilcoxon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1366.3.9.1 Wilcoxon Signed-Rank . . . . . . . . . . . . . . . . . . . . . . . . . . 1366.3.9.2 Wilcoxon Rank-Sum Mann-Whitney U . . . . . . . . . . . . . . . . . . . . . . . . . 1396.4 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1436.4.1 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1436.4.2 . . . . . . . . . . . . . . . . . . . . . . . . . 1466.4.3 r . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1526.5 k . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1546.5.1 k Anova . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1546.5.2 Kruskal-Wallis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1586.5.3 Levene k . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1606.6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1626.6.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1626.6.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1666.7 6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1737 R 7.1 : , , . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1757.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1817.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1837.4 7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189 R (2013)1 1 1.1 R R : , . To R - , -.RRossIhakaRobertGentleman 90().1997RDevel-opmentCore Team. R (open source) GNU1. R S. H S Bell Laboratories ( AT&T, Lucent Technologies) Rick Becker, John Champers Allan Wilks ( 1976). () S S (S version 3; S-PLUS 3.xand4.x)S(Sversion 4; S-PLUS 5.x ). R S R S . - R S .Rhttp://www.r-project.org . 1.2 R mirror CRAN2 http://cran.r-project.org.http://cran.r-project.org/ Download and Install R - Download R for Windows . 1 GNU 1984 ( Unix) : GNU. GNU Linux . - Linux, GNU/Linux. GNU - GNU's Not Unix. 2 Comprehensive R Archive Network R (2013)2 base. R ( - 2.14.0). Download R 2.14.0 for Windows (Desktop). R-2.14.0-win.exe (SelectComponents).-C:\ProgramFiles\R\R-2.14.0 R( 2.14.0).R C:\Program Files\R\R-2.14.0\bin\i386 Rgui.exe3. - Windows. 3 GUI: Graphical User Interface R-3.0.2 for Windows (32/64 bit) Download R 3.0.2 for Windows (52 megabytes, 32/64 bit) Installation and other instructions New features in this version Download and Install R Precompiled binary distributions of the base system and contributed packages, Windows and Mac users most likely want one of these versions of R:-Download R for Linux-Download R for MacOS X-Download R for WindowsR for Windows Subdirectories: base Binaries for base distribution (managed by Duncan Murdoch). This is what you want if you install R for the first time. contribBinaries of contributed packages (managed by Uwe Ligges) R (2013)3 1.3, R R - . RGui 7 , R (R console). R > . R Edit>Clear Console.Enter.- > 2+3*4 [1] 14 -+- . 128 . -R(REditor) R (2013)4 File>New script ... . To R - (-) Ctrl+R. Edit (Run line or selection, Run all). R . R > R . . Home (End) (). Bakspace Delete , . ,(R - R Graphics). , > curve(sin(x), from=0, to=2*pi) x sin ] 2 , 0 [ t . R > q() ( quit) R , File>Exit. 1.4 R help( ) ? . > help(curve) R (2013)5 > ?curve , , , curve graphics ( 1.6 ). , ,apropos(" - ") . > apropos("cu") [1] "cummax" "cummin" "cumprod" [4] "cumsum" "curve""cut" [7] "cut.Date" "cut.default""cut.POSIXt" [10] "cutree" "dev.cur""icuSetCollate"[13] "is.recursive" "occupationalStatus" help.start R web . , - R Help RGui. -curveHelp>Rfunctions(text)..., curve Help on, OK.apropos Help>Apropos ... . R (2013)6 () example(-).-example(log) .-1.5e3 1500 10 5 . 13= ,-1.5e-3 0015 . 0 10 5 . 13= . R*.pdf http://cran.r-project.org/(Manuals). The R Reference Index ( 3500 ). 1.5 R . T -- R. - ._. . , .. (case sensitive) b B . (commands) (expressions), (assignments). , . > exp(1)+1 [1] 3.718282 , -(object) x x [1] 7 . ;. R (2013)7 > sqrt(81);y Exit YesQuestion.-- , , savehistory(file=".Rhistory"), save.image(file=".RData"). - R (2013)9 R()File>LoadHistrory... File>Load Workspace ..., . (*.Rhistory) - (Word, Notepad, .). - R *.txt - File>Save to File ... . R Ctrl+S .R . File>Save as... . To .R - File>Open script . objects ls. rm( -, ..., __). - Misc>Remove all objects - rm(list=ls()). > x ls() [1] "z" 1.7 (functions) (datasets) R (packages). . - library - search. library(help= ). > library(help=survival) R (2013)10 library(_) Packages>Load package ... . R, - . ( - ) -Packages>Installpackage(s).... http://cran.r-project.org MirrorsCRANPack-agesSoftware., -qcc (Quality Control Charts) spc (Statistical ProcessConrol). (reference manual) *.pdf *.zip (directory) Packages>Install package(s) from local zip files. (update) - Packages>Update packages ....qcc: Quality Control Charts Shewhart quality control charts for continuous, attribute and count data. Cusum and EWMA charts. Operating characteristic curves. Process capability analysis. Pareto chart and cause-and-effect chart. Version:1.3 Depends:R (? 2.6) Published:2008-10-12 Author:Luca Scrucca Maintainer:Luca Scrucca License:GPL (?2) Citation:qcc citation infoCRAN checks: qcc results Downloads: Package source: qcc_1.3.tar.gz MacOS X binary:qcc_1.3.tgzWindows binary:qcc_1.3.zipReference manual: qcc.pdfOld sources: qcc archive R (2013)11 1.8 - 1.1: + - * / ^ %/% %% 1.2: & | ! == != = R 1.3: sqrt(x) abs(x) ( ) sin(x), cos(x), tan(x) asin(x), acos(x), atan(x) factorial(x) R (2013)12 choose(n,x) (n x) exp(x) log(x) () log b(x),log(x,b) b gamma(x) floor(x) x >ceiling(x) x s round(x, digits=n) signif(x, digits=6) 1.9 - R. > 4/0;-3/0 # + - [1] Inf [1] -Inf > 0/Inf;0/0;exp(-Inf);Inf-Inf# NaN (Not a number) [1] 0 [1] NaN [1] 0 [1] NaN > abs(-3.1);abs(5.7) # [1] 3.1 [1] 5.7 > sqrt(81); 81^(1/2) # [1] 9 [1] 9 > exp(1);exp(2)# [1] 2.718282 [1] 7.389056 > log(exp(1));log(20)# [1] 1 [1] 2.995732 > log2(64);log10(1000) # I [1] 6 [1] 3 > log(64,2);log(1000,10) # II [1] 6 [1] 3 >> floor(3.7);floor(-3.7);floor(3)# [1] 3 [1] -4 [1] 3 > ceiling(3.7);ceiling(-3.7);ceiling(3)# [1] 4 [1] -3 R (2013)13 [1] 3 > round(exp(1), digits=2)# [1] 2.72 > signif(exp(1), digits=6) # [1] 2.71828 > 121%/%7; 121%%7 # [1] 17 [1] 2 > sin(pi/6);cos(pi/6);tan(pi/6)# [1] 0.5 [1] 0.8660254 [1] 0.5773503 > asin(sqrt(3)/2);pi/3;atan(1);pi/4# [1] 1.047198 [1] 1.047198 [1] 0.7853982 [1] 0.7853982 > x x # y 2 > y z # a, b, c TRUE, FALSE, NA > a w 10; w; s #kmkostas maria > k mode(x); mode(c); mode(m) [1] "numeric" [1] "logical" [1] "character" > z v1 v1 [1] 1 2 3 4 scan . ) 7 , 6 , 5 ( 2 = v > v2 v2 v2 [1] 5 6 7 , , > x v1/2;exp(v1);min(v1);max(v1);range(v2);length(v2);sum(v1); prod(v1); mean(v2); var(v1) [1] 0.5 1.0 1.5 2.0 5 class numeric, integer, complex, logical, character, list, matrix, array, factor, ts, data.frame. 6 c concatenate (). R (2013) 17 [1]2.7182827.389056 20.085537 54.598150 [1] 1 [1] 4 [1] 5 7 [1] 3 [1] 10 [1] 24 [1] 6 [1] 1.666667 (var(x)sum((x-mean(x))^2)/(length(x)-1), x). . - . - > v1+2*v2 # (1,2,3,4)+2*(5,6,7,5) [1] 11 14 17 14 Warning message: In v1 + 2 * v2 : longer object length is not a multiple of shorter object length > v1/v2 # (1/5,2/6,3/7,4/5) [1] 0.2000000 0.3333333 0.4285714 0.8000000 Warning message: In v1/v2 :longer object length is not a multiple of shorter object length c . > c(v1,v2,v2^2,2*v1) [1]1234567 25 36 492468 > c(c(2,3), c(12,23,34)) [1]23 12 23 34 ., 1:10) 10 , 9 , 8 , 7 , 6 , 5 , 4 , 3 , 2 , 1 ( ,10:1(rev(1:10))-) 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 ( .:. > v3 v3 [1]2468 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 seq . R (2013) 18 > seq(1,30) [1]123456789 10 11 12 13 14 15 16 17 18 19 20 21 22 [23] 23 24 25 26 27 28 29 30 > seq(-1,1,0.2) [1] -1.0 -0.8 -0.6 -0.4 -0.20.00.20.40.60.81.0 > seq(2, by=0.5, length=12) [1] 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 > seq(2,10,length=4) [1]2.0000004.6666677.333333 10.000000 rep. > rep(v1, times=3) [1] 1 2 3 4 1 2 3 4 1 2 3 4 > rep(v1, each=3) [1] 1 1 1 2 2 2 3 3 3 4 4 4 > rep(v1, c(2,1,2,1)) [1] 1 1 2 3 3 4 sort. order. > v4 order(v4) [1] 1 2 5 6 4 3 H i x x[i]. > x x[2]+2*x[3] # 0.2+2*0.3 [1] 0.8 > x[2:5] [1] 0.2 0.3 0.4 0.5 > x[c(4,6)] [1] 0.4 0.6 > x[3] Dataeditor...,replace.Hi- R (2013) 19 ( )x ] [ i x . > z y z[z7] [1]139 11 13 15 17 19 . - > [1] "" "" """" > [2] [1] "" . - > v5 mode(v5) [1] "character" - ts. > x t1 t2 mode(t2);class(t2) [1] "numeric" [1] "ts" ( , -, .) R (2013) 20 2.1: sum(x) cumsum(x) prod(x) cumprod(x) max(x) min(x) sort(x) range(x) (min(x), max(x)) length(x) mean(x) median(x) var(x)sd(x) skewness(x) ( moments) kurtosis(x) ( moments) cor(x,y) quantile(x) rank(x) IQR(x) cov(x,y) summary(x) mode(x) class(x) 2.3 (matrices)2.3.1 R matrix. , ((

=3 5 47 1 2P > P mode(P) [1] "numeric" > class(P) [1] "matrix" R (2013) 21 (;) > P P P M M [,1] [,2] [,3] [1,]212 [2,]454 r > r matrix(r, nrow=2, ncol=3) [,1] [,2] [,3] [1,]222 [2,]222 dim. > va dim(va) vb dim(vb) rb cb Q dimnames(Q) Q C1 C2 C3 C4 C5 R1159 13 17 R226 10 14 18 R337 11 15 19 R448 12 16 20 , rownames colnames. > Q rownames(Q) colnames(Q) Q C1 C2 C3 C4 C5 R1159 13 17 R226 10 14 18 R337 11 15 19 R448 12 16 20 length, dim, . > dim(Q); length(Q); nrow(Q); ncol(Q) [1] 4 5 [1] 20 [1] 4 [1] 5> colMeans(Q) [1]2.56.5 10.5 14.5 18.5 > colSums(Q) [1] 10 26 42 58 74 > rowMeans(Q) [1]9 10 11 12 > rowSums(Q) [1] 45 50 55 60 R (2013) 23 ) , ( j i Q Q[i,j], i -Q[i,], jQ[,j].-- , . > Q[1,1]+Q[2,2]; Q[2,]; Q[,4] [1] 7 [1]26 10 14 18 [1] 13 14 15 16 > Q[-2,-4] C1 C2 C3 C5 R1159 17 R337 11 19 R448 12 20 -data.entry(_),edit(_),Ed-it>Data editor... . Data Editor R - . edit(Q) 2.3.2 - / . 2.2: , - %*% %o% t solve , diag eigen R (2013) 24 > u cbind(u) # u u [1,] 1 [2,] 2 [3,] 3 > rbind(u) # u [,1] [,2] [,3] u123 > u%*%v# [,1] [1,] 20 > rbind(u)%*%cbind(v)# v u 20 > rbind(u)%*%rbind(v)# Error in rbind(u) %*% rbind(v) : non-conformable arguments > u%o%v# [,1] [,2] [,3] [1,]234 [2,]468 [3,]69 12 > cbind(u)%*%rbind(v)# ( outer(u,v,"*")) [,1] [,2] [,3] [1,]234 [2,]468 [3,]69 12 x x % %- x x'x x ' x- . R , . > x x%*%x [,1] [1,]6 > A B w A [,1] [,2] [,3] [1,]252 [2,]374 [3,]461 R (2013) 25 > B [,1] [,2] [,3] [1,]135 [2,]517 [3,]728 > w [1] 1 2 3 > A*A# (2[i,j]) [,1] [,2] [,3] [1,]4 254 [2,]9 49 16 [3,] 16 361 > w%*%A [,1] [,2] [,3] [1,] 20 37 13 > A%*%w [,1] [1,] 18 [2,] 29 [3,] 19 > A%*%B [,1] [,2] [,3] [1,] 41 15 61 [2,] 66 24 96 [3,] 41 20 70 > t(B)# [,1] [,2] [,3] [1,]157 [2,]312 [3,]578 > solve(A)# [,1] [,2] [,3] [1,] -1.54545450.63636360.5454545 [2,]1.1818182 -0.5454545 -0.1818182 [3,] -0.90909090.7272727 -0.0909091 > diag(A) # [1] 2 7 1 > diag(3) # 3X3 [,1] [,2] [,3] [1,]100 [2,]010 [3,]001 > diag(w) # w [,1] [,2] [,3] [1,]100 [2,]020 [3,]003 7 7 x A Ax =x. - |-|=0. R (2013) 26 > C OBJ sum(OBJ$val) [1] 10 > diag(OBJ$vec) [1] -0.4472136 -0.7071068 > attributes(OBJ) $names [1] "values""vectors" > C%*%OBJ$vectors[,2] [,1] [1,] -2.828427 [2,] -2.828427 > OBJ$values[2]*OBJ$vectors[,2] [1] -2.828427 -2.828427 b x = A solve(A,b) b x1A= . 5 217 3 420 4 3 2= += += + +y xz yz y x ) 3 , 2 , 1 ( ) , , ( = z y x , > A b solve(A,b) [1] 1 2 3 2.4 (arrays) .R 3 > k array ( ar- R (2013) 27 rays). > ar ar[,,1] [,1] [,2] [,3] [1,]135 [2,]246 > ar[2,,1] [1] 2 4 6 > ar[,2,1] [1] 3 4 > ar[1,,] [,1] [,2] [,3] [,4] [1,]17 23 29 [2,]39 25 41 [3,]5 21 27 43 > ar[,2,] [,1] [,2] [,3] [,4] [1,]39 25 41 [2,]4 10 26 42 array . - > length(ar); dim(ar); mode(ar); class(ar); nrow(ar); ncol(ar) [1] 24 [1] 2 3 4 [1] "numeric" [1] "array" R (2013) 28 [1] 2 [1] 3 > dimnames(ar) ar , , ar1 C1 C2 C3 R1135 R2246 , , ar2 C1 C2 C3 R179 21 R28 10 22 , , ar3 C1 C2 C3 R1 23 25 27 R2 24 26 28 , , ar4 C1 C2 C3 R1 29 41 43 R2 30 42 44 2.5 (factors) ()Rfactor.-,20 (wbc log ). 1 . 2 log s wbc ,3 log 1 . 2 s < wbc ,3 log > wbc (-lmh)(Lo),(Med), (Hi),.lmh(,factor) (Lo, Med, Hi) (levels). - wbc log2.31.40.62.12.22.42.32.61.02.7 lmhMed LoLoLoMed Med Med Med LoMed wbc log2.23.12.83.70.83.83.91.12.61.7 lmhMed HiMed HiLoHiHiLoMedLo R > logwbc v v [1] "Med" "Lo""Lo""Lo""Med" "Med" "Med" "Med" "Lo""Med" [11] "Med" "Hi""Med" "Hi""Lo""Hi""Hi""Lo""Med" "Lo"> mode(v) [1] "character" > lmh lmh [1] Med LoLoLoMed Med Med Med LoMed Med HiMed HiLo[16] HiHiLoMed LoLevels: Hi Lo Med > mode(lmh) > class(lmh) [1] "numeric" [1] "factor" v ( x) > x x[logwbc2.1&logwbc3] x [1] "Med" "Lo""Lo""Lo""Med" "Med" "Med" "Med" "Lo""Med" [11] "Med" "Hi""Med" "Hi""Lo""Hi""Hi""Lo""Med" "Lo" tapply- . - > logwbcmeans logwbcmeans HiLo Med3.6250001.2428572.455556 > logwbcsum logwbcsum Hi LoMed14.58.7 22.1 ordered factor, - . > lmh lmh [1] Med LoLoLoMed Med Med Med LoMed Med HiMed HiLoHiHiLoMed LoLevels: Lo < Med < Hi 2.6 (lists) . (list) -() . list. R (2013) 30 > a mode(LstA) [1] "list" c. > LstB LstB $child.names [1] """""" $child.ages [1]5 11 12 > Lst Lst $father.name [1] "" $mother.name [1] "" $no.children [1] 3 $child.names [1] """""" $child.ages [1]5 11 12 > attributes(Lst) $names R (2013) 31 [1] "father.name" "mother.name" "no.children" "child.names" "child.ages" > names(Lst) [1] "father.name" "mother.name" "no.children" "child.names" "child.ages"i LL[[i]] L$odject_name. > length(Lst) [1] 5 > Lst[[5]] [1]5 11 12 > Lst$child.ages [1]5 11 12 > Lst[[5]][1:2] [1]5 11 > Lst[5] # 5 $child.ages [1]5 11 12 > sum(Lst[[5]]) [1] 28 > sum(Lst$child.ag) [1] 28 > sum(Lst[5]) Error in sum(Lst[5]) : invalid 'type' (list) of argument lapply. > lapply(LstA,length) $father.name [1] 1 $mother.name [1] 1 $no.children [1] 1 > lapply(LstB,mean) $child.names [1] NA $child.ages [1] 9.333333 data.entry(_). 2.7 (data frames) data.frame. . R (2013) 32 data.frame. > a b c d df df SEX AGE HEIGHT WEIGHT 1 male24181 81 2 female32167 55 3 male45178 75 4 female67170 74 5 male43175 78 6 male21 2008 95 > mode(df); class(df) [1] "list" [1] "data.frame" > names(df) [1] "SEX""AGE""HEIGHT" "WEIGHT" > mode(df$SEX); class(df$SEX) [1] "numeric" [1] "factor" > dim(df); length(df) [1] 6 4 [1] 4 - names. > names(df) df Col1 Col2 Col3 Col4 1 male 24181 81 2 female 32167 55 3 male 45178 75 4 female 67170 74 5 male 43175 78 6 male 21 2008 95 cbind, - rbind. > e DF DF SEX AGE HEIGHT WEIGHT SMOKE 1 male24181 81 2 female32167 55 3 male45178 75 4 female67170 74 5 male43175 78 6 male21 2008 95 R (2013) 33 () $, df$SEX DF$SMOKE, . .at-tach. > sum(WEIGHT) Error: object "WEIGHT" not found > attach(df) > sum(WEIGHT) [1] 458 data.frames(attached) search( 1.7).data.frame- detach. > search() [1] ".GlobalEnv""df""package:stats" [4] "package:graphics""package:grDevices" "package:utils" [7] "package:datasets""package:methods" "Autoloads" [10] "package:base" order. > df[order(AGE,decreasing=TRUE),] SEX AGE HEIGHT WEIGHT 4 female67170 74 3 male45178 75 5 male43175 78 2 female32167 55 1 male24181 81 6 male21 2008 95 > df[order(SEX,AGE),] SEX AGE HEIGHT WEIGHT 2 female32167 55 4 female67170 74 6 male21 2008 95 1 male24181 81 5 male43175 78 3 male45178 75 merge. > a1 b1 c1 d1 e1 df1 merge(df, df1, all=T) SEX AGE HEIGHT WEIGHT SMOKE 1 female32167 55 2 female33163 71 N 3 female67170 74 R (2013) 34 4 male21 2008 95 5 male23182 57 Y 6 male24181 81 7 male43175 78 8 male45178 75 > df[2,2] [1] 32 > df[1,] SEX AGE HEIGHT WEIGHT 1 male24181 81 > df[,"HEIGHT"] [1]181167178170175 2008 > df[SEX=="male",] SEX AGE HEIGHT WEIGHT 1 male24181 81 3 male45178 75 5 male43175 78 6 male21 2008 95 2008 . 192. > HEIGHT[6] HEIGHT [1] 181 167 178 170 175 192 > df SEX AGE HEIGHT WEIGHT 1 male24181 81 2 female32167 55 3 male45178 75 4 female67170 74 5 male43175 78 6 male21 2008 95 HEIGHT . > df$HEIGHT[6] df SEX AGE HEIGHT WEIGHT 1 male24181 81 2 female32167 55 3 male45178 75 4 female67170 74 5 male43175 78 6 male21192 95 , edit(__) Data Editor R. R (2013) 35 2.8 2 array, assign, attach, attributes c, cbind, class, colnames, cor, cos, cov, cumprod, cumsum data.entry, data.frame, detach, diag, dim, dimnames edit, eigen factor IQR lapply, length, list matrix, max, mean, median, merge, min, mode names, ncol, nrow order, ordered, outer prod quantilerank, range, rbind, rep, replace, rev, rownames scan, seq, sd, solve, sort, sum, summary tapply, ts var R (2013) 37 3 3.1 plot demo(graphics), demo(persp) demo(image) R. R - plot. trees R. help(trees) ?trees. R - data.> data() > help(trees) trees. > trees Girth Height Volume 18.3 70 10.3 28.6 65 10.3 . . . 3018.0 80 51.0 3120.6 87 77.0 > mode(trees); class(trees) [1] "list" [1] "data.frame" R (2013) 38 indexHeight (Height, Volume). > attach(trees) > plot(Height) > plot(Height,Volume)

plot -.plot par. . 3.1: plot main="" sub="" xlab="" x ylab="" y xlim=c(a,b) x (] , [ b a ) ylim=c(a,b) y (] , [ b a ) cex=u u ( ) - cex.axis, cex.lab , cex.main, cex.sub , font=n ( 5 ,..., 2 , 1 = n , ..= 2 bold,= 3 italic) font.axis, font.lab , font.main, font.sub , tck=a (tick marks) (05 . 0 05 . 0 < < a ,02 . 0 = default ) R (2013) 39 lab=c(n1,n2,n3) : ) 2 ( 1 n n ) ( y x : 3 n ( ) las=n (label) ( 2 , 1 , 0 = n ) xaxt=n x yaxt=n y frame.plot=FALSE lty="n" ( 6 ,..., 2 , 1 = n ) lwd="a" 0 (> a ) asp=nAspect ratio y / x log="w" w ( y x w , = ) log="xy" x , y pch="c" ( , 2 , 1 , , , , , , , , C B A c b a c = ) pch=n pch="" ( ,... 3 , 2 , 1 = n..= 10 ,= 18 +) col="" / ( 657 colors()) col=n / ( 8 ,.., 2 , 1 = n , ..= 2 ,= 7 )col.axis, col.lab , , , col.main, col.sub, , bg="" bg=n , legeng ( pch=21:25 type 3.2: type type="p" (default) type="l" type="b" ( )type="c" type="o" () type="h" type="s" ( ) type="S" ( ) type="n" R (2013) 40 - pch. 3.2 > x y par(mfcol=c(2,2)) > plot(x,y,main="type=''p'' (default)") > plot(x,y,type="l", main="type=''b''") > plot(x,y,type="b", main="type=''l''") > plot(x,y,type="h", main="type=''h''") > par(mfcol=c(1,1)) R (2013) 41 > par(mfcol=c(2,2)) 2 2 -.->par(mfcol=c(1,1)) . par . > oldpar x y plot(x,y) > plot(x,y, type="b", pch=23, bg="red", main="Title", xlab="x Axis", + ylab="y Axis", xlim=c(0,12), ylim=c(0,15), lab=c(13,4,7)) 3.2 plot (high level) -(). -. (low level). 3.3: points(x,y) lines(x,y) text(x,y,label="abc") "abc" ) , ( y x R (2013) 42 segments(x0,y0,x1,y1) ) 0 , 0 ( y x ,) 1 , 1 ( y xabline(a,b) bx a y + =title("abc") "abc" rug(x) x rect(x0,y0,x1,y1) ) 0 , 0 ( y x ,) 1 , 1 ( y xarrows(x0,y0,x1,y1) ) 0 , 0 ( y x , ) 1 , 1 ( y xlegend(x,y,...) ) , ( y x > x y1 y2 plot(c(x,x),c(y1,y2), type="n", xlab="x",ylab="y")#1 > lines(x,y1,lty=1,col="blue")#2 > lines(x,y2,lty=2,col="red", lwd=2)#3 > title(expression(f(x)==a*x*e^{-b*x})) #4 > legend(75, 1500, c("a=200, b=0.04", "a=220, b=0.06"), lty=c(1,2), + col=c("blue","red"), lwd=c(1,2))#5 #1 (type="n"), . #2 xxe x f04 . 0200 ) (= (col="blue"),-#3- xxe x f06 . 0220 ) (= (lty=2) (col="red") 2 (lwd=2). #4 o . title("f(x)=a*x*exp(b*x)") " ", 8 expression. #5 8 help(plotmath), example(plotmath) demo(plotmath). R (2013) 43 ) 1500 , 75 ( . ,, , . > x y plot(x,y) > plot(x,y, type="b", pch=23, bg="red", main="Title", xlab="x Axis", + ylab="y Axis", xlim=c(0,12), ylim=c(0,15), lab=c(13,4,7)) grid(nx = NULL, ny = NA, col = "orange", lty = 2) #1 > text(x,y,labels=y, pos=3, offset=1) #2 > points(10,2, pch=23, bg="red")#3 > text(11,10,label="Outlier") #4 > arrows(11,9.5,10.1,2.5,col=6, lwd=2)#5 > abline(0,1, lty=2, col="green") #6 > rect(0.5,0.5,8.5,13.5, lty=2) #7 > segments(0.5,5,8.5,5) #8 #1x (nx=NULL, ny=NA).#2) , ( y x y (labels=y). (pos=3)-(offset=1).#3 , , ) 2 , 10 ( .#4 Outlier ) 10 , 11 ( . #5 - ) 5 . 9 , 11 ( -) 5 . 2 , 1 . 10 ( ) , ( = y x .#6,- y 0 1 ( abline(a,b) bx a y + = ). #7 .#8- . R . > x y plot(x,y) > identify(x,y)#1 > text(locator(2), c("High", "Low")) #2 R (2013) 44 > identify(x,y,labels=y) #3 > locator( ,type="l")#4 #1 + -. -(-12(3,5)3(4,3)). Stop.#2 .locator identify(x,y), +- . #3 . #4 - ( 5 ). 3.3 ts.plot, pairs, matplot -ts.plot.R( ts)mdeaths,fdeathsldeaths ,,, 1974 1979. > class(ldeaths); class(mdeaths); class(fdeaths) [1] "ts" [1] "ts" [1] "ts" > ts.plot(ldeaths,mdeaths,fdeaths,lty=1:3, xlab="Year", ylab="Deaths") > leg.names legend(locator(1), leg.names, lty=1:3, bg="pink") R (2013) 45 mtcarsR 10-32(-1973-1974). -- pairs. , mpg (),hp()wt (/1000)- > ?mtcars > class(mtcars) [1] "data.frame" > attach(mtcars) > pairs(cbind(mpg,hp,wt)) , mtcars - data.frame, plot(mtcars). R (2013) 46 -matplot.iris3R, array - 50 (setosa, versicolor virginica). - > iris3 , , Setosa Sepal L. Sepal W. Petal L. Petal W. [1,]5.13.51.40.2 [2,]4.93.01.40.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .[50,]5.03.31.40.2 , , Versicolor Sepal L. Sepal W. Petal L. Petal W. [1,]7.03.24.71.4 [2,]6.43.24.51.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . [50,]5.72.84.11.3 , , Virginica Sepal L. Sepal W. Petal L. Petal W. [1,]6.33.36.02.5 [2,]5.82.75.11.9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . [50,]5.93.05.11.8 R (2013) 47 > class(iris3) [1] "array" > pet.length class(pet.length) [1] "matrix" > pet.width class(pet.width) [1] "matrix" > matplot(pet.length, pet.width, pch = c("+","x","o"), col=c(3,2,4)) > leg.names legend(locator(1), leg.names, pch = c("+","x","o"), col=c(3,2,4)) ++ + + ++++ +++++ +++ ++ + ++++++ +++ + + +++++ ++++ ++++++++ ++ +1 2 3 4 5 6 70.51.01.52.02.5pet.lengthpet.widthxx xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxx x xxxxxxxxxxxoooooooo oooooooooo oooo oooo oooooooooooo ooooooooo ooo+xoSetosaVersicolorVirginica 3.4 3 abline, arrows colors data, demo(graphics), demo(persp) demo(image)expression identify legend, lines, locator matplot pairs, par, plot, points rect, rug segments text, title, ts.plot. R (2013)49 4 4.1 txt.txtR read.table. , Infants.txt C:/Documents and Settings/DLA/Desktop (68 ,9 ). Infants.txt R > a a Ethnic Age Smoke PreWeight DelWeight BreastFed BthWeight BthLength TimeNut 1 White29 NonSmoker 115 140No331045.099 2 Black33 NonSmoker 112 126No265048.064 ..................................................................................... 68Black19 NonSmoker 132 156No336051.084 > class(a) [1] "data.frame" > names(a) [1] "Ethnic""Age" "Smoke" "PreWeight" "DelWeight" "Breast-Fed" "BthWeight" "BthLength" "TimeNut" , Infants.txt working directory ( .6) a library(foreign) b class(b) [1] "data.frame" foreign -( http://cran.r-project.org/, Packages, Table of available packages, sorted by name - foreign foreign.pdf). MR, , data. > data(HUMMER, package="UsingR") # HUMMER UsingR > HUMMER R (2013)51 JanFebMarAprMayJunJulAugSepOctNovDec 20032493 2654 2987 2837 3157 2837 3157 2004 1927 2141 2334 2268 1982 2175 2505 2404 2548 2554 2693 3814 2005 1864 1866 2220 1700 2964 6754 7476 6367 5806 5640 5991 8079 2006 5214 5645 library(UsingR) - . - ASCII write.table ( - dump source). , - b write.table(b, file="C:/Documents and Settings/DLA/Desktop/car.txt") 4.2 : SmokeInfants.txt(Non-Smoker), (HeavySmoker) (LightSmoker).> attach(a) > class(Smoke) [1] "factor" > length(Smoke) [1] 68 > Smoke [1] NonSmoker NonSmoker NonSmoker LightSmoker NonSmoker NonSmoker [7] NonSmoker LightSmoker HeavySmoker NonSmoker NonSmoker NonSmoker ........................................................................... [67] NonSmoker NonSmoker Levels: HeavySmoker LightSmoker NonSmoker Smoke (categorical). -(bar-charts) (pie charts). O Smoke table. > table(Smoke) Smoke HeavySmoker LightSmoker NonSmoker 10 949 barplot. > barplot(table(Smoke)) > barplot(table(Smoke), ylim=c(0,50), col=c(6,7,5), space=1) R (2013)52

( ) Smoke.counts dotchart(Smoke.counts, main="Mother's Smoking Status") > pie(table(Smoke)) > dotchart(table(Smoke)) TimeNut Infants.txt ( ) ((measure- R (2013)53 mentdata)).o hist. > class(TimeNut) [1] "integer" > mode(TimeNut) [1] "numeric" > hist(TimeNut) > min(TimeNut) [1] 32 > max(TimeNut) [1] 128 > class hist(TimeNut, breaks=class, include.lowest=TRUE, right=FALSE, + xlab="Time in minutes", main="Time spent with a Nutrionist", + xlim=c(20,140), ylim=c(0,14),labels=TRUE, col=c(2:8), las=1) (in-clude.lowest=TRUE, right=FALSE). - TimeNut density. simple.freqpoly - UsingR. > hist(TimeNut,prob=TRUE) > lines(density(TimeNut), lwd=2) > library(UsingR) > simple.freqpoly(TimeNut, breaks=seq(25,135,10), + include.lowest=TRUE, right = FALSE) R (2013)54 TimeNut NonSmoker - Smoke > hist(TimeNut[Smoke=="NonSmoker"])

( ) - stem ( scale ). > stem(TimeNut, scale=2) R stripchart dotplot. Age - > stem(Age) > stripchart(Age, method="stack",pch=16,col="blue",cex=2,offset=0.5) 3 | 2 3 | 4 | 4 | 8 5 | 0 5 | 7 6 | 0034 6 | 556779 7 | 01122 7 | 6779 8 | 01223444 8 | 5556778 9 | 000123 9 | 668899 10 | 0002222 10 | 5555 11 | 00 11 | 5 12 |12 | 5568 R (2013)55 boxplot. BthWeight > hist(BthWeight) > boxplot(BthWeight, main="Weight of the infant \n in grams",+ horizontal=T)

- 2.1. > mean(TimeNut) [1] 86.17647 > median(TimeNut) [1] 85.5 > range(TimeNut) [1]32 128 > diff(range(TimeNut)) [1] 96 > var(TimeNut) [1] 374.9833 1 | 67779999 2 | 000000001111222223333344444 2 | 55555555666677778888999 3 | 012334 3 | 556 4 | 1 R (2013)56 > quantile(TimeNut, 0.30, type=6) 30% 77> quantile(TimeNut, seq(0.1,0.9,0.1)) 10% 20% 30% 40% 50% 60% 70% 80% 90% 63.769.477.082.885.590.298.0 102.0 106.5> IQR(TimeNut) #Q3-Q1 [1] 28.25 > summary(TimeNut) Min. 1st Qu.MedianMean 3rd Qu.Max.32.00 71.75 85.50 86.18100.00128.00 4.3 : 4.3.1 ()Smoke(Levels:HeavySmoker,LightSmokerNon-Smoker)Ethnic(Levels:Black,Hispanic,White).o- Smoke Ethnic (two-way contingency table) ta-ble ( ), prop.table ( ). > ct prop.table(ct) # ct/sum(ct) Ethnic SmokeBlack HispanicWhite HeavySmoker 0.04411765 0.01470588 0.08823529 LightSmoker 0.05882353 0.00000000 0.07352941 NonSmoker 0.26470588 0.16176471 0.29411765 > class(ct) [1] "table" 2 2 - > x dimnames(x) x Black Hispanic White NonSmoker18 1120 LightSmoker 40 5 HeavySmoker 31 6 > ## > ## rownames(x)=c("NonSmoker","LightSmoker","HeavySmoker") > ## colnames(x)=c("Black","Hispanic","White") > class(x) [1] "matrix" R (2013)57 margin.tableaddmargins. > margin.table(x,1)# margin.table(ct,1) NonSmoker LightSmoker HeavySmoker 49 910> # 1 , 2 > margin.table(x,2)# margin.table(ct,2) Black HispanicWhite25 12 31> addmargins(x)# addmargins(ct) Black Hispanic White Sum NonSmoker18 112049 LightSmoker 40 5 9 HeavySmoker 31 610 Sum25 123168 > prop.table(x,1) # prop.table(ct,1)BlackHispanic White NonSmoker 0.3673469 0.2244898 0.4081633 LightSmoker 0.4444444 0.0000000 0.5555556 HeavySmoker 0.3000000 0.1000000 0.6000000 (tree-waycontingencytables)tableftable. > table(Smoke, Ethnic, BreastFed) , , BreastFed = No Ethnic Smoke Black Hispanic White HeavySmoker 30 2 LightSmoker 40 2 NonSmoker105 6 , , BreastFed = Yes Ethnic Smoke Black Hispanic White HeavySmoker 01 4 LightSmoker 00 3 NonSmoker 8614 > ftable(table(Smoke, Ethnic, BreastFed)) BreastFed No Yes Smoke EthnicHeavySmoker Black 3 0 Hispanic0 1 White 2 4 LightSmoker Black 4 0 Hispanic0 0 White 2 3 NonSmoker Black10 8 Hispanic5 6 White 614 R (2013)58 barplot(). > barplot(x, legend.text=TRUE) > # barplot(ct, legend.text=TRUE) > barplot(x, beside=TRUE, col=rainbow(3), ylim=c(0,25)) > labs legend(locator(1), labs, fill=rainbow(3)) dotchart() -. > dotchart(x, main="Dotchart of Smoke vs Ethnic") 4.3.2 Ethnic BthLength ( ) , , , . R (2013)59 > plot(Ethnic, BthLength)# Ethnic > stripchart(BthLength~Ethnic, pch=1, method="jitter", vertical=TRUE) > tapply(BthLength,Ethnic,summary) $Black Min. 1st Qu.MedianMean 3rd Qu.Max.41.00 47.00 49.00 49.16 52.00 55.00$Hispanic Min. 1st Qu.MedianMean 3rd Qu.Max.45.00 47.50 48.50 48.67 50.00 52.50 $White Min. 1st Qu.MedianMean 3rd Qu.Max. 45.050.051.050.952.060.0 lattice(ClevelandsTrellisgraphicsconcepts). y ~ x | z (response ~ predictor | condition) ~ x | z (- subset). > library(lattice) > bwplot(BthLength~Ethnic|Smoke, main="bwplot 1") > bwplot(BthLength~Ethnic,subset=(Ethnic=="White"), main="bwplot 2") > histogram( ~BthLength | Ethnic, + subset=(Ethnic=="Black")|(Ethnic=="White"), type="count", + nint=8, main="histogram") > dotplot(BthLength~Ethnic|Smoke, main="dotplot") > stripplot(Ethnic~BthLength|Smoke, jitter=TRUE, main="stripplot") > xyplot(BthLength~Age|Smoke, main="xyplot") R (2013)60 R (2013)61 4.3.3 BallParkData.txtR 30 2001. League (American,National),ParkBlt , Capacity a ,Attend 2001, WinPct . R > b b Team League ParkBlt Capacity Attend WinPct 1 Anaheim-Angels American196645050247080.463 2Baltimore-Orioles American199248262386860.391 ................................................................. 30San-Francisco-Giants National200041341408770.556 > attach(b) > class(b) [1] "data.frame"> names(b) [1] "Team" "League" "ParkBlt""Capacity" "Attend" "WinPct" - . > AgePark plot(AgePark, Capacity, main="Full Data") > identify(AgePark, Capacity, labels=AgePark) [1]39 17 > plot(AgePark[c(-3,-9,-17)],Capacity[c(-3,-9,-17)], main="Reduced Data") R (2013)62 AttendWinPct , League, - ( ifelse). > plot(Attend,WinPct) > plot(Attend,WinPct,pch=as.character(League)) > legend(10000,0.65,legend=c("National","American"), pch=c("N", "A")) > plot(Attend,WinPct,pch=ifelse(League=="National",1,4)) > legend(10000,0.65,legend=c("National","American"), pch=c(1,4)) > library(lattice) > xyplot(WinPct~Attend|League) Attend(5data.frameb) Capacity (4 data.frame b) (League) - . R (2013)63 > LeagueA boxplot(LeagueA,main="American League") > LeagueN boxplot(LeagueN,main="National League") Attend,CapacityWinPct cor. sum-mary. > y is.matrix(y) [1] TRUE > cor(y) AttendCapacityWinPct Attend 1.0000000 0.1891696 0.4985423 Capacity 0.1891696 1.0000000 0.3586474 WinPct 0.4985423 0.3586474 1.0000000 > summary(y) Attend Capacity WinPct Min. : 7935 Min. :33871 Min. :0.3830 1st Qu.:23716 1st Qu.:41631 1st Qu.:0.4278 Median :32616 Median :46782 Median :0.5075 Mean :30062 Mean :47451 Mean :0.5001 3rd Qu.:36907 3rd Qu.:50352 3rd Qu.:0.5527 Max. :43362 Max. :66307 Max. :0.7160 Attend,CapacityWinPct plot ( pairs). - Attend1 0b b + = WinPct lm. R (2013)64 > plot(b[c(5,4,6)]) # > pairs(b[c(5,4,6)]) > lm(Attend~WinPct) Call: lm(formula = Attend ~ WinPct) Coefficients: (Intercept) WinPct 256054997 > lr plot(WinPct,Attend) > abline(lr)

. > scatter.smooth(WinPct,Capacity,col="green3") > lines(smooth.spline(WinPct,Capacity),lty=2, col="tomato2") R (2013)65 4.4 4 addmargins, as.character, attach barplot, boxplot, bwplot data, density, diff, dotchart, dotplot, dump ftable hist ifelse, identify legend, lm, locator margin.table names pie, plot, prop.table read.table, read.spss scatter.smooth, smooth.spline, simple.freqpoly, source, stem, stripchart, simple.freqpoly, stripplot, subset table, tapply write.table xyplot R (2013) 67 5 5.1 R sample. > k1 sample(k1,size=10,replace=TRUE)# [1]8 1331 10 11916 18 > sample(k1,size=5,replace=FALSE)# [1] 159 1157 . - prob. > k p sample(k,size=15,prob=p,replace=TRUE) [1] 3 3 1 0 4 4 4 2 1 3 1 0 0 4 3 > sample(0:1,size=10,replace=TRUE,prob=c(0.32,1-0.32))# [1] 0 1 1 1 0 1 1 0 0 0# Bernulli sam-ple set.seed. > r set.seed(136); r1 set.seed(136); r2 help.search("distribution") ( stats).,- , ?_(_). R (2013) 68 > ?Cauchy(stats) . 5.1: R (Rname) etabetashape1, shape2, ncp Binomialbinomsize, prob Cauchycauchylocation, scale Chisquarechisqdf, ncp Exponentialexprate FDistfdf1, df2, ncp GammaDistgammashape, scale Geometricgeomprob Hypergeometrichyperm, n, k Lognormallnormmeanlog, sdlog Logisticlogislocation, scale NegBinomialnbinomsize, prob Normalnormmean, sd Poissonpoislambda TDisttdf, ncp Uniformunifmin, max Weibullweibullshape, scale Wilcoxonwilcox m, n Multinomialmultinomsize, prob d, p, q r R (Rname) , -,(..),(..),- . -dRname(x, ...)# .. x -pRname(q, ...)# .. q -qRname(p, ...)# o p- -rRname(n, ...)# n 5.2.1 Hn p (.) , ( p n B ) R (2013) 69 n x p pxnx fx n x..., , 2 , 1 , 0 , ) 1 ( ) ( = ||.|

\|=. > ?Binomial(stats).) 5 . 0 , 10 ( B - > n sum(dbinom(0:x,size=n,prob=p)) # F(5) [1] 0.6230469 > pbinom(x,size=n,prob=p)# F(5) [1] 0.6230469 > 1-pbinom(x,size=n,prob=p)# 1-F(5)=P(X>5) [1] 0.3769531 > pbinom(x,size=n,prob=p, lower.tail=FALSE)# P(X>5) [1] 0.3769531 > sum(dbinom((x+1):10,size=n,prob=p))# P(X>5)) [1] 0.3769531 ) 5 . 0 , 21 ( B > x1 x2 p1 p2 df colnames(df) df x P(X=x)x P(X=x) 1 0 0.00000048 11 0.16818810 2 1 0.00001001 12 0.14015675 3 2 0.00010014 13 0.09703159 4 3 0.00063419 14 0.05544662 5 4 0.00285387 15 0.02587509 6 5 0.00970316 16 0.00970316 7 6 0.02587509 17 0.00285387 8 7 0.05544662 18 0.00063419 9 8 0.09703159 19 0.00010014 109 0.14015675 20 0.00001001 11 10 0.16818810 21 0.00000048 - ) 7 / 4 , 5 ( B > par(mfcol=c(1,2)) > n points(k,heights,pch=16,cex=0.8, col="red") > # > m heights plot(m,heights,type="s",lab=c(10,10,7),tck=0, lwd=2, col="blue",+ main="B(5, 4/7): ") > # > m1 heights1 points(m1,heights1,pch=1, cex=0.8) > # > m2 heights2 points(m2,heights2, pch=16, cex=0.8, col="red") 5.2.2 H p(.) ( p G ) ... , 2 , 1 , 0 , ) 1 ( ) ( = = x p p x fx . > ?Geometric(stats). ) 2 . 0 ( G - ) (x f . 50 ) 2 . 0 ( G ecdf plot.stepfun. R (2013) 71 > par(mfrow=c(1,3)) > f barplot(f, names=as.character(0:20), xlab="x", ylab="f(x)", main="+ : G(0.2)") > rg1 rg2 plot(ecdf(rg1), verticals=TRUE, do.points=FALSE, main="+ ", sub=" 50 G(0.2)") > plot.stepfun(rg1, main=" ", sub="+ 50 G(0.2)") 5.2.3 2o(.) , (2o N ) < < ||.|

\||.|

\| = xxx f ,21exp21) (2o o t. > ?Normal(stats).) 4 , 10 ( N ) (x f , ) (x F - px . 1000 - ) 4 , 10 ( N ..) (x f) 4 , 10 ( N . R (2013) 72 > par(mfrow=c(2,2)) > curve(dnorm(x, mean = 10, sd = 2),from=4,to=16, xlab="x", ylab="f(x)", + main="Density function") > curve(pnorm(x, mean = 10, sd = 2),from=4,to=16, xlab="x", ylab="F(x)",+ main="Distribution function") > curve(qnorm(x, mean = 10, sd = 2),from=0,to=1, xlab="p",+ ylab=expression(x[p]), las=2, main="Quantiles") > y hist(y, breaks=2.5:17.5, prob=TRUE, ylim=c(0,0.25), xlab="x",+ ylab="Probability", main="Random numbers") > lines(seq(4,16,0.1),dnorm(seq(4,16,0.1), mean = 10, sd = 2)) ) ( k Z k P s s 3 , 2 , 1 = k , ) 1 , 0 ( ~ N Z . > par(mfrow=c(3,1)) #) 1 1 ( s s Z P> x y plot(x,y,type="l") > x1 =-1 & x y1 =-1 & x polygon(x1,y1,col="gray60") > pnorm(1)-pnorm(-1) [1] 0.6827 > options("digits"=4) > text(0,0.1,label="68.26%") > text(2,0.35,label="N(0,1)") > abline(h=0) >> x y plot(x,y,type="l") > x1 =-2 & x y1 =-2 & x polygon(x1,y1,col="gray60") > pnorm(2)-pnorm(-2) [1] 0.9545 > options("digits"=4) > text(0,0.1,label="95.45%") > text(2,0.35,label="N(0,1)") > abline(h=0) >> x y plot(x,y,type="l") > x1 =-3 & x y1 =-3 & x polygon(x1,y1,col="gray60") > pnorm(3)-pnorm(-3) [1] 0.9973 > options("digits"=4) > text(0,0.1,label="99.73%") > text(2,0.35,label="N(0,1)") > abline(h=0) R (2013) 74 ) 100 , 100 ( N ) ( ) ( k Z k P k X k P s s = + s s o o 3 , 2 , 1 = k > options("digits"=6) > mu ?Exponential(stats). 100 4 = -- . > rse par(fig=c(0,1,0,.35)) > boxplot(rse, horizontal=TRUE, xlab="Exponential Sample") > par(fig=c(0,1,0.25,1), new=TRUE) > # y > tmp.hist tmp.hist #tmp.hist$densities ........................... $density [1] 0.1600000 0.1000000 0.0850000 0.0700000 0.0300000 0.0200000 [7] 0.0150000 0.0000000 0.0000000 0.0100000 0.0000000 0.0050000 [13] 0.0050000 ............................. > tmp.dens tmp.dens [1] 0.2 > y.max # > hist(rse, ylim=c(0,y.max), prob=TRUE, breaks="FD", col=gray(0.9), main="Exp(1/5)") > x1 y1 lines(x1,y1) > rug(rse) R (2013) 75 5.2.5 a b(.) , ( b a G ) - 0 ,) (1) (/ 1>I= x e xa bx fb x aa. > ?GammaDist(stats)- . ) 1 , 2 ( G . > x plot(x,dgamma(x,shape=2,scale=1), type="l", col="blue", xlab="x",+ ylab="f(x)", main=" : G(a,b)") > lines(x,dgamma(x,shape=2,scale=2), col="red", lty=2) > lines(x,dgamma(x,shape=2,scale=4), col="green", lty=3) > lines(x,dgamma(x,shape=2,scale=8), col="brown", lty=4) > legend(x=10,y=.3,paste("a = 2, b =", c(1,2,4,8)),lty=1:4,+ col=c("blue","red","green","brown")) R (2013) 76 5.2.6 Cauchy Cauchy a b(.) , ( b a C ) - < < += xb a x bx f ,) ] / ) [( 1 (1) (2t. Cauchy > ?Cauchy(stats). ) 10 , 100 ( C . > x rsc plot(x,dcauchy(x,100,10), type="l", xlab="x", ylab="f(x)",+ main=" : C(100,10)") > points(rsc, rep(0,length(rsc))) 5.2.7 ) ..., , , (2 1 kX X X = ' X n , 1p , 2p ,..., kp (. kp p p n M ..., , , , (2 1)) kX X X ..., , ,2 1 n x p p p px x xnx x x fi ixkx xkkk= = = , 1 ,! ! !!) ,..., , (2 12 12 12 1. > ?Multinomial(stats).) 3 . 0 , 5 . 0 , 2 . 0 , 20 ( M R (2013) 77 > n results results [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [1,]742734769 4 1 4 5 4 [2,]9 1178 14 106 1091010 71212 [3,]45 11536742 6 9 9 3 4 ..................................................................

[,999] [,1000] [1,]4 3 [2,] 1010 [3,]6 7 > mean(results1[2,])# X2 [1] 9.918 > hist(results[2,], prob=TRUE)# X2 5.2.8 ) , (2 1X X = ' X 1 , 2 ,,21o ,22o 1X , 2X R x x T x x f e|.|

\|=2 1222 12 1, ,21exp1 21) , ( o to, R (2013) 78 . )] )( ( 2 ) ( ) ( [12112 2 1 1 1222 22121 122 212222122 211 1222 2211 122 o o oo o oo o o o + =(((

||.|

\| ||.|

\| ||.|

\| +||.|

\| =x x x xx x x xT ( function, persp, contour image) > m1 ) ( z Z P >-two.sided } : { } : {2 / 2 / a ar r r r r r ' > s )} ( ), ( min{ 2 z Z P z Z P > s ar ar' R a r R Pa s s ) (a r R Pa s ' > ) ( . Ro runs.testlawstat( tseries). H runs.test runs.test(x, alternative=c("two.sided", positive.correlated", "negative.correlated") x: R (2013) 99 6.10 ( ) 6.3 30 iX( 30 ..., , 2 , 1 = i ) 1 . 0 = a . () 25.25, 25 . 25 iX , iY > < =. 0 25 . 25 , 10 25 . 25 , 1iiiXXY i iX 25 . 25 iXiY i iX 25 . 25 iXiY126.180.9311626.240.991 225.300.0511725.460.211 325.18-0.07-11825.01-0.24-1 424.54-0.71-11924.71-0.54-1 525.14-0.11-12025.270.021 625.440.1912124.22-1.03-1 724.49-0.76-12224.49-0.76-1 825.01-0.24-12325.680.431 925.12-0.13-12426.010.761 1025.670.4212525.500.251 1124.22-1.03-12625.840.591 1226.481.2312726.090.841 1323.97-1.28-12825.21-0.04-1 1425.830.5812926.040.791 1525.05-0.20-13025.23-0.02-1 1Y , 2Y ,..., 30Y 151 = n 1,152 = n -118 = r . 12 ,2 1> n n . Z 3 0.74322335) 1 15 15 ( ) 15 15 () 15 15 15 15 2 ( 15 15 2115 1515 15 218) 1 ( ) () 2 ( 21222 122 12 1 2 1 2 12 12 1= + + |.|

\|++ = + + ||.|

\|++=n n n nn n n n n nn nn nrz0.4573465 0.2286732 2 )} ( ), ( min{ 2 = = > s = z Z P z Z P value p . ( ). R > library(lawstat) > cement attach(cement); names(cement) [1] "x" > runs.test(x, plot.it=TRUE) R (2013) 100 Runs Test - Two sided data:xStandardized Runs Statistic = 0.7432, p-value = 0.4573 A A A A A A A A A A A A A A A0 5 10 15 20 25 30-0.4-0.20.00.20.4B B B B B B B B B B B B B B B runs.test lawstat - . : > xnew runs.test(xnew) Runs Test - Two sided data:xnewStandardized Runs Statistic = 0.7432, p-value = 0.4573 - 1..., , ,2 1 nX X X 2..., , ,2 1 nY Y Y - . ) ( ) ( : ) ( ) ( :1 0x F x F H x F x F HY X Y X= =( 0H x 1H ) x) (x FX ) (x FY - XY . , . 2 1n n + a R (2013) 101 b . R - a b. R . - a } : {ar r r s ar a r R Pa s s ) ( . 6.2.7 - : . - . 6.2.7.1 Kolmogorov-Smirnov nX X X ..., , ,2 1 X ) (x F . - ) ( ) ( : ) ( ) ( :0 1 0 0x F x F H x F x F H = =( 0H x 1H ) x) (0x F - Kolmogorov-Smirnov. nD |} ) ( ) ( {| sup0x F x S Dnxn = < < ) (x Sn . nD . a } : {; a n n nD d d > nd nD , a nD, a - nD .value p ) (n nd D P > . nD ,,- nD . , = = s12 12 2) 1 ( 2 1 ) / ( limid i inne n d D P =s =nii nx X Inx S1) (1) ( R (2013) 102 value p () 100 > n = ~||.|

\|s = s = > = 12 12 2) 1 ( 2 1 ) ( 1 ) (id ni i nn n n n nnend nD P d D P d D P value p . Kolmogorov-Smirnov() . ) ( ) ( : ) ( ) ( :0 , 1 0 0x F x F H x F x F H > =+, ) ( ) ( : ) ( ) ( :0 , 1 0 0x F x F H x F x F H s =, ( 0H x + , 1H , 1H x ) , , )] ( ) ( [ sup0x F x S Dnxn = < < +, )] ( ) ( [ sup0x S x F Dnxn = < < ( } , max{ +=n n nD D D ). , , } : {;+ + +>a n n nD d d , } : {; >a n n nD d d +nd nd +nD nD . +a nD, a nD, a +nD nD , . value p , , ) (+ +>n nd D P) ( >n nd D P . Kolmogorov-Smirnov - ) (0x F -.- 2 . Kolmogorov-SmirnovR ks.test. H ks.test ks.test(x, "string", ... , exact = TRUE or FALSE) x: "string": (.. pnorm, pt, punif) ... : exact: p-value (TRUE) (FALSE) 6.11 (ks.test - ) 5, 6, 7, 8, 9 ) 1 , 7 ( N- . R (2013) 103 > x ks.test(x, "pnorm", mean=7, sd=1, exact = FALSE) One-sample Kolmogorov-Smirnov test data:xD = 0.2413, p-value = 0.9328 alternative hypothesis: two-sided------------------------------------------------------------------------- > ks.test(x, "pnorm", mean=7, sd=1, exact = TRUE) One-sample Kolmogorov-Smirnov test data:xD = 0.2413, p-value = 0.8704 alternative hypothesis: two-sided Kolmogorov-Smirnov- ) 5 , 0 ( U . > x ks.test(x, "punif", min=0.5, max=4.5, exact = FALSE) One-sample Kolmogorov-Smirnov test data:xD = 0.172, p-value = 0.00028 alternative hypothesis: two-sided------------------------------------------------------------------------- > ks.test(x, "punif", min=0, max=5, exact = FALSE) One-sample Kolmogorov-Smirnov test data:xD = 0.0963, p-value = 0.1241 alternative hypothesis: two-sided 1 . 0 = a , ) 5 . 4 , 5 . 0 ( U , ) 5 , 0 ( U . ) 1 , 7 ( N . , / -.. (- ). Lilliefors- R (2013) 104 Kolmogorov-Smirnov ( Lilliefors ). 6.12 (ks.test - ) Kolmogorov-Smirnov ) ( ) ( : ) ( ) ( :0 , 1 0 0x F x F H x F x F H > =+. - . value p .> x ks.test(x, "pnorm", mean=10.2, sd=1, alternative="gr", exact = FALSE) One-sample Kolmogorov-Smirnov test data:xD^+ = 0.1198, p-value = 0.01352 alternative hypothesis: the CDF of x lies above the null hypothesis------------------------------------------------------------------------ > ks.test(x, "pnorm", mean=10.3, sd=1, alternative="gr", exact = FALSE) One-sample Kolmogorov-Smirnov test data:xD^+ = 0.154, p-value = 0.0008155 alternative hypothesis: the CDF of x lies above the null hypothesis R (2013) 105 6.2.7.2 Komogorov-Smirnov .( ) Shapiro-Wilk. Shapiro-Wilk ( 50 < n ) 2000 < n(2000 > n - Kolmogorov-Smirnov -). shapiro.test. - . Komogorov-Smirnov, - . ks.test,- . -Komogorov-Smirnov,Lilliefors( MonteCarlo)value p .- lillie.test nortest. nortest4 . : -sf.test: Shapiro-Francia test -ad.test: Anderson-Darling test -cvm.test: Cramer-von Mises test -pearson.test: Pearson chi-square test 20 > n DAgostino-Pearson - ( fBasics, dagoTest). 6.13 (ks.test lillie.test) 7, 8, 9, 10, 11, 12, 13, 16, 20, 25 - . > x ks.test(x, "pnorm", mean=mean(x), sd=sd(x), exact=FALSE) One-sample Kolmogorov-Smirnov test R (2013) 106 data:xD = 0.207, p-value = 0.7849 alternative hypothesis: two-sided------------------------------------------------------------------------- > shapiro.test(x) Shapiro-Wilk normality test data:xW = 0.8942, p-value = 0.1889 ------------------------------------------------------------------------- > library(nortest) > lillie.test(x) Lilliefors (Kolmogorov-Smirnov) normality test data:xD = 0.207, p-value = 0.2602 ------------------------------------------------------------------------- > ad.test(x) Anderson-Darling normality test data:xA = 0.4578, p-value = 0.2059 ------------------------------------------------------------------------- > sf.test(x) Shapiro-Francia normality test data:xW = 0.8954, p-value = 0.1651 ------------------------------------------------------------------------- > cvm.test(x) Cramer-von Mises normality test data:xW = 0.0764, p-value = 0.206 ------------------------------------------------------------------------- > pearson.test(x) Pearson chi-square normality test data:xP = 4.4, p-value = 0.2214 . value p Kolmogo-rov-Smirnov. Komogorov-Smirnov -.(D = 0.207) Kolmogorov-Smirnov Lilliefors. R (2013) 107 6.2.7.3Q-Q Q-Q ,1X ,2X...., nX ) (x F . Q-Q ) ), ( () (1i ix p F,, ,..., 2 , 1 n i = >s= + =. 10 , 2 / 110 , 8 / 3), 2 1 /( ) (nna a n a i pi () (1ip F()ip F). x y = - ) (x F . - ) (x F . Q-Qqqplot qqplot(x, y) x: pi F y: ( xi) ip ,n i ,..., 2 , 1 = , ppoints(n). , - ) ), ( () (1i ix pu ,n i ,..., 2 , 1 = . E qqnorm qqnorm(x) x: qqnorm(x), qqline(x) Q-Q ) ), 25 . 0 ( (25 . 01xu) ), 75 . 0 ( (75 . 01xu . 6.14 (Q-Q G(2,2)) 80 ) 2 , 2 ( G () Q-Q,()Q-Q) 2 , 2 ( G . ,,- . . R (2013) 108 > par(mfrow=c(1,2)) > y qqnorm(y);qqline(y) > x qqplot(x, y, main="Gamma distribution with a=2, b=2") > a1 names(faithful) [1] "eruptions" "waiting"> par(mfrow=c(1,2)) > hist(eruptions, seq(1.6,5.2,0.2)) > long 3] > qqnorm(long);qqline(long) R (2013) 109 Q-Q > n x1 x2 plot(x1,x2); qqline(x2,lt=2) 6.16 ( Q-Q ) Q-Q (Pareto),(),(), (laplace), (). R (2013) 110 > library(VGAM) > par(mfcol=c(1,2)) > alpha = 3; k = 3; x = seq(3.001, 8, length=100) > y plot(x,y,type="l",main="Skewedright:Paretodensity,location=0& shape=3") > d par(mfcol=c(1,2)) > x y plot(x, y,type="l", main="Skewed left: Beta density, a=9 & b=1") > d par(mfcol=c(1,2)) > x y plot(x, y,type="l", main="Platykurtic: Uniform density with a=0 & b=100") > d library(VGAM) > par(mfcol=c(1,2)) > x y plot(x, y,type="l", main="Leptokurtic: Laplace density with loc=50 & scale=40") > d par(mfcol=c(1,2)) > x y plot(x,y,type="l",main="Mesokurtic:Normaldensitywithm=50& sd=15") > d = H H , 0 2 1 1 0 2 1 0: : o o = = H H , : 1 : 2221o o=() ||.|

\|+ =||.|

\| + + ||.|

\|+ =2 102 122 221 12 101 12) 1 ( ) 1 ( 1 1n nSY Xn nS n S nn nY XTpo o 22 1 +n nt 0H . t -T,a R (2013) 113 value p p value0 2 1 1: o < H } : {; 22 1a n nt t t + < ) ( t T P s0 2 1 1: o > H } : {; 22 1a n nt t t +> ) ( t T P >0 2 1 1: o = H } : { } : {2 / ; 2 2 / ; 22 1 2 1a n n a n nt t t t t t + +> < |)) | ( 1 ( 2 |) | ( 2 t T P t T P s = > - ( ). - - . - -, . , () 2221210nSnSY XT+ =o |||||.|

\|+ =2221210n nY Xo oo a () ( t w =z ) 0 2 1 1: o < H } : {az w w H } : {az w w >0 2 1 1: o = H } : { } : {2 / 2 / a az w w z w w > < 2 : 2221o o=() 2221210nSnSY XT+ =o vt 0H , R (2013) 114 1) / (1) / () / / (22222121212222 121++=nn snn sn s n sv . a value p - p value0 2 1 1: o < H } : {; at t tv < ) ( t T P s0 2 1 1: o > H } : {; at t tv> ) ( t T P >0 2 1 1: o = H } : { } : {2 / ; 2 / ; a at t t t t tv v> < |)) | ( 1 ( 2 |) | ( 2 t T P t T P s = >2 . 1 2 t.test(6.2.1).H t.test - t.test(x, y, mu=0, var.equal=FALSE TRUE) x: y: mu: 1-2 var.equal: 6.17 (t.test) 6.1(cm)60A .(cm)40 ( HEIGHT.txt) 176.6177.5166.8172.0175.0177.7178.5177.7178.6178.9 182.7175.7175.0176.4176.1178.7177.6179.1177.1175.3 180.3175.7182.6177.6173.8178.0171.4172.7175.7182.9 171.0178.4177.7178.4173.0174.9170.0172.8175.5177.4 0 : 0 :1 0= = B A B AH H 1 R : R (2013) 115 > HA HB attach(HA);names(HA) [1] "HEIGHTA" > attach(HB);names(HB) [1] "HEIGHTB" > t.test(HEIGHTA, HEIGHTB, mu = 0, var.equal = TRUE,+ alternative="two.sided", conf.level=0.95) Two Sample t-test data:HEIGHTA and HEIGHTBt = -2.5545, df = 98, p-value = 0.01217 alternative hypothesis: true difference in means is not equal to 095 percent confidence interval: -2.9347466 -0.3685867sample estimates: mean of x mean of y 174.6683176.3200 value p 95% - ( conf.level=0.95). p-value: > n1 = = =D DH H , 0 2 1 1 0 2 1 0: : o o = = = =D DH H , n SDTD/0o =D DS i i iY X D = , n i s s 1 . 1 nt 0H . t -T,a value p p value0 1: o DH } : {; 1 a nt t t> ) ( t T P >0 1: o =DH } : { } : {2 / ; 1 2 / ; 1 a n a nt t t t t t > < |)) | ( 1 ( 2 |) | ( 2 t T P t T P s = >- , . 1 2 t.test paired.H t.test t.test(x, y, mu=0, paired=TRUE) x: y: mu: 1-2 6.18 (t.test) WEIGHT.txt, Kgr20 (- WA, ()) ( W, ()) R (2013) 117 , iiXiY iiXiY181.381.81177.381.0 281.780.11280.883.3 389.684.21371.570.0 485.182.11484.775.5 583.375.81575.472.4 681.581.61681.979.9 780.676.11779.879.0 881.979.91882.080.7 969.570.91994.391.9 1080.281.22077.076.3 - , 0 : 0 :2 1 1 2 1 0= = = = D DH H :> W attach(W);names(W) [1] "WA" "WB" > t.test(WA, WB, mu = 0,paired=TRUE, alternative="two.sided") Paired t-test data:WA and WBt = 2.5302, df = 19, p-value = 0.02040 alternative hypothesis: true difference in means is not equal to 095 percent confidence interval: 0.3084184 3.2615816sample estimates: mean of the differences1.785 p-value: > D n t cat(": p-value=", 2*pt(abs(t), df=n-1, lower.tail = FALSE)) : p-value= 0.02039696 t.test i i iY X d = ,20 ..., , 2 , 1 = i . > t.test(WA-WB, mu = 0, alternative="two.sided") One Sample t-test data:WA - WBt = 2.5302, df = 19, p-value = 0.02040 alternative hypothesis: true mean is not equal to 0 R (2013) 118 6.3.3 XY . - 0 : 0 :1 0< = H H , 0 : 0 :1 0> = H H , 0 : 0 :1 0= = H H . XY , , . - ) , ( Y X - XY . 212Rn RT= = == =niiniinii iY Y X XY Y X XR12121) ( ) () )( ( (Pearson), 2 nt 0H . t T, - a value p p value0 :1< H} : {; 2 a nt t t < ) ( t T P s0 :1> H} : {; 2 a nt t t> ) ( t T P >0 :1= H} : { } : {2 / ; 1 2 / ; 2 a n a nt t t t t t > < |)) | ( 1 ( 2 |) | ( 2 t T P t T P s = > R cor.test. H - cor.test cor.test(x, y, method=c("pearson", "kendall", "spearman")) x: y: method: 6.19 (cor.test) 6.18, R (2013) 119 0 : 0 :1 0= = H H . R > W attach(W);names(W) [1] "WA" "WB" > cor.test(WA, WB, method="pearson", alternative = "two.sided", + conf.level = 0.9) Pearson's product-moment correlation data:WA and WBt = 6.1735, df = 18, p-value = 7.914e-06 alternative hypothesis: true correlation is not equal to 090 percent confidence interval: 0.6473194 0.9167935sample estimates: cor0.824146 p-value: > r t cat(": p-value=", 2*pt(abs(t), df=n-2, lower.tail = FALSE)) : p-value= 7.914368e-06 (),-t.test ( 6.3.1). 6.3.4 1..., , ,2 1 nX X X 2..., , ,2 1 nY Y Y ) , (21 1 o N) , (22 2 o N , . 0 22211 0 22210: : oooo< = H H , 0 22211 0 22210: : oooo> = H H , 0 22211 0 22210: : oooo= = H H , 222101SSF = 1 , 12 1 n nF 0H . f - R (2013) 120 F,a value p p value0 22211: oo< H } : {1 ; 1 , 12 1a n nF f f < ) ( f F P s0 22211: oo> H } : {; 1 , 12 1a n nF f f > ) ( f F P >0 22211: oo= H } : { } : {2 / ; 1 , 1 2 / 1 ; 1 , 12 1 2 1a n n a n nF f f F f f > < )} ( ), ( min{ 2 f F P f F P > sRvar.test . var.test var.test(x, y, ratio=r) x: y: ratio: x y default r=1 6.20 (var.test) ( ) 202 1= = n nX: 135, 193, 98, 160, 62, 80, 75, 142, 132, 57, 213, 100, 75, 76, 93, 73, 133, 90, 151, 56 Y: 131, 123, 117, 271, 85, 126, 99, 195, 200, 54, 107, 121, 131, 61, 79, 191, 206, 105, 125, 57 2221 12221 0: : o o o o = = H H R : > X Y var.test(X,Y, ratio=1) F test to compare two variances data:X and YF = 0.6376, num df = 19, denom df = 19, p-value = 0.3351 alternative hypothesis: true ratio of variances is not equal to 195 percent confidence interval: 0.2523864 1.6109706sample estimates: ratio of variances 0.6376418 R (2013) 121 p-value: > f df1 lillie.test(X);shapiro.test(X) Lilliefors (Kolmogorov-Smirnov) normality test D = 0.1846, p-value = 0.07256 Shapiro-Wilk normality test W = 0.9087, p-value = 0.06017 ------------------------------------------------------------------------- > lillie.test(Y);shapiro.test(Y) Lilliefors (Kolmogorov-Smirnov) normality test D = 0.2374, p-value = 0.004395 Shapiro-Wilk normality test W = 0.9132, p-value = 0.07343 R (2013) 122 - . Levene . > library(lawstat) > V G levene.test(V, G, location="mean") classical Levene's test based on the absolute deviations from the mean ( none not applied because the location is not set to median ) data:VTest Statistic = 0.1616, p-value = 0.69 ------------------------------------------------------------------------- > levene.test(V, G, location="median") modified robust Brown-Forsythe Levene-type test based on the ab-solute deviations from the median data:VTest Statistic = 0.18, p-value = 0.6738 (var.test) . 6.3.5 6.3.5.1 Fisher 1..., , ,2 1 nX X X 2..., , ,2 1 nY Y Y ) , 1 (1p B ) , 1 (2p B ,. ==11niiX X ==21niiY Y - . 2 1 1 2 1 0: : p p H p p H < =( 0 : 0 :2 1 1 2 1 0< = p p H p p H ) 2 1 1 2 1 0: : p p H p p H > =( 0 : 0 :2 1 1 2 1 0> = p p H p p H ) 2 1 1 2 1 0: : p p H p p H = =( 0 : 0 :2 1 1 2 1 0= = p p H p p H ) xy XY 2 2 1xx n 1 1n 2yy n 2 2nk k N N R (2013) 123 Fisher - . } , min{ ..., }, , 0 max{ , ) ( ) | (1 22 11k n n k ikNi knini U P k Y X i X P =||.|

\|||.|

\|||.|

\|= = = = + = , U1 1n , 2n , k( ) , , (2 1k n n Hyper ). ) /( ) (2 1 1 1n n kn U E + = . O Fisher X ( ) - . , 2 1 1 2 1 0: : p p H p p H < = . value p p- value 2 1 1: p p H 12 1} , min{1) (=||.|

\|||.|

\|||.|

\|= >kNi kninx X Pk nx i 2 1 1: p p H =12 1} , min{} , 0 max{)] ( ) ( [12 =||.|

\|||.|

\|||.|

\|= s =kNi kninx X P i X P Ik nn k i Fisher -. 2_( 6.4.2) value p . - ) /( ) (2 1 1 1n n kn U E + =) /( ) (2 1 2 2n n kn U E + = 10 Fisher.FisherRfisher.test- fisher.test(m, or) m: 2x2 or: odds ratio [p1/(1- p1)]/[p2/(1- p2)] 6.22 ( Fisher) R (2013) 124 .1410- . 10414 2810 121224 2 1 1 2 1 0: : p p H p p H > = 1) 1 () 1 (: 1) 1 () 1 (:1 22 111 22 10> =p pp pHp pp pH 1p(2p ) o () . > m m [,1] [,2] [1,] 104 [2,]28 > fisher.test(m, alternative="greater", or=1) Fisher's Exact Test for Count Data data:mp-value = 0.01804 alternative hypothesis: true odds ratio is greater than 195 percent confidence interval: 1.435748Infsample estimates: odds ratio8.913675 p-value: > pvalue cat(": p-value=", pvalue, "\n") : p-value= 0.01803742 7 ) (1 = U E5 ) (2 = U E . Fisher 2 2( 6.4.2). 6.3.5.2 , 1n , 2n , R (2013) 125 ,1 1) 1 ( 2 12 1||.|

\|+ =n np pp pZ 2 12 2 1 12211 , , n np n p npnYpnXp++= = = . ( 1n , 2n ) 0H . z , - a value p p value2 1 1: p p H < } : {az z z < ) ( ) ( z z Z P u = s2 1 1: p p H > } : {az z z > ) ( 1 ) ( z z Z P u = >2 1 1: p p H = } : { } : {2 / 2 / a az z z z z z > < |) | 1 ( 2 |) | ( 2 z z Z P u = >)] / 1 ( ) / 1 )[( 2 / 1 ( | |2 1 2 1n n p p + > z , zz 0 2 1> p p||.|

\|+ ||.|

\|+ 2 12 12 11 1) 1 ( 1 121 n np pn np p0 2 1< p p||.|

\|+ ||.|

\|+ + 2 12 12 11 1) 1 ( 1 121 n np pn np p prop.test.Hprop.test : prop.test(x, n, correct=TRUE or FALSE) x: n: correct:p-value(TRUE) (FALSE) 6.23 ( ) 1000 , ,100120,. 1p 2p R (2013) 126 0 : 0 :2 1 1 2 1 0= = p p H p p H . R > prop.test(c(100,120), c(1000,1000), alternative="two.sided",+ conf.level=0.95, correct=FALSE) 2-sample test for equality of proportions without continuity correction data:c(100, 120) out of c(1000, 1000)X-squared = 2.0429, df = 1, p-value = 0.1529 alternative hypothesis: two.sided95 percent confidence interval: -0.0474114820.007411482sample estimates: prop 1 prop 20.10 0.12 p-value: > s1 0 :1= u H } : { } : {2 / 2 / a aw w w w w w ' > s aw aw' W a w W Pa s s ) (a w W Pa s ' > ) ( . W( ) Wilcoxon Rank-Sum 12) 1 () ( ,2) 1 () (+ +=+ +=m n nmW Varm n mW E . 12 / ) 1 (2 / ) 1 () () (*+ ++ + ==m n nmm n m WW VarW E WW R (2013)141 ) 10 ( , > m n ) 1 , 0 ( N 0H . * w * W , value p p value0 :1< u H *) ( *) * ( w w W P u = s0 :1> u H *) ( 1 *) * ( w w W P u = >0 :1= u H |) * (| 1 ( 2 |) * | * ( 2 w w W P u = >A (, ties) ||.|

\| + + + ++ + ===rjj jm n m nt tm nnmm n m WW VarW E WW13) 1 )( () (1122 / ) 1 () () (* ( m n, ) ) 1 , 0 ( N 0H . r , jt -r j s s 1 ( ). W . Mann-Whitney U -W (- ) 2) 1 ( + =m mW U . = =< =nimjj iY X I U1 1) ( , U iX 1Y , iX 2Y , ..., iX mY . 0H *) (12 / ) 1 (2 /) () (* Wm n nmnm UU VarU E UU =+ +== ( m n, ) ) 1 , 0 ( N .WilcoxonRank-Sum- . - (ordinal data). Wilcoxon Rank-Sum - location model (- R (2013)142 ) . t-test . RoWilcoxonRank-Sum wilcox.test.Hwilcox.test wilcox.test(y, x) x: y: wilcox.test . 6.29 (wilcox.test ) - () () . 11 - 12.611.211.49.413.212.0 15.414.114.013.411.3 . 0 : 0 :1 0> = u u H H . 9.411.211.311.4 12.0 12.6 13.2 13.4 14.014.115.41234567891011 6 = n , 5 = m , 41 11 10 9 8 4 3 = + + + + + = w , 26 2 / 6 5 41 = = u . 00831604 2.12 / ) 1 5 6 ( 5 62 / ) 1 5 6 ( 5 2612 / ) 1 (2 / ) 1 (* * =+ + + + =+ ++ + = =m n nmm n m wu w . 0.02230486 *) ( 1 = u = w value p . R > x y wilcox.test(y, x, alternative="greater") Wilcoxon rank sum test data:y and xW = 26, p-value = 0.02597 alternative hypothesis: true location shift is greater than 0 H26(Mann-WhitneyU) value p (50) ( correct TRUE).value p (exact=FALSE) (correct=FALSE) > wilcox.test(y, x, alternative="greater", exact=FALSE, correct=FALSE) Wilcoxon rank sum test data:y and xW = 26, p-value = 0.02230 alternative hypothesis: true location shift is greater than 0 p-value: > u n m ustar cat(": p-value =", 1-pnorm(ustar), "\n") : p-value = 0.02230486 Kolmogorov-Smirnov > ks.test(x, y, alternative="greater") Two-sample Kolmogorov-Smirnov test data:x and yD^+ = 0.8, p-value = 0.03047 alternative hypothesis: the CDF of x lies above that of y 05 . 0 = a , . 6.4 2 6.4.1 2 k iE ,k i s s 1 . : ) ( , , ) ( , ) ( :1 2 2 1 1 0H p E P p E P p E P Hk k = = = 0H R (2013)144 ( 12 1= + + +kp p p ). n - in iE ,k i s s 1( n n n nk = + + + 2 1). = = ====ki ii iki ii iki ii iee oxpected expected e observednpnp nU121212) ( ) ( ) ( 21 k_ 0H ( 5 >inpk i s s 1 ). a- } : {2; 1 a ku u> _u U .value p ) (21u Pk>_ . ip (s ip -) U (- n n pi i/ = ). U - 21 s k_ . R chisq.test. chisq.test chisq.test(x,p=c(p1,...,pk),simulate.p.value=TRUEorFALSE, B=2000) x: (n1,...,nk) p: (p1,...,pk) simulate.p.value: TRUE -MonteCarlop.value(default FALSE) B: 6.30 (chisq.test ) , 186 5 iE ,5 1 s s i , 89, 37, 30, 28 2 . : 02 . 0 , 18 . 0 , 2 . 0 , 2 . 0 , 4 . 0 :1 5 4 3 2 1 0H p p p p p H = = = = = 0H R, ) (i iE P p = , > x px chisq.test(x, p = px) Chi-squared test for given probabilities R (2013)145 data:xX-squared = 5.9519, df = 4, p-value = 0.2028 Warning message: In chisq.test(x, p = p) : Chi-squared approximation may be incorrect p-value: > u cat(": p-value =", 1-pchisq(u, df=length(x)-1), "\n") : p-value = 0.2027684 2_ - 5 >inp i( 5 1 s s i ). - , ( ),MonteCarloHope (1968) simulate.p.value. > sum(x)*px[1] 74.40 37.20 37.20 33.483.72 # np[5]=3.72 chisq.test(x, p = px, simulate.p.value = TRUE, B=2000) Chi-squared test for given probabilities with simulated p-value (based on 2000 replicates) data:xX-squared = 5.9519, df = NA, p-value = 0.2014 -,, ) ( ) ( : ) ( ) ( :0 1 0 0x F x F H x F x F H = = . , , (.. Kolmogorov-Smirnov 6.2.7.1). . 6.31 (chisq.test Poisson) 232 = n- SuperLeague # ( i )012345678 # (in ) 194960473218331 Hope, A. C. A. (1968). A simplified Monte Carlo significance test procedure, J. Roy, Statist. Soc. B 30, 582598. R (2013)146 Poisson 5 . 2 = .R:o in ,8 ..., , 1 , 0 = i , Obs1.) ( i X P pi= = ,7 0 s s i , ) 8 (8> = X P p) 5 . 2 ( ~ P X( Prob1). 1 ...8 1 0= + + + p p p . inp( xp1). - : ) ( , , ) ( , ) ( :1 8 8 1 1 0 0 0H p E P p E P p E P H = = = 0H iE ,7 0 s s i , i 8E - 8 . 0H - ) 5 . 2 ( P . > a Obs1 goals table(goals) goals 01234567819 49 60 47 32 18331> Prob1 ans1 Prob1 Exp1 Obs1 X=00.082084999 19.0437197 19 X=10.205212497 47.6092992 49 X=20.256515621 59.5116240 60 X=30.213763017 49.5930200 47 X=40.133601886 30.9956375 32 X=50.066800943 15.4978188 18 X=60.0278337266.45742453 X=70.0099406172.30622303 X>=8 0.0042466950.98523341 57 < np58 < np 2_ -. 6, 7, 8 : ) ( , , ) ( , ) ( :1 6 6 1 1 0 0 0H p E P p E P p E P H = = = 0H) ( i X P pi= = ,5 0 s s i ,) 6 (6> = X P p ) 5 . 2 ( ~ P X .- Prob Exp Obs. > Prob Exp Obs ans row.names(ans) ans ProbExp Obs X=0 0.08208500 19.0437219 X=1 0.20521250 47.6093049 X=2 0.25651562 59.5116260 X=3 0.21376302 49.5930247 X=4 0.13360189 30.9956432 X=5 0.06680094 15.4978218 X>=6 0.042021049.74888 7 5 >inp6 0 s s i , .> chisq.test(x=Obs,p=Prob) Chi-squared test for given probabilities data:ObsX-squared = 1.3919, df = 6, p-value = 0.9663 p-value: > chi.obs df cat(": p-value=", 1-pchisq(chi.obs,df), "\n") : p-value= 0.9663469 - Poisson 5 . 2 = . 22 k_ . 6.4.2 X Y r) ..., , , (2 1 rA A A k ) ..., , , (2 1 kB B B (,),,--. : 1 , 1 :1 0H k j r i p p p Hj i ij s s s s =- - 0H) (j i ijB A P p = ) , (j iB A =-= =kjij i ip A P p1) ( iA=-= =riij j jp B P p1) ( jB(11 1 1 1= = = =-=-= =kjjriirikjijp p p ). R (2013)148 n) , (i iY X ,n i s s 1 , - ijn k j r i s s s s 1 , 1 ( ) ) , (j iB A ( = ==rikjijn n1 1). 1B2B ... kB1A 11n12n...

kn1

- 1n2A21n22nkn2 - 2n rA1 rn2 rn ... rkn- rn 1 -n2 -n ... kn-n rk ijn k j r i s s s s 1 , 1 ( ) - n ijp . ij ijnp n E = ) ( . : 1 , 1 :1 0H k j r i p p p Hj i ij s s s s =- - 0H = = = = = ====rikj ijij ijrikj ijij ijrikj ijij ijee oxpected expected e observednpnp nU1 121 121 12) ( ) ( ) ( 21 rk_ , 0H ( - ijnp 5). - a } : {2; 1 a rku u> _u U . value p ) (21u Prk>_ . - ip jp- U - ()k j r innnnpnnnnpriij jjkjijii> s s s = = = ==--=--1 , 1 , , 11. U = =- -- -||.|

\|=rikjj ij iijnn nnn nnU1 12 R (2013)149 2) 1 ( ) 1 ( 1 k r rk_ , 2) 1 )( 1 ( k r_ . R chisq.test. chisq.test chisq.test(m,correct=TRUEorFALSE,simulate.p.value=TRUEor FALSE, B = 2000) m: r x k correct: 2x2 Yates default TRUE ( simulate.p.value=FALSE) simulate.p.value: TRUE -MonteCarlop.value(default FALSE) B: 6.32 (chisq.test ) 100 Non-Smoker Light Smoker Heavy Smoker Female28822 Male26214 : > sex.smoke rownames(sex.smoke) colnames(sex.smoke) sex.smoke Non-Smoker Light Smoker Heavy Smoker Male 288 22 Female 262 14 > chisq.test(sex.smoke) Pearson's Chi-squared test data:sex.smokeX-squared = 2.9678, df = 2, p-value = 0.2267 Warning message: In chisq.test(sex.smoke) : Chi-squared approximation may be incorrect ()- : > a names(a) [1] "statistic" "parameter" "p.value" "method""data.name" [6] "observed""expected""residuals" > attach(a) R (2013)150 > observed Non-Smoker Light Smoker Heavy Smoker Male 288 22 Female 262 14 > expected Non-Smoker Light Smoker Heavy Smoker Male31.325.820.88 Female22.684.215.12 > residuals ## (observed - expected) / sqrt(expected) Non-Smoker Light Smoker Heavy Smoker Male -0.59323560.91350030.2451053 Female0.6971345 -1.0734901 -0.2880329 n n nj i/- -5(-4.2), 2_ ( ). r k . r 1 2 k 1 11n 12n ...

kn1 - 1n 221n22n kn2 - 2n r1 rn2 rn rkn- rn 1 -n2 -n kn- - -n, - , - n .,-(,2 = r ) ( 3 = k ), . -., : ,..., 2 , 1 , :1 2 1 0H k j p p p p Hj rj j j = = = = = 0H ijp i ( r i s s 1 ) -j( k j s s 1 ),.k - R (2013)151 ik i in n n ..., , ,2 1r i s s 1 ( ) - in ik i ip p p ..., , ,2 1. = = = = = = --===rikj ijij ijrikj ijij ijrikj j ij i ijee oxpected expected e observedp np n nU1 121 121 12) ( ) ( ) ( 2) 1 ( k r_ , 0H . - a } : {2); 1 ( a k ru u> _u U . value p ) (2) 1 (u Pk r>_ . jp( ) 1 k j s s ()k jnnnnpriij jjs s = =- -=- -- 1 , 1. U = =- -- -- -- -||.|

\|=rikjj ij iijnn nnn nnU1 12 2) 1 )( 1 ( k r_ . R chisq.test. 6.33 (chisq.test ) 100 200 . 300 - , 503020100 508070200 10011090300 o . R > m chisq.test(m) Pearson's Chi-squared test data:mX-squared = 19.3182, df = 2, p-value = 6.384e-05 R (2013)152 p-value: >dimnames(m) m GRADE SEXGOOD VERY GOOG EXCELLENT Male 503020 Female 508070 > E E GOOD VERY GOOG EXCELLENT Male 33.3333336.6666730 Female 66.6666773.3333360 > chi.obs cat(": p-value =", 1-pchisq(chi.obs,2), "\n") : p-value = 6.384253e-05 , . 6.34 (chisq.test prop.test) 6.236.3.5.2- 1009001000 1208801000 22017802000 1p 2p - . 0 : 0 :2 1 1 2 1 0= = p p H p p H prop.test. correct FALSE TRUE. > prop.test(c(100,120), c(1000,1000), alternative="two.sided",+ correct=TRUE) 2-sample test for equality of proportions with continuity correction data:c(100, 120) out of c(1000, 1000)X-squared = 1.8437, df = 1, p-value = 0.1745 alternative hypothesis: two.sided 6.3.5.2 ( correct). chisq.test . R (2013)153 > m chisq.test(m, correct=TRUE) Pearson's Chi-squared test with Yates' continuity correction data:mX-squared = 1.8437, df = 1, p-value = 0.1745 U - Yates, = =- -- -- -- -||.|

\| =212125 . 0i jj ij iijnn nnn nnU . > E chi.obs cat(": p-value=", 1-pchisq(chi.obs,1), "\n") : p-value= 0.1745158 6.4.3 r - r , : :1 2 1 0H p p p Hr = = = 0H 2 r 1 1x 1 1x n 1n 22x2 2x n 2n rrxr rx n rn ==ri ii ip p np n xU12) 1 () ( ==riix n p1) / 1 ( rn n n n + + + = ...2 1, 21 r_ 0H . U r -p . a R (2013)154 } : {2; 1 a ru u> _u U . value p ) (21u Pr>_ . r - prop.test ( chisq.test). H prop.test : prop.test(x, n, correct=TRUE or FALSE) x: n: correct:p-value(TRUE) (FALSE) 6.35 (prop.test r Bernoulli) 6 (397 ) . . 183386 290393 31297136 4701282 4.- R > smokers patients prop.test(smokers, patients) 4-sample test for equality of proportions without continuity correction data:smokers out of patientsX-squared = 12.6004, df = 3, p-value = 0.005585 alternative hypothesis: two.sidedsample estimates: prop 1prop 2prop 3prop 40.9651163 0.9677419 0.9485294 0.8536585 p-value: > p numer denom U df cat(": p-value =", 1-pchisq(U,df),"\n") : p-value = 0.005585477 R (2013)155 4 . 6.5 k 6.5.1 k Anova k-.k k () ()., ,, Kg . -: n .,, in( 3 , 2 , 1 = i ) i ( n n n n = + +3 2 1) ijX ( , 3 , 2 , 1 = iin j ,..., 2 , 1 = )j . i( 3 , 2 , 1 = i ) i - , : :1 3 2 1 0H H = = 0H Kg (one-way anova with fixed effects) i ij i ijn j k i X ,..., 2 , 1 , ,..., 2 , 1 , = = + = c i i ijc - ijX . : :1 2 1 0H Hk = = = 0H . i ij i ijn j k i X ,..., 2 , 1 , ,..., 2 , 1 , = = + + = c t i it + , i i t = ( k i ,..., 2 , 1 = ). iti (treatmenteffect). 1 + k (overparameterized model) k, - 01==kii int . R (2013)156 : 0 :1 2 1 0H Hk = = = = t t t 0H . . 1 11X12X . . .11nX- 1X- 1X2 21X22X . . .22nX- 2X- 2X k 1 kX2 kX. . .kknX- kX- kX - -X- -X ANOVA ( n n n nk = + + + 2 1) ANOVA (Source) (Df) (SS) (MS) F (F) (Treatment) 1 k=- - - =kii i TreatmentX X n SS12) (1 =kSSMSTreatmentTreatment ErrorTreatmentMSMSF = (Error) k n = =- =kinji ij ErroriX X SS1 12) (k nSSMSErrorError=1 n= =- - =kinjij TotaliX X SS1 12) ( k n kErrorTreatmentFMSMSF =, 1~ . f F, a } : {; , 1 a k n kF f f > . value p ) ( f F P > . ijX .. 2, ( ~ o i ijN X ). , ijc .. ) , 0 ( ~2o c Nij. ANOVA R aov. - aov R (2013)157 summary(aov(x~)) x: : (, factor) 6.36 (summary(aov)) TirePASWR24 ( ft) ( ) - , 60 (StopDist).46,-(tire(factor) A, B, C, D). , R : > library(PASWR) > attach(Tire);names(Tire) [1] "StopDist" "tire" > plot(tire, StopDist) >dotplot(StopDist~tire) , C. ANOVA > model summary(model) Df Sum Sq Mean Sq F value Pr(>F)tire 3 5673.1 1891.045.3278 0.007316 ** Residuals 20 7098.8354.94 ANOVA4. ( ) - () a 1 R (2013)158 (2 / ) 1 ( k k )-0( ). Tukey R . > TukeyHSD(model, conf.level = 0.95) Tukey multiple comparisons of means 95% family-wise confidence level Fit: aov(formula = StopDist ~ tire) $tire diff lwrupr p adj B-A25.500000-4.9446409 55.94464 0.1213153 C-A42.00000011.5553591 72.44464 0.0049515 D-A30.666667 0.2220258 61.11131 0.0479540 C-B16.500000 -13.9446409 46.94464 0.4464584 D-B 5.166667 -25.2779742 35.61131 0.9637307 D-C -11.333333 -41.7779742 19.11131 0.7273681 > plot(TukeyHSD(model)) A C ( 42) - A , C. 6.5.2 Kruskal-Wallis -k -. R (2013)159 Kruskal-Wallisk- 1F , 2F ,..., kF F () , (). Kruskal-Wallis , (Rust-Flinger Kruskal-Wallis ). (location model) : :1 2 1 0H Hk = = = u u u 0H . ( iu k ). k -. n n n nk = + + + 2 11 n . iR ,, ,..., 2 , 1 k i = k . =+ +=ki iinnRn nH12) 1 ( 3) 1 (12. . 21 k_ ( 5 } ..., , , min{2 1>kn n n )value p ) (21h Pk>_ h H. - |.|

\| ==rii itn n t tHH12 2) 1 ( / ) 1 ( 1. r , it r i s s 1 ( ). - ( 1 =it ) tH H . Kruskal-Wallis- Wilcoxon Rank-Sum ( ). k - , Kruskal-Wallis -. R (2013)160 R kruskal.test. - kruskal.test kruskal.test(list(x1, x2,...,xk)) xi: i kruskal.test(x, t) x: t: 6.37 (kruskal.test) . 80 20 2 . 10 - . 161200113121242111371 232121621121123223252 321232243232511376222 421131216110111122154 . > x1 x2 x3 x4 x d t plot(t, x, horizontal=T) > library(lattice) > densityplot(~x|t, layout=c(1,4)) R (2013)161 4 .Kruskal-Wallis -. > kruskal.test(list(x1,x2,x3,x4)) Kruskal-Wallis rank sum test data:list(x1, x2, x3, x4)Kruskal-Wallis chi-squared = 7.8654, df = 3, p-value = 0.04888 0.05. 6.5.3 Levene k k -, : :12 2221 0H Hk = = = o o o 0H . Levene k ( Bartlett ).6.5.1 = =-=- - - =kinji ijkii iiZ Z kZ Z n k nW1 1212) ( ) 1 () ( ) ( | |i ij ijm X Z = (in j k i ,..., 2 , 1 , ,..., 2 , 1 = = ). im () i -=i iX m ( ),()i,() R (2013)162 (trimmedmean)i.10% 10%10%- . () Levene, () Brown-Forsythe Levene , () Levene . Brown-Forsythe Levene - . W k n kF , 1. W 0H a } : {; , 1 a k n kF w w > w W. value p ) (, 1w F Pk n k> . Rlevene.testlawstat k ( 6.3.4). levene.test levene.test(x, t, location=c("median", "mean", "trim.mean") x: t: location: - . 6.38 (levene.test) 6.36 4 . - 6.36 ( 4 -). R > library(PASWR); > library(lawstat) > attach(Tire);names(Tire) ------------------------------------------------------------------------- > levene.test(StopDist, tire, location="mean") classical Levene's test based on the absolute deviations from the mean ( none not applied because the location is not set to median ) data:StopDistTest Statistic = 0.9896, p-value = 0.4178 ------------------------------------------------------------------------- > levene.test(StopDist, tire, location="median") modified robust Brown-Forsythe Levene-type test based on the absolute deviations from the median data:StopDistTest Statistic = 0.9789, p-value = 0.4224 R (2013)163 . 6.6 c | | | | + + + + + =p px x x Y 2 2 1 1 0 Y (response), iX( p i s s 1 ) (predictors) c ) , 0 ( ~2o c N . R. R c | | + + = x Y1 0x y ~ x y + 1 ~c |+ = Y 1 ~ yc | + = x Y x y + 0 ~ x y + 1 ~ 1 ~ x yc | | | + + + =22 1 0x x Y) 2 ^ ( 1 ~ x I x y + +c | | | + + + =2 2 1 1 0x x Y2 1 ~ x x y +c | | | | + + + + =2 1 3 2 2 1 1 0x x x x Y 2 : 1 2 1 ~ x x x x y + + 2 * 1 ~ x x y 6.6.1 c | | + + = x Y1 0 . ix140145150155160165 iy14.114.415.118.118.320.5 . - x y 2668 . 0 946 . 23 + = . R lm. - > x y lm(y~x) Call: R (2013)164 lm(formula = y ~ x) Coefficients: (Intercept)x -23.9457 0.2669 > v summary(v) Call: lm(formula = y ~ x) Residuals: 1 2 3 4 5 6 0.6857 -0.3486 -0.98290.6829 -0.45140.4143 Coefficients: Estimate Std. Error t value Pr(>|t|)(Intercept) -23.945715.65568-4.2340.01333 *x 0.266860.03703 7.2070.00197 ** --- Signif. codes:0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Residual standard error: 0.7745 on 4 degrees of freedom Multiple R-squared: 0.9285, Adjusted R-squared: 0.9106F-statistic: 51.94 on 1 and 4 DF,p-value: 0.001965 R (2013)165 (residuals) i i iy y e = . 0|(Intercept) 1|(x ) (- Estimate), ( Std. Error) 0 : 0 :0 1 0 0= = | | H H 0 : 0 :1 1 1 0= = | | H H . (t value =Esti-mate / Std. Error) 2 nt , value p (- Pr(>|t|)). o (Re-sidual standard error)) 2 /( ) 2 /( 2 2 2 = = =n SSE n e sio . SST SSR r /2=(Multiple R-squared)(Adjusted R-squared). 0 : 0 :1 1 1 0= = | | H H . ().-(F-statistic=)] 2 /( /[ n SSE SSR ) 2 , 1 nF value p . - . anova . > anova(v) Analysis of Variance Table Response: y DfSum Sq Mean Sq F value Pr(>F)x1 31.1556 31.155651.938 0.001965 ** Residuals42.39940.5999 --- Signif. codes:0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 SourceDFSSMSF Regression1SSR 1 / SSR 2/ s SSR Residual Error2 n SSE) 2 /(2 = n SSE s Y - predict . R (2013)166 > predict(v, newdata=data.frame(x=153), interval="confidence", + level=0.95)### fitlwrupr 1 16.88343 16.00404 17.76282 > predict(v, newdata=data.frame(x=153), interval="prediction", + level=0.95) )### fitlwrupr 1 16.88343 14.56020 19.20666 plot . - > par(mfrow=c(2,2)) > plot(v,which = 1:4) (Residuals vs Fitted) ) , (i ie y , n i s s 1 .- . () - ( -).(NormalQ-Q)Q-Q (standardizedresiduals) . T (Scale-location), , -.(Cooksdistance) R (2013)167 ( 1 6). 6.6.2 c | | | | + + + + =3 3 2 2 1 1 0x x x Y( 3 = p ) . 1x2x3xY12.035.477.685.7 11.034.283.093.6 14.029.971.681.0 21.032.575.375.2 27.039.880.483.1 19.026.669.571.7 22.037.268.079.9 17.029.265.071.4 30.035.887.280.1 29.033.076.867.8 5.031.066.388.2 25.028.685.178.9 23.037.778.988.7 19.030.878.281.8 15.033.773.084.0 Y 15 , 1x , 2x 3x ( h Km/ ),( Co)(%). 3 2 1582 . 0 001 . 1 929 . 0 5 . 21 x x x y + + = . > x1 x2 x3 y lm(y~x1+x2+x3) Call: lm(formula = y ~ x1 + x2 + x3) Coefficients: (Intercept) x1 x2 x3 21.4790-0.9290 1.0009 0.5824 R (2013)168 summary. > m summary(m) Call: lm(formula = y ~ x1 + x2 + x3) Residuals: Min1QMedian3Q Max-5.2573 -0.79930.21041.66264.9027 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept)21.479010.5795 2.0300.06722 . x1 -0.9290 0.1331-6.978 2.34e-05 *** x21.0009 0.2343 4.2710.00132 **x30.5824 0.1418 4.1070.00174 **--- Signif. codes:0 *** 0.001 ** 0.01 * 0.05 . 0.1