Internship Presentation

download Internship Presentation

of 41

description

PowerPoint presentation

Transcript of Internship Presentation

  • Vanderbilt University Medical CenterSRC PresentationVincent Kokouvi AgbotoAssistant Professor/Director of Biostatistics, Meharry Medical CollegeAssistant Professor of Biostatistics, Vanderbilt University Medical Center

  • Introduction to Experimental Designs in Biological and Clinical Settings.

  • Overview

    Introduction

    2. Examples of Classical Designs3. Optimal Experimental Design4. Other Designs Issues5. Conclusion

  • 1. IntroductionExperiment: Investigation in which investigator applies some treatments to experimental units and then observes the effects of treatments on the experimental units through measurement of response (s).

  • 1. IntroductionTreatment: Set of conditions applied to experimental units in an experiment. Experimental Unit: Physical entity to which a treatment is randomly assigned and independently applied.

  • 1. IntroductionResponse variable: Characteristic of an experimental unit that is measured after treatment and analyzed to assess the effects of treatments on experimental units. Observational Unit: Unit on which a response variable is measured.

  • 1. Introduction

    Experimental design procedure: Decision before data collection.Basic idea: Appropriate selection of values of control variables.Three Fundamental of Experimental Design Concepts: Randomization, Blocking, Replication. (R. A. Fisher)

  • 1. IntroductionImportant stages of an Experimental Research: Background of the experiment; Choice of factors; Reduction of error; Choice of model; Design criterion and Size of the design; Choice of an experimental design; Conduct of the experiment and Analysis of the data

  • 1. IntroductionClassical (Standard) DesignsOptimal Experimental Design: Only alternative when the standard designs do not provide us with adequate answers

  • 2. Examples of Classical DesignsExample1: Soils Moisture and gene Expression in maize seedlings.Example2: Drug and Feed Consumption on Gene Expression in rats.Example3: Treatments on Gene Expression in dairy cattle.

  • Example 1Experiment: Effect of three soil moisture levels on gene expression in maize seedlings.Total of 36 seedlings were grown in 12 pots with 3 seedlings per pot.Three soil moisture levels (low, medium, high) randomly assigned to the 12 pots.After three weeks, RNA extracted from the above ground tissues of each seedling.Each of the 36 RNA samples was hybridized to a microarray slide to measure gene expression.

  • Example 1 (continued)Treatment: The three moisture levels Experimental Unit: Moisture levels randomly assigned to the pots Pots: experimental units. A pot consisting of 3 seedlings is one experimental unit. Observational units: Gene expression was measured for each seedling Seedlings: Observational units.Response variable: Each probe on the microarray slide provide one response variable. This is the Standard Experimental Design (CRD).

  • Example 2 Experiment: Gauge the effects of a drug and feed consumption on gene expression in rats. A total of 40 rats were housed in individual cages.Half of them calorie-restricted diet (R); Another half Provided with access to feeders that were full so calories intake unrestricted (U). Within each diet group, four doses of an experimental drug (1, 2, 3, 4) rats with 5 rats per dose within each diet group.

  • Example 2 (continued)At the conclusion of the study, gene expression was measured for each rat using microarrays.

  • Example 2 (continued)Treatment (factors): Diet and Drug.Factor Diet (R, U); Factor Drug (1, 2, 3, 4)Each combination of diet and drug: Treatment (R1, R2, R3, R4, U1, U2, U3, U4).Each rat: Experimental unit/Observational unit. Response variable: Each probe on the microarray slide.This is a full factorial treatment design. It was used because all possible combination of diet and drug were considered.

  • Example 3Experiment: Study the effects of 5 treatments (A, B, C, D, E) on gene expression in dairy cattle.A total of 25 GeneChips and a total of 25 cows, located on 5 farms with 5 cows on each farm are available for the experiment. Which of the following designs is better from a statistical standpoint?

  • Example 3 (Continued)Design 1: To reduce variability within treatment groups, randomly assign the 5 treatments to the 5 farms so all 5 cows on any one farm receive the same treatment. Measure gene expression using one GeneChip for each cow.Design 2: Randomly assign the 5 treatments to the 5 farms within each farm so that all 5 treatments are represented on each farm. Measure gene expression using one GeneChip for each cow.

  • Example 3 (continued)Design 1 Design 2 Farm 1: B B B B B Farm 1: A B E D C Farm 2: D D D D D Farm 2: E D A C B Farm 3: A A A A A Farm 3: C D E A B Farm 4: E E E E E Farm 4: A B E C D Farm 5: C C C C C Farm 5: C A D B E

  • Example 3 (continued)Observation Units: Cows in both designs.Experimental Units: Farms in Design 1 and Cows in Design 2.Design 2: a randomized complete block design (RCBD) with a group of 5 cows on a farm serving as a block of experimental units.Design 1 has no replication because only 1 experimental unit for each treatment. Design 2 has 5 replications per treatment.

  • Design 3 (continued) Design 2 is by far the better design.We can compare treatments directly among cows that share the same environment.With Design 1, it is impossible to separate difference in expression due to treatment effects from differences in expression due to farm effects.

  • 3. Optimal Experimental Design3.1. Motivation Example3.2. Comments on Orthogonal Designs.3.3. Some Examples of Non-Orthogonal Designs3.4. Optimal Designs

  • 3.1. Motivating ExampleSuppose that the yield is linearly related to temperature whose range is [50, 150]: Y= a + b X If we want conduct experiments at two points, which of the following will we choose: Design1 at 50 and 150? Design2 at 70 and 130? Design3 at 90 and 110?

  • 3.1. Motivating ExampleWhat is the optimal design in this case?Better design among the three designs mentioned

  • 3.1. Motivating ExampleIt is the design1 because it gives the smallest confidence region for the parameters (D-optimality) and also give the smallest maximum variance for the predicted responses (G-optimality)

  • 3.2. Comments on orthogonal DesignsPros (Many desirable properties) - Easy to calculate - Easy to interpret - Maximum Precision (in some sense) - Tabled designs widely available

  • 3.2. Comments on Orthogonal Designs

    Cons: Not applicable if - Irregular design space - Mixture experiments - Sample size not power of 2 - Mixed qual and quant factors - Fixed covariates - Nonlinear models

  • 2.3. Some Examples of Non-Orthogonal Designs16-run design with 8 two-level factors with main effects and 6 interactions: BC, CH, BH, DE, EF, DF 12-run mixed level design with one 3 level factor and 9 two-level factors

  • 2.4. Optimal DesignsOptimal Experimental Design (OED): Standard alternative when classical designs not applicable.Choice of a particular experimental design: Depends on the experimenters design criterion (optimization problem).OED: Reduce costs of experimentation by allowing statistical models to be estimated with fewer experimental runs; Evaluated using statistical criteria.

  • 3.4. Optimal DesignsYnxp ~ N (X + , 2I), Xnxp: design matrix, : unknown px1 parameter vector and 2: known y(xi) = f(xi) + iX=[f(x1), , f(xn)]

  • 3.4. Optimal DesignsDesign : Probability measure over a compact region with (xi) = i places weight (xi) on xiProblem: n(xi) is not necessary an integer

  • 3.4. Optimal DesignsApproximate design: = x1 x2 xn 1 2n with (dx) =1 and 0 i 1Exact design: n(xi) must be an integer

  • 3.4. Optimal DesignsnM()=XX= m(x)(dx)= f(x) f(x) (dx) = i f(xi)f(xi) : Information matrix of Optimality crietria: * = arg max (M())

  • 3.5. Some Useful CriteriaD-Optimality: max |XX|: A-Optimality: min{trace (XX)-1}G-Optimality: min{max d(x)} where d(x) = f(x)(XX)-1f(x)V-Optimality: min{average d(x)}

  • 3.5. Some Useful CriteriaD and A-Optimality: Estimation based criteria.G and V-Optimality: Prediction based criteria.

  • 3.6. Algorithms for Optimal DesignsDevelopment of efficient computing methods and high power computer systems Great interest in algorithmic approaches.In general: Difficult to find exact designs analytically.Finding exact designs Solving a large nonlinear mixed integer programming problem.In practice: Find designs close to the best design locally optimal introduction of exact design algorithms.

  • 3.6. Algorithms for Optimal DesignsTypical Exact Design Algorithm steps: - Choose an initial feasible solution design - Modify solution slightly, by exchanging a point in the design for a point in the design space .

  • 3.6. Algorithms for Optimal DesignsFedorov algorithm (Fedorov, 1969).Modified Fedorov algorithm(Johnson and Nachtsheim, 1983).K-L exchange algorithm (Donev and Atkinson, 1988).Coordinate exchange algorithm (Meyer and Nachtsheim, 1995).Columnwise-Pairwise (CP) algorithm (Wu and Li, 1999).

  • 3.7. Software for the Computation of Optimal DesignsSASJMPMatlabRC++

  • 4. Other Designs IssuesSupersaturated DesignsBayesian DesignsModel Robust DesignsModel Discrimination Designs

  • 5. ConclusionAll problems are different Statistical knowledge will help improve the design.Get involved with the statistician (biostatistician) early in the process.Collaborate closely with people who know the background of the study. Even the most sophisticated statistical analysis could save do much to save a study based on a bad design.

  • ReferencesAgboto V. , 2006. Bayesian approaches to model robust and model discrimination designs. Unpublished Ph.D. dissertation, School of Statistics, University of Minnesota.Agboto V, Nachtsheim C, Li W. Screening designs for model discrimination. Journal of Statistical Planning and Inference,140:3, 766-780, 2010.Atkinson, A.C & Donev, A.N. (1992): Optimal Experimental Designs. Oxford Statistical Sciences Series:8, 1-328.Chaloner, K. (1984). Bayesian experimental design: A review. Statistical Science 10, 273-304.Cook, R. D. & Nachtsheim, C. J. (1982). A comparison of algorithms for constructing exact D-opitmal designs. Technometrics 22, 315-324.Li, W. & Wu, C. F. J. (1997). Columwise-pairwise algorithms with applications to the construction of supersaturated designs. Technometrics 39, 171-179.

    *******

    ******************Pros (Many desirable properties) - Easy to calculate - Easy to interpret - Maximum Precision (in some sense) - Lots of papers and textbooks in the literature

    *****: Probability measure over a compact design region with (xi) = Ii xi : design point, i :associated weight, where i = 1,2,, n.

    *Approximate designs are represented by the measure over . If has trials at n distinct points in , we write = x1 x2 xn 1 2n and (dx) =1 and 0 i 1 for all i.

    *Find an optimal design Minimize a measure of imprecision of (M()).

    *D-Optimality: max |XX| (the most commonly used criterion)A-Optimality: min{trace (XX)-1}(the sum of variance of all parameters)G-Optimality: min{max d(x)}V-Optimality: min{average d(x)}

    ****Need for an algorithmic approach to construct the optimal designsExchange procedures: Delete xj and add x (Start from a randomly chosen design and improve it iteratively): by Fedorov.K-l exchange algorithm (Donev and Atkinson 1988)k-exchange algorithm: drop k bad points (by Cook and Nachtsheim) - When k=1, Van Schalkwyk algorithm - When k=N, modified Fedorov algorithm (by Nachtsheim and Johnson, 1983)CP algorithm (by Wu and Li)

    ****