Introduction for Clinical Database 陳勁辰2003/06/02.

39
Introduction for Introduction for Clinical Database Clinical Database 陳陳陳 陳陳陳 2003/06/02 2003/06/02

Transcript of Introduction for Clinical Database 陳勁辰2003/06/02.

Page 1: Introduction for Clinical Database 陳勁辰2003/06/02.

Introduction for Clinical Introduction for Clinical DatabaseDatabase

陳勁辰陳勁辰2003/06/022003/06/02

Page 2: Introduction for Clinical Database 陳勁辰2003/06/02.

IntroductionIntroduction

Database management system (DBMS)Database management system (DBMS) Actual study informationActual study information Administrative informationAdministrative information Relational database model Relational database model

Page 3: Introduction for Clinical Database 陳勁辰2003/06/02.

Relational database modelRelational database model

Multiple tablesMultiple tables Entities: rowsEntities: rows Attributes: columnsAttributes: columns Primary keyPrimary key Foreign keyForeign key Principle of table normalizationPrinciple of table normalization

Page 4: Introduction for Clinical Database 陳勁辰2003/06/02.

Relational database modelRelational database model

Example: Infant Jaundice Study: whetheExample: Infant Jaundice Study: whether neonatal jaundice affects neuropsychiar neonatal jaundice affects neuropsychiatric scores at five years of agetric scores at five years of age

Table relations: one-to-many, many-to-Table relations: one-to-many, many-to-many, and one-to-onemany, and one-to-one

Need for a multi-table: repeated measurNeed for a multi-table: repeated measurements on individual subjects.ements on individual subjects.

Page 5: Introduction for Clinical Database 陳勁辰2003/06/02.
Page 6: Introduction for Clinical Database 陳勁辰2003/06/02.
Page 7: Introduction for Clinical Database 陳勁辰2003/06/02.

Jaundice vs non-jaundice: t-test p=0.46

Page 8: Introduction for Clinical Database 陳勁辰2003/06/02.

One-to-manyOne-to-many

Page 9: Introduction for Clinical Database 陳勁辰2003/06/02.
Page 10: Introduction for Clinical Database 陳勁辰2003/06/02.

Normal Form ViolationNormal Form Violation

Repeating columns: violate the First NorRepeating columns: violate the First Normal Form (1NF)mal Form (1NF)

Redundancy in column values: violates tRedundancy in column values: violates the Second Normal Form (2NF)he Second Normal Form (2NF)

Page 11: Introduction for Clinical Database 陳勁辰2003/06/02.
Page 12: Introduction for Clinical Database 陳勁辰2003/06/02.
Page 13: Introduction for Clinical Database 陳勁辰2003/06/02.

Many-to-manyMany-to-many

Page 14: Introduction for Clinical Database 陳勁辰2003/06/02.
Page 15: Introduction for Clinical Database 陳勁辰2003/06/02.
Page 16: Introduction for Clinical Database 陳勁辰2003/06/02.
Page 17: Introduction for Clinical Database 陳勁辰2003/06/02.

One-to-one One-to-one

Page 18: Introduction for Clinical Database 陳勁辰2003/06/02.
Page 19: Introduction for Clinical Database 陳勁辰2003/06/02.
Page 20: Introduction for Clinical Database 陳勁辰2003/06/02.

Referential integrityReferential integrity Referential Integrity in a normalized, relational Referential Integrity in a normalized, relational

databasedatabase Not allow creation of an exam record for a subjNot allow creation of an exam record for a subj

ect who does not already exist in the “Baby” ect who does not already exist in the “Baby” table, and it will not allow assigning an exam ttable, and it will not allow assigning an exam to a doctor who does not already exist in the “o a doctor who does not already exist in the “Doctor” table. Doctor” table.

A subject may not be deleted unless and until A subject may not be deleted unless and until all that subject’s examinations have also beeall that subject’s examinations have also been deleted.n deleted.

One-to-many: parent-childrenOne-to-many: parent-children Forbids the creation of “orphans”Forbids the creation of “orphans”

Page 21: Introduction for Clinical Database 陳勁辰2003/06/02.
Page 22: Introduction for Clinical Database 陳勁辰2003/06/02.

Undesirability of Storing Calculated Undesirability of Storing Calculated Values Values

Inconsistencies result if one of the “raInconsistencies result if one of the “raw-data” fields is updated without updaw-data” fields is updated without updating the calculated fieldting the calculated field

Solution: recalculating the value in a quSolution: recalculating the value in a queryery

Storing calculated values such as “AgeIStoring calculated values such as “AgeInMonths” : violate the Third Normal FonMonths” : violate the Third Normal Form (3NF) rm (3NF)

Page 23: Introduction for Clinical Database 陳勁辰2003/06/02.

Data Dictionaries, Data Types, and Data Dictionaries, Data Types, and DomainsDomains

Data dictionary: a table of information aData dictionary: a table of information about the database itselfbout the database itself

““Metadata”Metadata” Data types: text, number, dates, etc.Data types: text, number, dates, etc. Domain: range of allowed valuesDomain: range of allowed values Object data type: BLOB (Binary Large ObObject data type: BLOB (Binary Large Ob

ject), eg. photo (*.jpeg); cannot be sorteject), eg. photo (*.jpeg); cannot be sorted or soughtd or sought

Page 24: Introduction for Clinical Database 陳勁辰2003/06/02.
Page 25: Introduction for Clinical Database 陳勁辰2003/06/02.

Extracting Data from the Database Extracting Data from the Database (Queries)(Queries)

Structured Query Language or SQLStructured Query Language or SQL A query can join data from two or more tA query can join data from two or more t

ables, display only selected fields, and filables, display only selected fields, and filter for records that meet certain criteriater for records that meet certain criteria

SQL has 3 sublanguages: DDL – Data DefiSQL has 3 sublanguages: DDL – Data Definition Language, DML – Data Manipulatinition Language, DML – Data Manipulation Language, and DCL – Data Control Laon Language, and DCL – Data Control Languagenguage

Page 26: Introduction for Clinical Database 陳勁辰2003/06/02.
Page 27: Introduction for Clinical Database 陳勁辰2003/06/02.

SQLSQL

SELECT Baby.SubjectID, Baby.DOB, Exam.SELECT Baby.SubjectID, Baby.DOB, Exam.ExDateExDate

FROM Baby INNER JOIN Exam ON Baby.SuFROM Baby INNER JOIN Exam ON Baby.SubjectID = Exam.SubjectIDbjectID = Exam.SubjectID

WHERE Exam.ExDate Between #1/1/2010# WHERE Exam.ExDate Between #1/1/2010# And #2/28/2010#And #2/28/2010#

ORDER BY Exam.ExDate;ORDER BY Exam.ExDate;

Page 28: Introduction for Clinical Database 陳勁辰2003/06/02.
Page 29: Introduction for Clinical Database 陳勁辰2003/06/02.
Page 30: Introduction for Clinical Database 陳勁辰2003/06/02.
Page 31: Introduction for Clinical Database 陳勁辰2003/06/02.
Page 32: Introduction for Clinical Database 陳勁辰2003/06/02.
Page 33: Introduction for Clinical Database 陳勁辰2003/06/02.

Guidelines for Database ManagemGuidelines for Database Management for Clinical Researchent for Clinical Research

1.1. Establish the database tables, their rows and columnEstablish the database tables, their rows and columns, and their relationships correctly at the outset.s, and their relationships correctly at the outset.

2.2. Establish and follow naming conventions for columnEstablish and follow naming conventions for columns and tables.s and tables.

3.3. Obtain baseline demographic and clinical informatioObtain baseline demographic and clinical information about members of the study population from existin about members of the study population from existing computer databases.ng computer databases.

4.4. Minimize the extent to which study measurements arMinimize the extent to which study measurements are recorded on paper forms.e recorded on paper forms.

5.5. Follow standard data entry conventions.Follow standard data entry conventions.6.6. Back up the database regularly.Back up the database regularly.

Page 34: Introduction for Clinical Database 陳勁辰2003/06/02.

Guidelines for Database ManagemGuidelines for Database Management for Clinical Researchent for Clinical Research

Establish the database tables, their roEstablish the database tables, their rows and columns, and their relationshipws and columns, and their relationships correctly at the outset. s correctly at the outset.

A poorly organized database makes daA poorly organized database makes data maintenance and retrieval nearly imta maintenance and retrieval nearly impossible. Make sure the data are normpossible. Make sure the data are normalized. Avoid data structures that requialized. Avoid data structures that require duplicate data entry or redundant stre duplicate data entry or redundant storage.orage.

Page 35: Introduction for Clinical Database 陳勁辰2003/06/02.

Guidelines for Database ManagemGuidelines for Database Management for Clinical Researchent for Clinical Research

Establish and follow naming conventions for Establish and follow naming conventions for columns and tables.columns and tables.

Short field names without spaces or underscShort field names without spaces or underscores are convenient for programming, queryiores are convenient for programming, querying, and other manipulations. Instead of spang, and other manipulations. Instead of spaces or underscores, use “IntraCaps” (upper ces or underscores, use “IntraCaps” (upper case letters within the variable name) to disticase letters within the variable name) to distinguish words, e.g. “StudyID”, “FName”, nguish words, e.g. “StudyID”, “FName”, or “ExamDate”. Table names should be sior “ExamDate”. Table names should be singular, e.g. “Baby” instead of “Babies”, ngular, e.g. “Baby” instead of “Babies”, “Exam” instead of “Exams”.“Exam” instead of “Exams”.

Page 36: Introduction for Clinical Database 陳勁辰2003/06/02.

Guidelines for Database ManagemGuidelines for Database Management for Clinical Researchent for Clinical Research

Obtain baseline demographic and clinical informatiObtain baseline demographic and clinical information about members of the study population from exion about members of the study population from existing computer databases.sting computer databases.

Avoid re-entering data which are already available (iAvoid re-entering data which are already available (in digital format) from other sources. In the Infant Jan digital format) from other sources. In the Infant Jaundice Study, the patient demographic data and coundice Study, the patient demographic data and contact information are obtained from the hospital dantact information are obtained from the hospital database. Computer systems can almost always prodtabase. Computer systems can almost always produce character-delimited or fixed-column-width text fuce character-delimited or fixed-column-width text files that the database management system can impiles that the database management system can import.ort.

Page 37: Introduction for Clinical Database 陳勁辰2003/06/02.

Guidelines for Database ManagemGuidelines for Database Management for Clinical Researchent for Clinical Research

Minimize the extent to which study measurements aMinimize the extent to which study measurements are recorded on paper forms.re recorded on paper forms.

Enter data directly into the computer database or mEnter data directly into the computer database or move data from paper forms into the computer databove data from paper forms into the computer database as close to the data collection time as possible. ase as close to the data collection time as possible. When you define a variable in a computer database, When you define a variable in a computer database, you specify both its format and its you specify both its format and its domaindomain or range or range of allowed values. Using these format and domain sof allowed values. Using these format and domain specifications, computer data entry forms give immepecifications, computer data entry forms give immediate feedback about improper formats and values tdiate feedback about improper formats and values that are out of range. The best time to receive this fehat are out of range. The best time to receive this feedback is when the study subject is still on site.edback is when the study subject is still on site.

Page 38: Introduction for Clinical Database 陳勁辰2003/06/02.

Guidelines for Database ManagemGuidelines for Database Management for Clinical Researchent for Clinical Research

Follow standard data entry conventions.Follow standard data entry conventions. Several conventions for data entry and display have Several conventions for data entry and display have

developed over time. Although most users of screen fdeveloped over time. Although most users of screen forms are not aware of these conventions, they have corms are not aware of these conventions, they have come to expect them subconsciously. For example, a ome to expect them subconsciously. For example, a series of mutually exclusive, collectively exhaustive cseries of mutually exclusive, collectively exhaustive choices is usually displayed as an “option group” cohoices is usually displayed as an “option group” consisting of several different “radio buttons”, wherensisting of several different “radio buttons”, whereas choices which are not mutually exclusive are displas choices which are not mutually exclusive are displayed as check boxesayed as check boxes

Page 39: Introduction for Clinical Database 陳勁辰2003/06/02.

Guidelines for Database ManagemGuidelines for Database Management for Clinical Researchent for Clinical Research

Back up the database regularly and chBack up the database regularly and check the adequacy of the back up proceeck the adequacy of the back up procedure by periodically restoring a file frodure by periodically restoring a file from the back up medium.m the back up medium.