A complex ADaM dataset - three different ways to create one

34
A Complex ADaM dataset? Three different ways to create one.

description

The paper is intended for Clinical Trial SAS® programmers who create and validate a complex ADaM dataset. Some ADaM datasets require the use of complex algorithms. These algorithms could require several steps of data manipulation and more than one SDTM datasets. It can be very challenging to create a complex ADaM dataset in accordance with ADaM data structures and standards. Furthermore, it can be equally as challenging to validate those ADaM datasets. The paper will introduce three different ways to create a complex ADaM dataset. The first way is to create ADaM from SDTM directly without any intermediate permanent datasets. The second way is to create ADaM through the intermediate permanent datasets like SDTM+ or ADaM+ from SDTM. The third way is to create the final ADaM through the intermediate ADaM from SDTM. The paper will discuss the benefits and limitations of each method and also show some examples.

Transcript of A complex ADaM dataset - three different ways to create one

Page 1: A complex ADaM dataset - three different ways to create one

A Complex ADaM dataset? Three different ways to create one.

Page 2: A complex ADaM dataset - three different ways to create one

Any views or opinions presented in this presentation are solely those of the author and do not necessarily represent those of the company.

Disclaimer

11/27/2013 Cytel Inc. 2

Page 3: A complex ADaM dataset - three different ways to create one

• Introduction of ADaM dataset• Three methods for a complex ADaM dataset• Example• Benefits of each method• Limitation of each method• Consideration• Conclusion• Questions & Answers

Agenda

11/27/2013 Cytel Inc. 3

Page 4: A complex ADaM dataset - three different ways to create one

Introduction of ADaM

4

• ADaM(Analysis Data Model) is the analysis dataset in CDISC.

• Purpose• Analysis Ready (statistical analysis to be performed with  minimal programming)

• Traceability

• Type• ADSL(Subject Level Analysis Dataset)• BDS(Basic Data Structure)

− Special BDS(upcoming)• ADTTE(Time to Event Analysis Dataset)

• ADAE(Adverse Event Analysis Dataset ‐ upcoming)

Page 5: A complex ADaM dataset - three different ways to create one

• Can require several algorithms• Can require several data manipulation steps• Can be derived from more than one SDTM• Can be difficult to trace back• Can be difficult to validate

A complex ADaM dataset

11/27/2013 Cytel Inc. 5

Page 6: A complex ADaM dataset - three different ways to create one

1. SDTM datasets to ADaM datasets2. SDTM datasets through the intermediate 

permanent datasets to final ADaM datasets3. SDTM datasets through the intermediate 

ADaM datasets to final ADaM datasets

Three Methods to create a complex ADaM dataset

11/27/2013 Cytel Inc. 6

Page 7: A complex ADaM dataset - three different ways to create one

Three Methods Diagram

11/27/2013 Cytel Inc. 7

SDTM

SDTM+

ADaM

ADaM

ADaM+

Intermediate permanent datasets

Page 8: A complex ADaM dataset - three different ways to create one

• A comparison of average daily drinking rate in treatment period between placebo and study drug.

• At the baseline period ‐ the average daily drinking rate during 21 days from hospitalization date

• At the treatment period – the average daily drinking rate during during 42 days from the first study dose. Baseline rate imputation applied to the followings 

− The subject who discontinued early − Any missing assessment

Example 1

11/27/2013 Cytel Inc. 8

Page 9: A complex ADaM dataset - three different ways to create one

• SDTM – SU (Substance Use)• Final ADaM – ADDR (Drinking Rate Analysis Dataset)

• Parameter – ADDRATE (Average Daily Drinking Rate)

Key components in the example

11/27/2013 Cytel Inc. 9

Page 10: A complex ADaM dataset - three different ways to create one

• Rb (Baseline rate) = sum of all doses / number of days drinking data available at baseline period

• Ra (Actual treatment rate) = sum of all doses / number of days drinking data available at treatment period

• Rt (Imputed treatment rate) ( Ra * DAYS  + Rb * (42 – DAYS) ) / 42at DAYS is the number of days drinking data available 

Algorithm of parameter of ADDRATE

11/27/2013 Cytel Inc. 10

Page 11: A complex ADaM dataset - three different ways to create one

Three Methods for example

11/27/2013 Cytel Inc. 11

SDTM(SU)

SDTM+(_SU)

ADaM(ADSU)

ADaM(ADDR)

ADaM+(_ADDR)

Intermediate permanent datasets

Page 12: A complex ADaM dataset - three different ways to create one

SDTM SU dataset

11/27/2013 Cytel Inc. 12

USUBJID SUSEQ SUTRT SUSTAT SUDOSE SUDOSU SUSTDTC SUSTDY VISIT

001‐01‐001 1 ALCOHOL 0 DRINKS 2011‐02‐08 ‐21 Screening

001‐01‐001 2 ALCOHOL NOT DONE DRINKS 2011‐02‐09 ‐20 Screening

001‐01‐001 3 ALCOHOL 5 DRINKS 2011‐02‐10 ‐19 Screening

….

001‐01‐001 21 ALCOHOL 0 DRINKS 2011‐02‐28 ‐1 Screening

001‐01‐001 22 ALCOHOL 0 DRINKS 2011‐03‐01 1 Visit 1

001‐01‐001 23 ALCOHOL NOT DONE DRINKS 2011‐03‐02 2 Visit 1

001‐01‐001 24 ALCOHOL 0 DRINKS 2011‐03‐03 3 Visit 1

001‐01‐001 25 ALCOHOL 2 DRINKS 2011‐03‐04 4 Visit 1

001‐01‐001 26 ALCOHOL NOT DONE DRINKS 2011‐03‐05 5 Visit 1

….

001‐01‐001 58 ALCOHOL NOT DONE DRINKS 2011‐04‐06 37 Visit 3

001‐01‐001 59 ALCOHOL 4 DRINKS 2011‐04‐07 38 Visit 3

001‐01‐001 60 ALCOHOL 0 DRINKS 2011‐04‐08 39 Visit 3

001‐01‐001 61 ALCOHOL 2 DRINKS 2011‐04‐09 40 Visit 3

001‐01‐001 62 ALCOHOL 1 DRINKS 2011‐04‐10 41 Visit 3

001‐01‐001 63 ALCOHOL 4 DRINKS 2011‐04‐11 42 Visit 3

Page 13: A complex ADaM dataset - three different ways to create one

Analysis Dataset Metadata for ADDR

11/27/2013 Cytel Inc. 13

DatasetName

DatasetDescription

DatasetLocation

DatasetStructure

Key Variables of Dataset

Class of Dataset

Documentation

ADDR DrinkingRate Analysis Data

addr.xpt one record per subject per parameter per analysis visit

USUBJID,PARAMCD, AVISITN

BDS c‐addr.txt

Page 14: A complex ADaM dataset - three different ways to create one

Analysis Variable Metadata including Analysis Parameter value level Metadata for ADDR (1)

11/27/2013 Cytel Inc. 14

DatasetName

ParameterIdentifier

VariableName

Variable Label VariableType

DisplayFormat

Codelist / ControlledTerms

Source / Derivation

ADDR *ALL* USUBJID Unique Subject Identifier

text $20 ADSL.USUBJID

ADDR *ALL* SITEID Site ID text $20 ADSL.SITEID

ADDR *ALL* SEX Sex text $20 M, F ADSL.SEX

ADDR *ALL* FASFL Full Analysis Set Population Flag

text $1 Y, N ADSL.FASFL

ADDR *ALL* TRTPN Planned Treatment (N)

integer 1.0 1 = Placebo, 2= Study Drug

ADSL.TRTPN

ADDR *ALL* TRTP PlannedTreatment

text $20 Placebo, Study Drug

ADSL.TRTP

ADDR PARAMCD PARAMCD Parameter Code text $8 ADDRATE

ADDR *ALL* PARAM Parameter text $50 Average Daily Drinking Rate

Page 15: A complex ADaM dataset - three different ways to create one

11/27/2013 Cytel Inc. 15

DatasetName

ParameterIdentifier

VariableName

VariableLabel

VariableType

DisplayFormat

Codelist / ControlledTerms

Source / Derivation

ADDR *ALL* PARAMTYP Parameter Type

text $20 DERIVED

ADDR *ALL* AVISITN Analysis Visit (N)

integer 3.0 1=Baseline, 2=Treatment Period

ADDR *ALL* AVISIT Analysis Visit text $20 Baseline,  Treatment Period

‘Baseline’ when SU.VISIT=‘Screening’‘Treatment Period’ when SU.VISIT in (‘VISIT 1’, ‘VISIT 2’, ‘VISIT 3’)

ADDR *ALL* AVAL Analysis Value

float 8.2 Average Daily Drinking Rate within analysisvisit.  At Treatment Period, if a patient discontinues early or have missing records, impute with baseline rate

Analysis Variable Metadata including Analysis Parameter value level Metadata for ADDR (2)

Page 16: A complex ADaM dataset - three different ways to create one

11/27/2013 Cytel Inc. 16

DatasetName

ParameterIdentifier

VariableName

VariableLabel

VariableType

DisplayFormat

Codelist / ControlledTerms

Source / Derivation

ADDR *ALL* ABLFL BaselineRecord Flag

text $1 Y ‘Y’ at AVISIT = “Baseline”

ADDR *ALL* BASE BaselineValue

float 8.2 AVAL of AVISIT=“Baseline”

ADDR *ALL* CHG Change from Baseline

float 8.2 AVAL ‐ BASE

Analysis Variable Metadata including Analysis Parameter value level Metadata for ADDR (3)

Page 17: A complex ADaM dataset - three different ways to create one

1st method : SDTM to ADaM

11/27/2013 Cytel Inc. 17

SDTM(SU) ADaM(ADDR)

Page 18: A complex ADaM dataset - three different ways to create one

Final ADaM dataset of ADDR

11/27/2013 Cytel Inc. 18

USUBJID FASFL TRTP PARAMCD PARAM AVISIT ABLFL AVAL BASE CHG

001‐01‐001 Y StudyDrug

ADDRATE Average Daily Drinking Rate

Baseline Y 4.40

001‐01‐001 Y StudyDrug

ADDRATE Average Daily Drinking Rate

Treatment Period

2.72 4.40 ‐1.68

001‐01‐002 Y Placebo ADDRATE Average Daily Drinking Rate

Baseline Y 4.26

001‐01‐002 Y Placebo ADDRATE Average Daily Drinking Rate

TreatmentPeriod

3.10 4.26 ‐1.16

Key points to note:• Row 2: There are 3 missing assessments during the treatment period for the subject of 01‐001, so the baseline rate imputation method was applied as follow

2.60*39 + 4.40*(42‐39)  = 2.7242

• Row 4: There are no missing assessments during the treatment period for the subject of 01‐002

Page 19: A complex ADaM dataset - three different ways to create one

2nd method : SDTM to intermediate permanent datasets to ADaM

11/27/2013 Cytel Inc. 19

SDTM(SU)

SDTM+(_SU)

ADaM(ADDR)

ADaM+(_ADSU)

Intermediate permanent datasets

Page 20: A complex ADaM dataset - three different ways to create one

Intermediate permanent datasets of SDTM plus _SU (1)

11/27/2013 Cytel Inc. 20

USUBJID SUSEQ

SUTRT SUSTAT SUDOSE

SUDOSU SUSTDTC SUSTDY

VISIT _HOSEQ

_SDSEQ

001‐01‐001 1 ALCOHOL 0 DRINKS 2011‐02‐08 ‐21 Screening 1

001‐01‐001 2 ALCOHOL NOT DONE DRINKS 2011‐02‐09 ‐20 Screening

001‐01‐001 3 ALCOHOL 5 DRINKS 2011‐02‐10 ‐19 Screening 2

….

001‐01‐001 21 ALCOHOL 0 DRINKS 2011‐02‐28 ‐1 Screening 19

001‐01‐001 22 ALCOHOL 0 DRINKS 2011‐03‐01 1 Visit 1 1

001‐01‐001 23 ALCOHOL NOT DONE DRINKS 2011‐03‐02 2 Visit 1

001‐01‐001 24 ALCOHOL 0 DRINKS 2011‐03‐03 3 Visit 1 2

001‐01‐001 25 ALCOHOL 2 DRINKS 2011‐03‐04 4 Visit 1 3

001‐01‐001 26 ALCOHOL NOT DONE DRINKS 2011‐03‐05 5 Visit 1

….

001‐01‐001 58 ALCOHOL NOT DONE DRINKS 2011‐04‐06 37 Visit 3

001‐01‐001 59 ALCOHOL 4 DRINKS 2011‐04‐07 38 Visit 3 35

001‐01‐001 60 ALCOHOL 0 DRINKS 2011‐04‐08 39 Visit 3 36

001‐01‐001 61 ALCOHOL 2 DRINKS 2011‐04‐09 40 Visit 3 37

001‐01‐001 62 ALCOHOL 1 DRINKS 2011‐04‐10 41 Visit 3 38

001‐01‐001 63 ALCOHOL 4 DRINKS 2011‐04‐11 42 Visit 3 39

Page 21: A complex ADaM dataset - three different ways to create one

Intermediate permanent datasets of SDTM plus _SU (2)

11/27/2013 Cytel Inc. 21

• _HOSEQ is the sequence number of non‐missing drinking assessment from  the hospitalization date (2011‐02‐08)

• _SDSEQ is the sequence number of non‐missing drinking assessment from the first dose date (2011‐03‐01)

• When SUSTAT = ‘NOT DONE’, _HOSEQ and _SDSEQ are not increased by 1. 

Page 22: A complex ADaM dataset - three different ways to create one

Intermediate permanent dataset – ADaMplus _ADDR (1)

11/27/2013 Cytel Inc. 22

USUBJID TRTP PARAM AVISIT ABLFL AVAL BASE CHG _TOTAL

_DAYS _AVAL

001‐01‐001

StudyDrug

Average Daily Drinking Rate

Baseline Y 4.40 83.6 19 4.40

001‐01‐001

StudyDrug

Average Daily Drinking Rate

Treatment Period

2.72 4.40 ‐1.68 101.2 39 2.60

001‐01‐002

Placebo Average Daily Drinking Rate

Baseline Y 4.26 89.4 21 4.26

001‐01‐002

Placebo Average Daily Drinking Rate

TreatmentPeriod

3.10 4.26 ‐1.16 130.2 42 3.10

Plus variables• _TOTAL(Sum of doses per visit) = sum(SUDOSE)• _DAYS (Number of non‐missing drinking days per visit)= 

count(missing SUSTAT) or last._HOSEQ or last._SDSEQ within AVISIT

• _AVAL (Actual treatment rate)= _TOTAL / _DAYS

Page 23: A complex ADaM dataset - three different ways to create one

Intermediate permanent dataset – ADaMplus _ADDR (3)

11/27/2013 Cytel Inc. 23

USUBJID TRTP PARAM AVISIT ABLFL AVAL BASE CHG _TOTAL _DAYS _AVAL

001‐01‐001

StudyDrug

Average Daily Drinking Rate

Baseline Y 4.40 83.6 19 4.40

001‐01‐001

StudyDrug

Average Daily Drinking Rate

Treatment Period

2.72 4.40 ‐1.68 101.2 39 2.60

001‐01‐002

Placebo Average Daily Drinking Rate

Baseline Y 4.26 89.4 21 4.26

001‐01‐002

Placebo Average Daily Drinking Rate

TreatmentPeriod

3.10 4.26 ‐1.16 130.2 42 3.10

Key points to note:• Row 2 and 4: at the treatment period, AVAL algorithm is 

(_AVAL * _DAYS + BASE * (42 ‐ _DAYS) ) / 42• Row 2:

2.60*39 + 4.40*(42‐39)  = 2.7242

• Row 4:3.10*42 + 4.26*(42‐42)  = 3.10

42

Page 24: A complex ADaM dataset - three different ways to create one

3rd method: SDTM to intermediate ADaMto ADaM

11/27/2013 Cytel Inc. 24

SDTM(SU)

ADaM(ADSU)

ADaM(ADDR)

Page 25: A complex ADaM dataset - three different ways to create one

Intermediate ADaM dataset of ADSU (1)

11/27/2013 Cytel Inc. 25

USUBJID PARAMCD AVAL ADT AVISIT VISIT DTYPE ASEQ SUSEQ

001‐01‐001 DDRATE 0 2011‐02‐08 Baseline Screening 1 1

001‐01‐001 DDRATE 5 2011‐02‐10 Baseline Screening 2 3

….

001‐01‐001 DDRATE 0 2011‐02‐28 Baseline Screening 19 21

001‐01‐001 DDRATE 4.4 Baseline AVERAGE 20

001‐01‐001 DDRATE 0 2011‐03‐01 Treatment Period Visit 1 21 22

001‐01‐001 DDRATE 4.4 2011‐03‐02 Treatment Period Visit 1 BLCF 22 23

001‐01‐001 DDRATE 0 2011‐03‐03 Treatment Period Visit 1 23 24

001‐01‐001 DDRATE 2 2011‐03‐04 Treatment Period Visit 1 24 25

001‐01‐001 DDRATE 4.4 2011‐03‐05 Treatment Period Visit 1 BLCF 25 26

….

001‐01‐001 DDRATE 4.4 2011‐04‐06 Treatment Period Visit 3 BLCF 57 58

001‐01‐001 DDRATE 4 2011‐04‐07 Treatment Period Visit 3 58 59

001‐01‐001 DDRATE 0 2011‐04‐08 Treatment Period Visit 3 59 60

001‐01‐001 DDRATE 2 2011‐04‐09 Treatment Period Visit 3 60 61

001‐01‐001 DDRATE 1 2011‐04‐10 Treatment Period Visit 3 61 62

001‐01‐001 DDRATE 4 2011‐04‐11 Treatment Period Visit 3 62 63

001‐01‐001 DDRATE 2.72 Treatment Period AVERAGE 63

Page 26: A complex ADaM dataset - three different ways to create one

Intermediate ADaM dataset of ADSU (2)

11/27/2013 Cytel Inc. 26

• ‘NOT DONE’ data from SU were not included in ADSU• At baseline visit, we only include 19 records for 01‐001.   We used DYPTE=’AVERAGE’ to achieve the average of assessed doses at ASEQ = 20. 

• At treatment period visit, we only include 39 records.   We used DYPTE=’AVERAGE’ to achieve the average of assessed doses at ASEQ = 63. 

Page 27: A complex ADaM dataset - three different ways to create one

Final ADaM dataset of ADDR

11/27/2013 Cytel Inc. 27

USUBJID TRTP PARAM AVISIT ABLFL AVAL BASE CHG SRCDOM SRCSEQ

001‐01‐001

StudyDrug

Average Daily Drinking Rate

Baseline Y 4.40 ADSU 20

001‐01‐001

StudyDrug

Average Daily Drinking Rate

Treatment Period

2.72 4.40 ‐1.68 ADSU 63

001‐01‐002

Placebo Average Daily Drinking Rate

Baseline Y 4.26 ADSU 22

001‐01‐002

Placebo Average Daily Drinking Rate

TreatmentPeriod

3.10 4.26 ‐1.16 ADSU 65

Key points to note:• All the records are coming from ADSU.• Great data point traceability.

Page 28: A complex ADaM dataset - three different ways to create one

Example 2 : Intermediate Time to Event permanent ADaM plus dataset

11/27/2013 Cytel Inc. 28

USUBJID

TRTP PARAM

AVAL

STARTDT

ADT CNSR

EVNTDESC _DSDECOD _DSDTC

_SVXSTDTC

_AEXDT

001‐01‐001

StudyDrug 1

Death 157 2011‐01‐04

2011‐06‐10

1 COMPLETEDTHE STUDY

COMPLETEDTHE STUDY

2011‐06‐10

2011‐06‐10

2011‐05‐04

001‐01‐002

StudyDrug 2

Death 116 2011‐02‐01

2011‐05‐28

1 LOST TO FOLLOW‐UP

LOST TO FOLLOW‐UP

2011‐05‐28

2011‐05‐28

2011‐05‐01

001‐01‐003

StudyDrug 2

Death 88 2011‐02‐05

2011‐05‐04

0 DEATH DEATH 2011‐05‐04

2011‐05‐04

2011‐05‐04

001‐01‐004

StudyDrug 1

Death 102 2011‐03‐20

2011‐06‐30

1 ONGOING 2011‐06‐30

2011‐06‐04

001‐01‐005

StudyDrug 1

Death 101 2011‐03‐26

2011‐07‐05

1 ONGOING 2011‐07‐01

2011‐07‐05

AVAL = ADT – STARTDTPlus variables• _DSDECOD = DS.DSDECOD when DS.DSCAT = “DISPOSITION EVENT”• _DSDTC = DS.DSDTC when DS.DSCAT = “DISPOSITION EVENT”• _SVXSTDTC = Last Study Visit date• _AEXDT = Last AE date

Page 29: A complex ADaM dataset - three different ways to create one

The benefits are• Simple process The limitations are• A lack of data point traceability (Traceability will be provided with Define.xml) 

• Difficult to troubleshoot issues if development SAS programmer and validation SAS programmer do not agree on issues in the final ADaM dataset.

1st Method : SDTM to ADaM

11/27/2013 Cytel Inc. 29

Page 30: A complex ADaM dataset - three different ways to create one

The benefits are• Easy to follow each step and to validate • Flexibility of the data structure of intermediate datasets (A programmer does not need to follow CDISC standards in the intermediate permanent datasets)

The limitations are• A lack of data point traceability, especially for the reviewers.

2nd Method : SDTM thru intermediate permanent datasets to final ADaM

11/27/2013 Cytel Inc. 30

Page 31: A complex ADaM dataset - three different ways to create one

• Plus datasets • The same SAS program as the final ADaM dataset development program.   We do not have separate dataset programs for the intermediate permanent datasets. 

• Same number of the records – we keep the same number of records between SDTM datasets and SDTM plus datasets and also ADaM datasets and ADaM plus datasets.  

• Naming convention : the prefix of ‘_’ and original SDTM or final ADaM

• Plus variables • The temporary variables by adding the prefix ‘_’. • No Standard for plus variables – we assign the labels, but do not follow any CDISC standards.

Business rules for plus datasets

11/27/2013 Cytel Inc. 31

Page 32: A complex ADaM dataset - three different ways to create one

The benefits are• Easy to follow each step • Great data point traceabilityThe limitations are• Need to create and validate all ADaM datasets including the intermediate ADaM datasets

• Not much flexibility of ADaM datasets as the intermediate datasets

3rd method : SDTM thru ADaM to final ADaM

11/27/2013 Cytel Inc. 32

Page 33: A complex ADaM dataset - three different ways to create one

Datasets which will be submitted• SDTM to ADaM method 

1. SDTM 2. final ADaM

• SDTM thru the intermediate permanent datasets to ADaM method 1. SDTM 2. final ADaM

• SDTM thru ADaM to ADaM method 1. SDTM2. intermediate ADaM3. final ADaM

Consideration

11/27/2013 Cytel Inc. 33

Page 34: A complex ADaM dataset - three different ways to create one

• Three methods for a complex ADaM datasets1. SDTM datasets to ADaM datasets2. SDTM datasets through the intermediate 

permanent datasets to final ADaM datasets3. SDTM datasets through the intermediate ADaM

datasets to final ADaM datasets

• More options for a complex ADaM dataset creation

• Analysis will dictate the type of methods

Conclusion

11/27/2013 Cytel Inc. 34