- Outline
- Statistical Inference, Hypothesis Testing, Type 1 Error, Power, and Sample Size
- Randomization and Blinding
- Multiplicity in Clinical Trials
- Types of Endpoints
- Hypothesis testing
- Start with a null hypothesis, which usually state that there are no differences between treatment groups
- P-value presents that the probability that observed data could have occurred by chance alone i.e., if the null hypothesis were true
- Small P-Value suggests that the null hypothesis is false
- Statistical significance and clinical significance
- Statistical significance can simply mean that the difference is greater than zero but that does not necessarily equate to being clinically significant
- Power and Sample Size
- 80 to 90% Power to detect a small but clinically meaningful effect
- Consider what treatment effect size do we want to detect
- Randomization and Blinding
- Bias is a systematic error
- Blinding is a technique often utilized to reduce bias
- Random error is unpredictable i.e., innate differences between patients
- Bias is a systematic error
- Estimates and confidence intervals
- We want the results of the study to allow use to draw inferences about the population from the sample
- Power of a study is the sponsor’s risk, so it is up to the sponsor if they want to do a 90% power vs. with 80% power
- Multiplicity
- Clinical trials often have multiple hypothesis tests to increase likelihood of chances of having significant P-value
- Example: PROMISE-1 Key Efficacy Results
- One can controlling Type 1 error rate for the experiment by applying a multiplicity correction
- More statistical tests will result in increased likelihood of Type 1 errors, which means it is very likely there will be at least one Type 1 error
- Clinical trials have numerous statistical tests, providing opportunities for a Type 1 error to occur
- Sources of Multiple Tests: multiple endpoints, multiple dose groups, multiple timepoints
- Controlling the Experiment-Wise Error Rate
- Goal in the trials: requiring P < 0.05 ensures that the Type 1 error rate for a given test is <5%
- Main Approaches for Adjusting for Multiplicity
- Bonferroni Method
- Improved Bonferroni Methods
- Fixed-Sequence Method
- Gate Keeping Approaches, Fallback Approaches, Combination Approaches, etc.
- Regulatory Claims
- Draft guidance from FDA in January 2017 states that primary endpoints should be statistically significant before proceeding to the secondary endpoints
- This means claims cannot be established for secondary endpoints until statistical significance is shown for the primary endpoints
- Requires strong control of the Type 1 error rate to demonstrate additional effects
- Draft guidance from FDA in January 2017 states that primary endpoints should be statistically significant before proceeding to the secondary endpoints
- Adjusted P-Values
- Adjustments might be applied to statistical analyses when there are multiple P-values
- Multiple Endpoints vs. Co-Primary Endpoints
- No multiplicity adjustment for co-primary endpoints, which are all the endpoints that must be statistically significant
- This situation might arise as a requirement from FDA, not necessarily the sponsor
- No multiplicity adjustment for co-primary endpoints, which are all the endpoints that must be statistically significant
- Graphical Approach for Adjusting for Multiplicity
- Divide alpha into thirds (for each hypothesis test) and compare these values to each other
- Arrows inform what to do with that alpha level
- Types of Endpoints
- Continuous
- Treatment effect measured as mean difference between groups
- Simple Analysis Approach: T-Test
- Ordinal
- Include rating scales i.e., happy to sad emotion faces in hospital pain rating scales
- Model-Based Analysis
- Dichotomous/Binary
- An example of these type of variables is in oncology studies, the endpoint or variable is the shrinkage of the tumor
- Simple Analysis Approach: Chi-Square Test
- Survival or Time-To-Event
- Most Common Analysis: Cox Regression
- Composite Endpoints
- Are continuous variables that are calculated into an overall score
- Kaplan-Meier Curves
- Continuous
- Responder Analysis
- We want to see small differences that might be clinically significant, especially for continuous variables i.e., blood pressure, ACR 20 scale for rheumatoid arthritis, etc.
- Clinical and Surrogate Endpoints
- Clinical endpoint is a measure of how a patient feels, functions, or survives versus, a surrogate endpoint is believed to predict clinical benefit and is often used in place of a clinical endpoint
- Missing Data and Bias
- Most common reason for missing data is early discontinuation from the study (patients drop out of the study and ideally, would have liked to measure to the end of the study but can no longer do so)
- Provides an opportunity for bias to play a role
- Statistics techniques might be used to adjust for bias
- Handling Missing Data
- No technique is perfect!
- Best as well as ideal scenario is to follow everyone enrolling in a clinical trial from start to finish
- Thank You!