Discover millions of ebooks, audiobooks, and so much more with a free trial

Only $11.99/month after trial. Cancel anytime.

Methods and Applications of Statistics in Clinical Trials, Volume 1: Concepts, Principles, Trials, and Designs
Methods and Applications of Statistics in Clinical Trials, Volume 1: Concepts, Principles, Trials, and Designs
Methods and Applications of Statistics in Clinical Trials, Volume 1: Concepts, Principles, Trials, and Designs
Ebook2,218 pages25 hours

Methods and Applications of Statistics in Clinical Trials, Volume 1: Concepts, Principles, Trials, and Designs

Rating: 0 out of 5 stars

()

Read preview

About this ebook

A complete guide to the key statistical concepts essential for the design and construction of clinical trials

As the newest major resource in the field of medical research, Methods and Applications of Statistics in Clinical Trials, Volume 1: Concepts, Principles, Trials, and Designs presents a timely and authoritative reviewof the central statistical concepts used to build clinical trials that obtain the best results. The referenceunveils modern approaches vital to understanding, creating, and evaluating data obtained throughoutthe various stages of clinical trial design and analysis.

Accessible and comprehensive, the first volume in a two-part set includes newly-written articles as well as established literature from the Wiley Encyclopedia of Clinical Trials. Illustrating a variety of statistical concepts and principles such as longitudinal data, missing data, covariates, biased-coin randomization, repeated measurements, and simple randomization, the book also provides in-depth coverage of the various trial designs found within phase I-IV trials. Methods and Applications of Statistics in Clinical Trials, Volume 1: Concepts, Principles, Trials, and Designs also features:

  • Detailed chapters on the type of trial designs, such as adaptive, crossover, group-randomized, multicenter, non-inferiority, non-randomized, open-labeled, preference, prevention, and superiority trials
  • Over 100 contributions from leading academics, researchers, and practitioners
  • An exploration of ongoing, cutting-edge clinical trials on early cancer and heart disease, mother-to-child human immunodeficiency virus transmission trials, and the AIDS Clinical Trials Group

Methods and Applications of Statistics in Clinical Trials, Volume 1: Concepts, Principles, Trials, and Designs is an excellent reference for researchers, practitioners, and students in the fields of clinicaltrials, pharmaceutics, biostatistics, medical research design, biology, biomedicine, epidemiology,and public health.

LanguageEnglish
PublisherWiley
Release dateMar 5, 2014
ISBN9781118595916
Methods and Applications of Statistics in Clinical Trials, Volume 1: Concepts, Principles, Trials, and Designs

Read more from N. Balakrishnan

Related to Methods and Applications of Statistics in Clinical Trials, Volume 1

Related ebooks

Medical For You

View More

Related articles

Reviews for Methods and Applications of Statistics in Clinical Trials, Volume 1

Rating: 0 out of 5 stars
0 ratings

0 ratings0 reviews

What did you think?

Tap to rate

Review must be at least 10 words

    Book preview

    Methods and Applications of Statistics in Clinical Trials, Volume 1 - N. Balakrishnan

    Chapter 1

    Absolute Risk Reduction

    Robert Newcomb

    1.1 Introduction

    Many response variables in clinical trials are binary: the treatment was successful or unsuccessful; the adverse effect did or did not occur. Binary variables are summarized by proportions, which may be compared between different arms of a study by calculating either an absolute difference of proportions or a relative measure, the relative risk or the odds ratio. In this article we consider several point and interval estimates for the absolute difference between two proportions, for both unpaired and paired study designs. The simplest methods encounter problems when numerators or denominators are small; accordingly, better methods are introduced. Because confidence interval methods for differences of proportions are derived from related methods for the simpler case of the single proportion, which itself can also be of interest in a clinical trial, this case is also considered in some depth. Illustrative examples relating to data from two clinical trials are shown.

    1.2 Preliminary Issues

    In most clinical trials, the unit of data is the individual, and statistical analyses for efficacy and safety outcomes compare responses between the two (or more) treatment groups. When subjects are randomized between these groups, responses of subjects in one group are independent of those in the other group. This leads to unpaired analyses. Crossover and split-unit designs require paired analyses. These have many features in common with the unpaired analyses and will be described in the final section.

    Thus, we study n1 individuals in group 1 and n2 individuals in group 2. Usually, all analyses are conditional on n1 and n2. Analyses conditional on n1 and n2 would also be appropriate in other types of prospective studies or in cross-sectional designs. (Some hypothesis testing procedures such as the Fisher exact test are conditional also on the total number of successes in the two groups combined. This alternative conditioning is inappropriate for confidence intervals for a difference of proportions; in particular in the event that no successes are observed in either group, this approach fails to produce an interval.) The outcome variable is binary: 1 if the event of interest occurs, 0 if it does not. (We do not consider here the case of an integer-valued outcome variable; typically, this involves the number of episodes of relapse or hospitalization, number of accidents, or similar events occurring within a defined follow-up period. Such an outcome would instead be modeled by the Poisson distribution.) We observe that r1 subjects in group 1 and r2 subjects in group 2 experience the event of interest. Then the proportions having the event in the two groups are given by p1 = r1/n1 and p2 = r2/n2. If responses in different individuals in each group are independent, then the distribution of the number of events in each group is binomial.

    Several effect size measures are widely used for comparison of two independent proportions:

    Difference of proportions p1 − p2

    Ratio of proportions (risk ratio or relative risk) p1/p2

    Odds ratio (p1/(1 − p1))/(p2/(1 − p2))

    In this article we consider in particular the difference between two proportions, p1 − p2, as a measure of effect size. This is variously referred to as the absolute risk reduction, risk difference, or success rate difference. Other articles in this work describe the risk ratio or relative risk and the odds ratio. We consider both point and interval estimates, in recognition that confidence intervals convey information about magnitude and precision of effect simultaneously, keeping these two aspects of measurement closely linked [1]. In the clinical trial context, a difference between two proportions is often referred to as an absolute risk reduction. However, it should be borne in mind that any term that includes the word reduction really presupposes that the direction of the difference will be a reduction in risk—such terminology becomes awkward when the anticipated benefit does not materialize, including the nonsignificant case when the confidence interval for the difference extends beyond the null hypothesis value of zero. The same applies to the relative risk reduction, 1 − p1/p2. Whenever results are presented, it is vitally important that the direction of the observed difference should be made unequivocally clear. Moreover, sometimes confusing labels are used, which might be interpreted to mean something other than p1 − p2; for example, Hashemi et al. [2] refer to p1 − p2 as attributable risk. It is also vital to distinguish between relative and absolute risk reduction.

    In clinical trials, as in other prospective and cross-sectional designs already described, each of the three quantities we have discussed may validly be used as a measure of effect size. The risk difference and risk ratio compare two proportions from different perspectives. A halving of risk will have much greater population impact for a common outcome than for an infrequent one. Schechtman [3] recommends that both a relative and an absolute measure should always be reported, with appropriate confidence intervals.

    The odds ratio is discussed at length by Agresti [4]. It is widely regarded as having a special preferred status on account of its role in retrospective case-control studies and in logistic regression and meta-analysis. Nevertheless, it should not be regarded as having gold standard status as a measure of effect size for the 2 × 2 table [3,5].

    1.3 Point and Interval Estimates for a Single Proportion

    Before considering the difference between two independent proportions in detail, we first consider some of the issues that arise in relation to the fundamental task of estimating a single proportion. These issues have repercussions for the comparison of proportions because confidence interval methods for p1 − p2 are generally based closely on those for proportions. The single proportion is also relevant to clinical trials in its own right. For example, in a clinical trial comparing surgical versus conservative management, we would be concerned with estimating the incidence of a particular complication of surgery such as postoperative bleeding, even though there is no question of obtaining a contrasting value in the conservative group or of formally comparing these.

    The most commonly used estimator for the population proportion π is the familiar empirical estimate, namely, the observed proportion p = r/n. Given n, the random variable R denoting the number of subjects in which the response occurs has the binomial B(n,p) distribution, with Pr[R = r] = {n!/r!(n r!)} pr qn-r where q = 1 − p. The simple empirical estimator is also the maximum likelihood estimate for the binomial distribution, and it is unbiased—in the usual statistical sense that the expectation of R given n,

    equation

    However, when r = 0, many users of statistical methods are uneasy with the idea that p = 0 is an unbiased estimate. The range of possible values for π is the interval from 0 to 1. Generally, this means the open interval 0 < π < 1, not the closed interval 0 ≤ π ≤ 1, as usually it would already be known that the event sometimes occurs and sometimes does not. As the true value of π cannot then be negative or zero but must be greater than zero, the notion that p = 0 should be regarded as an unbiased estimate of π seems highly counterintuitive.

    Largely with this issue in mind, alternative estimators known as shrinkage estimators are available. These generally take the form p = (r + )/(n + 2 ) for some > 0. The quantity is known as a pseudo-frequency. Essentially, observations are added to the number of successes and also to the number of failures. The resulting estimate p is intermediate between the empirical estimate p = r/n and , which is the midpoint and center of symmetry of the support scale from 0 to 1. The degree of shrinkage toward is great when n is small and minor for large n. Bayesian analyses of proportions lead naturally to shrinkage estimators, with = 1 and corresponding to the most widely used uninformative conjugate priors, the uniform prior B(1,1) and the Jeffreys prior B( , ).

    It also is important to report confidence intervals, to express the uncertainty due to sampling variation and finite sample size. The simplest interval p ± zx SE(p), where SE(p) = √(pq/n), remains the most commonly used. This is usually called the Wald interval. Here, z denotes the relevant quantile of the standard Gaussian distribution. Standard practice is to use intervals that aim to have 95% coverage, with 2.5% noncoverage in each tail, leading to z = 1.9600.

    Unfortunately, confidence intervals for proportions and their differences do not achieve their nominal coverage properties. This is because the sample space is discrete and bounded. The Wald method for the single proportion has three unfavorable properties [6–9]. These can all be traced to the interval’s simple symmetry about the empirical estimate.

    The achieved coverage is much lower than the nominal value. For some values of π, the achieved coverage probability is close to zero.

    The noncoverage probabilities in the two tails are very different. The location of the interval is too distal—too far out from the center of symmetry of the scale, . The noncoverage of the interval is predominantly mesial.

    The calculated limits often violate the boundaries at 0 and 1. In particular, when r = 0, a degenerate, zero-width interval results. For small non-zero values of r (1, 2, and sometimes 3 for a 95% interval), the calculated lower limit is below zero. The resulting interval is usually truncated at zero, but this is unsatisfactory as the data tells us that 0 is an impossible value for π. Corresponding anomalous behavior at the upper boundary occurs when n r is 0 or small.

    Many improved methods for confidence intervals for proportions have been developed. The properties of these methods are evaluated by choosing suitable parameter space points (here, combinations of n and π), using these to generate large numbers of simulated random samples, and recording how often the resulting confidence interval includes the true value π. The resulting coverage probabilities are then summarized by calculating the mean coverage and minimum coverage across the simulated datasets.

    Generally, the improved methods obviate the boundary violation problem, and improve coverage and location. The most widely researched options are as follows.

    A continuity correction may be incorporated: p ± {z√(pq/n) + 1/(2n)}. This certainly improves coverage and obviates zero-width intervals but increases the incidence of boundary overflow.

    The Wilson score method [10] uses the theoretical value π, not the empirical estimate p, in the formula for the standard error of p. Lower and upper limits are obtained as the two solutions of the equation p = π ± zx SE(π) = π ± z x √(π(1 − π)/n), which reduces to a quadratic in π. The two roots are given in closed form as

    equation

    It is easily demonstrated [7] that the resulting interval is symmetrical on the logit scale—the other natural scale for proportions—by considering the product of the two roots for π, and likewise for 1 − π. The resulting interval is boundary respecting and has appropriate mean coverage. In contrast to the Wald interval, location is rather too mesial.

    The midpoint of the score interval, on the ordinary additive scale, is a shrinkage estimator with = ( )z², which is 1.92 for the default 95% interval. With this (and also Bayesian intervals) in mind, Agresti and Coull [8] proposed a pseudo-frequency method, which adds = 2 to the numbers of successes (r) and failures (n r) before using the ordinary Wald formula. This is also a great improvement over the Wald method, and is computationally and conceptually very simple. It reduces but does not eliminate the boundary violation problem. A variety of alternatives can be formulated, with different choices for , and also using something other than n + 2 ) as the denominator of the variance.

    Alternatively, the Bayesian approach described elsewhere in this work may be used. The resulting intervals are best referred to as credible intervals, in recognition that the interpretation is slightly different from that of frequentist confidence intervals such as those previously described.

    Bayesian inference starts with a prior distribution for the parameter of interest, in this instance the proportion π. This is then combined with the likelihood function comprising the evidence from the sample to form a posterior distribution that represents beliefs about the parameter after the data have been obtained. When a conjugate prior is chosen from the beta distribution family, the posterior distribution takes a relatively simple form: it is also a beta distribution. If substantial information about π exists, an informative prior may be chosen to encapsulate this information.

    More often, an uninformative prior is used. The simplest is the uniform prior B(1,1), which assumes that all possible values of π between 0 and 1 start off equally likely. An alternative uninformative prior with some advantages is the Jeffreys prior B( , ). Both are diffuse priors, which spread the probability thinly across the whole range of possible values from 0 to 1.

    The resulting posterior distribution may be displayed graphically, or may be summarized by salient summary statistics such as the posterior mean and median and selected centiles. The 2 1/2 and 97 1/2 centiles of the posterior distribution delimit the tail-based 95% credible interval. Alternatively, a highest posterior density interval may be reported. The tail-based interval is considered preferable because it produces equivalent results when a transformed scale (e.g., logit) is used [11].

    These Bayesian intervals perform well in a frequentist sense [12]. Hence, it is now appropriate to regard them as confidence interval methods in their own right, with theoretical justification in the Bayesian paradigm but empirical validation from a frequentist standpoint. They may thus be termed beta intervals. They are readily calculated using software for the incomplete beta function, which is included in statistical packages and also spreadsheet software such as Microsoft Excel. As such, they should now be regarded as computationally of closed form, though less transparent than Wald methods.

    Many statisticians consider that a coverage level should represent minimum, not average, coverage. The Clopper-Pearson exact or tail-based method [13] achieves this, at the cost of being excessively conservative; intervals are unnecessarily wide. There is a trade-off between coverage and width; it is always possible to increase coverage by widening intervals, and the aim is to attain good coverage without excessive width. A variant on the exact method involving a mid-P accumulation of tail probabilities [14,15] aligns mean coverage closely with the nominal 1 − α. Both methods have appropriate location. The Clopper-Pearson interval, but not the mid-P one, is readily programmed as a beta interval, of similar form to Bayes intervals. A variety of shortened intervals have also been developed that maintain minimum coverage but substantially shrink interval length [16,17]. Shortened intervals are much more complex, both computationally and conceptually. They also have the disadvantage that what is optimized is the interval, not the lower and upper limits separately; consequently, they are unsuitable when interest centers on one of the limits rather than the other.

    Numerical examples illustrating these calculations are based on some results from a very small randomized phase II clinical trial performed by the Eastern Cooperative Oncology Group [18]. Table 1 shows the results for two outcomes, treatment success defined as shrinkage of the tumor by 50% or more, and life-threatening treatment toxicity, for the two treatment groups A and B.

    Table 1: Some Results from a Very Small Randomized Phase II Clinical Trial Performed by the Eastern Cooperative Oncology Group

    Source: Parzen et al. J Comput Graph Stat. 2002: 11; 420–436.

    Table 2 shows 95% confidence intervals for both outcomes for treatment A. These examples show how Wald and derived intervals often produce inappropriate limits (see asterisks) in boundary and near-boundary cases.

    Table 2: 95% Confidence Intervals for Proportions of Patients with Successful Outcome and with Life-Threatening Toxicity on Treatment A in the Eastern Cooperative Oncology Group Trial

    Note: Asterisks denote boundary violations.

    Source: Parzen et al. J Comput Graph Stat. 2002: 11; 420–436.

    1.4 An Unpaired Difference of Proportions

    We return to the unpaired difference case. As described elsewhere in this work, hypothesis testing for the comparison of two proportions takes a quite different form according to whether the objective of the trial is to ascertain difference or equivalence. When we report the contrast between two proportions with an appropriately constructed confidence interval, this issue is taken into account only when we come to interpret the calculated point and interval estimates. In this respect, in comparison with hypothesis testing, the confidence interval approach leads to much simpler, more flexible patterns of inference.

    The quantity of interest is the difference between two binomial proportions, π1 and π2. The empirical estimate is p1 − p2 = r1/n1 − r2/n2. It is well known that, when comparing means, there is a direct correspondence between hypothesis tests and confidence intervals. Specifically, the null hypothesis is rejected at the conventional two-tailed α = 0.05 level if and only if the 100(1 − α) = 95% confidence interval for the difference excludes the null hypothesis value of zero. A similar property applies also to the comparison of proportions—usually, but not invariably. This is because there are several options for constructing a confidence interval for the difference of proportions, which have different characteristics and do not all correspond directly to purpose-built hypothesis tests.

    The Wald interval is calculated as p1 − p2 ± z√(p1q1/n1 + p2q2/n2). It has poor mean and minimum coverage and fails to produce an interval when both p1 and p2 are 0 or 1. Overshoot can occur when one proportion is close to 1 and the other is close to 0, but this situation is expected to occur infrequently in practice. Use of a continuity correction improves mean coverage, but minimum coverage remains low.

    Several better methods have been developed, some of which are based on specific mathematical models. Any model for the comparison of two proportions necessarily involves both the parameter of interest, δ = π1 − π2, and an additional nuisance parameter γ. The model may be parametrized in terms of δ and π1 + π2, or δ and (π1 + π2)/2, or δ and π1. We will define the nuisance parameter as γ = (π1 + π2)/2.

    Some of the better methods substitute the profile estimate γδ, which is the maximum likelihood estimate of γ conditional on a hypothesized value of δ. These include score-type asymptotic intervals developed by Mee [19] and Miettinen and Nurminen [20]. Newcombe [21] developed tail-based exact and mid-P intervals involving substitution of the profile estimate.

    All these intervals are boundary respecting. The exact method aligns the minimum coverage quite well with the nominal 1 − α; the others align mean coverage well with 1 − α, at the expense of fairly complex iterative calculation.

    Bayesian intervals for p1 − p2 and other comparative measures may be constructed [2,11], but they are computationally much more complex than in the single proportion case, requiring use of numerical integration or computer-intensive methodology such as Markov chain Monte Carlo (MCMC) methods. It may be more appropriate to incorporate a prior for p1 − p2 itself rather than independent priors for p1 and p2 [22]. The Bayesian formulation is readily adapted to incorporate functional constraints such as δ ≥ 0 [22]. Walters [23] and Agresti and Min [11] have shown that Bayes intervals for p1 − p2 with uninformative beta priors have favorable frequentist properties.

    Two computationally simpler, effective approaches have been developed. Newcombe [21] also formulated square-and-add intervals for differences of proportions. The concept is a very simple one. Assuming independence, the variance of a difference between two quantities is the sum of their variances. In other words, standard errors square and add—they combine in the same way that differences in x and in y coordinates combine to give the Euclidean distance along the diagonal, as in Pythagoras’ theorem. This is precisely how the Wald interval for p1 − p2 is constructed. The same principle may be applied starting with other, better intervals for p1 and p2 separately. The Wilson score interval is a natural choice as it already involves square roots, though squaring and adding would work equally effectively starting with, for instance, tail-based [24] or Bayes intervals. It is easily demonstrated that the square-and-add process preserves the property of respecting boundaries.

    Thus, the square-and-add interval is obtained as follows. Let (li, ui) denote the score interval for pi, for i = 1,2. Then the square-and-add limits are

    equation

    This easily computed interval aligns mean coverage closely with the nominal 1 − α. A continuity correction is readily incorporated, resulting in more conservative coverage. Both intervals tend to be more mesially positioned than the γδ-based intervals discussed previously.

    The square-and-add approach may be applied a second time to obtain a confidence interval for a difference between differences of proportions [25]; this is the linear scale analogue of assessing an interaction effect in logistic regression.

    Another simple approach that is a great improvement over the Wald method is the pseudo-frequency method [26,27]. A pseudo-frequency is added to each of the four cells of the 2 × 2 table, resulting in the shrinkage estimator (r1 + )/(n1 + 2 ) − (r2 + )/(n2 + 2 ).

    The Wald formula then produces the limits

    equation

    where

    equation

    Agresti and Caffo [27] evaluated the effect of choosing different values of , and they reported that adding 1 to each cell is optimal here. So here, just as for the single proportion case, in total four pseudo-observations are added. This approach also aligns mean coverage effectively with 1 − α. Interval location is rather too mesial, very similar to that of the square-and-add method. Zero-width intervals cannot occur. Boundary violation is not ruled out but is expected to be infrequent.

    Table 3 shows 95% confidence intervals calculated by these methods, comparing treatments A and B in the ECOG trial [18].

    Table 3: 95% Confidence Intervals for Differences in Proportions of Patients with Successful Outcome and with Life-Threatening Toxicity between Treatments A and B in the Eastern Cooperative Oncology Group Trial

    Note: Asterisks denote boundary violations.

    Source: Parzen et al. J Comput Graph Stat. 2002: 11; 420–436.

    1.5 Number Needed to Treat

    In the clinical trial setting, it has become common practice to report the number needed to treat (NNT), defined as the reciprocal of the absolute risk difference: NNT = 1/(p1 − p2) [28,29]. This measure has considerable intuitive appeal, simply because we are used to assimilating proportions expressed in the form of "1 in n," such as a 1 in 7 risk of life-threatening toxicity for treatment A in Table 1.

    The same principle applies to differences of proportions. These tend to be small decimal numbers, often with a leading zero after the decimal point, which risk being misinterpreted by the less numerate. Thus if p1 = 0.35 and p2 = 0.24, we could equivalently report p1 − p2 = 0.11, or as an absolute difference of 11% or an NNT of 9. The latter may well be an effective way to summarize the information when a clinician discusses a possible treatment with a patient. As always, we need to pay careful attention to the direction of the difference. By default, NNT is read as number needed to treat for (one person to) benefit, or NNTB. If the intervention of interest proves to be worse than the control regime, we report the number needed to harm (NNTH).

    A confidence interval for the NNT may be derived from any good confidence interval method for p1 − p2 by inverting the two limits. For example, Bender [30] suggests an interval obtained by inverting square-and-add limits [21]. But it is when we turn attention to confidence intervals that the drawback of the NNT approach becomes apparent. Consider first the case of a statistically significant difference, with p1 − p2 = +0.25, and 95% confidence interval from +0.10 to +0.40. Then an NNT of 4 is reported, with 95% confidence interval from 2.5 to 10. This has two notable features. The lower limit for p1 − p2 gives rise to the upper limit for the NNT and vice versa. Furthermore, the interval is very skewed, and the point estimate is far from the midpoint. Neither of these is a serious contraindication to use of the NNT.

    But often the difference is not statistically significant—and, arguably, reporting confidence intervals is even more important in this case than when the difference is significant. Consider, for example, p1 − p2 = +0.10, with 95% confidence interval from −0.05 to +0.25. Here, the estimated NNT is 1/0.10 = +10. Inverting the lower and upper confidence limits for p1 − p2 gives −20 and +4. This time, the two limits do not change places apparently. But there are two problems. The point estimate, +10, is not intermediate between −20 and +4. Moreover, the interval from −20 to +4 does not comprise the values of the NNT that are compatible with the data, but rather the ones that are not compatible with it. In fact, the confidence region for the NNT in this case consists of two intervals that extend to infinity, one from +4 to +∞ in the direction of benefit, the other from −20 to −∞ in the direction of harm. It could be a challenge to clinicians and researchers at large to comprehend this singularity that arises when a confidence interval spanning 0 is inverted [31].

    Accordingly, it seems preferable to report absolute risk reductions in percentage rather than reciprocal form. The most appropriate uses of the NNT are in giving simple bottomline figures to patients (in which situation, usually only the point estimate would be given), and in labeling a secondary axis on a graph.

    1.6 A Paired Difference of Proportions

    Crossover and split-unit trial designs lead to paired analyses. Regimes that aim to produce a cure are generally not suitable for evaluation in these designs, because in the event that a treatment is effective, there would be a carryover effect into the next treatment period. For this reason, these designs tend to be used for evaluation of regimes that seek to control symptomatology, and thus most often give rise to continuous outcome measures. Examples of paired analyses of binary data in clinical trials include comparisons of different antinauseant regimes administered in randomized order during different cycles of chemotherapy, comparisons of treatments for headache pain, and split-unit studies in ophthalmology and dermatology. Results can be reported in either risk difference or NNT form, though the latter appears not to be frequently used in this context. Other examples in settings other than clinical trials include longitudinal comparison of oral carriage of an organism before and after third molar extraction, and twin studies.

    Let a, b, c, and d denote the four cells of the paired contingency table. Here, b and c are the discordant cells, and interest centers on the difference of marginals:

    equation

    Hypothesis testing is most commonly performed using the McNemar approach [32], using either an asymptotic test statistic expressed as z or chi-square, or an aggregated tail probability. In both situations, inference is conditional on the total number of discordant pairs, b + c.

    Newcombe [33] reviewed confidence interval methods for the paired difference case. Many of these are closely analogous to unpaired methods. The Wald interval performs poorly. So does a conditional approach, based on an interval for the simple proportion b/(b + c). Exact and tail-based profile methods perform well; although, as before, these are computationally complex. A closed-form square-and-add approach, modified to take account of the nonindependence, also aligns mean coverage with 1 − α, provided that a novel form of continuity correction is incorporated.

    Tango [34] developed a score interval, which is boundary respecting and was subsequently shown to perform excellently [35]. Several further modifications were suggested by Tang, Tang, and Chan [36]. Agresti and Min [11] proposed pseudo-frequency methods involving adding = 0.5 to each cell and demonstrated good agreement of mean coverage with 1 − α. However, overshoot can occasionally occur.

    The above methods are appropriate for a paired difference of proportions. But for crossover and simultaneous split-unit studies, a slightly different approach is preferable. Thus, in a crossover study, if the numbers of subjects who get the two treatment sequences AB and BA are not identical, the simple difference of marginals contains a contribution from period differences. A more appropriate analysis is based on the analysis of differences of paired differences described by Newcombe [25]. The example in Table 4 relates to a crossover trial of home versus hospital physiotherapy for chronic multiple sclerosis [37]. Twenty-one patients were randomized to receive home physiotherapy followed by hospital physiotherapy, and 19 to receive these treatments in the reverse order. Following Hills and Armitage [38] and Koch [39], the treatment effect is estimated as half the difference between the within-subjects period differences in the two treatment order groups. The resulting estimate, +0.1454 and 95% confidence interval, −0.0486 to +0.3238, are very similar but not identical to those obtained by direct application of the modified square-and-add approach [33], +0.1500 and −0.0488 to +0.3339.

    Table 4: Crossover Trial of Home Versus Hospital Physiotherapy: Treating Physiotherapist’s Assessment of Whether the Patient Benefited from Either Type of Treatment

    References

    [1] K. Rothman, Modern Epidemiology. Boston: Little, Brown, 1986.

    [2] L. Hashemi, B. Nandram, and R. Goldberg, Bayesian analysis for a single 2 × 2 table. Stat Med. 1997; 16: 1311–1328.

    [3] E. Schechtman, Odds ratio, relative risk, absolute risk reduction, and the number needed to treat—which of these should we use? Value Health. 2002; 5: 431–436.

    [4] A. Agresti, Categorical Data Analysis, 2nd ed. Hoboken, NJ: Wiley, 2002.

    [5] R. G. Newcombe, A deficiency of the odds ratio as a measure of effect size. Stat Med. 2006; 25: 4235–4240.

    [6] S. E. Vollset, Confidence intervals for a binomial proportion. Stat Med. 1993; 12: 809–824.

    [7] R. G. Newcombe, Two-sided confidence intervals for the single proportion: comparison of seven methods. Stat Med. 1998; 17: 857–872.

    [8] A. Agresti and B. A. Coull, Approximate is better than exact for interval estimation of binomial proportions. Am Stat. 1998; 52: 119–126.

    [9] L. D. Brown, T. T. Cai, and A. Das-Gupta, Interval estimation for a proportion. Stat Sci. 2001; 16: 101–133.

    [10] E. B. Wilson, Probable inference, the law of succession, and statistical inference. J Am Stat Assoc. 1927; 22: 209–212.

    [11] A. Agresti and Y. Min, Frequentist performance of Bayesian confidence intervals for comparing proportions in 2 × 2 contingency tables. Biometrics. 2005; 61: 515–523.

    [12] B. P. Carlin and T. A. Louis, Bayes and Empirical Bayes Methods for Data Analysis. London: Chapman & Hall, 1996.

    [13] C. J. Clopper and E. S. Pearson, The use of confidence or fiducial limits illustrated in the case of the binomial. Biometrika. 1934; 26: 404–413.

    [14] H. O. Lancaster, The combination of probabilities arising from data in discrete distributions. Biometrika. 1949; 36: 370–382.

    [15] G. Berry and P. Armitage, Mid-P confidence intervals: a brief review. Statistician. 1995; 44: 417–423.

    [16] H. Blaker, Confidence curves and improved exact confidence intervals for discrete distributions. Can J Stat. 2000; 28: 783–798.

    [17] J. Reiczigel, Confidence intervals for the binomial parameter: some new considerations. Stat Med. 2003: 22; 611–621.

    [18] M. Parzen, S. Lipsitz, J. Ibrahim, and N. Klar, An estimate of the odds ratio that always exists. J Comput Graph Stat. 2002: 11; 420–436.

    [19] R. W. Mee, Confidence bounds for the difference between two probabilities. Biometrics. 1984; 40: 1175–1176.

    [20] O. S. Miettinen and M. Nurminen, Comparative analysis of two rates. Stat Med. 1985; 4: 213–226.

    [21] R. G. Newcombe, Interval estimation for the difference between independent proportions: comparison of eleven methods. Stat Med. 1998; 17: 873–890.

    [22] R. G. Newcombe, Bayesian estimation of false negative rate in a clinical trial of sentinel node biopsy. Stat Med. 2007; 26: 3429–3442.

    [23] D. E. Walters, On the reliability of Bayesian confidence limits for a difference of two proportions. Biom. J. 1986; 28: 337–346.

    [24] T. Fagan, Exact 95% confidence intervals for differences in binomial proportions. Comput Biol Med. 1999; 29: 83–87.

    [25] R. G. Newcombe, Estimating the difference between differences: measurement of additive scale interaction for proportions. Stat Med. 2001; 20: 2885–2893.

    [26] W. W. Hauck and S. Anderson, A comparison of large-sample confidence interval methods for the difference of two binomial probabilities. Am Stat. 1986; 40: 318–322.

    [27] A. Agresti and B. Caffo, Simple and effective confidence intervals for proportions and differences of proportions result from adding 2 successes and 2 failures. Am Stat. 2000; 54: 280–288.

    [28] A. Laupacis, D. L. Sackett, and R. S. Roberts, An assessment of clinically useful measures of the consequences of treatment. N Engl J Med. 1988; 318: 1728–1733.

    [29] D. G. Altman, Confidence intervals for the number needed to treat. BMJ. 1998; 317: 1309–1312.

    [30] R. Bender, Calculating confidence intervals for the number needed to treat. Control Clin Trials. 2001; 22: 102–110.

    [31] R. G. Newcombe, Confidence intervals for the number needed to treat—absolute risk reduction is less likely to be misunderstood. BMJ. 1999; 318: 1765.

    [32] Q. McNemar, Note on the sampling error of the difference between correlated proportions or percentages. Psychometrika. 1947; 12: 153–157.

    [33] R. G. Newcombe, Improved confidence intervals for the difference between binomial proportions based on paired data. Stat Med. 1998; 17: 2635–2650.

    [34] T. Tango, Equivalence test and CI for the difference in proportions for the paired-sample design. Stat Med. 1998; 17: 891–908.

    [35] R. G. Newcombe, Confidence intervals for the mean of a variable taking the values 0, 1 and 2. Stat Med. 2003; 22: 2737–2750.

    [36] M. L. Tang, N. S. Tang, and I. S. F. Chan, Confidence interval construction for proportion difference in small sample paired studies. Stat Med. 2005; 24: 3565–3579.

    [37] C. M. Wiles, R. G. Newcombe, K. J. Fuller, S. Shaw, J. Furnival-Doran, et al., Controlled randomised crossover trial of physiotherapy on mobility in chronic multiple sclerosis. J Neurol Neurosurg Psych. 2001; 70: 174–179.

    [38] M. Hills and P. Armitage, The two-period cross-over clinical trial. Br J Clin Pharmacol. 1979; 8: 7–20.

    [39] G. G. Koch, The use of non-parametric methods in the statistical analysis of the two-period change-over design. Biometrics. 1972; 28: 577–584.

    Further Reading

    [1] Microsoft Excel spreadsheets that implement chosen methods for the single proportion, unpaired and paired difference, and interaction cases can be downloaded from the author’s website: http://www.cardiff.ac.uk/medicine/epidemiology-statistics/research/statistics/newcombe

    [2] The availability of procedures to calculate confidence intervals for differences of proportions is quite patchy in commercial software. StatXact (Cytel Statistical Software) includes confidence intervals for differences and ratios of proportions and odds ratios. These are exact intervals, designed to guarantee minimum coverage 1 − α. The resulting intervals are likely to be relatively wide compared with methods that seek to align the mean coverage approximately with 1 − α.

    Chapter 2

    Accelerated Approval

    Louis Cabanilla and Christopher P. Milne

    2.1 Introduction

    The development and approval of a new drug is a complex and time-consuming procedure, which takes many years to accomplish. By the time a drug receives FDA approval, extensive lab and clinical work must be performed to ensure that the drug is both safe and effective, and months or years spent by the FDA reviewing the drug’s application. Although this level of quality control is a great benefit to people who use the drug, the extensive amount of time spent on development and regulatory submission review represents a barrier for a person with a serious or life threatening disease for which no treatment exists, or for a person for whom available treatments have failed. In such cases, a speedy development and regulatory review are of the utmost importance. A patient with no or few therapeutic options is often willing to accept a higher risk to benefit ratio including the use of a treatment whose efficacy is predicated on indirect measures of the expected clinical benefit.

    2.2 Accelerated Development Versus Expanded Access in the U.S.A.

    In the United States, an emphasis has been placed on making potentially life saving drugs available as soon as possible. In some instances, this process involves making experimental drugs available to patients who are not enrolled in clinical trials, but more typically it involves programs designed to decrease the time to market for these important drugs. This movement took shape in the middle to late 1980s when AIDS drugs were being developed but were not made available to the wider patient population quickly enough, which caused outrage among those who risked dying while waiting for the FDA to approve these drugs for marketing.

    In 1987, four major initiatives were drafted to speed drug delivery, cut cost, and make drugs available sooner. Two of these initiatives, Treatment IND and Parallel Track, focused on expanding access of potentially life-saving drugs to specific patient populations prior to approval of the drug. In contrast, the Subpart E and Accelerated Approval programs focused on decreasing the amount of clinical and regulatory review time needed to approve a life-saving drug. Subpart E provides a regulatory framework to grant approval to certain drugs after an extended Phase II trial. Accelerated approval allows for a drug to be approved with restrictions on distribution or with the use of unvalidated surrogate endpoints or measures of indirect clinical benefits to determine efficacy [1].

    The terminology and regulatory implications for various FDA programs can be confusing and are often mislabeled in literature. In understanding the terms, it is worth mentioning that Subpart E, which is also known as Expedited Development, refers to the sub-part of title 21 in the Code of Federal Regulations (CFR). Likewise, Accelerated Approval is sometimes referred to as Subpart H, as that is the subpart of 21 CFR that created the regulations allowing for this process by the FDA. Since the 1990s, more initiatives such as Fast Track designation, Priority Review, and Rolling Review have been implemented to facilitate the path of life-saving drugs to market, through separate initiatives.

    2.3 Sorting the Terminology—Which FDA Initiatives Do What?

    Treatment IND TIND provides for early access of promising drugs to patients with serious or life-threatening illnesses who have no treatment options, or for patients who have failed available therapies. Drug companies are allowed to recoup the cost of producing the drug for these patients who are not involved in the clinical trials.

    Parallel Track This program is a more extensive version of the TIND program aimed at people who do not qualify for participation in clinical trials. The program was a response to the AIDS epidemic, and it is generally intended for HIV/AIDS treatments or related problems. Drug companies collect basic safety data from these patients and are allowed to recoup the cost of producing the drug.

    Expedited Development (Subpart E) This provision allows for extensive interaction and negotiation with the FDA to move a drug to the market as quickly as possible. This process includes the possibility of expanding a Phase II trial and using that data for a new drug application (NDA)/biologic licensing application (BLA) submission, and then focusing on postmarketing safety surveillance.

    Accelerated Approval (Subpart H) This provision allows drugs to gain approval with distribution restrictions, or more commonly, based on unvalidated surrogate endpoints or an indirect clinical benefit. It allows lifesaving drugs to become available while the company completes a long-term Phase IV study.

    Fast Track This program originated as a provision of the second Prescription Drug User Fee Act (PDUFA II), and it allows the FDA to facilitate the development of and to expedite the review for drugs intended to treat serious or life-threatening illness that have the potential to address unmet medical needs.

    Priority Review This review allows the FDA to allocate more resources on the review of a priority drug (i.e., one that represents a significant advance over those currently on the market).

    Rolling Review Under roiling review, companies can submit sections or Reviewable Units (RUs) of a NDA or BLA as they complete them, to be reviewed by the FDA. Although an RU may be complete, marketing rights are not granted until all RUs are submitted and approved [1,2].

    2.4 Accelerated Approval Regulations: 21 C.F.R. 314.500, 314.520, 601.40

    The FDA has several programs to expedite the development and approval of life-saving drugs. Because some similarities are observed among the programs that pertain to the eligibility requirements and regulatory language, they are sometimes confused with each other. Although some overlap does occur, they generally affect different segments of the development and application review timeline (see Figure 1).

    Figure 1: Stages of drug development and FDA initiatives. The progressive stages of drug development, and where the initiatives begin/end.

    2.5 Stages of Drug Development and FDA Initiatives

    The intended effect of FDA expedited development programs is to speed development and approval; however, the respective programs focus on disparate aspects of the development and regulatory process. For example, accelerated approval is often confused with fast track designation and priority review; however, they are quite different. The central mechanism of the accelerated approval program is conditional approval based on the use of unvalidated surrogate endpoints or indirect clinical endpoints as evidence of efficacy, or restricted distribution. Fast track designation provides the opportunity for intensive scientific interaction with the FDA and acts as a threshold criterion for rolling review. Priority review is an administrative prioritization scheme implemented by the FDA to give precedence to applications for drugs that represent an improvement to the currently marketed products for a particular disease or condition (also applies to eligible diagnostics and preventatives). Moreover, it should be recognized that the same drug could be a part of accelerated approval, fast track, and other programs at the same time (see Figure 2).

    Figure 2: Accelerated Approval and…" The overlap between accelerated approval and various other initiatives.

    Many accelerated approvals have also benefited from other programs aimed at decreasing the time and resources required to bring crucial drugs to market. In addition, many accelerated approvals are also orphan drugs. Orphan drugs are treatments for rare diseases and conditions, and the FDA designation provides certain economic and regulatory incentives for a drug company to develop them.

    2.6 Accelerated Approval Regulations: 21 CFR 314.500, 314.520, 601.40

    Accelerated approval regulations were promulgated December 11, 1992. The law stipulates the following [3]:

    FDA may grant marketing approval for a new drug product on the basis of adequate and well-controlled clinical trials establishing that the drug product has an effect on a surrogate endpoint that is reasonably likely, based on epidemiologic, therapeutic, pathophysiologic, or other evidence, to predict clinical benefit or on the basis of an effect on a clinical endpoint other than survival or irreversible morbidity.

    Conditions for approval:

    Possible FDA restrictions or on distribution and use by facility physician, a mandated qualifying test, or procedural administration requirement

    Promotional materials must be submitted to, and approved by the FDA

    Streamlined withdrawal mechanisms if

    Anticipated benefit is not confirmed in Phase IV trials

    Sponsor fails to exercise due diligence in performing Phase IV trials

    Restrictions on use prove insufficient to ensure safe usage

    Violations on restrictions on use and distribution

    Promotional materials are false or misleading

    Other evidence that the drug is not safe or effective [3]

    An FDA advisory committee can recommend a drug for accelerated approval based on the set criteria for qualification, which includes the soundness of evidence for the surrogate markers. Following an advisory committee recommendation, the FDA can then review the application and approve the drug for marketing. Drugs must be intended for patients with serious or life-threatening illnesses. Moreover, the data used for the approval must show effectiveness of an unvalidated surrogate endpoint that is reasonably likely to predict a clinical effect. In contrast, a validated surrogate endpoint would proceed though the normal approval process. If a company seeks accelerated approval based on restricted distribution, then it must have clear distribution restriction practices and provider/user education programs for the drugs to gain approval. An NDA submission with unvalidated surrogate endpoints must still stand up to the scrutiny of the NDA review process, and it can be rejected for any of the reasons a traditional NDA could be rejected such as safety, efficacy, or concern about product quality. Beyond the issues of approval, all promotional materials for an accelerated approval drug must be submitted to the FDA for approval, and they must be periodically reviewed. This review is another method of ensuring the appropriate understanding and availability of these drugs for both doctors and patients.

    The initial approval process for a drug by the FDA pertains to a particular or limited set of indications for a particular Sub-population of patients. However, it is common that a drug can later be found to have a benefit for multiple populations and multiple indications. For a drug that is already approved for use, the expansion for a subsequent indication requires a less comprehensive supplemental NDA (sNDA) or supplemental BLA (sBLA). A supplement can be eligible for accelerated approval status, whether the first indication was a traditional approval or not. Over the past 5 years more sNDAs have been granted accelerated approval. In the 2000s, 23 sNDA accelerated approvals were granted by the FDA, whereas in the 1990s (1993–1999) only five sNDAs were approved [4].

    Most accelerated approvals have been granted to sponsors of small molecule drugs (i.e., chemicals), but a significant minority, which has increased over time, have been granted to large molecule drugs (i.e., biologies). This latter group is generally composed of monoclonal antibodies, although many more designations have been granted for other biologies such as vaccines or recombinant proteins. Nearly 31% of all accelerated approvals have been for biologies, most for oncology, although several accelerated approvals have been granted for rare diseases such as multiple sclerosis and Fabry’s disease [5]. Increasingly, accelerated approvals have been given to speed the development and availability of preventative vaccines, such as the influenza vaccine for the 2006–2007 flu season [6] (Figure 3).

    Figure 3: Product type of accelerated approvals. Comparison between small molecule and biologic drugs approved by FDA through the Subpart H.

    2.7 Accelerated Approval with Surrogate Endpoints

    The major advantage of the accelerated approval designation is that it can decrease the complexity of late stage clinical trials using an unvalidated surrogate endpoint, which may decrease the number of patients who must be enrolled, decrease the amount of data being collected, and most of all decrease the time required to conduct the necessary studies [7]. For example, in 1992 ddI (also called Videx or didanosine) was approved based on CD4 levels rather than the survival rate for HIV infected patients, which greatly reduced the Phase III trial time. However, this endpoint has become validated; as a result, new drugs that use CD4 levels as a clinical measurement now proceed through the normal NDA process.

    In terms of speed to market, drugs that are designed to increase the survival rate or slow disease progression can take years to properly test, which is generally the case for illnesses such as cancer, HIV, or multiple sclerosis. In such cases, accelerated approval is beneficial in making a drug available to patients while long-term data are collected. The ability to use easily observable and quantifiable data as evidence of clinical effectiveness allows a drug to reach the market much more quickly than using traditional measurement such as overall survival rate [7].

    Surrogate endpoints are a subset of biological markers, which are believed to have a correlation for endpoints in a clinical trial [8]. Although it is not guaranteed that an actual relationship exists, surrogate endpoints are based on scientific evidence for clinical benefit based on epidemiologic, therapeutic, pathophysiologic or other scientific evidence [8]. A clinical trial testing the efficacy of a statin, for example, might use a reduction in cholesterol as a surrogate for a decrease in heart disease. Elevated cholesterol levels are linked to heart disease, so this correlation is likely. The benefit of using cholesterol levels is that heart disease often takes decades to develop and would require a time-consuming and expensive trial to test. With this in mind, the FDA might approve the statin for marketing contingent on the sponsor company’s agreement to complete post-marketing commitment(s) to test for a clinical effect (Phase III b or Phase IV studies). Phase IIIb and Phase IV studies are distinct from each other in that the former begins after the application submission, but before approval, and it typically continues into the postmarketing period; a Phase IV study is a postmarketing study.

    2.7.1 What is a Surrogate Endpoint?

    A surrogate endpoint is a laboratory or physical sign that is used in therapeutic trials as a substitute for a clinically meaningful endpoint (or biomarker) that is a direct measure of how a patient feels, functions, or survives and that is expected to predict the effect of the therapy (8).

    2.7.2 What is a Biomarker?

    A biomarker is a characteristic that is measured and evaluated objectively as an indicator of normal biologic or pathogenic processes or pharmacological responses to a therapeutic intervention. Surrogate endpoints are a subset of biomarkers (8).

    2.8 Accelerated Approval with Restricted Distribution

    Although accelerated approval drugs are generally approved on the basis of surrogate endpoints, however, section 21 C.F.R. Section 314.520 allows for approval of a drug based on restrictions on distribution. These restrictions can refer to certain facilities or physicians who are allowed to handle and prescribe the drug, or to certain requirements and tests or medical procedures, which must be performed prior to use of the drug. The drug thalidomide has a restricted accelerated approval for the treatment of Erythema Nodosum Leprosum because the drug is associated with a high risk of birth defects for infants whose mothers are exposed to the drug. To mollify this risk, doctors must complete an educational program on the risks and safe usage of thalidomide before they are certified to prescribe the drug. Few accelerated approvals are based on this restriction, although they have become more common over time. The number of restricted distribution approvals averaged one per year between 2000 and 2005. However, in the first half of 2006 there were four restricted approvals, all of which were supplemental approvals (Figure 4) [4].

    Figure 4: Accelerated approval information through 06/06. Approval type: A comparison between Subpart H approvals with restricted distribution versus surrogate basis.

    2.9 Phase IV Studies/Post Marketing Surveillance

    Although a drug that is given accelerated approval might reach the market, the company that sponsors the drug is required to complete the necessary research to determine efficacy with a Phase IV study. These studies are intended to increase the quality of the application dossier to the scientific standards of drugs approved through the traditional NDA/BLA process. Sec. 314.510 [9] stipulates:

    Approval under this section will be subject to the requirement that the applicant study the drug further, to verify and describe its clinical benefit, where there is uncertainty as to the relation of the surrogate endpoint to clinical benefit, or of the observed clinical benefit to ultimate outcome. Post marketing studies would usually be studies already underway. When required to be conducted, such studies must also be adequate and well controlled. The applicant shall carry out any such studies with due diligence.

    Some ambiguity exists in this section of the law as no actual time frame was given, until recent years, for companies to complete their Phase IV trials and submit their findings to the FDA. Rather, it was expected that companies would use due diligence in completing their research. This nebulous definition has lead to some contention about the level of compliance by the drug industry. This question will be considered in more depth in a subsequent section.

    2.10 Benefit Analysis for Accelerated Approvals Versus Other Illnesses

    Drugs that qualify for accelerated approval are typically intended to treat serious conditions and diseases such as pulnanary Arterial hypertension, HIV, malignancies, Fabry’s disease, and Crohn’s disease. Given that the average median total development time for all FDA-approved drugs during the 1990s and 2000s was just over 7 years (87 months), drugs under the accelerated approval program have spent considerably less time in the journey from bench to bedside in such critical therapeutic areas as HIV/AIDS and cancer (Figure 5).

    Figure 5: Median development and approval times for accelerated approval drugs. Development and regulatory times by indication for Subpart H.

    To qualify for accelerated approval, a drug must offer a meaningful therapeutic benefit over available therapy, such as:

    Greater efficacy

    A more favorable safety profile

    Improved patient response over existing therapies [9]

    Given the severity of the illnesses and the potential for a meaningful therapeutic benefit, the FDA is generally willing to accept a higher risk-to-benefit ratio for a potential therapy. A drug with a comparatively high risk of side effects or with less certainty of proof of efficacy can be approved. For example, a drug with a traditional approval for the treatment of insomnia would have an acceptable overall safety and efficacy profile. Conversely, an accelerated approval drug for multiple sclerosis could gain approval with a likely clinical effect, based on surrogate endpoints. This change in approval occurred because patient advocacy groups and public opinion have made it clear that patients in such a situation are willing to bear additional risks. For patients with no other options, a new therapy often represents the only possibility for increasing the length or quality of their life.

    Likewise, a drug designated as a first-line treatment for a malignancy would have a more rigorous approval process, given the currently available treatments, than a drug designated as a third-line treatment for patients who have failed all conventional treatment options. Within any patient population, a certain percentage will fail to respond to first- or second-line treatments or they may become resistant to therapy. This finding is particularly true with malignancies that have high relapse rates and infections such as HIV, in which the virus gains immunity to treatments over time. In situations such as these, creating the incentive for drug companies to develop second-, third-, and even fourth-line treatments with new mechanisms of action is critical.

    2.11 Problems, Solutions, and Economic Incentives

    Accelerated approval is intended to increase the speed to market for important drugs. Although this timeline is a benefit to patients, it is also a benefit to the company that sponsors the drug. The use of surrogate endpoints can greatly decrease the length and complexity of a clinical trial. This use allows a drug company to begin recouping sunk costs earlier and provides returns on sales for a longer period. Without accelerated approval, it is possible that many of these drugs would have been viewed as too risky or expensive to develop in terms of return on investment for drug companies. Even after approval, sponsor companies have some degree of uncertainty. It is possible that some practitioners, patients, and third-party payers may consider such approvals to be conditional and avoid their use until no alternative is available. Moreover, insurance companies may refuse to reimburse experimental drugs, thus the financial burden for the treatment may also place the drugs out of reach of patients. However, these drugs can be very profitable. In 2004, 20 drugs had accelerated approval among the top 200 best-selling drugs, with combined sales of $12.54 billion (Table 1) [10].

    Table 1: 2004 Sales Figures for Drugs First Approved Under Accelerated Approval: Sales Figures for Various Accelerated Approval Drugs in Billions of Dollars (US)

    A prerequisite for accelerated approval designation is that the drug company will complete a Phase IV trial with due diligence following the marketing of the drug. Once approved, evidence from a Phase IV trial that does not support the efficacy of the drug may lead to the drug being withdrawn.

    The original text has proven problematic in that due diligence is ambiguous, and it is understood differently by the various stakeholders. Moreover, despite the wording of the provisions for drug withdrawal, it is difficult to do so once market rights have been granted, A congressional staff inquiry found that as of March 2005, 50% of outstanding accelerated approval post-marketing studies had not begun, whereas most of the other 50% of studies were underway. Overall, a significant proportion (26%) of trials is taking longer than expected to complete [11]. The incentive for a company to perform a Phase IV trial is minimal given that the sponsor company has already received marketing approval, and the completion of the trial is expensive and time consuming. As of 2006, no drug had been withdrawn for failure to complete a Phase IV trial. However, proposed regulations and guidance within the FDA could change this [12].

    The biggest change to the original procedure for accelerated approval came in 2003 when the FDA reinterpreted the laws regarding the number of accelerated approvals granted per oncology indication. Whereas the old practice allowed only one accelerated approval per indication, the new interpretation viewed a medical need as unmet until the sponsoring company completes a Phase IV trial [13]. This change provides an incentive for drug companies to complete Phase IV studies quickly to avoid market competition from other drugs. More recently, the FDA has released a guidance that suggests accelerated approvals could be based on interim analysis, which will continue until a final Phase IV report is completed, rather than on separate Phase IV trials [14]. This guidance would potentially decrease the cost of a Phase IV study as well as alleviate the issue of patient enrollment, which is often cited as a cause for delays by drug companies. The idea of granting conditional approval contingent on Phase IV completion for full approval has also been mentioned [15].

    Increasing Phase IV follow-through is important

    Enjoying the preview?
    Page 1 of 1