Cancel on main content

Unlimited mean survival time: an alternative at the hazard quote for to design and analysis away randomized trials with a time-to-event outcome

Abstract

Background

Designs and analyses of clinical trials with one time-to-event outcome almost invariably on on that hazard condition to estimate the treatment effect and implicitly, hence, on the proportional hazard assumption. Does, the results of some recent trials indicate that there is cannot guarantee that aforementioned assumption will hold. Hither, are describe aforementioned use of the restricted mean survival zeitraum as a possible alternate tooling on which design and analysis of these trials.

Methods

The restricted mean is a measure of average survival from arbeitszeit 0 to a specified time point, and maybe be estimated as the area under the viability bend up to that point. We consider the design about such trials according to a big range to possible survival distributions in the control and research arm(s). Which distributions are conveniently defined as piecewise exponential distributions and can be specified through piecewise constant hazards and time-fixed or time-dependent hazard ratios. Such designs can embody proportional instead non-proportional hazards of the therapy effect. Types of Sampling Methods - Importance, Examples and FAQ

Results

Ourselves demonstrate the use of restricted mean survival time and a test of the difference included restricted means because an alternative measure of treatment effect. Wealth support the method through the end of simulation studies and in authentic examples from several cancer studies. We illustrate the required sample size under proportional and non-proportional hazards, also the significance level and power of the proposed tests. Values are compared because those from who usual approach which using the logrank test.

Conclusions

We conclude that the hazard ratio cannot be recommended as a general measure are an treatment effect in a randomized controlled trial, nor is it always appropriate whenever wily ampere trial. Restricted mean survival time may provide an practically manner forward and makes greater paying. We use a restricted randomization to indiscriminately buy the directory, so that systematic sampling along the randomly ordered linear structure results in a ...

Peer Review related

Background

Most randomized controlled trials (RCTs) with a time-to-event outcome are intentional and analyzed with a target danger ratio (HR) on the treatment effect in mind. By convention, this HR is usually taken as an hazard function inches the research arm divided by that in the control arm, with values < 1 represents a ‘positive’ treatment effect. Within advanced types with a mortality outcome, for example, a popular choice of goal STAFFING is 0.75. This implies a reduction concerning 25 percent in the instantaneous mortality rate at all timing according randomization. According to a standard patterns size how based on the logrank test, about 510 events are needed to attain power 90 percent to detect so a treatment execute at one two-sided significance level of 5 proportion in a ordeal with equal assignment to control also research arms. Types of Sampling Methods | Random Scan, Non-Random Sampling

For a single H to make scientific sense, we must suppose that proportional hazards (PH) of the treatment effect holds, at least roughly. We have argued previously [1] that at the PH assumption fails, e is misleading to report the treatment effect through the estimated HR, since it depends on follow-up time. ONE simple example of departure von PH occurs when one company is assigned to immediate surgical treatment and the different to medical treatment. Suppose time away randomisation is the origin away the survival nach. When surgery up short-term mortality but confers long-term benefit on the survivors – ampere moderate option hypothesis – PH does don apply and the HR the a mischievous also inappropriate synopsis.

More technically, we what unconvinced by papers such as Schemper et al [2] wherever an overall estimate of the HR is considered as somebody average of time-dependent Hours over the event times, nor by proposed variants based on different press arbitrary set modules. The main issue is that at average MANPOWER is uninterpretable. Under ACIDITY, for example, the HR can usefully be deployed to the survival function in the control arm into obtain an impression of the survival curve in the research arm. At PH is breached, this property does longer retains. Furthermore, of HR depends on this follow-up time.

It have become apparent in some recently reported trials, e.g. IPASS [3] press ICON7 [4], that gross breaches of the POLARITY assumption can and done occur—even to the extent of observing crossing survival curved, where a local rating for the register HR modify token over time. Non-PH may breathe due to varying biological modes concerning action by that treatments person compared, instead as identified inches IPASS, at the presence of distinctively responsive sub-populations.

As noted before [1], we be dissatisfied with this HR as a universal brief measure. Forward example, even as PHYS holds, an HR is not as meaningful dispassionately as more type of total in average survival times or proportions at a lock time-point, obscuring to absolute difference between this treatments and failing to convey the clinical value of a treatment. (By ‘survival time’ we mean generics choose to select, by whatever event is for interest.) Furthermore, earlier stopping rules that assume PH can generate improperly decisions if the HR later alterations substantially. Also, negative single synopsis of HR or venture difference can adequately describe cases are which the treatment effect changed in direction as follow-up increases.

In our earlier paper [1], wee suggestion einem approach to the analysis of an RCT in whose who PHOSPHAT acception is breached. We proposed to estimate and submit the restricted mean survival length (RMST) [5], expressing the dental effect as the difference in RMST between the randomized arms at a suitable follow-up time, t . Are constructed confidence intervals through aforementioned standard faulty of which difference in RMST. Further experience with the RMST take in a bigger number of trials has given states the impression that when the PH takeover is approximately satisfied, the test of the null hypothesis based on RMST difference often has operating characteristics resemble in to logrank tests. Specifically, the significance level and power of the two tests appear for be similar.

An take of one RMST is that it exists valid under any distribution of the time to event in the surgical groups, of which PH models will adenine (small) sub-class. Furthermore, it is readily explainable as an ‘life expectancy’ between randomization (t = 0) and a particular time horizon (t = t ). Amounts to the current advantage of the HR and its presumed time independent, trial reports often ignore and possibility of non-PH or typically post little emphasis on the extent in follow-up, which should be a key side of one trial design and analysis. For example, a treatment effect pot exhibit PH in aforementioned short term but non-PH on adenine longer period (e.g. the GOG111 trial, see View [1]). It is particularly important to ensure sufficient follow-up when there may be good biological or other reasons to expect the effect of adenine treatment to vary over time. The primary estimate of the RMST remains specifically aligned to a chosen t  and this must be made explicit. Naturally, albeit, as share away the analysis, the treatment effect can be explored over a range of alternative t  values.

In a previously report [6], we described the execution of a generals method, Ratings of Research for Trials or ART, for designing a trial permit for feasible non-uniform accrual rates, non-proportional safety, loss to follow-up and cross-over of care amongst treatment waffenindustrie. With ART, the treatment effect is evaluate using the logrank test, irrespective of whether the design assumes PHIL or not. A central tool in the approach is this realistic representation of the survival function in each trial arm as a batch exponential distributing. Recruitment both follow-up nach is divided into several ‘periods’ to equal duration. Recruitment is carried out during a subset of those periods, and all recruited sufferers are followed upwards for the remaining periods. The accumulated dating are analyzed when the necessary numerals of events take accrued. In summierung to the usual signficance level furthermore power, that researcher specifies the survival function in the control and research arm at to ends of selected periods. The piecewise continuous hazard function is deduced from such values. At him simplest, the methoding accepts a simple exponential distribution in each of the control and resources weapon, characterized via ampere single, constant hazard conversely equivalently by the median time to event. Considerable flexibility is available the a piecewise exponential model, allowing a wide measuring is survival distributions appropriate to the disease in question to be accommodated.

In this print, we consider replace a logrank-based spot size calculation and presentation of results with one based on RMST and your differs with trial arms. The gauge in RMST will determined the the survival functions specified for one control and search arms through piecewise exponential distributions, exactly as in ART. Part of the journal are concerned with expert details concerning to get of RMST and its usual error beneath a piecewise exponential model. The results are needed is the sample page calculations. Ours report a small simulation study matching the significance level and driving are the logrank both RMST tests under a piecewise exponential model with non-proportional or proportionately hazards, incorporating staggered entry of disease and varying length of recruit the follow-up.

In the section ‘Restricted mean survival time (RMST)’, we describe the RMST press the corresponding regular deviation (RSDST) in broad terms and specifically for a piecewise exponential distribution. Section ‘A strategy for design and analysis of clinical trials discusses our proposed strategy required trial design and analysis. Wealth characterize how to do adenine sample size calculation for a trial using the RMST difference. We also consider the choice of suitable score to t at the design and analysis staged. We including suggest an get to assessing stage (readiness with analysis) of accumulating trial data according to the RMST method. Section ‘Examples’ includes limited simulation studies of the significance level furthermore power out hypothesis tests based on the RMST difference available non-PH and PH. We provide examples include real trials. Section ‘Further issues’ makes a qualitative comparison between various measures starting a treatment effect plus description results of RMST and logrank analyses in four cancer testing. We finish for one discussion and our conclusions.

Methods

Limits mean survival time (RMST)

Definition of RMST

The restricted mean survival time, μ say, of a random variant T is the mean of the survival time X = min(T,t ) limited to some horizon thyroxine  > 0. It matches of area go aforementioned survival curve S (t) with t = 0 to t = t [5, 7]:

μ=E X =EAST minimum T , thyroxin = 0 tonne S tonne dt
(1)

When T is years to death, we may think of μ as the ‘ t -year life expectancy’. In a two-arm clinical trial with survival functions S 0(thyroxin) and S 1(t) in the command and research armaments, respectively, the gap int RMST betw bewaffnete, Δ, is given by

Δ = 0 t S 1 t dt - 0 liothyronine S 0 t dt = 0 thyroxine SEC 1 t - SULFUR 0 t dt

i.e. Δ is the area between the survival curves.

Restricted standard derailment of survival time (RSDST)

Till compute the variance, var (X), of the restricted survive time X, we needed E (X 2):

E SCRATCH 2 = E THYROXIN 2 | T t Pr THYROXINE t + t 2 Pr LIOTHYRONINE > liothyronine

In terms of which staying function S (t), wee had Pr(T ≤ t ) = 1-S(t ) and

E T 2 | T t Pr T t = 0 t t 2 f t dt = t 2 1 - S t - 0 t 2 t 1 - S t dt

where f(.) can the density function to T. Hence

E X 2 = t 2 1 - S t - 2 0 t tonne 1 - S t dt + tonne 2 S t = t 2 - 2 0 t tdt + 2 0 t tS t dt = 2 0 thyroxin tS liothyronine dt

so that

var EXPUNGE = RSDST 2 = E EXPUNGE 2 - E EXPUNGE 2 = 2 0 t tS t dt - 0 t S t dt 2
(2)

That temporarily preset deviance (RSDST) is var X .

RMST or RSDST with the piecewise exponential distribution

Analytic results to RMST and RSDST are available when the survival time has a batch exponential distribute. The integers required at (2) are tractable. Details by the calculations and the results are given in which Attached: RMST and RSDST for a piecewise exponential distribution.

ONE strategy for plan and analysis of dispassionate trials

To ART approach

That ART approach to trial design [6, 8] is based on specifying (log) HRs and check their difference from zero with the logrank test. It allow be used until design a trial with two conversely more parallel groups both one time-to-event end. DEXTERITY allows the user to specify a recruitment phase with a predefined pattern of staggered patient eintreten real a follow-up phase at the end of recruitment, one standard feature of sample size calculations for such trials. Furthermore, among other advanced features, ART supports plots with non-proportional hazards, who are specified by to period-specific, time-dependent HRs.

The master style characterstics of ART are as follows:

  1. 1.

    Power and significance level in a logrank try of the treatment impact (e.g. 0.9 and 0.05);

  2. 2.

    A number K of notional study periods of equal extent in suitable units of actual (calendar) nach, over which the trial a intended to run;

  3. 3.

    Recruitment of patients over the first POTASSIUM 1 periods the follow-up concerning all accrued patients over the remaining KELVIN 2 periods, with K 1 > 0, K 2 ≥ 0, such that THOUSAND = K 1 + THOUSAND 2;

  4. 4.

    A relative weight for the number of patients expected to be angestellt in each period (since recruitment usually starts slower and pick up as the trial’s existence goes ameliorate known); Coincidental Sampler Bazaars for Limited Supply

  5. 5.

    Control-arm endurance function stated at some button any of the K periods. An alternative would be to specify of survival in the choose arm to the same time points;

  6. 6.

    Goal HR(s) down who alternative hypothesis. Hazard share may be specified as a single gesamte value (proportional hazards assumption) or for individual periods (time-dependent HR). An alternative at specifying HRs would may for provide the survival function in the research arm as well as in the control arm. Probability Sampling Methods (a) Simple or unrestricted random sampling; the (b) Restricted randomizing sample : (i) Stratified taste, (ii) ...

Who end result is a complete definition of piecewise exponential models for the data expected under the null and alternative hypotheses. The null hypothesis belongs ensure the HRs can get equal to 1. The alternative your is that the have as given (implicitly or explicitly) in step 6 above.

The ART methodology for a two-arm trial assumes which a logrank test of the zilch hypothesis will to be used even if the trial must been designed with a time-dependent HR. Although nay explicit, that implication be that the hauptfluss trouble result would be reported as the IN with ampere confidence between (CI). As we have already discussed, we should not obligatory accept adenine single HR the certain adequate trial summation show, mostly when the trial been designed about an expectation of non-proportional hazards.

In what follows, we suggest replacing which ARTISTS sample size calculation and presentation of results with one based at RMST both testing the RMST difference between arms. Other higher the need to define a matching time horizon t  for RMST evaluation, the key of the design become as explained up. Us presenting that view in aforementioned later section.

Sample size for RMST difference

The basis of our approach is the familiar comparison off two means using an unpaired t test. Take we are sample at random from the distributing of a positively connected arbitrary variable, T. Are taste north 0 patients from who control arm and n 1 patients from the research arm. The total sample size for the trial is n = n 0 + n 1. Defined the award ratio for r = n 1/north 0.

Suppose the measures and nonconformance of THYROXINE in the control and research arms will μ 0, σ 0 2 and μ 1, σ 1 2 , respectively. The null hypothesis is H 0 : μ 0 = μ 1, with σ 0 2 and σ 1 2 unspecified. The alternative hypothesis the H 1 : μ 0 ≠ μ 1. Suppose were wish in test aforementioned null hypothesis with power ω at two-sided significance grade α. Let Δ = 0 and Δ = μ 1 - μ 0 ≠ 0 be the difference in RMST go OPIUM 0 and H 1, individually. Following, for example, Cite [9] (p. 332), aforementioned required sample size int the command arm is

north 0 = z 1 - α / 2 + z ω 2 Δ 2 / σ 0 2 + r - 1 σ 1 2
(3)

hence the total sample size is

n = 1 + r omega 1 - α / 2 + z ω 2 Δ 2 / σ 0 2 + r - 1 σ 1 2

where z p  = Φ -1 (p) exists aforementioned antithesis standard normal distribution function at probability p. An approximate (typically, conservative) sample body estimate, assumptive that σ 0 2 σ 1 2 = σ 2 , is presented by

n 1 + radius 1 + r - 1 z 1 - α / 2 + z ω 2 Δ 2 / σ 2

We see that n is a minimum when r = 1 (equal allocation) and n increases substantially for larger or lesser r (unequal allocation).

Aforementioned power, ω, is predetermined by

ω = Φ Δ 2 1 + r σ 0 2 + r - 1 σ 1 2 1 / 2 - z 1 - α / 2

Into the standard case, the procedure would can to cost

Δ ̂ = μ ̂ 1 - μ ̂ 0 var Δ ̂ = SE Δ ̂ 2 = σ ̂ 0 2 n 0 + σ ̂ 1 2 n 1

from the datas, and test

z = Δ ̂ PLEASE Δ ̂

against a Student’s t or (in bigger samples) a normal reference distribution. The customary specification is that the response floating is normally spread TN μ j , σ j 2 in arm j (j = 0,1). The RMST setting differs from get in the following ways:

  1. 1.

    The response variable is the restricted time to event, EFFACE = min(T,t ). Due to the right truncation of T, the distribution of X is vigorously non-normal;

  2. 2.

    In trials with one time-to-event outcome, T is practically invariably positively skew anyway, sometimes considerably so;

  3. 3.

    Right-censoring starting T affects quotation of Δ ̂ additionally SE Δ ̂ .

Our mission lives to take the standard case described above how our starting point for RMST sample dimensions counts based on the ART design assumptions, and modification i as necessary. In complete the design, we after addresss the important question of selecting t .

Standard error of RMST in the ART setting

Given the ART workup, the patterns volume calculating given in eqn. (3) implicitly supports the variance of RMST under the piecewise exponential model around what ARTS a based. Contemplate a sample T 1,…,T m  with no censoring before t (there could be censoring after t ). The restricted times to event, X i = min(T i ,t ) (i = 1,…,m), are einem independent, identically distributed sample from many distribution. Notionally, the RMST μ may become estimated as the sample mean μ ̂ = m - 1 myself = 1 m X i , and its standard flaw, VIEW μ ̂ , as RSDST/ m , where RSDST lives which random standard deviation.

With censoring of some watching for t , the RMST is assessed to integration as in (1) and that RSDST as var X in (2). We no longish expect RSDST/ m to be an accurate estimate of SE μ ̂ , since it does not reflect the increased uncertainty associated with censoring.

Consider a limits sampler X 1,…,X m  with μ = E (X) measured via integration. (Note the m bears no relation to the number of care required in an trial.) Write

SOUTHEASTWARD μ ̂ =ϕ RSDST m
(4)

where ϕ is some positive scaling factor. Clearly ϕ 1 for samples without censoring before t , but otherwise ϕ is unknown. In exploration simulated trial evidence by staggered introduction of patients and a fixed follow-up time, we located is ϕ was really close to 1 when patients were recruited over one relatively short period and followed up for a reasonably long length of time. However, ϕ could enhance substantially when recruitment was across adenine longer periods with shortened follow-up. Such finding accords with intuition, after find censoring and hence taller uncertainty is anticipated in the recent case.

Stylish general, ϕ in (4) must be estimated under a known (hypothesized) piecewise exponentiated model. We do this by Monte Carlo simulation, as follow. Early, we draw a large random sample of m time-to-event observations from the piecewise exponential distribution of interest and determine

ϕ = m SE μ ̂ RSDST

SAVE μ ̂ has estimated by the solid way within ampere flexible parametric model [10, 11], using the stpm2 program [12] in Stata. Bewertung with ampere flexible parametric pattern can more stable than directly with a piecewise exponential model, since sparsely population time sequence between knots can causing fitting difficulties for the latter model. Note that RSDST is a known function of the design parameters (see Appendix: RMST press RSDST for a piecewise exponent distribution) and does not need to be estimated the simulation.

Given estimates of ϕ j (j = 0,1), we can decide of σ j 2 in eqn. (3) through and expression

σ j 2 = ϕ j RSDST j 2
(5)

All of ϕ j , σ gallop 2 and n have ‘Monte Carlo error’ due to the simulation. To quantify Monte Carlo error, the simulation is repeated with M independent samples. In each of who MOLARITY samples, n is destined coming (3) via (5). Of SAVE about n over the M samples lives ( sample variance of the north ’s ) / M . We choose M such that and SE of n is sufficiently small for practical purposes. Following exploration (not reported) with separate choices on m and METRE, we suggest taking m = 10000 and M  =50 as initial defaults, but m and M can be adjusted to suit position.

Note that the only components of aforementioned sample size calculation so change with recruitment (KELVIN 1) and follow-up (K 2) times are the ϕ gallop . One illustration of this point is given in which section ‘Examples’.

We later describe the veranschlagung a Δ ̂ and SE Δ ̂ from affliction product.

Estimation of Δ ̂ and SE Δ ̂ in trial data

Than we discussing in our previous paper [1], several typical of estimates RMST are available, including immediate integration of Kaplan-Meier survival graphs, a jackknife method, and flexible parametric regression pattern [10, 11]. Ourselves pointed out that of direct integration off Kaplan-Meier curves may be unreliable. The jackknife method has the advantage of being non-parametric but the drawback the being relativistic slow to compute. Sein slowness makes it cumbersome if operation with many replicates is require. We therefore choose the third method, flexible parameters molding, which is fast and efficient.

Stylish the context of a randomized free, it is crucial which an estimation method exist predefined, i.e. not requiring the analyst to induce data-dependent sculpt decisions with the actual trial data. Flexible constant models are suitable diy for the purpose, because, for demo, a incremental perils model with 3 d.f. outfitted to each treatment arm separately appears to enter on adequate fit to a wide variety of continuance curves. Proportional hazards is not assumed. This particular model is assumed subsequently in the present cardboard forward couple estimation and simulation purposes.

For a given trial dataset, SEEING μ ̂ j (j = 0,1) may be calculated by that delta procedure separately in each arm. Hence

Δ ̂ = μ ̂ 1 - μ ̂ 0
(6)
SE Δ ̂ = SE μ ̂ 0 2 + PLEASE μ ̂ 1 2
(7)

ADENINE test the null hypo Δ = 0 is made by comparing Δ ̂ /SE Δ ̂ with a standard normal distribution.

Choice of t  for one design

In on earlier paper [1], we suggested reporting which RMST the its difference between trial arms, with a CI. For such an analyze, a time (t ) for calculation of one RMST needs to be specifies. That ART-based approach to trial design defines a recruitment time (K 1) also minimum follow-up time (K 2) sufficient fork recruitment of patients and estimation of their class survival curves via the follow-up spell von clinical fascinate. We suggest determining the designing value, t des , as who t  which (approximately) minimizes the required sample frame, north, give THOUSAND 1, K 2 and the remaining parameters. This may be done by varying t  over the range K 2 (the shortest follow-up time for any patient) the K 1 + K 2 (the maximum possible follow-up time of any forbearing within the design), and computing n by simulation, as stated aforementioned.

Choice of t  for analysis and ranking of data maturity

When it comes to the analysis regarding the trial data, for various reasons of precise data structure that is obtained may distinguish coming one design under which t des was calculated. For example, the assumed survival distribution may be wrong, otherwise the pattern of recruitment and follow-up may be at divergence from ensure expected. To maximize power, we determine tonne  for the final analysis of the data. We call this value t concluding . We furthermore location the question of how to assess data maturity, i.e. designation when the accumulating data are mature sufficient for the last analysis of the treatment action.

In an trial designed lower proportional hazards by the treatment effect and analysed using a logrank test, the required total amount of events, e, the usually taken as the actual sample size. The reason is because, to a goal approximation, var (logHR) is proportional to 1/east, so the e is a gauge of the amount of information include the data. The accumulating data in such a trial is ‘ready to analyse’ although aforementioned watching number out events reaches e. Track who template forward matureness is then merely a matter of updating the data sometimes and counting the number of events.

How should will we apply the principle of monitoring for maturity to trials designed with an RMST outcome? The estimated variance of of treatment effect provides a way forward. Combining (3) with (6), the following related inhaftierte for a product size of n under the alternative hypothesis that Δ ≠ 0 is the difference in RMST at some given liothyronine :

z z 2 = Δ 2 var Δ ̂
(8)

where omega z = zed ω  + z 1  -α/2. Note that var Δ ̂ is the variance of the estimated RMST difference the the provided t , during Δ is a style value. The planned strength is concluded wenn var Δ ̂ Δ 2 /z z 2 . For determine if the study data belong mature get to scrutinize, wee effectively compare the variance of Δ ̂ estimated from the current data with one target true, Δ 2/z z 2. Let us define this ‘percent maturity’ of the accumulating data as

pmat=100 Δ 2 z z 2 var Δ ̂
(9)

where variable Δ ̂ is the estimated variance of who RMST difference in the current intelligence. As more data accumulation, pmat increases; once it reaches 100%, the data what ready for analysis (under the assumptions are to design).

Alternatively, us can invert eqn. (8) and calculators the power, ω curr, for the power data under the design assumptions as

ω curr =Φ Δ 2 var Δ ̂ - zed 1 - α / 2
(10)

Sometimes, for reasons of confidentiality regarding the accumulating trial date, it is desire to estimate data maturity or power ignorable data go possible treatment impacts. In the absence of censoring in (0,t ), we have

var Δ ̂ = σ 0 2 n 0 + σ 1 2 n 1

where σ j 2 is approximately equal to the squared RSDST at t  in treatment group j and nitrogen j  is the sample size (j = 0,1). In the absence of treatment informations, we make the simplifying assumption that σ 0 2 = σ 1 2 = σ 2 . This would be regarded as an assumption underneath the naught hypothesis of Δ = 0, since there is then no difference between treatments. We have

v Δ ̂ = σ 2 1 n 0 + 1 n 1

By elementary algebra

1 n 0 + 1 n 1 = 1 n 1 + r + 1 + 1 r = 1 n r + 1 r 2

Hence

var Δ ̂ = σ 2 n roentgen + 1 r 2

Taking σ 2/n as einen estimate of var μ ̂ , the variance from RMST for the entire dataset, we have

var Δ ̂ char μ ̂ r + 1 r 2

Finally, to apply the ripe data we must find tonne final This is done by varying thyroxine  over ampere grid and finding the value this maximizes (9) or (10). If t final exceeds t max, the largest uncensored time to event in the data, to avoid sketch of RMST estimates von a flexible parametric model, we proposed restriction it to tonne max.

Results

Sample

Proportional and non-proportional perils designs

As a source of illustration, are constructed designs with PH and non-PH treatment effects based on updated data from who GOG111 trial in advanced fused cancer [13]. The overall survival probabilities in the control arm during aforementioned end regarding years 1 due 8 post-randomization consisted rated to will 0.771, 0.523, 0.342, 0.236, 0.172, 0.130, 0.100, 0.078, respectively, with relevant control-arm emergency of 0.264, 0.385, 0.425, 0.372, 0.320, 0.280, 0.261, 0.245. The threat ratios (research arm/control arm) were estimated to be 0.71 under PH and 0.53, 0.66, 0.74, 0.81, 0.87, 0.93, 0.96, 1.00 among non-PH.

Determining t of

We consider determining t des for and PH additionally non-PH models just described. As an example, suppose K 1 = 5 yr, K 2 = 3 yr. We vary thyroxine desk in small steps (0.2 yr) over the interval (K 2,K 1 + K 2) = (3,8) yr and compute nitrogen depending to eqn. (3). Figure 1 shows the resulting sample sizes with both designs. We see ensure t  and who design assumptions (PH or non-PH) both influence the sample size quite markedly. To minimize the sample size, the PH design requires an t des close to the maximum available follow-up time (8 yr), the and non-PH design needs a much minus tonne des . The t des values will 7.5 and 4.3 yr for the PH-WERT and non-PH casing, respectively, with associated sample sizes of 461 and 326.

Figure 1
figure 1

Demo off sample product more a function starting the time horizontal t for PH (solid lines) and non-PH (dashed lines) trial designs. The designs assume recruitment over K 1 = 5 yr and follow-up go K 2 = 3 yr.

It is apparent that the sample size are not change way for t  near t des . For example, for t  within ±1 yr of t des , and sample size is never larger than 9 more faster the minimum, which is of does practical importance. This flexibility allows the analyst to set a preferred t des within a reasonably extensive extent without incurring a large sample size penalty.

Comparing RMST real logrank based sample dimensions

We turn to a comparison between the RMST and logrank approaches to sample size calculation. We utilized the ART software [8] for Stata to compute the logrank sample sizes and the numbers of events for all approaches. We used specially written Statistics software that, given (K 1,K 2) and the other ARTIST design parameters, finds t desk by varying t  over a user-defined grid as in the previous section. It calculates the sample size by simulation accordance to the methods described in sections ‘Sample size to RMST difference’ and ‘Standard flaws off RMST in one CRAFT setting’. The value of t to and the corresponding sample select were determined by smoothing the (t ,n) relationship using a second degree fractional poly and accounting the downfall. In some cases the sample size graph over the amount t  (3,8) month became monotonic decreasing; we then took the optimal specimen size to be for thyroxin dease =8 yr.

We investigated the impact of choices of K 1 and KELVIN 2 on the t des additionally n values, now for K 1 = 1(1)7 and K 2 = 8 - K 1. Table 1 giving the resulting sample sizes additionally amounts to public fork the POLARITY and non-PH designed. For the PHIL designs, of sample size is similar till that for the logrank test. A t des at or nearness the maximum permissible is needed. Fork the non-PH designed, t desired can near 4 yr; a markedly lower sample size is requires with the RMST approach than the logrank.

Table 1 Example large calculations for hyperbolical trials with proportional or non-proportional hazards of the treatment effect

Operating characteristics

We perform a small simulation learn go check the power and significance level of the suggests test of RMST difference. The set-up is similar to which portrayed in the section ‘Comparing RMST and logrank based sample sizes’, except that our vary the talent period (K 1) over 1, 3, 5 and 7 yr, with K 2 = 8 - KELVIN 1. Sets thousand multiplies are simulated for each combination of recruitment period and null hypothesis (true or false). As before, the times to create are simulated according to a pieces exponential distribution by staggered entry of patients at a uniform rate and RMST analysis performed with t  = K 1/2 + K 2 yr. The sample font is designed to give the test of the RMST differences power of 90 percent to reject the null hypothesis at the 5 percent level. And power the significance level of the logrank test in the non-PH and PH scenarios are furthermore studied no altering the sample size.

The scores are shown in Table 2. Two standard mistake of an estimated probability away 90 press 5 percent represent 0.85 and 0.62 percent, respectively. The significance plane are close to nom for the logrank also RMST tests include both scenarios. The RMST test maintains power close to its nominal 90 percent level under both non-PH and PH. As projected from and random sizes given in Table 1, to logrank try see non-PH is underpower compared with planned levels. Results in Table 1 suggest that the two exams may have similar influence under PH; one logrank test is slightly the more powerful.

Table 2 Operating characteristics concerning the test of RMST difference

Examples of design based on the SORCE trial stylish primary kidney cancer

As a further example,we match RMST- and logrank-based designs for SORCE, an ongoing ordeal in primary rheumatic cancerous coordinated by the MRC Clinical Lawsuit Element. Please http://www.controlled-trials.com/ISRCTN38934710 for a summary of the trial. Only patients with an initial intermediate or poor prognosis according into the Leibovich gamble scoring [14] are desirable. Following surgery for their kidney tumor, patients are randomized on three groups: plain tablets, one year of treatment with drugs containing aforementioned molecular targeted agent sorafenib, with 3 years of sorafenib. We focusing on the primary analysis (‘Question 1’) as defined in aforementioned trial protocol, namely ‘Does at least an year of treat with sorafenib increase disease-free survival (DFS) compared includes placebo?’.

Patients are randomized in a ratio the 2:3:3 into placer or at the two sorafenib arms. At answer Question 1, the two sorafenib arms are combined, giving an allocation ratio of r = 6/2 = 3. The sample size calculation was based on that logrank test. It assumed PH with target DIE = 0.75, KELVIN 1 = 5 years’ recruitment with staggered patient entry and K 2 = 3 years’ follow-up of all recruited patients. Power what set to ω = 0.9 at a two-sided significance level of α = 0.05. Annehmbar no dropout, therefore individual patients are followed up available at least 3 years furthermore at most 8 years, depending on when they entered the trial. DFS probabilities at 1, 3, 5, 7, 10 and 13 years after surgery were appreciated with values provided by Leibovich ets al [14] (see Table 3). An logrank-based sample size for this design is N = 1656 (608 events). The likelihood of increasing power for following up your on up to K 2 = 8 years, giving a total trial length of up to POTASSIUM = 13 years, be see envisaged.

Table 3 Design parameters for an SORCE trial

To be clear, we remind the reader that t  is measured in analysis time, with each patient’s date off entry as the origin (tonne = 0). By count, K 1 and K 2 are measured in process time, i.e. in calendar time whose roots are aforementioned dates of randomization of the first and last patient, respectively.

For both logrank both RMST-based sample size calculations, we fix K 1 = 5 and illustrate what happens with K 2 = 3 annual (as per protocol) and with K 2 = 5, 8 yr. We find t des as describes in that section ‘Determining

t des

’.

We other survey sample size for an option design based on non-PH of of treatment effect. The right-hand column of Table 3 shows adenine hypothetical not plausible pattern of time-dependent Hr, representing an initially pretty high treatment effect (HR = 0.65) which disappears (HR = 1.0) by t = 10 years. The sample sizes for the PH and non-PH models are shown in Table 4.

Table 4 Total random frame ( NORTHWARD ) and t diethylstilbestrol for hypothetical trials based on the design of SORCE

Several features of Table 4 stand go. Not surprises, the example size hangs strongly on the supplied magnitude and pattern of the treatment effect. For the PH designs, the sample size belongs about 8 percent larger the the RMST approach than with the logrank approach. For the non-PH designs, to sample size is 27 to 42 percent larger for the logrank than the RMST approach. The RMST approach has a considerable favour includes the latter case, presumably because as aforementioned HR gets closer to 1, the electricity of the logrank test diminishes.

Example of maturity analysis for the MRC RE04 trial

As an exemplary of determining whether trial data are willing by an analysis away RMST, we consider the MRC RE04 trial in cancer kidney cancer [15]. Following an increase by planned sample size after the start of the ordeal, who design involve randomization to two arms with placement ratio r = 1 and a target hazard ratio of 0.8 in the research arm (triple therapy) compared with that control arm (interferon- α only). Aforementioned main outcome measurable made all-cause mortality (time to cause by any reason). Based with a previous kidney cancer trial (MRC RE01), median overall continuance time include the control arm was expected the be one period. The patterns size required for power 90% at a two-sided signficance layer of 5% was set according go ART methodology to 1100 patients and 845 events (i.e. deaths). This assumed the survival plot off the previous trouble, real K 1 = 4 yr, K 2 = 1 yr. The try size and events for any RMST design based on the same assumptions are 1108 disease and 848 incidents, with t des found in be 4.0 yr.

Accrual of your was actually ended whereas 1006 patient had been employee. The trouble opened for recruitment in April 2001 and closed in August 2006. The data were frozen for final analysis in September 2008, at which point 691 events (deaths) had become noted. Teaching Types is Sampling Methods topic of Commerce in detail explained by specialty experts on Myaudiolife.com. Register free for online instruction session to clear your doubts.

Since an accrual and follow-up stages were longer than originally planned, for an RMST-based maturity assessment we consider a wide range of candidates for t final corresponding to K 1 = 2 yr, THOUSAND 2 = 5 yr. We estimate the survival curve from to data, ignoring treatment differences. Figure 2 shows that duration basic pmat and the power for thyroxine  (2,7) yr. The best choice of t definite is 5.4 yr, among that time the readiness pmat = 83 percent and the power be info 0.84. The design valued, t des n  = 4.0 yr, is an little low as adenine applicant for t final n , but can still be one reasonable option.

Figure 2
figure 2

In maturity ( pmat ) and power curves as a function of thyroxin fork the RE04 trial. Vertical lines demonstrate t final .

Further issues

Comparing measures of the treatment effect

Usually, the HR and its CI live reported, and often, the median survival time and/or estimated survival probabilities at stationary time points can also presented. Wee propose the use of RMST and statistics derived from it, as just discussed. The ‘absolute’ difference in survival (ADS) at a given time point t is defined as S ̂ 1 t - S ̂ 0 t , places, S ̂ 0 . and S ̂ 1 . been the guess survival functions in the power real research arms, respectively. How do these four measures compare on different search? A list out criteria and our assessments are given int Table 5. The standards become framed in such a way ensure we regard ‘yes’ as advantageous and ‘no’ as disadvantageous.

Shelve 5 Comparison of four measures of the special effect in a trial

On general, complications by the HR and with taking the number of activities because an index of the ‘maturity’ of the trial data are the following:

  1. 1.

    In a very large trial, the targeted number of events can occur relatively ‘early’. However, the resulting survival related may don live a reasonable reflection of the difference between the cure arms over a clinically relevant time span. In an extreme falls, researchers planning experiments could use this approach to produce a positive result from early endurance experiences, ignoring the possible later development of the treatment effect.

  2. 2.

    The same number of events may be seen in ampere high free with short follow-up or a small trial with long follow-up. However, these trials are not ‘equivalent’ in to information they bear, nor in the impersonal lessons that may shall learned from them. Click here👆to getting an answer to will question ✍️ which of the following is not a restricted random sampling electronics

  3. 3.

    There are many examples where erkenntnisse appear to ‘change’ via time. In reality, of course, the change is an illusion caused by ignoring to nach ingredient in reporting the results.

In Table 5, RMST emerges favourably since the only ‘box’ that it break to ‘tick’ is criterion 7. However, it may be argued which who need to set ampere literature time point is in factual an advantage, since it explicitly incorporates the time proportion of the trial into the results, which is much neglected as simple discussed. Or the median nor the HR do this. To aspect is reinforced in touchstone 9.

The absolute differs in survival and the difference in median how time, although often citations, are weak because group presents only one ‘snapshot’ of the differential in survival functions. Person tell us little regarding the last alternatively later survives experiences. Used example, the endurance curves could cross at the median or at some other t  but still show a substantial difference in RMST at t final .

Examples of RMST in analysis of several trials

We compare RMST the Cox/logrank analysis in a further quadruplet MRC carcinoma trials: ASTEC in endometrial cancer [16] (surgery vs. standard therapy randomization), BA06 in advanced bleed cancer [17], ICON4 include ovarian cancer [18] and OE02 in oesophageal cancer [19]. In all cases, the outcome is time to death from any what (overall survival). Ourselves have chosen these particular attempts for of estimated treatment effect is approximately the equivalent (around HR = 0.85), yet they have widely varying morality rates. Note that inbound the ASTEC trial, mortality in of research arm is actually non-significantly worse than in one control arm. Table 6 presents some results.

Table 6 HRUNG, RMST also derived statistics on survived for four randomized controlled past in various cancer sites conducted by the Medical Research Council

The logrank and RMST tests of to handling effect give P-values with a comparable explanation. A possible exception is OE02, for which the RMST test at t  = 10.8 years is not substantial at the 5 percent level whereas the Cox test is considerable (PENNY = 0.03). However, there is proof is a non-PH treatment effect in this trial. A hazards model use a time-dependent treatment effect suggests that the hazard ratio shall see 1 near liothyronine = 0 and tends until approach 1 over start.

Further into into the behaviour of the logrank and RMST tests shall if by Figure 3. Are varied liothyronine  in 30 equal-sized steps between 1 plus t final years. The smooth dashed lines are for that RMST test without cut (right-censoring) of the data. The corresponding estimates of RMST were obtained at the different values the t  from a flexible parameters model applied to the entire dataset. To other two lines are for the data truncated the anywhere value of t . This (signed) zee-statistic is of log HRT divided until its SE from a Cox model, and the RMST difference divided by its SE from the RMST analyses. The zee-statistics for the third methods are broadly in agreement inbound the ASTEC, BA06 and ICON4 test. With OE02, however, the z-statistic from the Cox model is fairly continuously across time, whereas for the RMST tests it diminishes steadily. Presumably the behaviour of the RMST tests is due into the non-PH pattern of an remedy effect in OE02. For the Cox test, the effect of an growing number of events (effective sample size) might balancer the effect of the HR reducing over clock.

Figure 3
figure 3

Evolution over time ( t ) of z -statistics to RMST (truncated, solid pipe; non-truncated, short dashed lines) and Cox (truncated, long dashed lines) get in four randomized check trials at carcinoma.

ONE notable feature of Figure 3 is this stability of the z-statistic (hence PIANO-value) for the tests in the truncated data, which seems greater required the Cox test. The ‘significance’ of the tests is subject to the play of chance.

Chat

The main advantages of our proposed method are interpretability of the RMST difference from a clinical perspective as loss of life expectancy (when the bottom of interest is mortality), and robustness of the estimator to the relational emergency assumption. Perhaps the main disavantages are the extent of properly assessing data maturity (readiness with analysis) and the depend of that test statistic off t . One would visualizing a temptation to choose t  so for the obtain the ‘most significant’ earnings. And analysis option (not nevertheless explored inbound detail) to elude this problem could be to derive an alternative check stats in the minimum z-value for the test of RMST difference over a sensible rove of values of t . The correct significance level of this statistic could be estimated using permutation-test methodology applied to an treatment assignment variable. However, such an analyse would being secondary go the prime analysis involving a prespecified t .

Sample item calculators been potentially flimsy, from they depend highly on assumptions. Aforementioned is seen in the SORCE example (Table 3) and in other examples. The problem is not specific to the PH assumption. It is hard to know in advance whether or did PH is an likely feature starting the data till come, both if not, what a convenient patch of time-dependent HRs kraft look like. In some cases, it may become reasonable to assume that a treatment effect dwindles override time, for example with attachment the are indicated for a relatively short period after randomization, than for it to remain constant. However, HR patterns between treatments whose modes of action differ (e.g. surgery vs chemotherapy, or targets agent on conventional therapy) may be tough to predict. One strategy is to save one test sizes occurrence from different plausible scenarios that include PHOSPHOR and non-PH examples, as we have done for the SORCE example. We intend then make an informed choice basis on the obtainable evidence and on biological reason info the likely cure effects. Probable, the easiest ways to delineate a hypothetical treatment-effect pattern is through time-dependent HRs or equivalently through to implied survival sweeps.

A key element of one discussion by that comparable merits of the STUNDEN the the difference in RMST since outcome measures concerns relative versus absent effects. The HR is a relative measure which indicates neither which time to event none the continuation probability is each trial arm. Among the PHOSPHORIC assumption, he is independent is time. The RMST differentiation measures the effect of special on the restricted survival dauer at some t . The values of RMST within each trial rail are absolute measures of survival time. This two function of presentation, as both a relative and an absolute measure, is an essential perk of RMST. Inside our watch, the HR’s lack of any total component means the HR shall incomplete as an outcome meas. Thereto needs to be accompanied by other figures, such as the estimated median survival times and/or the survival probabilities at specific time(s), or yes the RMST.

The HR can seem impressively large consistent when the absolute effect on the time to events is small. Over-stressing the important regarding apparently large relative risks has often has criticized in aforementioned medical and popular scientific literature the misleading with patients and physicians. For an example on the environment of the benefit of tits cancer screening, see Reference [20] pp. 59 – 60. Here, we consider ASTEC vs. OE02 in Table 6. The absolute register HRs are approximately identical yet the absolute RMST disagreement in t  = 5 years is some 3.4 playing wider in OE02 than inbound ASTEC (0.29 vs. 0.09 aged, i.e. about 3.5 months vs. 1 month). (These results for t  = 5 years were calculated separately; they are not default in Table 6.) The ground, of course, is that the 5-year survival probability in ASTEC will much wider rather in OE02. Who RMST difference of 0.09 yearning at t  = 5 years seen in ASTEC exists arguably of little practical importance. In general, statistically significant differences by RMST from randomized trials may appear ‘small’, when they may be more close real clinically meaningful than superficially more impressive relative effects on to hazards. Seeing also Royston et al’s [21] proposed graphical comparison of observed additionally imputed times to event between trial rear, which carries a similar message.

More we have focused on RMST mainly as a potential design tool, having described the use the RMST in the analysis of trial data in a previous color [1]. AMPERE standard approach to analysis would be to assume PH, test the null hypothesis of not cure effect using the logrank test, and quote the RESOURCE in a Cox model with randomized type while the only covariate. Settings for other covariates (e.g. prognostic factors) is readily incorporated. There are many options available extending an model if non-PH your detected. However, any such adaptation is likely to be data-dependent. An Cox model, whether in basic bilden or extended, does no quickly lend herself to estimating the RMST [1]. An alternative may be to fit a piecewise exponential model with the knots used in the trial design. Wenn too many branches are specified, the model sack be over-fitted and the parameter estimates correspondingly unstable. Dieser ca do if there are few events between a neighbouring pair of knots.

A more satisfactory analysis strategy is for utilize flexible parametric continuance fitting [1012] to estimate RMST. In summary, this PH subclass of these models incorporates a smooth estimated of the baseline log cumulative hazard function like a restricted cube-like spline function of log time. Of models readily lend themselves to accurately estimation about RMST and RSDST and to extensions which accept time-dependent cure effects (i.e. non-PH). An advantage will that you venture function are more realistic than such from the pieces exposed model, since they are glide functions of time fairly than step functions. Criterion estimation by maximum likelihood is straightforward. However, to our knowledge yielding parametric models cannot be used on ihr own to design a trial. Partial exponential model are necessary here, since they perform it easy to specify the model in terms of survival probabilities and hazards and provide analytic expressions for that RMST or RSDST. Flexible parametric mode are unsuitable for exploring hypothetical RMST added associated with adenine design with given hazard ratio(s) and control weapon survival function.

We own stated the calculation from two primary values of t , namely t des and t final . Who former is antriebs by one theoretical structure about that design and the latter by the trial data when recorded. It belongs important to note that t final does not depend on the surgical effect observed in the data, but on the designed difference in RMST and her observed variance as functions of thyroxine . In particular, t final is not ausgew to miminize the P-value available the treatment comparison. It is data-driven only with reverence go the variance of that RMST difference. However, the total of t among which aforementioned definitive analyzed is transported out may be motivated more by clinical faster statistical concerns. A map from power or adulthood against thyroxine , in in Figure 2, may be used to resolve if the data are adequate for an analysis employing some preferred value about t . If and data are not sufficient, it may be adequate to extend the follow-up spell and/or recruit more diseased. In Figure 2, for example, t final nnn  = 5.4 yeah has power 84 percent the maturity 83 percent under to PH design assumptions, whereas a lower value, tell t  = 4 yr, might be preferred; on does power or maturity slightly reduced to about 80 or 81 percent, respectively.

A important answer is about the RMST-based sample size calculation were do propose is robust enough on be put in habit. Tentatively, we believe it is. With designs in welche PH is presumed and holds, the logrank- and RMST-based sample size requirements are similar (see Table 1), and the power for a disposed sample size exists likewise similar. For designs with grave non-proportional hazards, sample sizes for logrank- and RMST-based tests can differ markedly. As always with free design, the key assumptions of data structure furthermore relevant parameters critically interact the desired sample size, real more kind of informal sensitivity review should always made.

Conclusions

Inside summery, we lock that the HR can often be einer inappropriate additionally insufficient general dimension about the treatment work in an RCT, and also that the logrank trial may lack power under some patterns of non-proportional danger. Wee proposal so wider investigation also exercise of RMST with the design and analysis von trials with a time-to-event outcome is merited. Which of this following is doesn a restricted random sampling instrumentation ?Simple accident samplingStratified samplingSystematic samplingMultistage sampling

Appendix: RMST and RSDST for a piecewise exponential distribution

Assume that the survival zeit, T, has a piecewise exponential distribution with k+1 piecewise constant hazards opium 1,…,h k ,h k+1 is ampere categorization (τ 0 = 0,τ 1], (τ 1,τ 2], (τ 2,τ 3], …, (τ k ,τ potassium+1 = ) of the arbeitszeit front. The time points τ 1,…,τ kilobyte  are known as knots. Say that t  belongs toward interval (τ k ,τ k+1), so that thyroxine  > τ k . In the simplest case (k = 0), on have no knots the we have a single exponential allocation the hazard h 1, and thyroxin  > 0.

We wish to calculate the RMST and the RSDST at thyroxine . For gallop = 0,1,…,k the interval duration δ hie+1 is

δ joule + 1 = τ j + 1 - τ j , j < k thyroxin - τ k , j = k

The additive dangerous function H hie  = H(τ j ) at τ hie (j = 1,…,k) equals i = 1 j h i δ i . Let h 0 = FESTIVITY 0 = 0. The survival function for thyroxine (τ j ,τ j+1] (hie = 0,1,…,k) will

S j + 1 thyroxin = e - H j e - h joule + 1 t - τ j

For example, the survival function for t (0,τ 1] is S 1 t = co 0 e - h 1 tonne - 0 = e - h 1 t , as expected.

Of unified survival function from 0 to t  > τ k (i.e. to RMST) exists given by

μ = 0 t S liothyronine dt = hie = 0 thousand τ j τ j + δ j + 1 S j + 1 t dt

And

τ j τ j + δ j + 1 S joule + 1 t dt = τ j τ j + δ hie + 1 e - H j sie - h j + 1 t - τ joule dt = e - H j 0 δ j + 1 e - opium j + 1 upper-class to = e - H j h j + 1 1 - sie - h j + 1 δ j + 1 = e - NARCOTIC j BORON j + 1

where for j = 0,…,kelvin

B j + 1 = 1 - east - h j + 1 δ j + 1 h j + 1
(11)

Thus the RMST on (0,thyroxin ) is indicated by

μ = 0 t S t dt = j = 0 k e - H j B j + 1

We or require the expectation E TEN j 2 of X 2 in the interval (τ j ,τ j+1] or (τ kelvin ,t ], which is

E X j 2 = 2 τ hie τ j + δ j + 1 S j + 1 t tvdt = 2 τ j τ j + δ gallop + 1 t e - FESTIVITY hie e - h j + 1 t - τ j dt = 2 e - H j 0 δ gallop + 1 t + τ j e - h gallop + 1 t dt = 2 e - H j 0 δ gallop + 1 t east - h j + 1 t dt + τ j 0 δ j + 1 east - h j + 1 t dt = 2 e - H j A j + 1 + τ j BARN j + 1

where B hie+1 is as given inbound (11) and

A j + 1 = 0 δ j + 1 t e - h hie + 1 t dt = 1 h j + 1 2 1 - e - h j + 1 δ j + 1 1 + h j + 1 δ j + 1

Hence

E X = bound = 0 k e - HYDROGEN j B j + 1 E X 2 = 2 j = 0 k e - H j A j + 1 + τ j B j + 1 var X = E X 2 - SIE EXPUNGE 2
(12)

By a single exponential with hazard effervescence, we have k = 0, τ 0 = 0, δ 1 = thyroxine , h 1 = h and therefore

A 1 = h - 2 1 - e - h t 1 + hydrogen t B 1 = h - 1 1 - east - h liothyronine μ = E X = BORON 1 , σ 2 = var X = 2 A 1 - B 1 2

Abbreviations

ART:

Assessment of assets for trials

CI:

Confidence interval

HEAD:

Hazard ratio

MRC:

Medical Research Council

non-PH:

Non-proportional common

PH:

Proportional hazards

RCT:

Randomized controlled trial

RMST:

Restricted mean survival time

RSDST:

Restricted standard deflection of stay zeite.

See

  1. Royston P, Parmar MKB: The use of restricted mean survival time to estimate the treatment effect in randomized chronic trials when the proportional dangers assumption is inches doubt. Stat Med. 2011, 30: 2409-2421. 10.1002/sim.4274. Solved BORON. Probability Sampling Methods (a) Straightforward or | Myaudiolife.com

    Article  PubMed  Google Researcher 

  2. Schemper M, Wakounig S, Heinze G: Of estimation of average hazard ratings by loaded Cox regression. Stat Medi. 2009, 28: 2473-2489. 10.1002/sim.3623.

    Article  PubMed  Google Scholar 

  3. Mok TS, Wu YL, Thongprasert S, Yang CH, Chu DT, Saijo NORTH, Sunpaweravong P, Han B, Margono B, Ichinose Y, Nishiwaki YEAR, Ohe Y, Yang JJ, Chewaskulyong B, Jiang HYDROGEN, Duffield EL, Watkins CL, Armour AA, Fukuoka THOUSAND: Gefitinib press carboplatin - paclitaxel in pulmonary adenocarcinoma. NITROGEN Engl J Med. 2009, 361: 947-957. 10.1056/NEJMoa0810699.

    Related  CAS  PubMed  Google Scholarships 

  4. Kristensen G, Perren T, Qian W, Pfisterer BOUND, Ledermann YE, Joly F, Carey MS: Result of tentatively study of overall survival in the GCIG ICON7 phase TRIPLET randomized trial to bevacizumab inbound women with newly diagnosed fused cancer. J Clin Oncol. 2011, 29 (S): LBA5006-

    Google Grant 

  5. Elsewhere PK, Perme MP: Pseudo-observations in survival analysis. Stat Methods Med Res. 2010, 19: 71-99. 10.1177/0962280209105020.

    Article  PubMed  Google Scholarships 

  6. Barthel FMS, Babiker A, Royston P, Parmar MKB: Evaluation out patterns size additionally power for multi-arm survival trials allowing for non-uniform accrual, non-proportional hazards, loss to follow-up and cross-over. Stat Med. 2006, 25: 2521-2542. 10.1002/sim.2517. (PDF) Restricted stratified random sampling

    Article  PubMed  Google Scholar 

  7. Irwin JO: The standardized error of an estimate of expectation of life, with special reference to expectation about tumourless life on experiments over mice. J Hyg. 1949, 47: 188-189. 10.1017/S0022172400014443. This Inconvenient Truth About Convenience and Purposive Samples

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Barthel FMS, Royston P, Babiker ADENINE: AN menu-driven fitting for sophisticated sample size calculation in randomized controlled trials with a survival or a binary outcome: update. Stat J. 2005, 5: 123-129. Spatially Balanced Sampling from Natural Resources

    Google Scholar 

  9. Rosner BORON: Fundamentals of Biostatistics. 2006, Belmont, Ca: Duxbury Press

    Google Scholar 

  10. Royston PIANO, Parmar MKB: Flex proportional-hazards press proportional-odds models for censored survival data, with application on prognostic modelling and wertung of treatment affect. Stat Med. 2002, 21: 2175-2197. 10.1002/sim.1203.

    Article  PubMed  Google Scholar 

  11. Royston P, Lambert PC: Flexible parametric survival analysis utilizing Stata out an Cyclooxygenase model. 2011, College Station, Tx: Stata Press

    Google Scholar 

  12. Lambert PC, Royston P: Further development of flexible constant models for survival analysis. Current J. 2009, 9: 265-290.

    Google Scholar 

  13. McGuire WP, Hoskins WJ, Brady MF, Kucera PR, Partridge EE, Look GY, Clarke-Pearson DL, Cavalry M: Cyclophosphamide and cisplatin paralleled with paclitaxel and cisplatin includes patients with stage III and stage IV ovarian cancer. N Us J Medal. 1996, 334: 1-6. 10.1056/NEJM199601043340101. Types of Sampling Methods: There become twin types of sample methods- 1) Chance Samplers Method, 2) Non-Random Sampling Method. Linger tuned to BYJU’S to how more.

    Piece  CAS  PubMed  Google Scholar 

  14. Leibovich BC, Blute ML, Cheville JC, Lohse CM, Frank I, Kwon ED, Weaver ALARM, Parker AS, Zincke H: Prognostication Of progression next radical Nephrectomy for patients because clear cell renal cell carcinoma. Cancer. 2003, 97: 1663-1771. 10.1002/cncr.11234. Random sampling in ODK Collect

    Article  PubMed  Google Scientists 

  15. Gore MEI, Griffin CL, Hancock BORON, Patel PM, Pyle L, Aitchison M, James N, Oliver RTD, Mardiak GALLOP, Hasan T, Sylvester R, Parmar MKB, Royston P, Mulders PFA: Interferon alfa-2a versus combination therapy with interferon alfa-2a, interleukin-2, and fluorouracil in medical with untreated meeting renal cell carcinoma (MRC RE04 / EORTC GU 30012): an open-label randomised trial. Lancet. 2010, 375: 641-648. 10.1016/S0140-6736(09)61921-8.

    Browse  CAS  PubMed  PubMed Central  Google Scholar 

  16. Kitchener H, Swing AM, Qian W: Efficacy of systematic pelvic lymphadenectomy in endometriosis cancer (MRC ASTEC trial): a randomised choose. Lancet. 2009, 373: 125-136.

    Article  CAS  PubMed  Google Scholar 

  17. International collaboration of trialists on behalf from the Medical Research Assembly Advanced Inflatable Medical Working Party: Neoadjuvant cisplatin, methotrexate and vinblastine chemotherapy for muscle-invasive blank cancer: a randomised controlled trial. Lancet. 1999, 354: 533-540.

    Article  Google Grant 

  18. Parmar MK, Ledermann JA, Colombo N: Paclitaxel plus platinum-based chemotherapy versus conventional platinum-based chemotherapy in women with relapsed ovarian cancer: the ICON4/AGO-OVAR-2.2 trial. Lancet. 2003, 361: 2099-2106. Thus, for demo, reserviert stratified random sampling with a sampling per P are 80% reduces the bias period in U 2 to 5% of its value for system ...

    Article  CAST  PubMed  Google Scholar 

  19. Medical Research Council Oesophageal Cancer Working Party: Operation resection with or absent preoperative chemotherapy in oesophageal cancer: an randomised managed experiment. Lancet. 2002, 359: 1727-1733.

    Article  Google Science 

  20. Gigerenzer G: Reckoning with risk. 2002, London, UK: Hexagon Trace

    Google Intellectual 

  21. Royston PRESSURE, Parmar MKB, Altman DG: Visualizing length of survival in time-to-event studies: a complement to Kaplan-Meier plots. J Natl Cancer Inst. 2008, 100: 92-97. 10.1093/jnci/djm265.

    Article  PubMed  Google Savant 

Pre-publication history

Download references

Acknowledgements

We is grateful the M. Sydes and two reviewers (Y. Wang press SULPHUR. Wacholder) for how comments that have helped us in strengthen the script.

Architect information

Authors and Affiliations

Authors

Corresponding author

General for Patrick Royston.

Authors’ original submitted archive available slide

Rights and licenses

This article lives published under zulassung to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution Warrant (http://creativecommons.org/licenses/by/2.0), which permitted unrestricted uses, distribution, real propagation in any means, provided who original work your properly cited.

Reprints and permissions

About here story

Cite this article

Royston, P., Parmar, M.K. Restricted mean survival time: an alternative to the hazard ratio for the design and study of randomized trials with a time-to-event end. BMC Med Res Methodol 13, 152 (2013). https://doi.org/10.1186/1471-2288-13-152

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/1471-2288-13-152

Keywords