How to calculate required sample size in chi-square test, Fisher exact test, Student’s t-test and log-rank test?

Sample size calculation may be hard for research member, because it’s difficult to distinguish sample size is enough or not when it was not statistical significant. Required sample size calculation is very important.

χ2 test without correction

To compare survival rate between risk/intervention group and control group, it’s required to execute χ2 test. You can calculate sample size as following formula. With significance level (α) 0.05 (two-tailed) and statistical power (1 – β) 0.8 (one-sided), Zα/2 is 1.96 and Zβ is 0.84, respectively.

\displaystyle N_0 = \frac{\left(Z_{\alpha/2}\sqrt{(1+\phi)\bar{p}(1 - \bar{p})} + Z_\beta\sqrt{\phi p_0(1 - p_0) + p_1(1 - p_1)}\right)^2}{\phi\delta^2}

\displaystyle N_1 = \phi N_0

If effect size δ was expressed with odd ratio (OR), sample size could be calculated as formula below.

\displaystyle N_0 = \left(\frac{1 + \phi}{\phi}\right)\frac{(Z_{\alpha/2} + Z_\beta)^2}{(\log{OR})^2\bar{p}(1 - \bar{p})}

\displaystyle N_1 = \phi N_0

\displaystyle N_0 : required number of control group.

\displaystyle N_1 : required number of risk/intervention group.

\displaystyle n_0 : actual number of control group.

\displaystyle n_1 : actual number of risk/intervention group.

\displaystyle \phi = \frac{n_1}{n_0}: the ratio of number of risk/intervention group to number of control group.

\displaystyle p_0 : survival rate or efficacy in control group.

\displaystyle p_1 : survival rate or efficacy in risk/intervention group.

\displaystyle \delta = p_1 - p_0 : effect size, difference between two groups.

\displaystyle \bar{p} = \frac{p_0 + \phi p_1}{1 + \phi}

χ2 test with Yates correction and Fisher exact test

When you execute χ2 test with Yates correction or Fisher exact test, you have to correct N0 with multiplying by C, correction term as below.

\displaystyle C = \frac{1}{4}\left(1 + \sqrt{1 + \frac{2 (1 + \phi)}{\phi N_0 |\delta|}}\right)^2

Student’s t-test

In Student’s t-test, you have to calculate standardized effect size (Δ) first with a mean of control group and a mean of risk/intervention group. Then you can calculate sample size with Δ as below. It’s assumed that the variances are equal between control group and risk/intervention group.

\displaystyle \Delta = \frac{|\mu_0 - \mu_1|}{\sigma}

\displaystyle N_0 = \left(\frac{1 + \phi}{\phi}\right)\frac{(Z_{\alpha/2} + Z_{\beta})^2}{\Delta^2} + \frac{Z_{\alpha/2}^2}{2(1 + \phi)}

\displaystyle N_1 = \phi N_0

log-rank test

In log-rank test, you can calculate required number of event (e) and sample size (N) as following formula. p0 and p1 are cumulative survival rate of control group and risk/intervention group, respectively, derived from previous research or cumulative survival rate after 1 or 2 years from the research started. When φ was 1, it means equal sample size in both groups, it would bring same result as described in How to calculate appropriate sample size in Cox proportional hazard analysis with cross tabulation?.

\displaystyle \theta = \frac{\log(p_1)}{\log(p_0)}

\displaystyle e_0 = \frac{1}{(1 + \phi)\phi}\left(\frac{1 + \phi\theta}{1 - \theta}\right)^2(Z_{\alpha/2} + Z_\beta)^2

\displaystyle e_1 = \phi e_0 = \frac{1}{1 + \phi}\left(\frac{1 + \phi\theta}{1 - \theta}\right)^2(Z_{\alpha/2} + Z_\beta)^2

\displaystyle e = e_0 + e_1 = \frac{1}{\phi}\left(\frac{1 + \phi\theta}{1 - \theta}\right)^2(Z_{\alpha/2} + Z_\beta)^2

\displaystyle N_0 = \frac{e}{(1 - p_0) + \phi(1 - p_1)} = \frac{1}{\phi}\left(\frac{1 + \phi\theta}{1 - \theta}\right)^2\frac{(Z_{\alpha/2} + Z_\beta)^2}{(1 - p_0) + \phi(1 - p_1)}

\displaystyle N_1 = \phi N_0

\displaystyle N = N_0 + N_1 = \frac{1 + \phi}{\phi}\left(\frac{1 + \phi\theta}{1 - \theta}\right)^2\frac{(Z_{\alpha/2} + Z_\beta)^2}{(1 - p_0) + \phi(1 - p_1)}

\displaystyle N_0 : required number of control group.

\displaystyle N_1 : required number of risk/intervention group.

\displaystyle n_0 : actual number of control group.

\displaystyle n_1 : actual number of risk/intervention group.

\displaystyle \phi = \frac{n_1}{n_0} : the ratio of number of risk/intervention group to number of control group.

\displaystyle p_0 : survival rate or efficacy of control group.

\displaystyle p_1 : survival rate or efficacy of risk/intervention group.

References:
TABLES OF THE NUMBER OF PATIENTS REQUIRED IN CLINICAL TRIALS USING THE LOG RANK TEST

χ2乗検定,Fisher正確確率検定,Student t検定およびlog-rank検定においてサンプルサイズを計算するには

 サンプルサイズの計算は重要です.多くの研究者にとって統計的有意差が出なかった場合に,それがサンプルサイズ不足が原因によるものかどうかの判断ができないからです.全ての検定を網羅することはできませんでしたが,重要と思われる主な検定においてサンプルサイズを計算する方法を述べます.

χ2検定

 リスク群・介入群と対照群との2群間で有効率・生存率を比較するにはχ2乗検定を行いますが,その際のサンプルサイズの算出には下記の式を用います.α = 0.05 (両側), 1 – β = 0.8 (片側)とすると Zα/2 = 1.96, Zβ = 0.84 として計算します.

\displaystyle N_0 = \frac{\left(Z_{\alpha/2}\sqrt{(1+\phi)\bar{p}(1 - \bar{p})} + Z_\beta\sqrt{\phi p_0(1 - p_0) + p_1(1 - p_1)}\right)^2}{\phi\delta^2}

\displaystyle N_1 = \phi N_0

 効果量 δ がオッズ比で表現できる場合,サンプルサイズは下式で求まります.

\displaystyle N_0 = \left(\frac{1 + \phi}{\phi}\right)\frac{(Z_{\alpha/2} + Z_\beta)^2}{(\log{OR})^2\bar{p}(1 - \bar{p})}

\displaystyle N_1 = \phi N_0

\displaystyle N_0 : 対照群に必要なサンプルサイズ

\displaystyle N_1 : リスク群・介入群に必要なサンプルサイズ

\displaystyle n_0 : 対照群の実際の症例数

\displaystyle n_1 : リスク群・介入群の実際の症例数

\displaystyle \phi = \frac{n_1}{n_0}

\displaystyle p_0 : 対照群の有効率・生存率

\displaystyle p_1 : リスク群・介入群の有効率・生存率

\displaystyle \delta = p_1 - p_0 : リスク群・介入群と対照群との有効率・生存率の差

\displaystyle \bar{p} = \frac{p_0 + \phi p_1}{1 + \phi}

Yates補正による χ2 検定と Fisher 正確確率検定

 Yates 補正や Fisher 正確確率検定の際には N0 に補正項 C を乗じて補正する必要があります.リンクした書籍には平方根内の項に 1 を加算していますが,森實敏夫の教科書には加算していません.しかしウェブ上のサンプルサイズの計算の数式は合っています.

\displaystyle C = \frac{1}{4}\left(1 + \sqrt{1 + \frac{2 (1 + \phi)}{\phi N_0 |\delta|}}\right)^2

Student t 検定

 対照群の平均値 μ0 およびリスク群・介入群の平均値 μ1 から効果量 Δ を計算し,そこからサンプルサイズを求めます.この場合,対照群とリスク群・介入群とでは分散が等しいと仮定しています.

\displaystyle \Delta = \frac{|\mu_0 - \mu_1|}{\sigma}

\displaystyle N_0 = \left(\frac{1 + \phi}{\phi}\right)\frac{(Z_{\alpha/2} + Z_{\beta})^2}{\Delta^2} + \frac{Z_{\alpha/2}^2}{2(1 + \phi)}

\displaystyle N_1 = \phi N_0

log-rank 検定

 log-rank 検定において必要なイベント数 e およびサンプルサイズ N は Freedman の方法で下式にて求まります.p0 および p1 は先行研究や試験開始後 1-2 年での累積生存率です.φ = 1 の場合,COX比例ハザードモデルのlog-rank検定に必要なサンプルサイズを四分表から計算するで説明した数式と同じ結果になります.

 森實敏夫の教科書の記載には誤りがあります.イベント数 e を求める式の分母において,Freedman の原著では φ は括弧の外にありますが,森實敏夫の教科書の記載では括弧内にあります.ウェブ上のサンプルサイズの計算の数式は合っています.参考文献の Freedman の原著は有料です.

\displaystyle \theta = \frac{\log(p_1)}{\log(p_0)}

\displaystyle e_0 = \frac{1}{(1 + \phi)\phi}\left(\frac{1 + \phi\theta}{1 - \theta}\right)^2(Z_{\alpha/2} + Z_\beta)^2

\displaystyle e_1 = \phi e_0 = \frac{1}{1 + \phi}\left(\frac{1 + \phi\theta}{1 - \theta}\right)^2(Z_{\alpha/2} + Z_\beta)^2

\displaystyle e = e_0 + e_1 = \frac{1}{\phi}\left(\frac{1 + \phi\theta}{1 - \theta}\right)^2(Z_{\alpha/2} + Z_\beta)^2

\displaystyle N_0 = \frac{e}{(1 - p_0) + \phi(1 - p_1)} = \frac{1}{\phi}\left(\frac{1 + \phi\theta}{1 - \theta}\right)^2\frac{(Z_{\alpha/2} + Z_\beta)^2}{(1 - p_0) + \phi(1 - p_1)}

\displaystyle N_1 = \phi N_0

\displaystyle N = N_0 + N_1 = \frac{1 + \phi}{\phi}\left(\frac{1 + \phi\theta}{1 - \theta}\right)^2\frac{(Z_{\alpha/2} + Z_\beta)^2}{(1 - p_0) + \phi(1 - p_1)}

\displaystyle N_0 : 対照群に必要なサンプルサイズ

\displaystyle N_1 : リスク群・介入群に必要なサンプルサイズ

\displaystyle n_0 : 対照群の実際の症例数

\displaystyle n_1 : リスク群・介入群の実際の症例数

\displaystyle \phi = \frac{n_1}{n_0} : リスク群・介入群の症例数と対照群の症例数との比

\displaystyle p_0 : 対照群の有効率・生存率

\displaystyle p_1 : リスク群・介入群の有効率・生存率

参考文献:
TABLES OF THE NUMBER OF PATIENTS REQUIRED IN CLINICAL TRIALS USING THE LOG RANK TEST