 Research article
 Open Access
 Open Peer Review
 Published:
Sample size calculations based on a difference in medians for positively skewed outcomes in health care studies
BMC Medical Research Methodology volume 17, Article number: 157 (2017)
Abstract
Background
In healthcare research, outcomes with skewed probability distributions are common. Sample size calculations for such outcomes are typically based on estimates on a transformed scale (e.g. log) which may sometimes be difficult to obtain. In contrast, estimates of median and variance on the untransformed scale are generally easier to prespecify. The aim of this paper is to describe how to calculate a sample size for a two group comparison of interest based on median and untransformed variance estimates for lognormal outcome data.
Methods
A lognormal distribution for outcome data is assumed and a sample size calculation approach for a twosample ttest that compares logtransformed outcome data is demonstrated where the change of interest is specified as difference in median values on the untransformed scale. A simulation study is used to compare the method with a nonparametric alternative (MannWhitney U test) in a variety of scenarios and the method is applied to a real example in neurosurgery.
Results
The method attained a nominal power value in simulation studies and was favourable in comparison to a MannWhitney U test and a twosample ttest of untransformed outcomes. In addition, the method can be adjusted and used in some situations where the outcome distribution is not strictly lognormal.
Conclusions
We recommend the use of this sample size calculation approach for outcome data that are expected to be positively skewed and where a two group comparison on a logtransformed scale is planned. An advantage of this method over usual calculations based on estimates on the logtransformed scale is that it allows clinical efficacy to be specified as a difference in medians and requires a variance estimate on the untransformed scale. Such estimates are often easier to obtain and more interpretable than those for logtransformed outcomes.
Background
In most clinical studies, sample size calculations are important at the study design stage [1–3]. A typical objective of such studies is to test for a difference in the distribution of some outcome of interest between two or more groups using a hypothesis test. A sample size calculation helps to ensure that a study has the correct power to reject the null hypothesis, thereby providing conclusive evidence of a true difference between groups, where such evidence exists. Where a study does not have a high level of power, the probability of rejecting the null hypothesis is low and, as a result, evidence in support of a difference between groups may not be detected.
For studies in which the outcome of interest has a continuous distribution, a twosample ttest is often used to test the null hypothesis that the mean outcome is the same for two groups. Use of the twosample ttest relies on an underlying assumption that this outcome is normally distributed. The required sample size is then calculated on the basis of a prespecified minimum clinically significant difference in means between groups and an estimate for the variance of the outcome (or, equivalently, a standardised mean difference), together with a desired hypothesis test power and significance level. However, many health care outcomes are not normally distributed and instead have skewed distributions. For example: qualityoflife measures [4], tumour size or features in cancer patients [5, 6] and time outcomes [7, 8], amongst many others. For such data, nonparametric tests might be considered (for example, a MannWhitney U test) although such tests may be affected by a reduced power and can require a substantial inflation of the sample size [9, 10]. Alternatively, outcomes may be transformed to obtain a normal distribution and enable the use of a two sample ttest. For positively skewed outcome data, a common transformation is the natural logarithm of the outcome [11] with a ttest then used to test the null hypothesis that mean values on the logtransformed scale are equal. In this case, standard sample size calculations based on ttests or Ztests are performed for the transformed data, using prespecified means and variances on the logscale.
A potential problem with taking this approach can be specifying accurate or appropriate values of standardised mean differences (or, equivalently, group means and standard deviations) for the logtransformed outcomes. Often, it may be more natural for health care practitioners or trialists to have knowledge of such values on the untransformed scale, since these typically represent more clinically relevant or meaningful measures. A natural choice for summarising skewed data on the untransformed scale is the median. Unlike the mean, the median is not unduly influenced by extreme values. Estimates of medians are likely to be more readily available and interpretable than the alternative of specifying means on the logscale. In addition, specifying variances of untransformed values is also likely to be easier for a study’s research team and probably more accurate than the specification of variances on the logscale.
Here, we describe how a sample size can be calculated for a two group comparison of a lognormal outcome, based on estimates of the median outcome and untransformed standard deviation in each group and evaluate this approach in a variety of settings. The method is relevant for both randomised trials and observational studies where we plan to test the null hypothesis that logscale means are equal or, equivalently, the null hypothesis that medians on the untransformed scale are equal. The approach can be extended using usual methods to incorporate other complexities such as clustering, unequal allocation or adjustments for confounding.
Methods
We assume that we have a study for which the outcome of interest is positively skewed and that we wish to perform a hypothesis test that compares outcomes between two independent groups, for example, a parallelgroup randomised controlled trial in which outcomes are compared for a placebo group and an active treatment group. Throughout, we define the two groups as groups 1 and 2. Furthermore, for simplicity, sample size calculations are performed assuming that these groups are equally sized. Assuming that each group has size n, we denote T _{ ij } as the positively skewed outcome value for individual i (i∈{1,…,n}) in group j (j∈{1,2}). We assume that, in each group, the primary outcome T _{ ij } has a lognormal distribution. That is
and hence
To compare groups, the usual null (H_{0}) and alternative (H_{1}) hypotheses tested would be:
In words, this denotes a test of the null hypothesis that the logscale means are equal for groups 1 and 2 against a twosided alternative. For a lognormal distribution, this test is equivalent to a test of the null hypothesis that the medians of the untransformed outcomes are equal for groups 1 and 2.
This test could be performed using a twosample ttest between groups with the logtransformed outcome values. A standard sample size calculation for such a test would rely on the specification of a minimal clinically relevant difference in mean logtransformed outcome values between groups together with an estimate of the variance of the logtransformed primary outcome values for each group. For many outcomes, health care researchers or clinicians may not be familiar with their outcome on the logscale and may find it difficult to specify the requested estimates. In contrast, clinicians may have a more precise idea of the approximate median outcome value for each group together with an appreciation of the variance of the untransformed outcome values, perhaps from pilot studies, clinical observation/expertise or results in relevant literature. Alternatively, an estimate of the median for the control group could be specified, together with an anticipated difference in the median values between groups for the other group on conclusion of the study.
Where outcome data have a lognormal distribution, the mean of the logtransformed outcome can be easily calculated as the natural logarithm of the median on the untransformed scale. However, it is more challenging to recover the variance of the logtransformed outcomes and, in many cases, prespecifying variances of logtransformed outcome data may not be straightforward. We now demonstrate how the sample size calculation for a two group ttest comparison of a lognormal outcome can be obtained using medians and variances specified for each group on the untransformed scale. We define
It can be shown (see Additional file 1, or [12]) that
due to the symmetry of the distribution of log(T _{ i }). In other words, for a lognormal random variable, the populationlevel geometric mean (logscale mean) is equal to the population median. As a result, on specification of approximate median values of the primary outcome for the two groups, the difference in means on the logscale is written
Here, τ corresponds to the minimal clinically important difference in the primary outcome on the logscale, but we note that this has been constructed using median values of the untransformed primary outcome, which may be easier to prespecify. The variance of the untransformed outcome for group j is \(\phi ^{2}_{j}\) and, as mentioned previously, it is likely that the variance of an untransformed outcome is easier and more meaningful to prespecify than that of transformed outcome variable. It can be shown (see ‘Additional file 1’) that the variance of the logtransformed primary outcome for group j, \(\sigma _{j}^{2}\), is related to the variance of the corresponding untransformed primary outcome as follows
To compare groups, a twosample ttest is performed using logtransformed outcome variables. The hypotheses are given by
In other words, the test is of the null hypothesis that the logscale means are equal. We note that μ _{ j }= log(m _{ j }) (j∈{1,2}) and, as such, this test may be used to test the null hypothesis of equal medians on the untransformed scale. Taking the standard sample size calculation formula for a twosample ttest with equal group sizes [13] and using (1) and (2), the number of patients per group (n) is given by
Here, z _{ ε } denotes the value such that \(\mathbb {P}(Z>z_{\epsilon }) = \epsilon \) for a standard normal random variable \(Z\sim \mathcal {N}(0,1)\). So \(z_{\frac {\alpha }{2}}\) and z _{ β } denote quantiles pertaining to a significance level of 100α % and power of 100(1−β)%. To summarise, Eq. (4) allows a sample size calculation to be performed easily for a two group comparison of untransformed medians or logscale means on prespecification of untransformed medians, m _{1} and m _{2} and untransformed standard deviations ϕ _{1} and ϕ _{2}.
A common approach taken when conducting sample size calculations for normally distributed outcomes is to assume a common standard deviation for the outcomes in both groups. With the method considered in this paper, an assumption of common standard deviation values for the untransformed outcomes (i.e. ϕ _{1}=ϕ _{2}) would still imply that the standard deviations of the transformed outcomes (σ _{1} and σ _{2} in Eq. (2)) are different, owing to the expected difference between m _{1} and m _{2}. In addition, the formula for the sample size given in Eq. (4) is based on a normal distribution but where the hypothesis test of interest is a twosample ttest. As a result, for smaller sample sizes, it may be sensible to increase the sample size slightly, in line with other sample size calculation methods for a twosample ttest.
The method presented in Eq. (4) is not only applicable to situations where a comparison of medians between groups is desired. The relationship between the untransformed median and the logscale mean for a lognormal distribution also implies that the method is useful for situations in which linear regression models are fitted to logtransformed data and used for inference under an assumption that the logtransformed outcomes are approximately normally distributed. Such models are used frequently in medical statistics.
To explore and evaluate the accuracy of Eq. (4) as a method for sample size calculation, we perform a simulation study in which the power of the hypothesis test can be estimated in a variety of scenarios and compared to other common approaches.
Simulation study
To perform the simulation study, prespecified values of untransformed median values (m _{1} and m _{2}) were chosen together with corresponding untransformed standard deviations ϕ _{1} and ϕ _{2}. Using these parameters, and for a chosen power and significance level, a sample size n was calculated analytically using the method outlined in “Methods” section, specifically Eq. (4). The aim was to assess whether or not the analytically derived sample size would attain the desired level of power when a twosample ttest comparing logtransformed outcomes between groups is performed. The null and alternative hypotheses for this test were specified in the previous section (3). In addition, we aimed to compare this test to a MannWhitney U test and a twosample ttest that compared untransformed outcomes between groups. The algorithm for the simulation process was as follows:

1.
At random, draw n values from the distribution \(\log \mathcal {N}\left (\mu _{1},\sigma _{1}^{2}\right)\) and n values from the distribution \(\log \mathcal {N}\left (\mu _{2},\sigma _{2}^{2}\right)\) where, for j∈{1,2}:
$$\begin{array}{*{20}l} \mu_{j} &= \log(m_{j});\\ \sigma^{2}_{j} &= \log\left(\frac{1}{2}+\sqrt{\frac{1}{4}+\frac{\phi_{j}^{2}}{m_{j}^{2}}}\right). \end{array} $$The two sets of drawn values are denoted (T _{11},…,T _{ n1})^{T} and (T _{12},…,T _{ n2})^{T} respectively.

2.
The corresponding logtransformed values are computed as Y _{ ij }= log(T _{ ij }), producing two sets of logtransformed outcomes (Y _{11},…,Y _{ n1})^{T} and (Y _{12},…,Y _{ n2})^{T}. These are compared using a twosample ttest of the null hypothesis that μ _{1}=μ _{2} (equivalently, m _{1}=m _{2} on the untransformed scale) against a twosided alternative and assuming a 5% significance level. The outcome of the test is recorded using a binary variable (1 = ‘reject the null hypothesis’, 0 = ‘retain the null hypothesis’). In each case, a twosample ttest is performed to compare logtransformed outcomes between groups. In addition a MannWhitney U test and a twosample ttest are performed using untransformed outcomes for comparative purposes.

3.
Steps 1–2 are repeated N=100000 times and the power of the corresponding hypothesis test is calculated as the proportion of these repeated tests for which the null hypothesis is rejected.
Results
Table 1 shows results of the simulation study for various prespecified median outcome values and standard deviations. The column ‘n’ denotes the analytically derived sample size calculated using Eq. (4). The columns ‘log ttest’, ‘MW test ’ and ‘ttest’ denote the estimated hypothesis test powers for the ttest on logtransformed outcomes, the MannWhitney U test on untransformed outcomes and the ttest on untransformed outcomes, respectively, between groups. Examining Table 1, for larger sample sizes the twosample ttest of the logtransformed outcomes between groups appears to have a power close to the nominal power value (either 0.8 or 0.9). For smaller sample sizes the power is sometimes slightly less than the nominal value. In such situations, it would be advisable to increase the sample size slightly, perhaps by one or two individuals in each group, to ensure that the desired level of power is attained. This issue is likely to be caused by the fact that the sample size calculation in Eq. (4) uses normal distribution quantiles \(z_{\frac {\alpha }{2}}, z_{\beta }\) but the hypothesis test that is performed is a twosample ttest. This would also affect any sample size calculation where a normal approximation is used and the sample size is small and is not specific to the approach taken in this work.
The ‘MW Test’ column shows that a MannWhitney U test on untransformed outcomes does not attain the expected power in all situations, being smaller than the desired level of power in each case. We would expect this and, consequently, suggest a suitable adjustment to the sample size calculation if a MannWhitney U test or other nonparametric hypothesis test were used [9, 10]. Furthermore, we note that the MannWhitney U test may be viewed as a test of a shift in location of the outcome variable’s probability distribution between groups. As such, it may not be desirable to consider a MannWhitney U test in situations where the untransformed variances differ between groups. However, in Table 1 estimated powers from the MannWhitney U test are not typically worse for simulation scenarios where ϕ _{1}≠ϕ _{2} when compared to scenarios where ϕ _{1}=ϕ _{2}. Hence, the issue of different variances between groups does not appear to have been too problematic here.
When considering the ‘ttest’ column of Table 1, where the untransformed standard deviation values are equal for groups 1 and 2 (ϕ _{1}=ϕ _{2}), a twosample ttest of untransformed outcomes consistently fails to attain the desired level of power, even when the sample size is large. However, where untransformed standard deviation values are different (ϕ _{1}≠ϕ _{2}) we see that there are some scenarios where a power greater than the prespecified power is attained. We note that the twosample ttest tests the null hypothesis that the untransformed population mean values are equal. Here, for untransformed lognormal outcomes, the population mean for group j is given by \(m_{j}\exp \left ({\sigma ^{2}_{j}}/2\right)\) and hence if m _{1}=m _{2} but \(\phi _{1}\neq \phi _{2}{\left (\text {implying that}~\sigma _{1}^{2} \neq \sigma _{2}^{2}\right)}\), the null hypothesis for a twosample ttest of untransformed outcomes would never be true. This explains why the estimated power can be considerably higher than the prespecified value in this situation. Naturally, we would not usually recommend a twosample ttest on untransformed outcomes where the data have a lognormal or other positively skewed probability distribution and, in most cases, a histogram of outcome data would indicate that a suitable transformation is required prior to using a ttest.
Overall, the results in Table 1 indicate that the analytical method given in Eq. (4) appears to give correct sample sizes for desired hypothesis test powers based on median values and untransformed standard deviations for outcomes of interest. Some adjustment may be necessary for smaller sample sizes but such adjustment would be recommended with most sample size calculation methods.
Sensitivity to the lognormal distributional assumption
When health outcomes have a positively skewed distribution, the distribution may not be strictly lognormal. Here, we simulate some scenarios to consider the performance of the sample size method where the distribution is skewed but not lognormal. Specifically, we examine situations where the outcome has an Exponential distribution. We note that it can be easily shown that the logarithm of an Exponential random variable is not normally distributed. As an example, Fig. 1 shows probability density functions of an Exp(2) random variable and the natural logarithm of an Exp(2) random variable. If a random variable X∼Exp(λ) then the median of X is equal to log(2)/λ and the standard deviation is 1/λ. The fact that a closed form expression for the median exists is desirable here, since our sample size calculation method (Eq. (4)) relies on prespecified median values that, with an Exponential distribution, can be directly linked to rate parameters. We perform sample size calculations using Eq. (4) for prespecified median and untransformed standard deviation values under a lognormal assumption, but then simulate data from Exponential distributions with the same median and standard deviation values. The simulation algorithm is similar to that outlined previously, except that Exponential distribution rates for groups are calculated as λ _{ j }= log(2)/m _{ j } (where m _{ j } is the prespecified median for group j) and then untransformed values are drawn from an Exp(λ _{ j }) distribution for group j. This simulation process allows the evaluation of the sample size calculation method where the outcome distribution is not strictly lognormal in that we can assess the expected level of power when performing various hypothesis tests.
Results from the simulation study where the outcome data have an Exponential distribution are shown in Table 2. Each row of Table 2 denotes a different simulation scenario and Fig. 2 shows example plots of the distributions of values and logtransformed values for each scenario to demonstrate levels of skewness. Figure 2 shows that the logtransformed outcomes generally exhibit left skewness.
Examining Table 2 we see that the estimated power for each method is below the nominal value of 0.9 for all scenarios. On examining the form of Eq. (4) and the untransformed variances for the Exponential distribution, the poor performance of the methods is unsurprising. The standard deviations used in the sample size calculation in Eq. (4) are too small to reflect the true variability of the logtransformed exponentially distributed data, resulting in sample size calculations that are too small to attain the deired level of power under the proposed method. For the Exponential distributions, the untransformed variance in group j is given by
and, therefore, on substitution into the formula for the logtransformed variance in Eq. 4 we obtain
for all j. It can be shown that the variance of the natural logarithm of an Exponential random variable is π ^{2}/6≈1.645 (see ‘Additional file 1’). This is more than twice the assumed value of \(\sigma _{j}^{2}\) that was used for sample size calculations given in Table 2 which explains why the estimated powers in Table 2 are low. To amend the sample size calculation in Eq. (4) for this distribution, we substitute \(\sigma _{j}^{2} = \pi ^{2}/6\) into Eq. (4) and recalculate the sample sizes using the formula
Table 3 shows results from a simulation study, conducted in the same way as that in Table 2, but where the analytical sample sizes have been calculated using Eq. (5). We see that calculation of the sample size using Eq. (5) results in larger sample size values, which we would expect as more accurate estimates of the transformed variance values have been used. As a result, a power close to 0.9 is attained for all simulation scenarios in which a twosample ttest is performed on the logtransformed outcomes. However, the power is typically higher for the MannWhitney U test, thereby suggesting that this test would generally require a smaller sample size than the twosample ttest on the logtransformed outcomes. This might be expected, since we can see in Fig. 2 that the logtransformed outcomes are negatively skewed and thus the data are generally more suited to analysis using a MannWhitney U test. Also, we note that in Table 3 the estimated powers for the ttest using untransformed outcomes are very high. Naturally, this test would not typically be used for exponentially distributed outcome data. In summary, for positively skewed outcomes that are clearly not lognormal, the sample size calculation presented in Eq. (4) should not be used without carefully considering the shape of the distribution of logtransformed outcomes and its variability.
We now present a real example in which the analytical method has been used to obtain a sample size for a two arm randomised controlled trial in epilepsy surgery for which the outcome of interest is the time taken for the implantation of electrodes during a surgical procedure.
Application
Some patients with refractory focal epilepsy, a type of epilepsy that is difficult to control using medication alone, may be considered for neurosurgery whereby invasive testing is required to determine the location of epileptic activity in the brain [14]. This procedure is known as stereoelectroencephalography (SEEG) and involves the implantation of SEEG electrodes into a patient’s brain in a surgical procedure. These electrodes are monitored using continual video and encephalography monitoring to assess brain activity. A number of different methods exist for the placement of SEEG electrodes and it is unclear which placement method is best [15].
A randomised controlled trial of SEEG electrode placement methods has been funded at the National Hospital for Neurology and Neurosurgery, London (Wellcome Trust grant number WT106882), in which the operative time (time taken for electrode implantation procedure) of the iSYS1 trajectory guidance system is to be compared with the currently used frameless mechanical arm based technique for the placement of SEEG depth electrode bolts in patients undergoing preoperative evaluation for drug resistant focal epilepsy. In brief, the iSYS1 trajectory guidance system [16] uses a robot during part of the surgical procedure for the implantation of the electrodes into a patient’s brain. It is believed that the robot insertion method will result in a significant reduction in the time taken to perform SEEG electrode placement [16, 17].
For this randomised controlled trial, the primary outcome is the time taken (in minutes) for the implantation of an electrode during surgery. Typically, each patient has 8–12 electrodes implanted during a surgical procedure and, as such, we note that a degree of patientlevel clustering is to be expected with regard to the primary outcome in this example. Patients shall be randomised to one of two groups, with equal allocation, and the two randomised groups are:

Group 1: Patients who are randomised to receive manual SEEG electrode placement;

Group 2: Patients who are randomised to receive robotguided SEEG electrode placement.
The trial investigators aimed to estimate the number of patients to recruit to the trial so that a reduction of at least 20% in the median electrode implantation time may be detected when comparing times for electrode implantation between groups. Clearly, implantation times are likely to be positively skewed and, as such, analysis of the primary outcome for this trial shall consist of a twosample ttest of the null hypothesis of no difference in electrode implantation time between groups, by comparing logtransformed implantation times. A 5% significance level and a power of 90% are assumed. For this trial, we first perform a sample size calculation using Eq. (4) where the median electrode implantation time for manual SEEG electrode placement (m _{1}) is specified, together with estimates of the standard deviation of the implantation time for an electrode for both groups (ϕ _{1},ϕ _{2}). The values assumed for the sample size calculation are:
Ignoring the clustering for now, we use Eq. (4) to compute the number of electrodes required in each arm as
This would equate to a sample size of 31 electrodes per group. Since this sample size is fairly small and, in light of the simulation study results, we increase the sample size to 32 electrodes per group. However, we note that electrodes are clustered within patients who undergo surgery and, as such, the effect of withinpatient clustering should be accountedfor in the sample size calculation. Clustering can be handled easily within our sample size calculation approach.
Accounting for clustering
In a similar way to other sample size calculation methods, we can calculate a design effect that is a function the intraclass correlation coefficient and the average cluster size and use this to update the sample size calculation to account for the likely effect of withinpatient clustering [13]. The design effect is calculated as:
where m=10 is the average cluster size and ICC = 0.2 is a estimated value for the intraclass correlation coefficient. Here, def=2.8 and the original sample size is inflated by this factor to reflect the withinpatient clustering [18], yielding a revised sample size of
This implies that each group should contain 90 electrodes and this would suggest that 9 patients per group should be recruited to trial, under the assumption of an average of 10 electrode insertions per patient. In the protocol for the SEEG Electrode Placement Randomised Controlled Trial, the number of patients to be recruited has been increased to 16 per group, to reflect the possibility of patient dropout and variable cluster size.
Discussion
We have considered a sample size calculation method in which clinical efficacy is measured using median values for a positively skewed outcome that is assumed to follow a lognormal distribution. We chose this approach because, particularly in the case of times and other positively skewed health outcomes, comparisons between groups may be specified based on differences in medians (or, equivalently, differences in geometric means) rather than differences in arithmetic means. Furthermore, information on variability of the untransformed outcome variable may be easier to estimate and more interpretable than that of a transformed outcome variable. In addition, the approach is applicable to situations in which interest lies in a direct comparison of the logscale means between groups or for situations in which linear models are fitted to logtransformed outcome data. The example on SEEG electrode placement showed that this approach to sample size calculation could be applicable in clinical practice and also that clustering of outcome data can be handled within the outlined sample size method.
The analytical sample size formula presented (Eq. (4)) was accurate when evaluated using extensive simulation studies. As such, the method appears to be acceptable for estimation of the required sample size for situations in which two groups are compared and the outcome of interest is assumed to have a lognormal distribution. We note that the assumption of a lognormal distribution may not always be appropriate. A simulation study in which outcomes had an exponential distribution indicated that the formula in Eq. (4) provided sample sizes that were generally too small. However, an adjustment to Eq. (4) that used a more precise estimate of the untransformed variance yielded a formula for which the approach outlined appeared to work well. In general, we recommend that caution should be taken when using the approach presented in this paper if it is suspected that outcomes are not lognormal. In such situations we would recommend simulation studies to check that the proposed analytical sample size method is effective. Alternatively, if a different distributional assumption is made for untransformed outcomes, such as a Gamma distribution, then a likelihood ratio test statistic could be constructed based on that distributional assumption and used to calculate a sample size [19]. In some cases, it may be more appropriate to consider a nonparametric hypothesis test of untransformed outcomes, though we note that a parametric test or other analysis that relies on the specification of a probability distribution for outcome variables may be useful as a sensitivity analysis.
Additionally, as with all sample size approaches, the method depends on the prespecified values (i.e. the estimated medians and standard deviations of the untransformed outcomes). As a result, it is important to elicit prespecified values that are applicable to the study at hand. Here, we note that it may be more appropriate and perhaps easier for health care researchers to specify median values and associated standard deviations of the untransformed (clinically interpretable) outcome variable than to consider estimated values for logtransformed outcomes. Furthermore, the median may represent a more robust and intuitive summary of an outcome for a treatment group where the distribution is positively skewed, when compared to the sample mean which may be affected by extreme values.
The sample size calculations were performed on the assumption of equal numbers of individuals in the two groups. The calculations could be easily adapted to situations in which the numbers in the two groups are unequal, using standard adjustment methods. Overall, we have presented a simple method for sample size calculation where the outcome of interest is assumed to have a lognormal distribution and a hypothesis test is performed using data from two groups. The method is applicable to problems in which a difference in median values between the groups is of clinical interest and untransformed standard deviations are specified.
Conclusions
The sample size approach that has been outlined is applicable to situations in which a comparison between medians or logtransformed means is proposed for positively skewed data under the assumption that the data have a lognormal distribution. The method relies on prespecified untransformed median and standard deviation values for groups which may typically be easier to elicit from clinicians and perhaps more interpretable. The method may be adjusted to account for situations where the outcome data are positively skewed but not strictly lognormal.
Abbreviations
 ICC:

Intraclass correlation coefficient
 SEEG:

Stereoelectroencephalography
References
 1
Freiman JA, Chalmers TC, Smith Jr. H, Kuebler RR. The importance of beta, the type II error and sample size in the design and interpretation of the randomized control trial: Survey of 71 negative trials. N Engl J Med. 1978; 299(13):690–4.
 2
Schulz KF, Grimes DA. Sample size calculations in randomised trials: mandatory and mystical. Lancet. 2005; 365(9467):1348–53.
 3
Chow SC, Wang H, Shao J. Sample size calculations in clinical research, 2nd ed. London: CRC Press; 2007.
 4
Fitzpatrick R, Fletcher A, Gore S, Spiegelhalter D, Cox D. Quality of life measure in health care. I: Applications and issues in assessment. BMJ. 1992; 305(6861):1074–7.
 5
Davnall F, Yip CSP, Ljungqvist G, Selmi M, Ng F, Sanghera B, et al. Assessment of tumor heterogeneity: an emerging imaging tool for clinical practice?Insights Imaging. 2012; 3(6):573–89.
 6
Wolsztynski E, O’Sullivan F, O’Sullivan J, Eary JF. Statistical assessment of treatment response in a cancer patient based on pretherapy and posttherapy FDGPET scans. Stat Med. 2017; 36(7):1172–200.
 7
Hougaard P. Fundamentals of Survival Data. Biometrics. 1999; 55(1):13–22.
 8
Stepaniak PS, Heij C, Mannaerts GH, de Quelerij M, de Vries G. Modeling procedure and surgical times for current procedural terminologyanesthesiasurgeon combinations and evaluation in terms of caseduration prediction and operating room efficiency: a multicenter study. Anesth Analg. 2009; 109(4):1232–45.
 9
Randles RH, Wolfe DA. Introduction to the theory of nonparametric statistics. New York: Wiley; 1979.
 10
Vickers AJ. Parametric versus nonparametric statistics in the analysis of randomized trials with nonnormally distributed data. BMC Med Res Methodol. 2005; 5(1):35.
 11
Wolfe R, Carlin JB. SampleSize Calculation for a LogTransformed Outcome Measure. Control Clin Trials. 1999; 20(6):547–54.
 12
Daly LE, Bourke GJ. Interpretation and Uses of Medical Statistics, 5th ed. Oxford: Blackwell Science; 2000.
 13
Machin D, Campbell MJ, Tan SB, Tan SH. Sample Size Tables for Clinical Studies, 3rd ed. Chichester: Wiley–Blackwell; 2009.
 14
de Tisi J, Peacock JL, McEvoy AW, Harkness WFJ, Sander JW, Duncan JS. The longterm outcome of adult epilepsy surgery, patterns of seizure remission, and relapse: a cohort study. Lancet. 2011; 378:1388–95.
 15
Enatsu R, Mikuni N. Invasive Evaluations for Epilepsy Surgery: A Review of the Literature. Neuro Med Chir (Tokyo). 2016; 56(5):221–7.
 16
Dorfer C, Minchev G, Czech T, Stefanits H, Feucht M, Pataraia E, et al. A novel miniature robotic device for frameless implantation of depth electrodes in refractory epilepsy. J Neurosurg. 2017; 126:1622–8.
 17
Nowell M, Rodionov R, Diehl B, Wehner T, Zombori G, Kinghorn J, et al. A Novel Method for Implementation of Frameless StereoEEG in Epilepsy Surgery. Neurosurgery. 2014; 10(4):525–34.
 18
Frison L, Pocock SJ. Repeated measures in clinical trials: analysis using mean summary statistics and its implications for design. Stat Med. 1992; 11(13):1685–704.
 19
Cundell B, Alexander NDE. Sample size calculations for skewed distributions. BMC Med Res Methodol. 2015; 15:28.
Acknowledgments
The authors gratefully acknowledge Prof. John Duncan, Professor of Clinical Neurology at UCL and Clinical Director of the National Hospital for Neurology and Neurosurgery, who provided permission to use and insight into the SEEG electrode placement example. We thank the reviewers and Associate Editor for comments and suggestions that have improved the paper.
Funding
This work was supported by the National Institute for Health Research University College London Hospitals Biomedical Research Centre. AGO was funded partly by the Wellcome Trust, grant number WT106882.
Availability of data and materials
The R code used for the simulation study is available in ‘Additional file 1’.
Author information
Affiliations
Contributions
AGO performed the statistical analyses and drafted the manuscript. GA and JAB provided overall guidance, including revisions and critical comments. All authors read and approved the final version of the manuscript.
Corresponding author
Correspondence to Aidan G. O’Keeffe.
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Additional file
Additional file 1
Properties of the lognormal distribution, derivation of the variance of the logExponential distribution and simulation study R code. Derivation of the relationship between the median of a lognormal random variable and the mean of its logtransform. Derivation of the form of the variance of the logtransformed outcome and that for the logexponential distribution. R code for the simulation studies presented in this paper. (PDF 164 kb)
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
About this article
Cite this article
O’Keeffe, A.G., Ambler, G. & Barber, J.A. Sample size calculations based on a difference in medians for positively skewed outcomes in health care studies. BMC Med Res Methodol 17, 157 (2017) doi:10.1186/s1287401704261
Received
Accepted
Published
DOI
Keywords
 Hypothesis test
 Logtransformation
 Median
 Sample size
 Skewness