Performance metrics for models designed to predict treatment effect

Table 1 An illustration of the calculation of the proposed metrics based on matching patients to assess models predicting treatment effect

	Patient assigned to treatment				Patient assigned to control treatment				Matched pair
Matched patient pair (A)	\({{\varvec{p}}}_{0}\) (B)	\({{\varvec{p}}}_{1}\) (C)	Predicted treatment effect (D = B-C)	Observed outcome (E)	\({{\varvec{p}}}_{0}\) (F)	\({{\varvec{p}}}_{1}\) (G)	Predicted treatment effect (H = F-G)	Observed outcome (I)	\({{\varvec{p}}}_{0}\) (J = F)	\({{\varvec{p}}}_{1}\) (K = C)	Predicted pairwise treatment effect (L = J-K)	Observed pairwise treatment effect (M = E-I)	LOESS curve (N)
1	0.136	0.283	-0.147	1	0.162	0.307	-0.145	1	0.162	0.283	-0.121	0	-0.412
2	0.246	0.343	-0.097	0	0.218	0.319	-0.101	1	0.218	0.343	-0.125	-1	-0.589
3	0.156	0.219	-0.063	1	0.142	0.203	-0.061	0	0.142	0.219	-0.077	1	0.901
4	0.081	0.083	0.002	0	0.098	0.062	0.036	0	0.098	0.083	0.015	0	-0.081
5	0.345	0.212	0.133	1	0.299	0.171	0.128	0	0.299	0.212	0.087	1	0.937
6	0.421	0.390	0.031	1	0.561	0.255	0.306	1	0.561	0.390	0.171	0	0.190
7	0.364	0.201	0.163	1	0.243	0.164	0.079	1	0.243	0.201	0.042	0	0.217
8	0.264	0.199	0.065	1	0.345	0.278	0.067	0	0.345	0.199	0.146	1	0.707

The calibration metrics are calculated in the following manner calibration-in-the-large = abs(mean(M)-mean(N)) \(\approx\) 0.016, E_avg-for-benefit = mean(abs(L-N)) ≈ 0.429, E₅₀-for-benefit = median(abs(L-N)) ≈ 0.378, and E₉₀-for-benefit = quantile(abs(L-N), 0.9) ≈ 0.888. The overall performance are calculated by Cross-entropy-for-benefit \(=-\frac{1}{{n}_{p}}\left[I\left(M=1\right)\cdot \mathrm{log}\left[\left(1-K\right)J\right]+I\left(M=0\right)\mathrm{log}\left[\left(1-K\right)\left(1-J\right)+K\cdot J\right]+ I\left(M=-1\right)\mathrm{log}\left[K\left(1-J\right)\right]\right]\approx 1.001\) and Brier-for-benefit \(=\frac{1}{2{n}_{p}}\left[{\left[\left(1-K\right)J-I\left(M=1\right)\right]}^{2}+{\left[\left(1-K\right)\left(1-J\right)+K\cdot J-I\left(M=0\right)\right]}^{2}+{\left[K\left(1-J\right)-I\left(M=-1\right)\right]}^{2}\right] \approx 0.308\), where n_p the number of patient pairs. Abbreviations: p₀ = P(Y = 1│W = 0); p₁ = P(Y = 1│W = 1); LOESS curve is created by predict(stats::loess(M ~ L))h

ISSN: 1471-2288