Abstract
1. Introduction
2. Background
3. Detecting bias of a relevant size
4. Empirical evaluation
5. Conclusion
Acknowledgements
Appendix A. Variability of the Jackknife estimator of bias
Appendix B. Asymptotics of b and bapprox
Appendix C. Asymptotics of
References
Abstract
This is the first paper to approach the problem of bias in the output of a stochastic simulation due to using input distributions whose parameters were estimated from real-world data. We consider, in particular, the bias in simulation-based estimators of the expected value (long-run average) of the real-world system performance; this bias will be present even if one employs unbiased estimators of the input distribution parameters due to the (typically) nonlinear relationship between these parameters and the output response. To date this bias has been assumed to be negligible because it decreases rapidly as the quantity of real-world input data increases. While true asymptotically, this property does not imply that the bias is actually small when, as is always the case, data are finite. We present a delta-method approach to bias estimation that evaluates the nonlinearity of the expected-value performance surface as a function of the input-model parameters. Since this response surface is unknown, we propose an innovative experimental design to fit a response-surface model that facilitates a test for detecting a bias of a relevant size with specified power. We evaluate the method using controlled experiments, and demonstrate it through a realistic case study concerning a healthcare call centre.
Introduction
In stochastic simulation the “stochastic” element of the simulation comes from the input models that drive it. In this paper we focus on parametric input models, probability distributions or stochastic processes that are estimated from observations of the real-world system of interest. Since we can only ever collect a finite number of observations, error, with respect to what the simulation says about the real-world system performance, is inevitable. In this paper ‘response’ means the expected value of a simulated output performance measure. Error caused by input modelling can be broken down as MSE = Variance + Bias2; that is, the mean squared error (MSE) due to input modelling is made up of the variability of the simulation response caused by input modelling, known in the literature as input uncertainty variance (IU variance), and the squared bias due to input modelling. Barton (2012) explains that, even in very reasonable simulation scenarios, analysis of the response of interest can be very different when error due to input modelling is included. Barton (2012) was referring to the IU variance, but the same idea holds for the bias due to input modelling. In simulation models where a large number of replications of the simulation are completed, effectively driving out the inherent simulation noise caused by random-variate generation, ignoring the input modelling uncertainty can lead to overconfidence in the simulation results. Underestimating the error of the simulation response is dangerous, especially when this output may be used to guide important decisions about a real-world system.