Abstract
1- Introduction
2- Bayesian nonparametric regression through Gaussian process prior
3- Application
4- Results
5- Concluding remarks
References
Abstract
The goal of this paper is to promote the use of Non-Parametric Regression (NPR) for hypothesis testing in hospitality and tourism research. In contrast to linear regression models, NPR frees researchers from the need to impose a priori specification on functional forms, thus allowing more flexibility and less vulnerability to misspecification problems. Importantly, we discuss in this paper a Bayesian approach to NPR using a Gaussian Process Prior (GPP). We illustrate the advantages of this method using an interesting application on internationalization and hotel performance. Specifically, we show how in contrast to linear regression, NPR decreases the risk of making incorrect hypothesis statements by revealing the true and full relationship between the variables of interest.
Introduction
Despite the increased popularity of non-parametric regression (NPR), its use in the tourism and hospitality literature remains very limited. We aim in this note to highlight the advantages of NPR, and illustrate how it can be used to provide a more accurate reflection on the true relationship between a set of variables. We show through an example that hospitality researchers might be missing some important input for hypothesis testing when estimating the traditional linear regression model. NPR, like linear regression, estimates mean outcomes for a given set of covariates. However, unlike linear regression, NPR is not subject to misspecification error arising from potentially wrong functional forms as it does not impose a priori a functional form on the regression model (Müller, 2012; Mammen et al., 2012). The linear model (y = β0 + βx + u) is generally assumed for convenience, and not because we truly believe that the model is linear in reality. Researchers in the field often model nonlinearities using extensions of the linear model, for example, y = β0 + β1x + β2x2 + u. It is clear, however, that this model accounts only for limited types of nonlinearity of U or inverted U shape, and cannot capture more complicated patterns in the data. When more than one regressor is available, nonlinearities are often modeled using interactions: y = β0 + β1x + β2z + β3xz + u. The interpretation is that the effect of x on y depends on z: = + ∂ ∂ β β z E y x ( ) 1 3 . This is, of course, a deviation from the simple linear model where the main assumption is that the effect of x on y is constant across all values of x or other explanatory variables. However, the effect of x on y depends on z in a linear way, an assumption that may or may not hold in practice.