UNIVERSITAT ROVIRA I VIRGILI DEPARTAMENT D’ECONOMIA  WORKING PAPERS Col·lecció “DOCUMENTS DE TREBALL DEL DEPARTAMENT D’ECONOMIA - CREIP” Estimating individual effects and their spatial spillovers in linear panel data models: Public capital spillovers after all? Karen Miranda Óscar Martínez-Ibáñez Miguel Manjón-Antolín Document de treball n.26- 2015 DEPARTAMENT D’ECONOMIA – CREIP Facultat d’Economia i Empresa UNIVERSITAT ROVIRA I VIRGILI DEPARTAMENT D’ECONOMIA  Edita: Departament d’Economia www.fcee.urv.es/departaments/economia/publi c_html/index.html Universitat Rovira i Virgili Facultat d’Economia i Empresa Avgda. de la Universitat, 1 43204 Reus Tel.: +34 977 759 811 Email: sde@urv.cat CREIP www.urv.cat/creip Universitat Rovira i Virgili Departament d’Economia Avgda. de la Universitat, 1 43204 Reus Tel.: +34 977 758 936 Email: creip@urv.cat Adreçar comentaris al Departament d’Economia / CREIP Dipòsit Legal: T – 1414 - 2015 ISSN edició en paper: 1576 - 3382 ISSN edició electrònica: 1988 - 0820 DEPARTAMENT D’ECONOMIA – CREIP Facultat d’Economia i Empresa Estimating individual effects and their spatial spillovers in linear panel data models: Public capital spillovers after all?∗ , K. Miranda, O. Mart´ ınez-Iba˜ez, and M. Manj´n-Antol´ n o ın QURE-CREIP Department of Economics, Rovira i Virgili University Abstract Individual-specific effects and their spatial spillovers are not generally identified in linear panel data models. In this paper we present identification conditions under the assumption that covariates are correlated with the individual-specific effects and derive appropriate GLS and IV estimators for the resulting correlated random effects spatial panel data model. We also illustrate the proposed estimators using a Cobb-Douglas production function specification and US state-level data from Munnell (1990). As in previous studies, we find no evidence of public capital spillovers. However, public capital does play a role in the positive “outwards” spatial contagion of the individual effects. Keywords: correlated random effects, spatial spillovers, panel data JEL Classification: C23, R11 ∗ We thank P. Elhorst, A. P´rez-Laborda, Jesus Mur and participants at the ERSA Summer School 2016, e 18 INFER annual conference, IAAE 2016 annual conference, 4th WIPE workshop, 40th Simposio de An´lisis a Econ´mico, and 8th Jean Paelinck seminar on Spatial Econometrics for helpful comments. This research was o funded by grants ECO2014-55553-P (Ministerio de Econom´ y Competitividad) and 2014FI B00301 (Agaur, ıa Generalitat de Catalunya). Usual caveats apply. th 1 Introduction Does public capital have an effect on private output? And if it does, does this effect spill over nearby geographical areas? Using US state (and/or county) panel data, production function estimates have consistently concluded that public capital and its spatially weighted counterpart are not statistically significant.1 In contrast, studies using alternative methodologies (e.g., VAR models), seem to suggest otherwise (Pereira and Andraz, 2013). In this paper we provide production function estimates supporting the existence of public capital spillovers. To be precise, we find no evidence of a direct positive effect of public capital on private output. However, we find evidence of a relation between public capital and the unobserved productivity of the states (i.e., the individual specific effect of the production function) and its spatial spillover.2 To obtain these results, this paper introduces a correlated-random effects model (Mundlak, 1978; Chamberlain, 1982) that presents spatial correlation in the individual effects. To our knowledge, only the random effects model of Kapoor et al. (2007) accounts for this spatial correlation. In the fixed effects case, Beer and Riedl (2012) advocate using an extension of the Spatial Durbin Model for panel data that includes the spatially weighted individual effects. Ultimately, however, they argue that “it is (...) advisable to remove the spatial lag of the fixed effects from the equation as the inclusion of both, [the individual effects] and [their spatial lags], leads to perfect multicollinearity” (p. 302). Removing the spatial lag of the fixed effects does not generally preclude the consistent estimation of the parameters of the model (see e.g. Halleck Vega and Elhorst, 2015). However, this practice rules out obtaining an estimate of the individual-specific effects (net of the spatially weighted effects).3 1 See e.g. Munnell (1990); Baltagi and Pinnoi (1995); Holtz-Eakin and Schwartz (1995); Garcia-Mila et al. (1996); and Kelejian and Robinson (1997). 2 As Boarnet (1998, p. 381-382) points out, “[p]ublic capital is provided at a particular place, and if such capital is productive, it enhances the comparative advantage of that location relative to other places”. Also, “productive public capital might shift economic activity from one location to another”. 3 This is a critical issue, for example, in two-step models that use this estimate as the dependent variable (Combes and Gobillon, 2015). Similarly, obtaining an estimate of the spatial spillovers of the individual-specific effects may be of great interest (e.g., for assessing their geographical distribution, which is what we do in our empirical application). 1 This raises the question of whether both individual effects and their spatial spillovers can indeed be identified in linear panel data models. In this paper we provide identifying conditions in a model specification that spatially weights both the independent variables and the individual effects. In particular, we show that there is no identification problem if the covariates are correlated with the individual-specific effects and the individual effects correspond to deviations from the constant term. Having proved that the model is identified, we then consider the estimation of its parameters under alternative exogeneity assumptions on the explanatory variables. Under the assumption that all the explanatory variables are strictly exogenous (with respect to the idiosyncratic term), we derive a Feasible Generalised Least Squares (FGLS) estimator. We also prove that, regardless of the structure of the variance-covariance matrix of the correlation functions shocks, this estimator coincides with the within (fixed effects) estimator when all the explanatory variables are used to construct the correlation functions. Under the assumption that the explanatory variables are predetermined, we propose an Instrumental Variables (IV) estimator to address the endogeneity of the means of the predetermined explanatory variables used to approximate the correlation functions. We also advocate using the backward means of these variables (i.e., the means taken, for each period, over only current and past values) as instruments. Lastly, we use these estimators and a (correlated random effects) production function specification to address the existence of capital spillovers. Using the data and (a spatially weighted variant of) the specification used by Munnell (1990), we find that, under strict exogeneity, our FGLS estimates of a Cobb-Douglas production function for the US states over the period 1970 to 1986 are largely consistent with those reported in related studies (using this data set, as e.g. Baltagi and Pinnoi 1995; Kelejian and Robinson 1997; and using analogous data sets, as e.g. Holtz-Eakin and Schwartz 1995; Garcia-Mila et al. 1996).4 However, when we explore the possibility that (some of) the explanatory variables are not exogenous, we find evidence of predeterminedness in the public capital. We then estimate the model by IV to 4 The data set we employ is publicly available and can be downloaded, for example, from the Ecdat package in R (a standardised binary contiguity spatial weights matrix of the US states is also included in the package). 2 find that, under a sequential exogeneity assumption, states with a larger/smaller estimated individual effect tend to have larger/smaller negative spatial spillovers. In particular, we consider both “spill-in” and “spill-out” effects (LeSage and Chih, 2016), although only the spillout effects turn out to be statistically significant. Also, while the part of the individual effects associated with the private capital produces negative spatial contagion, the part associated with the public capital produces positive spatial contagion. Consistent with previous literature, however, we find no significant spatial spillovers in the public capital. The rest of the paper is organised as follows. In Section 2 we discuss the identification problem and show that, under mild rank conditions, the correlated random effects model considered is identified. In Section 3 we present appropriate (FGLS and IV) estimators. In Section 4 we present empirical evidence based on the work of Munnell (1990). Section 5 concludes. 2 2.1 Specification and identification of the model Spatial spillovers and the identification problem Let us consider the spatial−(X, Ψ) panel data model, that is, the spatial (lag of) X model for panel data with spatially weighted fixed effects: y = Xβ + W Xγ + Ψµ + W Ψα + ε, (2.1) where y = (y11 , . . . , y1T , . . . , yN 1 , . . . , yN T ) is the dependent variable (as usual, i = 1, . . . , N denotes cross-sectional, geographical units and t = 1, . . . , T denotes the time dimension), X = (x11 , . . . , x1T , x21 , . . . , x2T , . . . , xN 1 , . . . , xN T ) is the N T × K matrix of explanatory variables (i.e., xit is a row vector of order K) is the N T × K matrix of explanatory variables, 2 and ε is a zero-mean idiosyncratic error term with assumed variance-covariance matrix σε I N T , 3 with I N T being the N T × N T identity matrix.5 We assume that neighbourhood relations do not change over time, so the spatial matrix is W = w ⊗ I T , with I T denoting the T × T identity matrix and w = [wij ] being the N × N spatial weight matrix that describes the spatial arrangement of the units in the sample. Also, unobservable individual-specific effects are collected in Ψµ, with Ψ = I N ⊗ ιT , I N being the N × N identity matrix and ιT a vector of ones of order T . Notice that, in contrast to the so-called SLX model (Halleck Vega and Elhorst, 2015), this model specification accounts for the spatial weights of the individual effects through the term W Ψα. Thus, the parameters 2 of the model are β, γ, µ, α and σε . This means that 2(K + N ) + 1 parameters need to be estimated. The main motivation behind the use of this model specification is the estimation of the individual effects and their spatial spillovers, since these often have a meaningful interpretation (Combes and Gobillon, 2015). In particular, following LeSage and Pace (2009), we define the spatial spillovers of the individual-specific effects in terms of the partial derivative of the (conditional expectation of the) dependent variable, ∂E [y | X, Ψ] = (I N µj + wαj ) ⊗ I T ∂Ψj (2.2) where Ψj is the j-th column of Ψ and j = 1, . . . , N . The off-diagonal elements of this matrix of partial derivatives represent the spillovers or indirect effects of unit j, whereas the diagonal elements of the matrix represent the direct effects of unit j. Notice, however, that since these effects are time-invariant, we can, without loss of generality, concentrate on the term in brackets, I N µj +wαj . Thus, a generic off-diagonal element of this matrix, wil αj with i = l, measures the effect of unit l having the unobservable characteristics of unit j on the dependent variable of unit i. Similarly, a generic element of the diagonal of this matrix, µj + wii αj , measures the effect of unit i having the unobservable characteristics of unit j on the dependent variable of unit i. This direct effect reduces to µj for 5 Throughout the paper, we assume that a balanced (complete) panel data is available. However, results can easily be extended to incomplete panels. 4 all i when wii = 0, the standard case.6 The vector of parameters µ has thus a neat interpretation. However, the role the spatial weight matrix w and the vector of parameters α play in the spill-over effects deserves some further attention. On the one hand, the structure of w provides the definition of neighbourhood. That is, which units are affected (the “spill-out” effects) and which are affecting (the “spill-in” effects) by the spillover and, in distance-based matrices, how much will be affected/affecting each unit. Thus, different spatial weight matrices yield different spatial spillovers. On the other hand, the parameter αj provides a measure of the spatial contagion of the individual effect of unit j irrespective of the number of neighbours it has and how close/distant they are. That is, αj provides a measure of the “potentiality of the spatial contagion” associated with the individual effect of unit j. We thus refer to α as the “potential” of the spatial spillovers of the individual effects. We illustrate the calculation and interpretation of these direct and spillover effects in the empirical application of section 4. In any case, what is clear is that the estimation of these effects requires that of the parameter vectors µ and α (since w is assumed to be a fixed and/or known matrix). It is therefore critical to determine whether these (and the rest of) parameters of the model are identified. Proposition 1 shows that, in general, this is not the case. Proposition 1. The Spatial (Lag of ) −(X, Ψ) model for panel data with spatially weighted fixed effects is not identified for any spatial weight matrix w. Proof. See appendix. Notice that, as Beer and Riedl (2012) argue, the omission of W Ψα does not preclude the consistent estimation of the parameters of the model. Thus, if the spatial spillovers of the individual-specific effects are of no interest for the application in hand, their suggestion to 6 However, LeSage and Pace (2009) do not recommend reporting unit-level effects but scalar summary measures. Namely, the average of the main diagonal elements (direct effects) and the cumulative sum of the off-diagonal elements from each row, averaged over all rows (indirect effects). In particular, if the spatial weight matrix w is row-standarised and has zeros in the diagonal, these scalar summary measures correspond to µ and α, respectively. We discuss the use of alternative unit-level indirect effects (analogous to the ones proposed by LeSage and Chih 2016) in section 4. 5 remove one of the components – i.e., either Ψ or W Ψ – is perfectly sensible. This is because the model y = Xβ + W Xγ + Ψµ∗ + ε, (2.3) with Ψµ∗ = Ψµ + W Ψα, is observationally equivalent to (2.1). On the other hand, if the individual-specific effects and/or their spatial spillovers are of some interest, then the identification and estimation of the model need to be discussed. To this end, we go on to propose using a spatial−(X, Ψ) panel data model with correlated random effects. In particular, we show that, under mild assumptions, the model is identified. Later we present appropriate estimators under alternative exogeneity assumptions 2.2 X The Correlated Random Effects Spatial−(X , Ψ ) Panel Data Model Fixed effects models implicitly assume that the individual effects are correlated with the covariates. But they somehow ignore this correlation in the estimation procedure. In fact, what the within and analogous transformations do (see e.g. Beer and Riedl, 2012) is to wipe out the individual effects so that this correlation is no longer a concern for the consistent estimation of the model. An alternative procedure for obtaining consistent estimates, however, is to incorporate this correlation into the model (Mundlak, 1978; Chamberlain, 1982). This is the approach followed here. In particular, we make use of the correlation between covariates and the (spatially weighted) individual effects to identify the spatial contagion in the individual effects. Our modelling approach is related to that of Debarsy (2012), who uses a correlated random effects specification to construct an LR test on “the relevance of the random effects approach” (p. 112).7 Notice, however, that although we both deal with the correlation between individual effects and covariates, our purposes differ markedly: while he seeks to correctly specify this correlation, we use it as a means to identify the spatial contagion in the individual effects. We 7 Although this is not the aim of this paper, an analogous Wald test could be developed using our model specification and estimation procedures. 6 also differ in the model specification which, although similar, treats the spatial contagion of the individual effects differently. Debarsy (2012) assumes that the individual effects depend on both the explanatory variables and the explanatory variables in their neighbourhood, but there is no spatial contagion in the individual effects. In contrast, we account for the spatial contagion in the individual effects (i.e., both the individual effects and their spatial spillovers are included in the specification) and assume that the individual effects and the “potential” of their spatial spillovers depend on the (mean of the) explanatory variables, which allows us to identify both the individual effects and their spatial spillovers. These alternative assumptions yield different error component structures: a one-way error in his case, a two-way error in ours (the additional component being a spatially weighted element).8 Thus, we assume the following relation between µ, α and the explanatory variables: 1 Ψ X ∗ Πµ + υ µ T 1 α = Ψ XΠα + υ α , T µ= (2.4) where Πµ and Πα are (K +1)×1 and K ×1 parameter vectors to be estimated, respectively, and X∗ = ιN T X contains the covariates and a vector of ones. Notice that a constant term needs to be included in one of the equations in (2.4) to guarantee identification, since the spatial contagion of any common factor in the individual effects µ (in particular, a constant term) is not identified. Ultimately, this means that we are implicitly assuming that the individual effects correspond to deviations from the constant term. Also, the error terms υ µ and υ α are assumed 2 2 to be random vectors of dimension N with υ µ ∼ (0, σµ I N ) and υ α ∼ (0, σα I N ). However, υ µ and υ α are not assumed to be independent, the covariance parameter, σµα , being such that E(υ µ υ α ) = σµα I N with E denoting the mathematical expectation. 8 Another important difference with the work of Debarsy (2012) is that whereas he analyses the Spatial Durbin Model (as Beer and Riedl 2012 do), our results are derived for the spatial −(X, Ψ) model. This allows us to address the identification and estimation of the model in a linear setting, whereas considering correlated random effects in a Spatial Durbin specification would result in a non-linear model in which identification and estimation are more involved. (Debarsy (2012, p. 115), for example, “assume(s) that all parameters are identified”; see, however, Lee and Yu 2016). 7 Plugging equations in (2.4) into the model (2.1) we obtain y = Xβ + W Xγ + 1 1 ΨΨ X ∗ Πµ + W ΨΨ XΠα + η, T T (2.5) where η = Ψυ µ + W Ψυ α + ε. Notice that the resulting error component is similar to the one proposed by Kapoor et al. (2007) in that both error components allow for spatial contagion in the individual (random) effects. It is different in that while we assume that the idiosyncratic term is not spatially correlated (and propose an identification strategy that takes into consideration that the individual effects may have spatial effects), Kapoor et al. (2007) assume that the idiosyncratic term is spatially autocorrelated. It is also interesting to note that our model specification does not impose the existence of spatial contagion in the individual effects. In fact, there is no contagion at all if both Πα and σα are zero (while there would still be some “random contagion” if Πα is zero but σα is not). Similarly, the model does not impose correlation between the individual effects and the covariates. In fact, the specification becomes that of a pure random effects model if both Πµ and Πα are zero (still with spatial contagion if σα is not zero). Thus, it is the statistical significance of these (sets of) parameters what ultimately determines the existence of spatial contagion in the individual effects and correlation between the individual effects and the covariates. Finally, in contrast to the spatial −(X, Ψ) model for panel data in (2.1), Proposition 2 shows that the correlated random effects spatial panel data model in (2.5) is generally identified. Proposition 2. The correlated random effects spatial panel data model in (2.5) is identified if the matrix X = X WX 1 ΨΨ T X∗ 1 W ΨΨ T X has full column rank. Proof. See appendix. Notice that since the number of parameters in the model is 4K + 1 (excluding the σ s), N T ≥ 4K + 1 is a necessary identification condition. Notice also that since ΨΨ X ∗ is a N T × K + 1 matrix and W ΨΨ X is an N T × K matrix, both with row rank equal to N , N ≥ 2K + 1 is an additional necessary identification condition. Further, X, W X, Ψ X ∗ and W Ψ X must have full rank. Lastly, time-invariant regressors must be included in either 8 the vector of explanatory variables (X and/or W X) or in the vector of determinants of the individual effects and their spatial spillovers (ΨΨ X ∗ and/or W ΨΨ X). Otherwise, there is exact multicollinearity between the explanatory variables. 3 Estimation We start by noticing that consistent estimation of the parameters of the model does not depend on what the structure of the error term η is. In particular, assuming that the covariates are strictly exogenous (meaning here that E(εit |xj1 , xj2 , . . . , xjT ) = 0 for all i, j), Ordinary Least Squares (OLS) estimates of (2.5) are consistent.9 Yet a more efficient Generalized Least Squares (GLS) estimator can be derived by accounting for the error components structure of the model. To be precise, the GLS estimator does not provide efficiency gains in the β and γ parameters of the correlated random effects model in (2.5). In fact, as shown below (see also Mundlak 1978), the GLS estimator of these parameters coincides with the OLS estimator in (2.5) and the within or fixed-effects estimator of the (observationally equivalent) model in (2.3). This means that, if these parameters are the only ones of interest, efficiency considerations do not justify the use of the GLS estimator. In contrast, the GLS estimator of (2.5) may provide efficiency gains (with respect to OLS) in the Πµ and Πα parameters of the correlation functions. This is why, under the above strict exogeneity assumption, we propose using a FGLS estimator based on the estimates of the parameters of the variance-covariance matrix of η. On the other hand, none of these estimators are consistent if the strict exogeneity assumption does not hold. In particular, the presence of predetermined variables among the regressors (X and W X) makes such variables endogenous when they are included among the variables that compose the correlation functions in (2.4). To obtain consistent estimates, we propose an IV estimator and the means of the endogenous variables taken, for each period, over only current and past values (backward means) as instruments. 9 Notice that the spatial structure of our model requires an orthogonality condition involving not only all the time periods (a standard assumption in applied work; see e.g. Wooldridge 2002) but also all the units. 1 Otherwise, we cannot guarantee the exogeneity of W X and T W ΨΨ X. 9 Next we discuss the derivation of the proposed estimators in detail. 3.1 GLS estimation under strict exogeneity Given our initial assumption of spherical disturbances and the stochastic assumptions about the behaviour of υ µ and υ α , the error-component η = Ψυ µ + W Ψυ α + ε has zero mean and variance-covariance matrix given by 2 2 2 Ω = σµ ΨΨ + σα W ΨΨ W + σµα ΨΨ W + σµα W ΨΨ + σε I N T 2 2 2 = Ψ σµ I N + σα ww + σµα w + σµα w Ψ + σε I N T (3.1) Knowledge of this matrix suffices to derive the GLS estimator (see e.g. Wooldridge 2002) of the parameters of the correlated random effects model in (2.5). In particular, Mundlak (1978) proves that, when all the explanatory variables are used to construct the correlation functions and η = Ψυ µ + ε, this coincides with the within (fixed effects) estimator of β and γ in (2.3). Next we generalise this result to any (non-spherical) variance-covariance matrix of υ µ . Proposition 3. Consider the following correlated random effects model: y = Xλ + 1 ΨΨ X∗ Π + η T with X∗ being the matrix X plus a column of ones, λ and Π vectors of parameters with the 2 appropriate dimension, and E (ηη ) = Ω = ΨΣυ Ψ + σε I N T , where Συ is any variance- covariance matrix. The GLS, OLS and within (fixed effects) estimators of λ are the same. Proof. See appendix. Notice that our model corresponds to X = X WX 1 , T ΨΨ X = 1 ΨΨ T X 1 W ΨΨ T X 2 2 λ = (β, γ) and Συ = σµ I N + σα ww + σµα w + σµα w . Thus, Proposition 3 fully applies. More generally, the previous proof shows that, regardless of the structure of the variance-covariance 10 , matrix of υ µ and υ α , the GLS (and OLS) estimator of β and γ coincides with the within (fixed effects) estimator. Next we consider the derivation of the feasible version of this GLS estimator. This basically 2 2 2 requires a consistent estimate of the vector of parameters σ = (σµ , σα , σµα , σε ). To this end, we notice that each component of Ω can be written as a linear function of σ: E [ηit ηls ] = σM ilts , (3.2) where i, l = 1, . . . , N and t, s = 1, . . . , T . Also, E [ηit ηls ] denotes the mathematical expectation of ηit ηls and M is a 4 × 1 vector whose rows are functions of w. More specifically, N 2 2 2 E ηit = σµ + σα 2 2 wij + 2σµα wii + σε (3.3) 2 wij + 2σµα wii for t = s (3.4) j=1 N E [ηit ηis ] = 2 σµ + 2 σα j=1 N 2 E [ηit ηls ] = σα wij wlj + σµα (wil + wli ) for i = l (3.5) j=1 This allows us to consider the following linear regression to estimate σ: ηit ηls = σM ilts + uilts ˆ ˆ (3.6) ˆ where η is obtained as the residual term of a consistent estimation of the model in (2.5). Given the assumption of strict exogeneity of the covariates, OLS may be used for this purpose. Under mild conditions, OLS estimation of (3.6) provides consistent estimates of σ (denoted ˆ by σ) and, with these in hand, we can obtain the FGLS estimates of the model.10 In ˆ particular, σ allows us to obtain ΩGLS using (3.1) and, by Cholesky Decomposition of its 10 2 2 2 Alternatively, one may impose the positiveness of the variances (σµ , σα and that σε ) and that the correlation 2 2 between µ and α lies in the [−1, 1] interval (−1 ≤ σµα × (σµ × σα )−1/2 ≤ 1), and use e.g a Non-Linear Least Squares estimator of σ. Notice that this differs from the approach followed by e.g. Kapoor et al. (2007) in that their estimating equations are non-linear in the parameters of interest and they therefore have to resort to a Generalized Moment estimator (which can also be used here). 11 −1 inverse, ΩGLS = D GLS D GLS , the transformation matrix D GLS . Finally, OLS estimation of the transformed model DGLS y = DGLS Xβ + DGLS W Xγ + 1 1 DGLS ΨΨ X ∗ Πµ + DGLS W ΨΨ XΠα + DGLS η, T T (3.7) provides the FGLS estimates of β, γ, Πµ and Πα . Interestingly, we can proceed in an analogous way to deal with error structures with idiosyncratic shocks following autoregressive and moving-average processes. In particular, the main difference with respect to the procedure proposed to deal with spherical disturbances is that the presence of a serial correlation matrix Kronecker (post-) multiplying the last term of equation (3.1) results in additional (autoregressive and moving-average) parameters in (3.2). Alternatively, we can account for the serial correlation in the model by including among the regressors lags of the dependent variable (and possibly of the explanatory variables in X and W X). Notice, however, that the presence of the time-invariant components υ µ and υ α in the error term makes the lagged dependent variable endogenous. We thus propose instrumenting this variable using lags of the explanatory variables X (and possibly W X) to control for the endogeneity of the lagged dependent variable. 3.2 Instrumental Variables Estimation Under Sequential Exogeneity The assumption of strict exogeneity of the covariates is critical to guarantee that the GLS estimators presented in the previous section provide consistent estimates of the parameters of interest. However, in applications the assumption that εit is uncorrelated with the covariates in all the time periods may not hold. If for example the values of an explanatory variable in period t are related to past values of the dependent variable (e.g., in t − 1), then future values of these explanatory variables (e.g., in t + 1) may depend on the values of the idiosyncratic term in t, thus breaking the strict exogeneity assumption (see e.g. Wooldridge 2002). In such circumstances, a sequential exogeneity assumption, E(εit |xi1 , xi2 , . . . , xit ) = 0, 12 seems more appropriate, since it implies that present values of yit do not affect present and past values of xit . However, given the spatial structure of our model and following the strict exogeneity case, we instead propose using an “extended sequential exogeneity assumption” involving all the units in the sample. In mathematical terms, E(εit |xjs ) = 0 for all ∀i, j and s ≤ t. Notice also that if (expected) future values of xit depend on yit (i.e., present values of yit affect the expected value of xit+1 ), then the explanatory variables used to construct the correlation functions in (2.4) are endogenous by construction. In other words, the presence of predetermined variables in X and W X means that 1 ΨΨ T X ∗ and 1 W ΨΨ T X are correlated with the idiosyncratic term ε. Therefore, under sequential exogeneity, the GLS estimators presented in the previous section no longer provide consistent estimates of the parameters of interest. Rather, an IV estimator should be considered for this purpose. The main challenge IV estimators face in practice is that it is often difficult to find good instruments. In this case, however, the structure of the model provides natural candidates. Namely, the means of the exogenous explanatory variables constructed using values up to period t (rather than using all T values).11 Let  1   1  2   LT =  1  3  .  . .  1 T 0 1 2 1 3 . . . 1 T 0 ··· 0    0 ··· 0    1 ··· 0  3  . .. .  . .  . . .  1 1 ··· T T be the row-standardised lower triangular matrix of ones and Γ = I N ⊗LT be the transformation matrix that yields the backward-up-to-t mean of the variable (i.e., ΓX, for example, yields a matrix composed by the means of the exogenous explanatory variables constructed using values up to period t). The matrix of instruments can be thus written as Z 1 = ΓX ΓW X Notice that these backward means are exogenous variables under the extended sequential exogeneity assumption. But they are also relevant, since by construction they are correlated 11 In fact, provided that the number of available periods is long enough, one may use up to lagged periods to construct such means, i.e., one may use values up to period t − 1, t − 2, etc.. 13 1 1 with the endogenous explanatory variables T ΨΨ X ∗ and T W ΨΨ X. Notice also that if we use the same explanatory variables to construct both the instruments and the correlation functions (or different variables but the same number), then the model is exactly identified. However, if all the explanatory variables are used to construct the instruments but not all the explanatory variables are used to construct the correlation functions, then the model is overidentified. To construct the IV estimator, we follow Hausman and Taylor (1981) and Keane and Runkle (1992). Hausman and Taylor (1981) propose a two-step procedure to estimate linear panel data models with endogenous explanatory variables (with respect to the idiosyncratic term, as well as with respect to the individual effect) that boils down to an initial GLS transformation of the model (using a consistent estimate of the variance-covariance matrix of the error term) and then an estimation of the transformed model by IV. However, Keane and Runkle (1992) show that this procedure may not yield consistent estimates when the instruments are predetermined. This is because the GLS-transformation proposed by Hausman and Taylor (1981) results in individual errors that are linear combinations of the errors of the individual in all time periods. To obtain consistent estimates, Keane and Runkle (1992) instead propose using the uppertriangular Cholesky decomposition of the serial correlation matrix (forward filtering) to GLStransform the model. In essence, this is the procedure we follow, except that the complex structure of our error term requires a different GLS transformation and alternative orthogonality conditions between the errors and the explanatory variables. To be precise, we obtain the IV estimates of our model in the following way. First, we transform the model using the projection matrix onto the column space of the matrix consisting of the exogenous variables and the instruments. That is, we multiply the model by the projection matrix PZ = Z(Z Z)−1 Z , with Z = X W X Z1 . This addresses the endogeneity problem and makes it possible to consistently estimate the transformed model by OLS. However, a more efficient estimation may be obtained if we transform the model to obtain ˆ spherical disturbances. To this end, we use these OLS estimates to generate the residuals η and, after estimating (3.6), obtain ΩIV and D IV in the same way as we did for ΩGLS and D GLS . In particular, since our instruments are predetermined, we propose using the upper- 14 triangular Cholesky decomposition of the inverse of the variance-covariance matrix (rather than that of the serial correlation matrix used by Keane and Runkle 1992) to obtain D IV . However, given the spatial structure of our model, we need to sort the data first by time and then by units within each time period before computing the upper-triangular Cholesky decomposition of ΩIV .12 This guarantees that the transformed errors in period t contain elements of ηis for s ≥ t and hence the exogeneity of our instruments in the transformed model.13 In the second step of the procedure, we estimate the GLS-transformed model by IV. This means that we again transform the model using the projection matrix PZ , except that now the sorting of the data requires using the matrix Γ = LT ⊗ I N to construct Z 1 . The transformed model, 1 1 PZ D IV y = PZ D IV Xβ+PZ D IV W Xγ+ PZ D IV ΨΨ X ∗ Πµ + PZ D IV W ΨΨ XΠα +PZ D IV η, T T (3.8) is then estimated by OLS. The IV estimates of β, γ, Πµ and Πα we obtain are not only consistent but also more efficient than the initial IV estimates obtained in the first step of the procedure. Lastly, it is interesting to note that, unlike the strict exogeneity case, the treatment of serial correlation under sequential exogeneity does not follow immediately. In the present case, the presence of lags of the idiosyncratic term ε makes any predetermined variable in the model endogenous (not only those in the correlation functions). This is, however, not a major issue if the number of lags is small (the order of the moving average is low) and the time dimension 12 Notice that so far we have followed the standard practice of having the data sorted first by units and then by time within each unit, so e.g. the dependent variable was defined in Section 2 as y = (y11 , . . . , y1T , . . . , yN 1 , . . . , yN T ) . Here we require that y = (y11 , . . . , yN 1 , y12 , . . . , yN 2 , . . . , y1T , . . . , yN T ) and X = (x11 , . . . , xN 1 , x12 , . . . , xN 2 , . . . , x1T , . . . , xN T ) . In particular, notice that this sorting requires using W = I T ⊗ w and Ψ = ιT ⊗ I N . 13 Notice that, in a model without spatial dependence, Keane and Runkle (1992) assume that E(ηit |z is ) = 0 for s ≤ t. However, the presence of spatially weighted covariates in the model means that, for the proposed instruments, this only holds if the extended sequential exogeneity assumptions holds. Notice also that the observation (i, t) of the transformed error term, D IV η, contains the original error terms ηj,t for j = i, i + 1, i + 2, ..., N as well as ηj,s for all j and s > t. This is why, in the presence of spatial dependence, the orthogonality condition proposed by Keane and Runkle (1992) does not suffice. In contrast, our extended sequential exogeneity assumption guarantees that the proposed instruments are exogenous to the transformed errors, since E(ηjt |z is ) = 0 for all j and s ≤ t. 15 of the panel is large. If the predetermined variable is among the regressors (X and W X), we propose using lags of the variable as instruments; and if the predetermined variable is in the correlation function, we similarly propose adjusting the periods used to compute the backward means. Thus, we lose (at least) one period for each additional lagged term in the idiosyncratic error. The problem, of course, is that if the order of the moving average process driving the idiosyncratic term is not smaller than the number of time periods minus one, then there is no room for using lags and backward means as instruments. In particular, by the Wold representation theorem, this situation arises if the idiosyncratic term follows an AR process (of any order). Alternatively, in applications in which strict exogeneity does not hold and there is serial correlation in the model, we may include among the regressors lags of the dependent variable (and possibly of the explanatory variables in X and W X) and then apply the two-step procedure previously described. Notice, however, that in this specification the lagged dependent variable is endogenous (for the reasons pointed out in the GLS case). We thus proposed extending our matrix of instruments to include lags of the explanatory variables X (and possibly W X) to control for the endogeneity of the lagged dependent variable. 4 Public capital spillovers in a production function specification: empirical evidence In this section we use our correlated random effects specification and the proposed FGLS and IV estimators to empirically address the existence of public capital spillovers. To this end, we use a Cobb-Douglas production function specification and yearly data from (Munnell, 1990, p. 77) on 48 US contiguous states over the period 1970 to 1986. The output variable is the gross state product and the inputs include public capital, private capital and labour. “The unemployment rate is also included [in the regressions] to reflect the cyclical nature of productivity”. All the variables except unemployment are in logs. This data set has the additional interest of having been partially (e.g. Garcia-Mila et al., 1996) or totally (e.g. Baltagi and Pinnoi, 1995) used 16 in a number of studies on the relation between public capital and private output —see also Boarnet (1998) and Sloboda and Yao (2008). In particular, some of these studies have used spatial econometrics techniques (e.g. Holtz-Eakin and Schwartz, 1995; Kelejian and Robinson, 1997). This accumulated evidence provides an excellent benchmark for our estimates, which were obtained for a model specification that uses all the inputs and their spatial counterparts to construct the correlation functions (µ and α). Before proceeding with the estimation, however, we considered the identification of the proposed model. We thus computed det(X X) to find that it was indeed positive, which means that our identification condition holds. We report FGLS estimates of the model in Table 1 (coefficients and variance components). We also report the joint significance LM-tests of each subset of coefficients (β, γ, Πµ and Πα ). The first thing to notice is that since our model specification uses all regressors (X and W X) to construct the correlation functions (µ and α), estimates of β and γ reported in Table 1 correspond to the within estimates of the model. As for the coefficient estimates of the variables that compose the correlation functions, Πµ and Πα , all tend to be statistically significant (both individually and jointly). Lastly, all the variance components σ are statistically significant and have reasonable values. This supports our correlated random effects model specification. In particular, given that we reject that Πα is zero, there is evidence of contagion in the individual effects and, given that we reject that Πµ is zero, there is evidence of correlation between the individual effects and the covariates (fixed effects). [Insert Table 1] The FGLS estimates of the β-coefficients associated with the main regressors (X) are in line with those reported in previous studies. More precisely, they are close to those reported by Holtz-Eakin and Schwartz (1995) and Kelejian and Robinson (1997). While our estimate of the elasticity of labour is 0.7, for example, they estimated it to be between 0.6 and 0.9; similarly, our estimate of the elasticity of private capital is 0.2, while their estimates range from 0.06 to 0.2.14 We further concur with the lack of statistical significance of public capital (see 14 These estimates tend to be smaller that those reported by Baltagi and Pinnoi (1995) and Garcia-Mila et al. 17 also Baltagi and Pinnoi, 1995; Garcia-Mila et al., 1996) and the statistical significance of the spatially weighted public capital (see the second column of Table 1, γ). Lastly, the statistical significance of public capital in the correlation function of the individual effects (see the third column of Table 1, Πµ ) is consistent with evidence reported by Baltagi and Pinnoi (1995, p. 396) that rejects the orthogonality between regressors and individual effects “only when the public capital stock (is) included in the production function”. Next we explore the possibility that the explanatory variables are not exogenous but predetermined. Previous related studies have analysed the endogeneity of (some of) the explanatory variables, with mixed results on the endogeneity tests (Baltagi and Pinnoi, 1995; Holtz-Eakin and Schwartz, 1995; Garcia-Mila et al., 1996) and implausible results on the coefficient estimates (Baltagi and Pinnoi, 1995; Holtz-Eakin and Schwartz, 1995). Here we address the predeterminedness of the public capital variable (and its spatially weighted counterpart). This would be the case, for example, if the amount states spend on public capital is related to past values of private output (e.g., because more prosperous states are likely to generate higher tax revenues). Under such circumstances, our previous discussion on the FGLS estimates is flawed, since the variables that compose the correlation functions defining µ and α become endogenous and the FGLS estimator is no longer consistent. We thus report results from an IV estimation in Table 2. These were obtained using as instruments backward-up-to-t means of all the explanatory variables and their spatially weighted counterparts. That is, Z 1 contains the Γ−transformations of public capital, private capital, labour and unemployment as well as of their spatially weighted counterparts.15 At first sight, the IV estimates of the β- and γ-coefficients are not substantially different from those obtained by FGLS (perhaps with the exception of the public capital in γ). In contrast, IV and FGLS estimates of the coefficients associated with the variables that compose the correlation functions, Πµ and Πα , differ substantially. Indeed, a Hausman test between these two estimators strongly rejects the null hypothesis of strict exogeneity (the statistic is 159.42). (1996), which may suggest that ignoring spatial dependence results in overestimation of the coefficients. 15 We experimented with other set of instruments (e.g., without considering unemployment and its spatial weight) and found that coefficients estimates were barely altered. 18 This supports our tenet that public capital is actually a predetermined variable. [Insert Table 2] We then obtained an estimate of the direct individual effects (i.e., µ) and the “potential” of their spatial spillovers (i.e., α) using these IV estimates. However, we found that while the estimated direct efffects were generally statistically significant (both using FGLS and IV estimates), the “potential” of the spatial spillovers of the invididual effects were only statistically significant under the strict exogeneity assumption. Under the assumption that capital is a predetermined variable, the estimated α’s were not statistically significant.16 Seeking for alternative, more efficient specifications that yielded statistically significant α’s, we used a variables-selection procedure that resulted in the specification reported in Table 3.17 Notice that the (common) estimated model coefficients are not that different from those reported in Table 2. Also, the correlation between the estimated µ’s and α’s (obtained from the estimates reported in Table 2 and Table 3) is 0.99 and 0.90, respectively. However, 14 out of the 48 estimated α’s are now statistically significant at standard confidence levels. Alternatively, we considered an ad-hoc model selection procedure in which we explored different specifications in terms of instruments (e.g., using squared terms) and/or variables (e.g., dropping the unemployment, as in Holtz-Eakin and Schwartz 1995). This also produced 16 We use t-statistics to test the statistical significance of the individual effects and their spatial spillovers. 2 In particular, the standard errors were obtained from V ar(µ − µ) = T12 Ψ X ∗ ΣΠµ X ∗ Ψ + σµ I N − 1 2 T Ψ X ∗ Mµ E ηυ µ , with E [·] denoting the mathematical expectation, ΣΠµ = V ar(Πµ ) and Πµ −Πµ = Mµ η. The expression for V ar(α − α) is analogous, only differing in the subindices (i.e., using ΣΠα instead of ΣΠµ , 2 2 σα instead of σµ , Mα instead of Mµ , and υ α instead of υ µ ). Under standard assumptions, these are consistent and (approximately) normally-distributed estimators of V ar(µ − µ) and V ar(α − α), respectively. Notice that these estimated variances allow us to also test whether the direct effects of two states are statistically equal (rather than whether the direct effect of one state is statistically equal to zero, as we did). 17 In particular, we proceeded as follows. We dropped the most non-significant variable (i.e., that with the higher p-value) in the original specification (that reported in Table 2), reestimated the model and determined which was the most non-significant variable in the new specification. We then tested whether these two variables were jointlty statistically significant in the original specification. If the null hypothesis of this Wald test was not rejected, we dropped the two variables and considered again the most non-significant variable in the resulting specification. Then, we constructed a new Wald test for the null that the three variables were not jointly statistically significant in the original specification. We went on dropping variables and testing their joint significance untill either the jointly statistically non-significant hypothesis was rejected or all the variables in the model were statistically significant. 19 specifications in which the α’s were statistically significant and highly correlated with those obtained using the results reported in Table 2, while the coefficients of the model and their statistical significance remained generally unaltered. This was the case, for example, when dropping public capital from W X, the unemployment from Πµ , and the unemployment and its spatially weighted counterpart from the set of instruments. Given the illustrative aim of this empirical exercise, determining which is the best model specification is clearly beyond the scope of this paper. It is important to stress, however, that little differences were observed among the alternative specifications we considered when we plotted on a map of the US states the estimated values of µ and α (available from the authors upon request). Bearing this in mind, we have used the IV estimates reported in Table 3 to analyse the geographical distribution of the estimated direct individual effects and their spatial spillovers. In particular, our spillover effects are based on LeSage and Chih (2016), who define “spill-in” and “spill-out” effects as the cumulating off-diagonal elements from row- and columnsums of the matrix of marginal effects in (2.2), respectively. [Insert Table 3] N wil αj , To be precise, LeSage and Chih (2016) interpret the row-sums of wαj (i.e., l=1 assuming that wii = 0) as the “spill-in effects” on the outcome of unit i. Thus, this spillin effect captures the impact on the outcome of unit i of all the units neighbouring i having the N unobserved characteristics of j. However, we find more interesting to report wil αl , which is l=1 the impact on the outcome of unit i of all the units neighbouring i having their unobserved characteristics (i.e., the impact on the outcome of unit i of the individual effects of the units neighbouring i). This is reported in Figure 1a. N Similarly, LeSage and Chih (2016) interpret the colum-sums of wαj (i.e., αj wli , assuming l=1 that wii = 0) as “spill-out effects” of unit i. In particular, this spill-out captures the impact of unit i having the unobserved characteristics of j on the outcome of the neighbours of unit i. N Again, we find more interesting to report αi wli , which is the impact on the outcome of the l=1 units neighbouring i of the individual effect of unit i. Notice, however, that our proposed spillout effect is the product of αi (the “potential” of the spatial spillovers of the individual effects) 20 N wli (which in essence determines the spatial contagion, i.e., which units are affected by and l=1 the spillover). Since our spill-in effect already reflects the spatial contagion of the individual N effects, it seems more interesting to report α rather than αi wli . This is consequently what l=1 we do in Figure 1b. [Insert Figure 1] Results indicate that the geographical distribution of the spatial contagion, as measured by the spill-in (Figure 1a) and spill-out (Figure 1b) effects, follows very much the same pattern (with some notable exceptions, such as Nebraska and Nevada). This means that most states show spill-in and spill-out effects that are of analogous magnitude. However, the “inwards” spatial contagion of the spill-in effects is generally not statistically significant (Colorado, Montana, Texas and Utah being the exceptions). We thus concentrate on the analysis of the spill-out effects which, as previously pointed out, tend to be statistically significant. Figure 1b shows that, in absolute values, West-Central (from Texas to Kansas, but also North-Dakota and Indiana) and West-Mountain (New Mexico and Wyoming) states stand out as the areas with the highest “outwards” spatial contagion. These are thus states with individual effects that strongly and negatively spill over the neighbouring states. Interestingly, these spill out effects are generally statistically significant at conventional confidence levels. On the other hand, there are two areas of low spill-out effects (in absolute values): the WestPacific (California and Washington) and the North-East (New York in the mid-Atlantic and Connecticut, Rhode Island, Massachusetts and Vermont in New England). These are thus states with individual effects that negatively spill over the neighbouring states, but for which the relative magnitude of these spillovers is small (and, in fact, not statistically significant). Further, Figure 1c reveals that the states that have the highest estimated values of the direct individual effects are mostly located in the North and West of the country (plus Texas and Louisiana in the South): more precisely, in the East (Illinois, Michigan and Ohio) and West (Nebraska) North Central, the mid-Atlantic (New York and Pennsylvania), the West-Mountain (Montana and Wyoming) and West-Pacific (California and Washington) regions. Figure 1c 21 also shows that the states with the lowest estimated values of the direct individual effects concentrate in New England (Connecticut, Maine, Massachusetts, New Hampshire, Rhode Island and Vermont), although we also find some states in the West-Mountain (i.e., Idaho, Utah and Nevada) and the South-Atlantic (North and South Carolina) regions. Lastly, it is interesting to note the overlap between Figure 1a, Figure 1b and Figure 1c. In fact, since the estimated values of the individual effects and their spatial spillovers show different sign (see the signs of the Π-coefficients in Table 1, Table 2 and Table 3), Figure 1 points to a negative relation between µ and α (wα). Importantly, this “proportionality” between the direct and indirect effects is not a feature of the model specification (LeSage and Chih, 2016). Rather, it arises as a genuine characteristic of the data.18 Therefore, states with larger/smaller estimated direct individual effects tend to have larger/smaller negative spatial spillovers (both spill-in and spill-out). Notice, however, that this is mostly driven by New England states (with small direct individual effects and small negative spatial spillovers) and the central and southern states (with large direct individual effects and large negative spatial spillovers, except for Nebraska, which shows large direct individual effects and small negative spatial spillovers). The West and North-East states, on the other hand, tend to have large estimated direct individual effects and small negative spatial spillovers. To conclude, it is worth noting that negative spillovers that “might shift economic activity from one location to another” have previously been found by, for example, Boarnet (1998, p. 382) and Sloboda and Yao (2008) with respect to the stock of and the investment in public infrastructure, respectively. However, the source of this “crowding out” effect in our model are the unobservable characteristics of the states or “unobserved productivity” (which these studies cannot identify). In fact, consistent with the work of Holtz-Eakin and Schwartz (1995) and Kelejian and Robinson (1997), results reported in Table 2 show that the spatially weighted public capital has a negative coefficient, but is not statistically significant (while private capital 18 A simple regression between µ and α, for example, indeed has a negative and statistically significant slope (−0.34, with a p-value of 0.03), whereas a simple regression between α and wα has a positive and statistically significant slope (0.86 with a p-value of 0.00). 22 is, and has a positive sign). Notice also that both private and public capital are positively related to the states’ unobserved productivity (both variables show positive and statistically signs in Πµ ). However, the role of these variables in the spatial spillover of the productivity, Πα , differs. While the investment in private capital is associated with negative spillovers, the investment in public capital is associated with positive spillovers. 5 Conclusions In this paper we analyse the problem of estimating individual effects and their spatial spillovers in linear panel data models. In particular, we consider models in which the exogenous regressors are spatially weighted and there is no spatially lagged dependent variable (i.e., the so-called spatial-X model). We first show that in this model specification the individual effects and their spatial spillovers are not identified for any spatial weight matrix. Under mild assumptions, however, we show that they are identified in a correlated random effects specification. To be precise, we show that there is no identification problem in a spatial−(X, Ψ) panel data model with correlated random effects if certain rank conditions hold and the individual effects correspond to deviations with respect to the constant term. We then consider the estimation of the parameters of the (identified) model. Under strict exogeneity of the covariates, OLS estimates are consistent. Here, though, we provide more efficient FGLS estimators (at least more efficient with respect to the coefficients of the variables that compose the correlation functions) and propose an IV estimator to tackle situations in which the strict exogeneity assumption may not hold and a sequential exogeneity assumption is upheld. In particular, we suggest using the means of the exogenous explanatory variables constructed using values up to period t as instruments for the endogenous explanatory variables used to construct the correlation function (which, ultimately, are “means-up-to-T ” of the exogenous variables). Also, dropping the most recent periods used to construct these instrumental variables (i.e., using “means-up-to-(t − s)”, with s being a positive integer) may provide further instruments and/or instruments for potentially endogenous regressors. 23 Lastly, we present results from an empirical application: the estimation of a Cobb-Douglas production function using US state data. We find statistically significant differences between the FGLS and IV estimates, which suggest that the strict exogeneity assumption that sustains the FGLS estimates may not hold because the public capital variable is actually predetermined. Also, IV (and FGLS) estimates show that the variables that compose the correlation functions, as well as the variance components, all tend to be statistically significant. This supports our correlated random effects model specification. The geographical distribution of the IV-estimated direct individual effects and their spatial spillovers reveals the existence of three major regions: i) Central and South states, where both direct individual effects and negative spatial spillovers tend to be large; ii) New England states, where both the direct individual effects and negative spatial spillovers tend to be small; and iii) West and North-East states, where the estimated direct individual effects tend to be large and the negative spatial spillovers tend to be small. In addition, both the “inwards” and “outwards” spatial contagion of the individual effects (i.e., the spill-in and spill-out effects) involve negative spillovers, although this sign is mostly associated with the private capital (and labour) and is only statistically significant for the spill-out effects. Public capital, on the other hand, is behind the positive spatial contagion of the individual effects. Consistent with previous literature, however, public capital itself does not seem to convey statistically significant spatial spillovers. 24 References Baltagi, B. and Pinnoi, N. (1995). Public capital stock and state productivity growth: Further evidence from an error components model. Empirical Economics, 20(2):351–59. Beer, C. and Riedl, A. (2012). Modelling spatial externalities in panel data: The spatial durbin model revisited. Papers in Regional Science, 91(2):299–318. Boarnet, M. G. (1998). Spillovers and the locational effects of public infrastructure. Journal of Regional Science, 38(3):381–400. Chamberlain, G. (1982). Multivariate regression models for panel data. Journal of Econometrics, 18(1):5–46. Combes, P.-P. and Gobillon, L. (2015). The empirics of agglomeration economies. In Duranton, G., Henderson, V., and Strange, W., editors, Handbook of Urban and Regional Economics, vol. 5A, pages 247–348. Elsevier. Debarsy, N. (2012). The Mundlak Approach in the Spatial Durbin Panel Data Model. Spatial Economic Analysis, 7(1):109–131. Garcia-Mila, T., McGuire, T. J., and Porter, R. H. (1996). The Effect of Public Capital in State-Level Production Functions Reconsidered. The Review of Economics and Statistics, 78(1):177–80. Halleck Vega, S. and Elhorst, J. P. (2015). The slx model. Journal of Regional Science, 55(3):339–363. Harville, D. (2008). Matrix Algebra From a Statistician’s Perspective. Springer. Hausman, J. A. and Taylor, W. E. (1981). Panel data and unobservable individual effects. Econometrica, 49(6):1377 – 1398. Henderson, H. V. and Searle, S. R. (1981). On deriving the inverse of a sum of matrices. SIAM Review, 23(1):53–60. 25 Holtz-Eakin, D. and Schwartz, A. (1995). Spatial productivity spillovers from public infrastructure: Evidence from state highways. International Tax and Public Finance, 2(3):459–468. Kapoor, M., Kelejian, H. H., and Prucha, I. R. (2007). Panel data models with spatially correlated error components. Journal of Econometrics, 140(1):97 – 130. Analysis of spatially dependent data. Keane, M. P. and Runkle, D. E. (1992). On the estimation of panel-data models with serial correlation when instruments are not strictly exogenous. Journal of Business & Economic Statistics, 10(1):1–9. Kelejian, H. H. and Robinson, D. P. (1997). Infrastructure productivity estimation and its underlying econometric specifications: A sensitivity analysis. Papers in Regional Science, 76(1):115–131. Lee, L.-F. and Yu, J. (2016). Identification of spatial durbin panel models. Journal of Applied Econometrics, 31(1):133–162. LeSage, J. and Chih, Y.-Y. (2016). Interpreting heterogeneous coefficient spatial autoregressive panel models. Economics Letters, 142:1 – 5. LeSage, J. and Pace, R. (2009). Introduction to Spatial Econometrics. Statistics: A Series of Textbooks and Monographs. CRC Press. Mundlak, Y. (1978). On the Pooling of Time Series and Cross Section Data. Econometrica, 46(1):69–85. Munnell, A. H. (1990). How does public infrastructure affect regional economic performance? In Munnell, A. H., editor, Is There a Shortfall in Public Capital Investment?, pages 69–103. Federal Reserve Bank of Boston. Pereira, A. and Andraz, J. (2013). On the economic effects of public infrastructure investment: A survey of the international evidence. Journal of Economic Development, 4(38):1–37. 26 Sloboda, B. W. and Yao, V. W. (2008). Interstate spillovers of private capital and public spending. The Annals of Regional Science, 42(3):505–518. Wooldridge, J. (2002). Econometric Analysis of Cross Section and Panel Data. Econometric Analysis of Cross Section and Panel Data. MIT Press. 27 Appendix: Proofs of Propositions Proof of Proposition 1. The model in (2.1) is not identified for any spatial weight matrix w because Ψ and W Ψ are perfectly collinear. det Ψ WΨ We prove this by showing that is zero for any spatial weight matrix w. Let Ψ WΨ  A= Ψ WΨ Ψ WΨ =T IN w w ww   (5.1) Then, by Schur complement, det(A) = T 2N det (I N ) det w w − w (I N )−1 w = T 2N det (w w − w w) = 0 (5.2) Proof of Proposition 2. Since the correlated random effects spatial−(X, µ) panel data model is linear in parameters, it is identified iff det(X X) = 0. If X has full rank, it is easy to show that det(X X) > 0 (see e.g. Corollary 14.2.14 and Theorem 14.9.4 in Harville 2008). Proof of Proposition 3. Let λOLS , λw and λGLS be the OLS, within and GLS estimators of λ. We prove first that λOLS = λw . To this end, we start by noting that, by the Frisch-WaughLovell theorem and given that ΨΨ ΨΨ = T ΨΨ , −1 λOLS = (X M 1 X) −1 1 with M 1 = I N T − T ΨΨ X∗ (X∗ ΨΨ X∗ ) (X M 1 y) 1 X∗ ΨΨ . Also, let Q = I N T − T ΨΨ , which satisfies that Q Q = Q and QΨ = 0. Lastly, since X = 0K×1 I K×K X M 1 = X Q. Consequently, −1 λOLS = (X QX) 28 X Qy = λw X∗ , it can be proved that This concludes the first part of the proof. Next we prove that λGLS = λOLS = λw . To this end, we start by noting that λGLS corresponds to the OLS estimator of the considered correlated random effects model transformed using the (upper triangular part of the) Cholesky decomposition of the inverse of the variance covariance matrix of η, Ω−1 = DD . Therefore, by the Frisch-Waugh-Lovell theorem, λGLS = ((D X) M 2 D X)−1 ((D X) M 2 D y) with M 2 = I N T − D ΨΨ X∗ (X∗ ΨΨ Ω−1 ΨΨ X∗ )−1 X∗ ΨΨ D. Also, from equation (19) in Henderson and Searle (1981), 2 Ω−1 = σε I N T + ΨΣυ Ψ −1 1 1 I − 2 ΨΣυ Ψ Ω−1 2 NT σε σε 1 1 = 2 I N T − 2 Ω−1 ΨΣυ Ψ , σε σε = which implies that ΨΣυ Ψ Ω−1 = Ω−1 ΨΣυ Ψ and, using again that ΨΨ ΨΨ = T ΨΨ , it can be proved that ΨΨ Ω−1 = Ω−1 ΨΨ QΩ−1 = Ω−1 Q = (5.3a) 1 Q 2 σε (5.3b) Lastly, by (5.3a) matrix M 2 can be rewritten as M 2 = INT − 1 D ΨΨ X∗ X∗ Ω−1 ΨΨ X∗ T −1 X∗ ΨΨ D, and so X DM 2 = X QD. Consequently, λGLS = X QΩ−1 X −1 (X QΩ−1 y) = X QX where the second equality holds by (5.3b). 29 −1 (X Qy) = λOLS = λw , Table 1: FGLS estimates. Coefficients Labour Unemployment rate Public capital Joint LM-test Variance Components Note: γ Πµ Πα 0.199∗∗∗ (0.030) 0.724∗∗∗ (0.035) −0.002 (0.001) −0.023 (0.030) 0.260∗∗∗ (0.043) −0.027 (0.050) −0.007∗∗∗ (0.002) −0.129∗∗ (0.051) 0.197∗∗∗ (0.052) −0.212∗∗∗ (0.066) −0.013 (0.010) 0.186∗∗∗ (0.070) −0.477∗∗∗ (0.089) 0.101 (0.115) 0.035∗ (0.018) 0.230 (0.146) 250.07∗∗∗ 17.83∗∗∗ 13.10∗∗∗ 8.28∗∗∗ 2 σµ 2 σα σµα 2 σε 0.0045∗∗∗ (0.0001) Private capital β 0.0012∗∗∗ (0.0003) 0.0017∗∗∗ (0.0001) 0.0013∗∗∗ (0.0002) ∗ p-value<0.1; ∗∗ p-value<0.05; ∗∗∗ p-value<0.01. The dependent variable is the gross state product. All the variables are in logs, except for the unemployment. Variance components were estimated by OLS. 30 Table 2: IV estimates. Coefficients Labour Unemployment rate Public capital Joint LM-test Variance Components Note: γ Πµ Πα 0.255∗∗∗ (0.037) 0.676∗∗∗ (0.059) −0.003 (0.002) −0.029 (0.125) 0.259∗∗∗ (0.055) −0.045 (0.078) −0.009∗∗∗ (0.003) −0.100 (0.163) 0.351∗∗∗ (0.081) −0.666∗∗∗ (0.132) 0.009 (0.015) 0.541∗∗ (0.230) −0.601∗∗∗ (0.135) −0.100 (0.217) 0.067∗∗ (0.029) 0.661∗ (0.347) 168.57∗∗∗ 12.40∗∗∗ 12.77∗∗∗ 6.32∗∗∗ 2 σµ 2 σα σµα 2 σε 0.0046∗∗∗ (0.0001) Private capital β 0.0008∗∗∗ (0.0003) 0.0019∗∗∗ (0.0001) 0.0019∗∗∗ (0.0002) ∗ p-value<0.1; ∗∗ p-value<0.05; ∗∗∗ p-value<0.01. The dependent variable is the gross state product. All the variables are in logs, except for the unemployment. The matrix of instruments consist of backward-up-to-t means of public capital, private capital, labour and unemployment as well as of their spatially weighted counterparts. Variance components were estimated by OLS. 31 Table 3: IV estimates. Coefficients Private Capital Labour Unemployment rate β γ Πµ Πα 0.252∗∗∗ (0.040) 0.666∗∗∗ (0.050) −0.011∗∗∗ (0.002) 0.419∗∗∗ (0.068) −0.279∗∗∗ (0.084) −0.008∗∗∗ (0.003) 0.342∗∗∗ (0.084) −0.776∗∗∗ (0.118) −0.909∗∗∗ (0.180) 0.660∗∗∗ (0.132) 0.877∗∗∗ (0.193) Public Capital Note: 2 σµ 2 σα σµα 2 σε 0.0044∗∗∗ (0.0001) Variance Components 0.0026∗∗∗ (0.0003) 0.0018∗∗∗ (0.0001) 0.0015∗∗∗ (0.0002) ∗ p-value<0.1; ∗∗ p-value<0.05; ∗∗∗ p-value<0.01. The dependent variable is the gross state product. All the variables are in logs, except for the unemployment. The matrix of instruments consist of backward-up-to-t means of public capital, private capital, labour and unemployment as well as of their spatially weighted counterparts. Variance components were estimated by OLS. 32 Figure 1: Estimated individual effects and their spatial spillovers. (a) Geographical distribution of wα (b) Geographical distribution of α (c) Geographical distribution of µ Note: ∗ p-value<0.1; ∗∗ p-value<0.05; ∗∗∗ p-value<0.01.