3.1 Data
We use a strongly balanced panel of 21 Italian NUTS 2 regions containing annual data from 2010 to 2020. The choice of the time interval is constrained by the availability of data and the nature of the panel, which does not allow for the kind of long-term study that would allow for a wider range of methodologies, such as dynamic models. As the main variables of interest tend to be quite stable over time, being the product of structural factors that characterise the socio-economic context of the Italian regions, we carry out a static analysis. About the choice of level of geographical detail, the main variables of interest are only available at the NUTS 2 level.
In constructing the dataset, we consider a variety of sources for identifying the variables to be included in the empirical model, all of which can be traced back to the Italian National Statistics Institute (ISTAT). The criterion for the choice of variables is a careful analysis of the existing literature on the regional determinants of income inequality.
3.2 The model
To empirically examine the linkage between the regional diffusion of organisation-based volunteering and income inequality, we estimate the following model:
Giniit = α + β1 Voluntit + β2 Xit + μt + uit (1)
Where i is the unit and t is the year. Gini is the Gini index of net household income for measuring income inequality, Volunt is the number of volunteers of voluntary organisations per 100 inhabitants, X is a set of control variables, μt is a full set of time dummies, and u is the disturbance term.
Regarding the dependent variable, we use the Gini index because it is the most widely used index in the literature to assess income inequality (Crespo & Hernandez, 2020; Rogerson, 2013). Figure 2 shows its spatial distribution in 2010 and 2020. It can be seen that the high levels of inequality are registered in the South of the country. It seems that the spatial distribution of this indicator reflects the different levels of economic development that characterise the regions, highlighting the traditional divide between the Centre-North and the South. This would explain why it is broadly stable over the period considered, as can be seen from the small differences in the spatial distribution observed between 2010 and 2020, as it could be mainly determined by structural factors.
The main independent variable is taken from the ISTAT “Household Multisurvey: aspects of daily life”, where there is a specific item asking respondents to indicate whether or not they volunteer in an NPO. We only consider volunteering in voluntary organisations at this stage, since the focus is on those NPOs that carry out activities of general interest, which the social capital literature, following the work of Knack and Keefer (1997), classifies as “Putnamian organisations” to distinguish them from those referred to as “Olsonian organisations”, which, being oriented towards the pursuit of particularistic interests, could be detrimental to the spread of generalised trust, bringing out what is referred to as “the dark side of social capital” (Baycan & Öner, 2023).[1] Figure 3, which shows the spatial distribution of the indicator in 2010 and 2020, indicates that volunteering in voluntary organisations is more widespread in the Centre-North and especially in the North-East, where the two autonomous provinces of Bolzano and Trento are located, areas with a strong tradition of cooperation and solidarity.
Following the existing empirical literature, we include a set of control variables specifically related to socio-demographic and sectoral composition aspects of regional economies, namely:
1. Number of persons employed in the primary sector as a percentage of total employment (Primary_emp).
2. Number of persons employed in the tertiary sector as a percentage of total employment (Tertiary_emp).
3. The dependency ratio (Dep_ratio), which is the ratio of the inactive population (0-14 and 65+) to the active population, is included to control for the demographic structure of the regions.
4. Government final consumption expenditure on social protection (Welfare).
Table 1 provides a detailed description of the variables included in the model and the main summary statistics.
3.3 Empirical strategy
Regarding the empirical strategy, we first estimate a pooled OLS model with standard errors clustered by region. A pooled OLS model could be affected by omitted variable bias. Indeed, there may be unobserved factors correlated with the main independent variable, which could affect its relationship with the dependent variable and lead to a bias in the OLS estimator. For this reason, panel analysis generally uses fixed effects estimation, which allows us to control for those unobserved, time-invariant factors that are correlated with the regressors. Theoretically, the choice of fixed effects estimation would be optimal, given that it is highly plausible that the main independent variable may be correlated with unobserved region-specific factors. Indeed, as mentioned above, volunteering may be shaped by specific institutional and socio-cultural factors characterising each region and rooted in history. Empirically, however, some issues lead us not to estimate a region-fixed effects model, since the dependent variable shows significant stability over time, as further evidenced by the low within-variance.[2] Fixed effects exploit the variation over time within regions (the within-variance). If the dependent variable varies only slightly within units, there is little useful information to be gained from estimating the effects of regressors using time variation. This can make the fixed effects estimators inaccurate or meaningless. In fact, they absorb much of the total variance by eliminating the between-variance, resulting in a significant loss of information that reduces the efficiency of the estimates. To partially account for the presence of time-invariant unobserved factors, we include in the model the NUTS-1 macroregional dummies North-East, Nort-West, Centre, and South (with Islands as reference).
Table 1. Summary statistics (n = 252)
Variable
|
Source
|
Years
|
Mean
|
S.D.
|
Min
|
Max
|
Gini
|
ISTAT on EU-SILC data
|
2010-2020
|
0.280
|
0.026
|
0.288
|
0.356
|
Volunt
|
ISTAT - Multi-purpose household survey: aspects of daily life
|
2010-2020
|
10.813
|
4.563
|
5.000
|
27.300
|
Dep_ratio
|
ISTAT
|
2010-2020
|
55.614
|
4.068
|
46.300
|
66.100
|
Primary_emp
|
ISTAT – Territorial accounts
|
2010-2020
|
5.011
|
3.329
|
1.221
|
15.511
|
Tertiary_emp
|
ISTAT – Territorial accounts
|
2010-2020
|
72.036
|
5.196
|
59.982
|
85.433
|
Welfare
|
Territorial Public Accounts
|
2010-2020
|
313.337
|
202.172
|
138.121
|
1162.077
|
|
|
|
|
|
|
|
Sep_waste
|
ISTAT
|
2004-2014
|
32.460
|
17.675
|
3.600
|
71.300
|
Volunt_lag
|
ISTAT - Multipurpose survey on households: aspects of daily life
|
2001-2011
|
9.774
|
4.619
|
3.800
|
24.500
|
To deal with omitted variable bias and reverse or simultaneous causality (income inequality might be a determinant of volunteering), we estimate an instrumental variables regression model using a two-stage least squares (IV-2SLS) estimator. In an IV-2SLS estimation, identifying instrumental variables that are highly correlated with the endogenous variable and uncorrelated with the disturbance term is crucial. Earlier, we stressed that volunteering is strongly influenced by the socio-cultural and institutional characteristics of a region, which determine the propensity of the population to cooperate. Therefore, to isolate the exogenous component of the key independent variable, it may be appropriate to use a proxy for civic capital as an instrumental variable. Given the limited availability of time-series data on civic capital at the NUTS-3 level, we consider the percentage of waste collected separately out of total waste (Sep_waste) as the first instrument, using the maximum possible lag (6 years). The rationale for this choice is that the performance of municipalities in terms of separate waste collection is highly dependent on the civic sense of citizens (e.g. Argentiero et al., 2023; Whang & Zang, 2022). Therefore, we believe that it can be considered as an appropriate proxy for civic capital. Regarding the exogeneity requirement, it is implausible that civic capital can have a direct effect on income inequality, especially when the variable is lagged by 6 years, which allows us to exclude the possible influence of contemporary shocks that simultaneously affect our variables. Furthermore, the inclusion of NUTS-1 area fixed effects in the model significantly reduces the possible presence of omitted factors correlated with the instrumental variable. We believe that a civic culture, characterised by a strong persistence over time (e.g. De Blasio & Nuzzo, 2010; Guiso et al., 2011; 2016), can have a significant impact on the propensity of the population to volunteer, which largely depends on the presence of intrinsic motivations shaped by the civic values of the context in which one lives. We therefore believe that its impact on income inequality can only be indirect, i.e. mediated through volunteering. To test the exclusion restriction, we include an additional instrumental variable. This is the deep lag of the endogenous variable (Volunt_lag), which is lagged by 9 years, the maximum possible lag. Deep lags of endogenous variables are particularly robust instruments because “they guarantee the robustness of the zero-correlation assumption, even when autoregressive components in the endogenous variables are present” (Crociata et al., 2020, p. 92).
3.4 Results
Before proceeding with the estimations, it is crucial to check for collinearity between the independent variables. Table 2 shows the correlation matrix and the variance inflation factor (VIF) values. The correlation matrix does not show high correlations, which could be a symptom of linear dependence between the variables. This is further supported by the VIF values, which are all well below the threshold of 5, indicating that collinearity issues are unlikely.[3]
Table 2. Correlation matrix of continuous explanatory variables
Variable
|
VIF
|
1
|
2
|
3
|
4
|
5
|
1
|
Volunt
|
4.888
|
1.000
|
|
|
|
|
2
|
Primary_emp
|
4.248
|
-0.289
|
1.000
|
|
|
|
3
|
Tertiary_emp
|
2.012
|
-0.080
|
-0.141
|
1.000
|
|
|
4
|
Dep_ratio
|
2.977
|
0.242
|
-0.555
|
-0.008
|
1.000
|
|
5
|
Welfare
|
3.841
|
0.571
|
-0.193
|
0.239
|
0.101
|
1.000
|
Table 3 presents the estimation results of the OLS model. First, we estimate the model without including control variables, area, and time dummies (column 1). Then we gradually include the control variables (column 2), the area dummies (column 3), and finally the time dummies (column 4). In all cases, we observe that the independent variable of interest has a negative and statistically significant coefficient. Overall, the model shows good explanatory power, explaining around 76% of the variance in the dependent variable. These results provide preliminary evidence in support of the study’s hypothesis that organisation-based volunteering contributes to reducing income inequality.
As argued earlier, OLS estimates may be biased by endogeneity. Therefore, to check the robustness of the previous estimates to endogeneity, we present IV-2SLS estimates in Table 4. We first estimate the model including the two instrumental variables separately (columns 1 and 2) and then include them together (column 3). The results of these estimations confirm the negative and statistically significant coefficient for the main independent variable. Compared to the OLS estimation, the coefficients of this variable are higher, indicating an attenuation bias due to measurement error in the previous OLS estimation. No significant changes are observed for the control variables either. As for the instrumental variables, both are significantly correlated with the endogenous variable, as shown by the first-stage estimates. The test for weak identification (Kleibergen-Paap test) rejects the null hypothesis that the instruments are weaks in all cases, and the value of the F-statistic is well above the threshold of 10 indicated by Steiger and Stock (1995) for an instrument to be considered relevant. The Hansen test confirms that the instruments should be exogenous, as there is no statistically significant evidence to reject the null hypothesis. Finally, the Anderson-Rubin and under-identification tests further confirm that the model is well-specified.
Table 3. POLS estimations
|
Dependent variable: Gini
|
|
|
(1)
|
(2)
|
(3)
|
(4)
|
Volunt
|
-0.150***
|
-0.034*
|
-0.051***
|
-0.076***
|
|
(0.028)
|
(0.019)
|
(0.017)
|
(0.019)
|
Primary_emp
|
|
0.010
|
0.038***
|
0.038***
|
|
|
(0.011)
|
(0.012)
|
(0.010)
|
Tertiary_emp
|
|
0.676***
|
0.659***
|
0.561***
|
|
|
(0.071)
|
(0.069)
|
(0.054)
|
Dep_ratio
|
|
-0.278***
|
-0.377***
|
-0.542***
|
|
|
(0.086)
|
(0.070)
|
(0.070)
|
Welfare
|
|
-0.087***
|
-0.118***
|
-0.110***
|
|
|
(0.013)
|
(0.009)
|
(0.009)
|
Area dummies
|
No
|
No
|
Yes
|
Yes
|
Time dummies
|
No
|
No
|
No
|
Yes
|
R2
|
0.408
|
0.670
|
0.706
|
0.760
|
Moran’s I (p-value)
|
|
|
|
0.360
|
N. of observations
|
231
|
231
|
231
|
231
|
N. of regions
|
21
|
21
|
21
|
21
|
Notes: Standard errors clustered by region in parentheses. Level of significance: 10% (*), 5% (**), and 1% (***). All variables included in the model are log-transformed (natural logarithm). All estimates include a constant term (not shown).
We perform a spatial diagnostic on the residuals of the OLS estimate (Table 3, column 4). Using a row-standardised spatial weighting matrix (KNN with 4 as the critical cut-off), we do not detect any spatial autocorrelation. For this reason, we do not estimate spatial regression models, which are often a suitable option when working with regional data. [4]
Table 4. IV-2SLS estimation
Second stage (Dependent variable: Gini)
|
|
(1)
|
(2)
|
(3)
|
Volunt
|
-0.145***
|
-0.116***
|
-0.120***
|
|
(0.051)
|
(0.024)
|
(0.023)
|
Primary_emp
|
0.044***
|
0.042***
|
0.042***
|
|
(0.011)
|
(0.008)
|
(0.009)
|
Tertiary_emp
|
0.477***
|
0.513***
|
0.507***
|
|
(0.103)
|
(0.059)
|
(0.064)
|
Dep_ratio
|
-0.588***
|
-0.568***
|
-0.571***
|
|
(0.088)
|
(0.070)
|
(0.072)
|
Welfare
|
-0.093***
|
-0.100***
|
-0.099***
|
|
(0.018)
|
(0.012)
|
(0.013)
|
Area dummies
|
Yes
|
Yes
|
Yes
|
Time dummies
|
Yes
|
Yes
|
Yes
|
R2
|
0.742
|
0.754
|
0.753
|
First stage (Dependent variable: Volunt)
|
Sep_waste
|
0.197***
|
|
0.079***
|
|
(0.054)
|
|
(0.024)
|
Volunt_lag
|
|
0.615***
|
0.549***
|
|
|
(0.106)
|
(0.115)
|
R2
|
0.828
|
0.876
|
0.881
|
Weak identification test (F)
|
13.396
|
33.558
|
26.221
|
(Kleibergen-Paap rk Wald F statistic)
|
Underidentification test (p-value)
|
0.007
|
0.010
|
0.020
|
(Kleibergen-Paap rk LM statistic)
|
Weak instrument robust test (p-value)
|
0.014
|
0.000
|
0.000
|
(Anderson-Rubin Wald test)
|
Overidentification test (p-value)
|
|
|
0.557
|
(Hansen J statistic)
|
N. of observations
|
231
|
231
|
231
|
N. of regions
|
21
|
21
|
21
|
Notes: Standard errors clustered by region in parentheses. Level of significance: 10% (*), 5% (**), and 1% (***). All variables included in the model are log-transformed (natural logarithm). All estimates include a constant term (not shown). Due to space limitations, we only report the instrumental variables results for the first stage regression. Full results are available upon request.
We report additional estimates as robustness tests in the appendix. First, we estimate the baseline model by changing the dependent variable (Table A1). In fact, instead of the Gini index, we include a variable representing the income quintile share ratio (S80/S20), the ratio of total income received by the 20% of the population with the highest income to that received by the 20% of the population with the lowest income (Income_ineq). In a later estimation (Table A2), we change the independent variable of interest by including volunteers from all organisations and not only from voluntary organisations (Volunt_tot). Considering that the Autonomous Province of Trento and Bolzano are potential outliers in the distribution of the independent variable of interest, we estimate the model removing them from the sample (Table A3). Finally, given the panel nature of our dataset, we consider an estimation strategy useful to account for both cross-sectional dependence and serial correlation. We therefore estimate a panel-corrected standard errors model (the so-called Prais-Winsten regression) and a feasible generalised least squares (FGLS) regression, in both cases allowing for a common AR(1) disturbance (Table A4).[5] These robustness tests provide evidence that strongly confirms the results of the previous estimates.[6]
3.5 Comments
The results of the econometric analysis provide fairly robust evidence that is useful in confirming the hypothesis that emerged in the conceptual framework developed in this study. In line with expectations, it is therefore possible to argue that organisation-based volunteering can be an important factor in mitigating income inequality.
These results should be treated with caution and confirmed by further studies designed to address what we consider to be the two main limitations of this empirical work. The first is external validity. The peculiarities of the Italian socio-cultural and institutional context make it difficult to generalise the results obtained. Indeed, volunteering is a phenomenon that depends strongly on the specific characteristics of territories. Therefore, only national studies with the highest possible level of geographical detail can provide further evidence on the causal link between organisation-based volunteering and income inequality. The second concern is that we did not use some important estimation methods to further explore the causal link, such as dynamic panel models. We conducted a static analysis because the phenomena considered in the analysis are relatively stable over the time interval considered. A dynamic panel analysis, which would provide a more complete picture of the relationship between volunteering and income inequality, requires in this specific case a much longer time series.
Concerning the control variables, the results are in line with expectations. The results on the sectoral composition of the production system show us how a greater weight of the primary and tertiary sectors can lead to greater income inequality. With particular reference to the growth of the service sector, for example, it is possible to argue how this can lead to a concentration of skills and a polarisation of the labour market, which can inevitably lead to increased inequality (e.g. Antonelli & Tubiana, 2020; Li et al., 2024). We interpret the negative coefficient of the structural dependency ratio considering that in the Italian context, a higher weight of the elderly population means higher household income, mainly thanks to transfers, but also to its higher propensity to save, which can lead to higher investments (Goldstein & Lee, 2014; Perugini, 2024). Finally, the negative coefficient of the social protection expenditure variable confirms the importance of welfare in reducing inequalities, as recently shown, for example, by Miranda-Lescano et al. (2024).