4.1 Data description
The data were taken from CHFS (China Household Finance Survey) in 2019. It is a nationwide survey conducted by the China Household Finance Survey and Research Center of the Southwestern University of Finance and Economics (SWUFE). The proportional to population size (PPS) is adopted to select samples nationwide. The survey centers on data from 29 provinces and 345 counties, including 1,054 urban and rural communities. The demographic characteristics of CHS, with a low rejection rate, are very close to that of the national census statistics. Therefore, the data is reliable and representative.
In the questionnaire, family (also individuals) and community are involved. First, we collect the basic demographic characteristics, assets and liabilities, income and consumption, insurance and security, employment, and subjective attitude of urban and rural families. The information above reflects the basic situation of families. On the other hand, the community questionnaire covers the basic situation of the community, politics, economy, culture, public security, environmental protection, etc.
Specifically, in household assets, there are more than business assets, houses, cars, and other non-financial assets in the questionnaire. We also inquire more details about the household’s financial assets, including cash, current deposit, time deposit, stocks, bonds, funds, financial products and derivatives, the renminbi assets, precious metals, loans, and other financial assets. In terms of household liabilities, we investigated the asset of each household (agriculture, business, housing, vehicles, non-financial assets, financial assets) and household liabilities. The information above supports our study on rural household investment in online products. Before analyzing, samples without key variables were removed, and the number of effective samples was finally 7165, thus the avoidance of the extreme values’ impact on the estimated results.
Table 1shows the characteristics of samples. 48.09% of the questionnaire respondents are female. The respondents have an average age of 48.46, of which 3487 range from 46 to 60 years old, accounting for 34.71%. Nearly 40% of the respondents have 9 years of schooling or below, half of the respondents have attended a high school (9–12 years), and less than 10% of people have 13 years of schooling and beyond. About 20.52% of interviewees stated that their household annual income is less than 30,000 Yuan, and 8.68% of them have an annual income above 90,000 Yuan.
Table 1
Demographic profiles of the respondents
|
Category
|
Frequency
|
Proportion
|
Gender
|
Male
|
3719
|
51.91%
|
Female
|
3446
|
48.09%
|
Age (years)
|
< 25
|
396
|
5.52%
|
25–35
|
825
|
11.51%
|
36–45
|
2158
|
30.12%
|
46–60
|
2487
|
34.71%
|
> 60
|
1299
|
18.12%
|
Schooling years
|
≤ 9 years
|
2518
|
35.14%
|
10–12 years
|
4001
|
55.84%
|
> 12 years
|
646
|
9.02%
|
Household annual income
(thousand yuan)
|
< 30
|
1470
|
20.52%
|
30–60
|
3888
|
54.27%
|
60–90
|
1185
|
16.53%
|
> 90
|
622
|
8.68%
|
4.2 The identification of neighborhood effects
There are several challenges associated with identifying the neighborhood effects. Above all, it is difficult to define an appropriate ‘neighbor group’ (Chen, Jin, and Yue 2010; Kim 2016)[49];[32]. Besides, the identification of neighborhood effects also faces three common concerns (Manski 2013; Atefi, & Pourmasoudi, 2019)[50–51]: (1) The contextual effect, which means that the behavior of an individual varies with the exogenous characteristics of the neighbors. (2) The correlated effect, one type of correlated effect refers to the concern that households select neighbors according to their preferences and backgrounds (also called ‘self-selection problem’). Rural residents making similar online investment decisions may simply because the levels of local house prices attract people with similar income levels to live together, or they are affected by a common political factor. (3) Simultaneity, also termed as ‘reflection problem’, indicates that individuals and their peer-group influence each other simultaneously. There is a mutually causal relationship between personal behavior and neighbors’ behaviors, which would generate an endogenous issue when distinguishing the extent to which peer choice determines the individual choice.
To accurately identify the neighborhood effects concerned in our paper, we take the following measures in the empirical analysis. Firstly, we introduce the neighbor characteristics and the village characteristics of the sample into the control variables, which can control the contextual effect and self-selection problem. Secondly, we chose neighborhood children aged 3–6 years old' and 'Neighborhood serious illness suffering' as instrumental variables and use the IV-probit model for regression. Finally, we verify the robustness of the neighborhood effects by using substitution variables, excluding high-income neighbors, constructing false covariate variables, and placebo tests.
4.3 Model and variables
Based on the above analysis of the identification strategy of the neighborhood effects, and combined with the data characteristics, the following Probit model is constructed to identify the neighborhood effects of online financial investment in rural places:
$$\text{P}\text{r}\text{o}\text{b}\text{i}\text{t}({ofi}_{i}^{C}=1)={\phi }\left({\alpha }_{0}+{\alpha }_{1}{Nofi}_{i}^{C}+{\alpha }_{2}{X}_{I}+{\alpha }_{3}{Y}_{-i}^{C}+{\alpha }_{4}{Z}^{C}+ProDummy\right)$$
1
Where ofiiC is the indicator for the online financial investment of rural households i in village C (1 = yes; 0 = no), the data is directly derived from a question in the questionnaire - 'whether your family purchased any online financial investment products ?'.
The core variable is Nofi− iC (i.e. neighborhood effects), which indicates the average online financial investment within the neighbors of the focal household i. The size and significance of the coefficient α1 are the focus of this paper. The measurement of the core explanatory variable must define the scope of ‘neighborhood’. Referring to many relevant literature in China, this study labels ‘neighborhood’ as all the rural households living in the same administrative village. As mentioned above, the household registration (‘Hukou’) system and urbanisation cause few flow in populations, so rural residents rarely choose their neighbors by their preferences. Besides, villagers, particularly in less-developed places, keep a stable and tight social relationship by living together for generations (Liu, Sun, and Zhao 2014; Loh and Li 2013)[52–53]. Thus, we calculate the neighborhood effects using the following equation:
$${Nofi}_{i}^{C}=\left({\sum }_{i}^{\text{n}}{ofi}_{i}^{C}-{ofi}_{i}^{C}\right)/(n-1)$$
2
Equation (2) is the measure of neighborhood behavior in our paper. Neighbors’ influence should not contain the effects from one’s own family, so the focal household i is excluded. n is the number of sample families in the village.
Otherwise, in order to accurately identify the neighborhood effects of online financial investment, we introduce three categories of control variables into the empirical model.
X i indicates the first kind of control variables, which are a vector of exogenous characteristics of the respondent or his/her family, including gender, age, education, income, assets, liabilities, stock market participation, third-party payment balance, and consumption expenditure.
\({Y}_{-i}^{C}\) represents the second kind of control variables. They are a vector of neighbors’ background variables. To partially deal with the contextual effect issue, we add neighbors’ age, education, income, assets, liabilities, and online consumption into basic regression. The calculation method is similar to formula (2), i.e. the average value of all families except the focal family in the same village.
We address the correlated effect issue in various ways, for example, conducting a province-varying fixed effects model and controlling village-based variables (ZC), including 'the economic condition' 'the greening situation of the village' and 'the aging rate '.
Moreover, we introduce mediating variables by the unary multiple mediation model to explore how neighborhood effects work. The mediating variables include financial knowledge and risk tolerance.
First, we measured financial knowledge using a combination of four questions[1]. Those questions cover the financial calculating ability, the knowledge about inflation and risk, and financial market knowledge of the sample. The validity and reliability of the resulting scale were then tested based on the responses to these questions. The Cronbach's alpha of these scales was 0.865, the KMO value was 0.786, and the Bartlett's test was significant, Besides, the component matrix loadings of each rotation were greater than 0.682, thus the validity and reliability of above scales measuring financial knowledge. Subsequently, financial literacy indicators (fliC) were calculated using factor analysis.
Finally, we use the risk-taking as another mediating, which is the risk a peasant household is willing to take. We choose the following investment choices as the measurement of risk-taking: “Assume you have some assets to invest in, which type of project would you choose? 1 = Unwilling to take any risk; 2 = Slightly below-average risk, slightly below-average return; 3 = Average risk, average return; 4 = Slightly above-average risk, slightly above-average return; 5 = High risk, high return“. Respondents' responses to this question reflect their risk tolerance, with a larger value indicating higher risk tolerance. This variable is formulated with riski.
Even though the above methods have mitigated many identification challenges, there still exists an endogenous threat that stems from the simultaneity (Bertrand, Luttmer, and Mullainathan 2000; Chen et al. 2008)[55–56]. We implement the IV strategy to eliminate the reflection problem (Angrist 2014)[57]. As adopted by Gaviria&Raphael (2001) and Ling et al. (2018),[58–59] we select two neighbors’characteristics as instruments, i.e.‘Neighborhood children aged 3–6 years old’and‘Neighborhood serious illness suffering’. Children are one of the most vulnerable groups, and more children mean higher risk exposure. On the other hand, more preschool-aged children (usually 3 to 6 years old in China) bring about a high education expenditure since they have must to attend a kindergarten. Therefore, it will lead to less need for online financial investment. Health risk is the common risk faced by rural households, and medical expenditure is also a big part of the rural household payment. So rural households would increase the payment for medical treatment and the liquidity holding for future health risks if there a family member suffers serious illness. That will squeeze out financial asset allocation, which may decrease the likelihood of purchasing online investment products. Thereby, these two instrumental variables may have a strong influence on the neighbors’online investment products demand, but not a direct impact on the individuals’ propensity to online investment. Besides, in rural China, the birth of children is less susceptible to human intervention because the fertility behavior of young couples is usually natural after marriage and seldom controls the timing of childbirth.So it is to the health condition of family members. Theoretically, the requirements of IVs are satisfied.
More descriptions of the variables are shown in Table 2.
Table 2
category
|
Variables
|
abbrev
|
items
|
Dependent variable
|
Online financial investment
|
ofiiC
|
Whether your family purchased any online financial investment products? Dummy (1 = yes; 0 = no)
|
Explanatory variable
|
Neighborhood online financial investment
|
Nofi− iC
|
Average online financial investment in neighbors’ households. (range: 0–1)
|
Instrumental variables
|
Neighborhood children aged 3–6 years old
|
Nchild− iC
|
The average number of children aged 3 to 6 years old in neighbors’households. Number
|
Neighborhood serious illness suffering
|
Nillness− iC
|
The proportion of neighbors whose family members have suffered a serious illness
|
Mediators
|
Financial knowledge
|
fliC
|
Calculated by factor analysis based on 4 financial knowledge questions (index)
|
Risk-taking
|
risktiC
|
Assume you have some assets to invest in, which type of project would you choose? (1 = Unwilling to take any risk;2 = Slightly below-average risk, slightly below-average return;3 = Average risk, average return;4 = Slightly above-average risk, slightly above-average return;5 = High Risk, High Return)
|
Rural household
characteristics
|
Household payment
|
tpayiC
|
Annual household payment. Number (10, 000 yuan)
|
Household income
|
tincomeiC
|
Annual household income. Number (10, 000 yuan)
|
household head age
|
ageiC
|
Age (Years)
|
Gender
|
sexiC
|
Female = 0, male = 1
|
Education
|
eduiC
|
Number of years schooling (Years)
|
Total assets
|
assetiC
|
The value of your household assets (10,000 yuan)
|
Total debt
|
debtiC
|
The value of your household debt (10,000 yuan)
|
Stock purchase
|
stockiC
|
Whether your family purchased any stock? Dummy (1 = yes; 0 = no)
|
Neighborhood characteristics
|
Neighborhood gender
|
Nage− iC
|
The average gender of the neighboring respondents/the proportion of male respondents of the neighborhood (range: 0–1)
|
Neighborhood education
|
Nedu− iC
|
The average number of years schooling of the neighboring respondents Years)
|
Neighborhood payment
|
Ntpay− iC
|
The average number of household payment of the neighboring respondents (10,000 yuan)
|
Neighborhood income
|
Nincom− iC
|
The average number of household income of the neighbors’ families(10,000 yuan)
|
Neighborhood debt
|
Ndebt− iC
|
The average number of household debt of the neighbors’ families(10,000 yuan)
|
Neighborhood assets
|
Nasset− iC
|
The average number of household assets of the neighbors’ families(10,000 yuan)
|
Village characteristics
|
The economic condition
|
econdC
|
How is the village economy developing? (range from1-10;1 is the worst;10 is the best)
|
The greening situation
|
greenC
|
How green is the village? (range from1-10;1 is the worst;10 is the best)
|
The aging rate
|
oldrateC
|
The proportion of the elderly over the age of 70 in the village. (%)
|
4.4 Descriptive statistics
Statistical description results of relevant variables used in this empirical study are shown in Table 3 below:
Table 3
Descriptive statistical results
Variable
|
Mean
|
Standard
|
Minimum
|
Maximum
|
Obs
|
ofppiC
|
0.19
|
0.38
|
0
|
1
|
7165
|
Nofpp− iC
|
0.19
|
0.14
|
0
|
1
|
7165
|
Nchild− iC
|
1.56
|
0.47
|
0
|
3
|
7165
|
Nillness− iC
|
0.15
|
0.59
|
0
|
1
|
7165
|
fliC
|
0.01
|
0.86
|
-1.168
|
1.478
|
7165
|
risktiC
|
2.11
|
2.12
|
1
|
5
|
7165
|
sexiC
|
0.48
|
0.50
|
0
|
1
|
7165
|
ageiC
|
48.46
|
14.1
|
20
|
75
|
7165
|
eduiC
|
10.88
|
3.91
|
0
|
22
|
7165
|
incomeiC
|
4.15
|
10.45
|
0.10
|
48.81
|
7165
|
assetiC
|
50.28
|
207.90
|
2.10
|
897.30
|
7165
|
tdebtiC
|
8.96
|
59.36
|
0
|
406
|
7165
|
stockiC
|
0.11
|
0.31
|
0
|
1
|
7165
|
consumptioniC
|
11.75
|
12.29
|
1.64
|
174.8
|
7165
|
Nage− iC
|
14.10
|
3.92
|
0
|
30.43
|
7165
|
Nedu− iC
|
3.25
|
1.23
|
0
|
8.543
|
7165
|
Nincome− iC
|
4.18
|
2.03
|
0.57
|
12.03
|
7165
|
Ntasset− iC
|
50.31
|
42.43
|
8.71
|
201.10
|
7165
|
Ndebt− iC
|
8.78
|
5.81
|
0
|
94.18
|
7165
|
econdC
|
5.31
|
0.28
|
1
|
10
|
7165
|
greenC
|
3.94
|
1.89
|
1
|
10
|
7165
|
oldrateC
|
0.37
|
0.10
|
0
|
0.656
|
7165
|
[1]The questions are: "Suppose the bank's annual interest rate is 4%, if you deposit $100 in a bank for a fixed term of 1 year, the principal and interest you will get after 1 year will be?1. less than $104, 2. equal to $104, 3. greater than $104, 4. impossible to calculate "; "Suppose the bank's interest rate is 5% per year and the inflation rate is 8% per year, what can you buy after a year of saving $100 in the bank that will be? 1. more than a year ago, 2. as much as a year ago, 3. less than a year ago, 4. impossible to calculate". "Which do you think is riskier in general, equity or debt funds? 1. equity funds, 2. debt funds, 3. never heard of equity funds, 4. never heard of debt funds, 5. neither, 6. the same". Which do you think is riskier in general, Main Board stocks or GEM stocks?1. Main Board, 2. GEM, 3. never heard of Main Board stocks, 4. never heard of GEM stocks, 5. Neither heard of them, 6. the same ". If the answer to the above question is " impossible to calculate " or " never heard of either", the value is 0. If the answer is incorrect, the value is 1.