Theoretical Background
According to the guidelines introduced by WHO, an age-friendly city encourages active aging by optimizing opportunities for health, participation, and security to enhance the quality of life (8). WHO has proposed 6 determinants for the concept of active aging in cities: (1) health and social services, (2) behavioral, (3) personal, (4) physical environment, (5) social, and (6) economic determinants (9). "Active aging" is perceived as the desire and ability of older people to integrate physical activity into their daily routines and engagement in economic and socially productive activities (10).
There are many different methods to assess the age-friendliness of urban spaces (6). Current methods of assessing older peoples’ view of the built environment can be categorized into 3 groups. Observational audit tools typically aimed to capture descriptive and objective data on specific street-level attributes such as presence and qualities. The second method is a well-established tradition of perceived-environment measures through surveys to collect self-reported data (11, 12). Lastly, spatial qualitative methods use a more heterogeneous group of tools, comprising techniques such as photo-voice, walk-along interviews, or virtual reality experiments, as exemplified in a recent review of qualitative studies (11, 12).
The objective of this study was to develop and determine the psychometric properties of the developed questionnaire for measuring age-friendly urban spaces according to older peoples' preferences. Developing the questionnaire and its validation is done in two phases (Figure 1).
The objective of the first phase was to develop the overall scheme of the questionnaire based on grounded theory (GT) and context characteristics. The extraction and design of the items and phrases of the initial questionnaire consisted of three steps: (1) adopting the GT (qualitative research and extracting appropriate phrases through content analysis technique), (2) conducting desk study and extracting the phrases and (3) designing the initial questionnaire.
The objective of the second phase was to validate the questionnaire developed in phase 1 by assessing the validity of the psychometric characteristics of the questionnaire and assessing reliability through structural validity, split-half analysis, and Cronbach's α coefficient in SPSS 22. Validity analysis was checked by 3 indicators of content, construct, and face validation according to Waltz and Bausell content validity index and Lawshe content validity ratio (13, 14). The study protocol was approved by the Ethics Board of Iran University of Medical Sciences.
Grounded Theory (GT) and Item Extraction
The purposeful sampling is used to have maximum variation in the age, sex, literacy, physical and mental health status, and socioeconomic status with a high presence in neighborhoods' community centers with registered local information in the health department of the community center in Tehran's neighborhoods. Since the participants in the GT study were selected from older people living in Tehran. The inclusion criteria were (1) age over 65 years, (2) local residents in neighborhoods, (3) willing to participate in the study and (4) providing consent.
The interviews were carried out with 54 older participants who were presents in urban outdoors 3-5 times a week. They were chosen from different public spaces such as parks, streets, and squares in different neighborhoods with different socio-economic classes which have active community centres to collect the elders' health information from June and July 2018 (Table 1). The duration of the interviews was 20 to 45 minutes depending on the participant’s level of interest and cooperation (Table 1).
Moreover, a Focus Group Discussion (FGD) with 12 older peoples (7 women and 5 men) among interviewees was held for trustworthiness in the City Council of District 10 in Tehran Municipality in August 2018.
Table 1- Participants' socio-demographic status who attended the interview.
Feature
|
Participants (n = 54)
|
Age group
|
65-75: 28
75-85:26
|
Gender
|
29 F
25 M
|
Education level
|
Undergraduate: 19
Graduate:18
Postgraduate:17
|
General Health perception(self – reported)
|
Good Health: 24
Moderate Health:12
Poor Health:18
|
Socio- economic status
|
Middle- High:28
Poor- low:26
|
During the semi-structured interviews according (Appendix 1), the participants were asked the following questions: How do you like this place to be? What qualities should this place have so that you would want to spend more time in it? The subsequent questions were asked according to the participants’ responses to these two initial open-ended questions. The data were analyzed using Strauss and Corbin's coding supervised method by two people in the research team's experts (15). The last five interviews and the FGD were conducted after reaching theoretical saturation for more certainty and validity.
The credibility of data was assured through peer checking and member checking (16-18). Peer checks were conducted via weekly research team meetings during which the emerging data were discussed and reviewed and analyzed the data among research group. Member checks occurred by providing a summary of the analyzed interviews and extracted codes to participants so the research team could be asked and incorporated their feedback and ideas for corrections. In addition, the quality of public places was appraised through observational field studies by applying the urban design techniques to assess public spaces' qualities for instance Jan Gehl's toolbox (19). Thus, conformability was observed by considering the opinions of other researchers and transferability by fully describing all the stages of the procedure (18).
Item Finalization
The relevant literature was reviewed to validate the extracted subcategories. In this process, all of the extracted codes are assessed by similar concepts in the literature of this domain (Table 2).
The extracted items and gathered data from desk study are used to guide item development. The developed questionnaire consisted of three scales: place (functional dimension), place (preferred dimension) and process (environments). All items used in the questionnaire were locally experienced items by the elder (Table 2).
The questionnaire was initially designed in the Persian language and then checked by two experts in Persian literature to assure cultural appropriateness. In addition, the questionnaire was piloted on a group of 18 older people, and modifications were made prior to the study.
Table 2- This table depicts the extracted items from GT and literature reviewed during phase 1.
Domains/ Categories
|
Scales/ Subcategories
|
Personal characteristics (socioeconomic status)
|
Age
|
Gender
|
Marital status
|
Occupation
|
Place (functional dimension)
|
Density
|
Amenities (Access to services)
|
Safety (Traffic)
|
Aesthetics (design)
|
Landscaping
|
Comfort
|
Environmental cleanness (Visual, air, noise, pollution)
|
Place (preferred dimension)
|
Security (Crime)
|
Security (Fear of falling)
|
Security (Fear of losing/ wayfinding)
|
Aesthetics (experienced environment)
|
Process (environments)
|
Social environment
|
Cultural environment
|
Sense of belonging
|
Life satisfaction
|
As an initial instrument, the questionnaire of the frequency of use was devised based on a 5-point Likert-type scale (almost always, often, sometimes, seldom, and never) (Table 3). The reasons for selecting this scale were its pivotal role in building the older peoples' preferences in public spaces and its focus on dynamic interactions between people and the environment (20).
Table 3- This table shows the scales, items, and the number of items presented in the questionnaire.
Domains
|
Scales
|
Number of items
|
PF: Place (functional dimension)
|
Density
|
9
|
Amenities (Access to services)
|
10,11,12,13,14
|
Safety (Traffic)
|
15,16
|
Aesthetics (Objective)
|
26,27
|
Landscaping
|
30,31,32
|
Comfort
|
33,34,35,36
|
Environmental cleanness (Visual, air, noise, pollution)
|
37,38
|
PP: Place (preferred dimension)
|
Security (Crime)
|
17,18,19
|
Security (Fear of falling)
|
20,21,22,23
|
Security (Fear of losing/ wayfinding)
|
24,25
|
Aesthetics (Subjective)
|
28,29
|
PE: Process (environments)
|
Social environment
|
39,40,41,42
|
Cultural environment
|
43,44
|
Sense of belonging
|
45,46,47,48
|
Life satisfaction
|
49
|
Questionnaire Validation
After pilot testing and revisions of the questionnaire, a second pilot test was run on the intended respondents for initial validation among 42 elder people participated in the qualitative phase. After considering validity and reliability, the final version of the questionnaire was given to the specified sample of 350 respondents in two neighborhoods.
Questionnaires' Validity
In this section three concepts of content, face, and construct validity are considered to investigate the questionnaire validity.
Content Validity
Lawshe’s method was adopted for content validity analysis by calculating the Content Validity Ratio (CVR) (14). The questionnaire items were evaluated by a group of nine experts in landscape architecture, urban design, planning, and gerontology. The experts rated items either as essential, useful, or not necessary. A dichotomy was then devised from the 3-point rating scale into essential, useful, and not necessary. The revised binomial probability distribution for Lawshe’s critical values was applied in excluded items rated as not necessary (21). A scale content validity index (S-CVI) was calculated for each scale by averaging the CVR for all the retained items in the scale (22, 23). If CVI is higher than 0.9, it indicates excellent content validity at the scale level (22).
Face Validity
Initially, 18 older people were asked whether there was any ambiguity in items of the questionnaire, and if any, the items were modified. In the quantitative phase, the impact score (frequency in importance) was evaluated by nine experts considering difficulty, inappropriateness, and ambiguity of the phrases. Qualitative face validity was determined by a panel including three urban designers, three urban planners, two gerontologists and one epidemiologist. These specialists evaluated the level of difficulty, inappropriateness, and ambiguity of the phrases. Their comments were used in the questionnaire.
The impact score was calculated for each question to determine the quantitative face validity (Equation 1) (24). For each of the 41 questions, a 5-point Likert scale was used to determine impact score. This scale range included strongly agree (score 5), agree (score 4), no idea (score 3), disagree (point 2), and strongly disagree (score 1). After completing the questionnaire by the target group (by 12 participants of FGD and 9 health expert), the face validity of the item was calculated by using the impact score equation (Equation 1). The impact scores equal to or greater than 1.5 are considered appropriate (25).
Equation 1: Impact Score = Frequency (%) × Importance value
Construct Validity
To examine the construct validity and internal consistency of the final questionnaire, a random sample of 350 older people (≥ 65 years old) from different public spaces in the selected district was invited to participate in answering the questionnaire in August and September of 2018. Stratified random sampling is used in this study to improve the representative ness of the sample.The population of the elders is divided into nine neighborhoods with different public spaces called sub-region and random samples are drawn from each of these public spaces(parks, community centres) in sub-regions. The time needed to complete the questionnaire was 30–40 minutes. Construct validity was determined by the Kaiser–Meyer– Olkin (KMO) value. The Bartlett's test of sphericity was used to test the sampling adequacy and the strength of correlations between each scale item, respectively (26).
We applied Partial least squares (PLS) to test the conceptual model. PLS is useful in structural equation modeling for applied research projects, especially when the participants are limited with skewed data distribution (27). To measure the validity in PLS, the 3 indicators of Average Variance Extracted (AVE), Confirmatory Factor Analysis (CFA), and Fornel and Larker methods were adopted (28). Fornel and Larker introduced the AVE criterion in 1981 to measure convergent validity and claimed that the critical number is 0.5. Any output of more than 0.5 indicates acceptable convergence (28). The AVE criterion indicates the shared average variance between any structures and the indices thereof, and the more the correlation, the greater the goodness-of-fit. Convergent validity was applied as the substantial criterion as the goodness-of-fit measuring model in PLS.
Questionnaires' Reliability
We evaluated the reliability of the questionnaire through internal consistent split-half reliability, composite reliability (CR), and item reliability.
Split-Half & Internal Consistency
The split-half method as an improvement method is used when it may not be possible to use the same test twice and to get an equivalent form of test especially among older adults (29). The items of a test were divided into two matched halves and, then, the score of the first half questions and that of the second half are calculated (30). The split-half method cannot be applied with heterogeneous questionnaires, as the division of the questionnaire will not yield equivalent forms. In this situation (heterogeneous questionnaires), one may repeat questions throughout the questionnaire, while only the original question is kept in the final form (30).
In this study to divide the measuring instrument into two halves, the correlation coefficient was calculated between scores of odd numbered and even numbered items based on Equation 2. Coefficient α represents the average of all possible split-half estimates.
Equation 2: Reliability coefficient = (Correlation Coefficient * 2) ⁄ (Correlation Coefficient + 1)
Composite Reliability (CR)
A more up-to-date PLS criterion named “composite reliability” is applied in relation to coefficient α, as this criterion is introduced in1974 (31). Here, the validity is measured in accordance with the correlations within, not in an absolute sense. Accordingly, both of these criteria are applied to measure validity in PLS more accurately. In case the CR volume for each structure is higher than 0.7, appropriate internal stability is assured for the measuring methods (32).
Item Reliability (Factor Loading)
Factor loading is calculated through analyzing the correlation values of a structures' indices in PLS. The obtained volume ≥0.5 indicates that the variance between the structure and its indices are greater than its measuring error variance and that the validity of the measuring model is acceptable (33).