Research Design
The methodology used in this research is a survey design. Research using a survey design describes quantitative (numbers) related to empathy [30]. This research was conducted by collecting, classifying, and analyzing or processing data, drawing conclusions, and writing reports to objectively show the quality of the (I-BES) instrument in a description.
The study design was a survey study. A survey is a quantitative research technique in which the researcher interviews a sample or an entire population to determine the population's attitudes, beliefs, behaviors, or characteristics. For this study, a random sample was used, focusing on the population of Lampung, Indonesia. The basis for consideration is that the philosophical values of the Lampung cultural subculture demonstrate an openness to diversity that supports the development of the value of empathy. The social philosophy of Nengah Nyappur in Lampung culture is rooted in the cultural attitudes of the Lampung (ethnic) people. Thus, it can help avoid discriminatory social attitudes that can potentially cause unproductive social rifts in building community unity (integration) [31].
Rasch Model
In the 1960s, Georg Rasch constructed an analytical method of item response theory or item response theory (IRT) known as 1PL (one parameter logistic) [32], [33]. Ben Wright later popularized this mathematical model [34]. Rasch developed a model that connects students and things using raw data in dichotomous data (true and false) representing students' abilities [33].
In the field of social sciences, the data were obtained in the form of source numbers, which can usually be obtained in the form of attitudes and opinions about statement items or questions in a given instrument [35]. The instrument was designed based on satisfactorily defined variables. Subsequently, the relevant constructs were identified, and the items to measure the variables in question were developed. At the same time, the response options offered generally follow the scoring pattern of classical test theory (CTT). In the context of the Rasch model, this 'settled' scoring pattern is nothing more than a measurement whose results depend on who is being measured (test-dependent scoring), whereas quantitative research in the social sciences is concerned with objective measurement [36], [37].
According to Mok and Wright [32] , objective measurement in the social sciences must meet five criteria: 1. providing a linear measure with equal intervals; 2. conducting an appropriate estimation procedure; 3. finding inappropriate items (misfits) or unusual ones (outliers); 4. overcoming data loss; 5. generating replicable measurements (independent of the parameters studied). Only the Rasch model can satisfy the five conditions of the five conditions. In other words, the quality of measurements in the social sciences made with the Rasch model will be the same as measurements in the field of physics [32], [33], [38].
Sample and Data Collection
The participants were guidance and counseling students of STKIP PGRI Bandar Lampung, a total of hundred and two students. Forty-eight students were in 2018-2019, and as many as fifty-four started their studies in 2020-2021. The distribution was based on the COVID-19 pandemic, which emerged at the end of 2019. The social interaction restrictions that impacted the world of education required students to take lectures online after the policy was enacted in mid-2020, around June-July. Namely, students in the 2018-2019 cohort were still delivering lectures offline and interacting a lot with their peers in lectures.
Meanwhile, students in the class of 2020-2021 did not hold intensive offline lectures because no policy allows lectures to be held with a 100% capacity in a class. In addition, participants were also categorized by gender - a total of eighty-five female students and seventeen male students. Specifically, the distribution of participants is described in Table 1.
Table 1. Demographic Characteristics of the Respondents
No
|
Demographic Characteristics
|
Respondents
|
Total
|
%
|
1
|
Gender
|
|
|
Male
|
17
|
16.83
|
Female
|
84
|
83.17
|
2
|
Region
|
|
|
Bandar Lampung
|
46
|
45.54
|
Pringsewu
|
14
|
13.86
|
Tanggamus
|
8
|
7.92
|
Pesawaran
|
13
|
12.87
|
Other (Outside Lampung Province)
|
20
|
19.80
|
The Basic Empathy Scale in Adult (BES-A)
The Basic Empathy Scale-A (BES-A) measures items derived from the Basic Empathy Scale (BES) [39]. BES-A was developed by Carré et al. (2013) as a revision and reconstruction of the shortcomings of BES so that it is more comprehensive than the previous item. BES has forty items, whereas BES-A consists of twenty items aimed at adults. In the context of development, the average student is already in the adult stage, so this item should be able to measure empathy comprehensively.
BES is based on Cohen and Strayer's (1996) definition of empathy as "sharing and understanding the emotional state or context of others as a result of experiencing the emotional state (affective) and understanding the emotions (cognitive) of others" The BES assesses five basic emotions (fear, sorrow, anger, and joy) as well as broader cognitive and affective empathy measures than nonspecific emotional states (e.g., anxiety). For a forty-item scale, items were provided with reversed wording, with twenty items requiring a positive answer and another twenty items requiring a negative response. A shortened twenty-item version is also available for use with adults in France. However, several studies have found that the BES, a two-factor scale, does not consider newer ideas of empathy, which have found that empathy depends on three components [40], [41].
Modification of BES-A and Transformation to Indonesia Basic Empathy Scale (I-BES)
To develop a comprehensive empathy instrument and refer to the recommendations of several previous studies [40], [42], [43], the Indonesia Basic Empathy Scale (I-BES) refers to three main components of empathy. The empathy instrument construct measures empathy contagion (affect empathy), cognitive empathy, and emotional disconnection. Empathy contagion is a situation in which a person automatically adjusts his or her emotions to the emotions or states of others. Cognitive empathy refers to a person's ability to understand and articulate emotions based on the effect of other people's emotions or situations. In addition, emotional disconnection is a regulatory component that involves self-protection from anguish, suffering, and emotional impact caused by other people's emotions or situations.
I-BES is an instrument analyzed and evaluated for validity and reliability. Since the research was conducted in Indonesia, the items were translated into Indonesian and adapted to the context of the participants. The language change was made to avoid irrelevant construct variables caused by the participants' language proficiency. Translators included a psychologist, a career counselor, and an English language expert. For the instrument to be comprehensively adapted to Indonesian culture regarding empathy, language translation equality, concepts, and metrics, considering the linguistic expressions and language culture of the local community under this study. Language translation involved forward translation, expert panel, back translation, pre-testing and cognitive interviewing, and final revision.
The instrument initially contained twenty items and was adjusted to fifteen based on empirical test results. The primary consideration of the instrument used had fifteen items; that is, the instrument was conducted through the stage of unidimensionality, and the item analysis consisted of item fit and item difficulty, using procedures and assumptions from the Rasch model; the results are fifteen items; by eliminating items not included in the criteria, procedures, and assumptions of the Rasch model-a 4-point Likert scale in the form of statements with sixty statements. Participants were asked to select from four statements for each item the one that applied to them. A summary of the correspondence between I-BES and item taxonomy is provided in Table 2.
Table 2. Summary of the I-BES
Construct
|
Control Taxonomy
|
Number of Items
|
Item Number
|
Basic Empathy
|
Emotional Contagion
|
5
|
1-5
|
Cognitive Empathy
|
5
|
6-10
|
Emotional Disconnection
|
5
|
11-15
|
Data Coding
Participants' responses to the items in I-BES were coded using a predetermined Likert scale. The statement describing the most positive behavior received a score of 4, and the statement describing the most negative behavior received a score of 1.
Data Analysis Procedure
The RASCH model was used to analyze the instrument I-BES. An application that can analyze the RASCH model, WINSTEP [44]. was used for the analysis. The RASCH model can be used to determine whether the responses given by participants correspond to their abilities [45]. The RASCH model estimates an item's ability and difficulty by setting the parameters of the person and the item to the same logit measure. The association between the parameters must be linear to satisfy the RASCH criterion of relative invariance [22], [46]–[49].
The term "item model fit" refers to how well an item fits the model. The statistical value of the item means fit square (MNSQ) was determined to evaluate the extent to which the items reflect the underlying concept [22], [35], [37], [49].
The MNSQ values infit and outfit are used to measure the item's fit to the model. The infit value can predict abnormal patterns in respondents' observation of items based on their ability. The infit values can also predict abnormal patterns in respondents' observation of items regardless of their ability level [50]. Ideally, the MNSQ value, which indicates that the item is consistent with the RASCH model, is 1.0, and researchers believe that the criteria for an acceptable MNSQ value range from 0.6 to 1.4 [34], [50], [51]. The item is considered a misfit if the MNSQ score does not meet these criteria. Misfit items indicate that the item does not measure what it is supposed to measure [34].
Rating scale analysis was used to analyze the degree of accuracy of the scale. I-BES is suitable for rating scale analysis because it uses a 4-point Likert scale that assumes a constant threshold at each item, a common characteristic in scales that measure personality or attitudes [52]. Analysis was conducted to determine whether the relative difficulties of steps within the items were constant, the psychological distances between statement choices, and whether or not they had the same position [53].
The difficulty of the items can be determined using the RASCH model analysis. The RASCH analysis creates a distribution map between respondents and items that graphically represents the items' difficulty level and the respondent's ability level. The RASCH model estimates the item difficulty and respondent ability parameters using logit to form the same linear interval scale [35]. Thus, both parameters can be compared simultaneously to determine whether the items available match the respondent's ability.
Reliability measures were analyzed using the RASCH model, which was conducted to determine the reliability and separation index level for individuals and items [54], [55]. The reliability value was obtained to determine how much the scale can discriminate between persons and items. The reliability value must be between 0 and 1 [36]. The separation index value indicates how much of the distribution of persons or items can be identified in the measured variable. The separation value is at least 2 to provide adequate separation between persons, items, or [56][57].
Unidimensionality is an analysis used to determine the construct validity of the instrument. This analysis aims to identify the ability of the instrument to measure the range of respondents' abilities [24], [49]. Instruments that have a valid construct measure respondents from highest to lowest ability. This analysis determines whether the respondent's measured ability is correct. In addition, the one-dimensionality analysis measures the variables comprehensively. The analysis is conducted to determine if the items used in the instrument measured what should be measured [37], so that the instrument can provide accurate information. The value to be considered is the natural variance explained by the measures. The instrument measures the diversity of respondents' abilities if the value is at least 20 %. In addition, the value of unexplained variance in the first contrast is accepted if the value does not exceed 15 %, so the instrument does not have significant noise. The eigenvalues are also considered to determine if the items are not measuring what should be measured. Initially, if the eigenvalue of unexplained variance is above 2, other items are not measuring what should be measured. This result is evidenced in the standardized residual contrast plot.
Findings
Item Model Fit
Item-model fit for the I-BES instrument was measured by looking at the MNSQ value on infit and outfit. Misfit items with MNSQ values greater than 1.4 will be analyzed further. This result is because items that do not fit will give a picture of information that should not be measured. Four items are identified as misfits: item 2, item 1, item 5, and item 14. Detailed descriptions are presented in Table 3.
Table 3. Descriptions and Item fit Statistics
Entry Number
|
Item Difficulty
|
Infit Mean-Square
|
Outfit Mean-Square
|
Control taxonomy
|
2
|
0.84
|
1.74
|
1.82
|
Emotional Contagion
|
1
|
-0.35
|
1.77
|
1.43
|
Emotional Contagion
|
5
|
-1.37
|
1.60
|
1.06
|
Emotional Contagion
|
14
|
0.22
|
0.55
|
0.57
|
Emotional Diconnection
|
Note. MNSQ = mean square
The four items declared to be misfits were removed, leaving eleven items. Therefore, ten items are declared fit with the model after going through the item-model fit process twice in the analysis process. Table 4 describes the twelve items that were declared misfits in the second analysis after the items that were declared misfits in the first analysis were removed. The following is also Table 5, which displays the items that were declared fit with the model.
Table 4. Descriptions and Item fit Statistics of items 12
Item No
|
Item Difficulty
|
Infit Mean-Square
|
Outfit Mean-Square
|
Control taxonomy
|
12
|
-1.58
|
1.48
|
0.72
|
Cognitive Empathy
|
Table 5. Item Fit Statistics of the Final Refit Model
Item Number
|
Item Difficulty
|
Infit Mean-Square
|
Outfit Mean-Square
|
Control taxonomy
|
3
|
0.55
|
1.08
|
1.08
|
Emotional Contagion
|
4
|
0.70
|
1.01
|
0.90
|
Emotional Contagion
|
6
|
-0.04
|
0.85
|
0.85
|
Cognitive Empathy
|
7
|
-0.80
|
1.15
|
0.83
|
Cognitive Empathy
|
8
|
0.98
|
1.25
|
1.16
|
Cognitive Empathy
|
9
|
0.20
|
0.65
|
0.71
|
Cognitive Empathy
|
10
|
-0.59
|
1.10
|
0.98
|
Emotional Disconnection
|
11
|
-0.19
|
1.25
|
1.20
|
Emotional Disconnection
|
13
|
0.42
|
0.85
|
0.83
|
Emotional Disconnection
|
15
|
-1.23
|
1.16
|
0.85
|
Emotional Disconnection
|
After making adjustments by eliminating items that do not fit or have infit values and MNSQ outfits that are less or more than 0.6 and 1.4, 10 items were found to have been declared fit. The average MNSQ infit and outfit scores were 1.03 (SD = 0.18) and 0.94 (SD = 0.15), indicating that the items in the instrument fit the model.
Rating Scale
Rating scale analysis was conducted to determine whether each statement on each item could be understood and used by the respondent as intended. The rating scale values are presented in Table 6 as follows.
Table 6. Rating Scale Structures for the I-BES
Category Label
|
Observed Average
|
Expected Average
|
Infit Mean-Square
|
Outfit Mean-Square
|
Threshold
|
1
|
-0.28
|
-0.20
|
0.91
|
0.84
|
None
|
2
|
0.32
|
0.25
|
1.08
|
0.98
|
0.26
|
3
|
0.86
|
0.82
|
0.91
|
0.86
|
-1.14
|
4
|
1.59
|
1.62
|
1.18
|
1.08
|
0.87
|
Note. Obsv. Avrg = Observed Average, Expt. Avrg = Expect Average, MNSQ = Mean Square
Table 6 shows that respondents did not understand the difference in answers if the observer's average and Andrich's threshold values decreased in the alternative third, so it was suspected that the third choice confused the respondent, the assumptions of the Rasch analysis model of the confusing choice had to be removed, to examine more sharply can be analyzed through figure 1 shows the probability curve of the answer choice categories to describe the probability of the responses in each category of answer choices which are estimated based on the score and difficulty level of the item.
Category probability curves show that category one curves do not intersect through categories 2, 3, and 4, respectively. The threshold value in Table 5, which has decreased on a scale of 2 to 3, is proven to be a discrepancy, resulting in an assumption, not by the intended scale. In other words, scales 2 and 3 are often considered similar, causing bias. The omitted scale is a scale of 3 with a decreasing threshold value.
Compatibility of difficulty level with sample
The item's difficulty level and the respondent's ability level will be illustrated in Figure 2, the map variable. The item difficulty level and the respondent's skill level are ordered from most difficult to least complicated. Respondents with more excellent empathy abilities and the most difficult things in sequence will be at the top, while those with lesser empathy abilities and the most accessible items will be at the bottom.
Reliability Measures
Table 6 shows the summary of the reliability of the person and item. The summary is the result after eliminating items that are declared misfits. Table 7 summarizes the reliability of persons and items.
Table 7. Person and Item Reliability Summary Statistics
Summary
|
Average Measure
|
Average Z-Standard (SD)
|
Standar Deviation
|
Real root-mean-square-deviation
|
Separation
|
Realibility
|
Infit Mean-Square
|
Outfit Mean-Square
|
Person
|
1.08
|
0.1
|
0.0
|
0.62
|
0.53
|
1.17
|
0.58
|
Item
|
0.00
|
0.2
|
-0.3
|
0.65
|
0.16
|
4.15
|
0.95
|
Table 7 shows that the person has a separation value of 1.17, and the item has a value of 4.15. This indicates that the results obtained will likely change when repeated measurements are made.
Dimensionality
Dimensionality is a Rasch-residual-based principal component analysis (PCA) that identifies whether I-BES can be stated as dimensional. Previously, I-BES had been adjusted based on item-model fit analysis by eliminating items declared to be a misfit. The results of dimensionality after adjustment with before adjustment has changed. The following is Table 8, which displays a summary of PCA results.
Table 8. Summary of PCA Results
|
First factor Eigenvalue units
|
First contrast Eigenvalue units
|
Original Scale
|
6.4 (29.8%)
|
2.0 (9.2%)
|
First Revision
|
6.4 (36.6%)
|
1.7 (9.6%)
|
Second Revision
|
5.4 (35.0%)
|
1.6 (10.2 %)
|
Note. First Revision = items 2, 1, 5, and 14 were removed from the original scale; Second Revision = item 12 was removed from the first revision scale.
Table 8 shows a change in the value of first-factor and first-contrast eigenvalue units. The first-factor eigenvalue unit shows that the original scale up to the second revision has natural variance explained by quite reasonable measures because it is above 20 %. Even the increase occurred after the first revision, although it decreased in the second revision but was insignificant. Thus, the instrument can quite measure what should be measured, especially using an instrument revised second.