Analyzing The Construct Validity of the Indonesian Basic Empathy Scale using the Rasch model

doi:10.21203/rs.3.rs-3364544/v1

Download PDF

Research Article

Analyzing The Construct Validity of the Indonesian Basic Empathy Scale using the Rasch model

https://doi.org/10.21203/rs.3.rs-3364544/v1

This work is licensed under a CC BY 4.0 License

Version 1

posted

You are reading this latest preprint version

Background The COVID-19 pandemic has brought many changes to human social life. The ability for empathy plays an essential role in these changes because empathy enables the process of sharing experiences, needs, and desires between individuals. In this study, the author aims to evaluate, develop and modify the Basic Empathy Scale in Adults (BES-A) to adapt it to Indonesian culture, which will be transformed into Indonesian Basic Empathy Scale (I-BES).

Methods The methodology used in this research is a survey design. Research using a survey design, the participants, Indonesia (n= 101, male 17, female 84 regions 5 region), evaluation was done by analyzing the Indonesian Empathy Scale using the RASCH model.

ResultsShow that the instrument does not have unidimensionality. Based on the results of fitting the item model, several items on the I-BES were omitted because they were deemed inappropriate, including at least five items. The instrument has good dispersion reliability among items but low dispersion reliability among individuals. The Indonesian Basic Empathy Scale needs further improvement to obtain response categories based on its functions.

ConclusionThis study can be used as teaching material for counseling and guidance groups, and the personality theory of sharpening the material regarding the development of empathy, in addition to the measurement instruments for the development of empathy, which can be used as material for counseling and guidance assessment courses.

Basic empathy scale

empathy

post-traumatic stress disorder

Rasch model

The COVID-19 pandemic has significantly affected human existence. The social lives of people have been altered by the COVID-19 pandemic, causing various mental health issues for individuals that may exacerbate pre-existing problems [1]. Many events during the COVID-19 pandemic are traumatic and can have emotional effects, such as anxiety, depression, and trauma [2], [3]. Individuals who have difficulties adapting to a traumatic event have serious consequences. One is that individuals can experience post-traumatic stress disorder [4], [5].

COVID-19 has become an impetus for individuals to share, understand, and help others through the affective side [6]–[8]. Although the COVID-19 pandemic had a tremendous negative impact, it also had a positive impact that promotes the psychological side of individuals, which is to increase the awareness of empathy [9]. Empathy promotes the individual's motivation to follow all existing regulations to help others, especially those regulations that are in place during the COVID-19 pandemic [10]. Behavior generated by empathy - which is not based on any interests - can positively affect the environment to behave similarly. In other words, behavior that arises from empathy can lead other individuals to empathize and to prosocial behavior [10]–[12].

Empathy enables individuals to relate to the feelings of others. Empathy is a factor that encourages someone to engage in social interactions that are attentive to the well-being of others [13]. In other words, empathy plays a vital role in interpersonal and social relationships. Empathy enables individuals to share experiences, needs, and desires. This process leads people to experience satisfaction in social relationships and become an emotional bridge that can promote positive behavior [14]–[16]. For people who have experienced a traumatic event, emotional encouragement through empathy can help them escape the symptoms of Post Traumatic Stress Disorder (PTSD) [17].

The World Health Organization (WHO), which continuously urges every country to continue vaccinating, and many countries have now reached the specified vaccination limit, suggests that the COVID-19 pandemic may slowly be overcome. However, it is impossible to predict when the endemicity will begin [18]. This possibility must be considered when considering that traumatic events such as the COVID-19 pandemic significantly increase empathy. When the COVID-19 pandemic begins to become endemic, the things that trigger increased empathy also decrease. This information does not mean that it takes other traumatic events to elicit empathy in the individual capacity but that there must be efforts that can improve or even increase the empathy ability of individuals. One of the efforts that can be made is to measure empathy instruments to know what individuals need to get information about their needs regarding empathy abilities. In this way, people can be offered the right services to maintain or improve their empathy abilities.

Its development was done as a revision and reconstruction of the shortcomings of the Basic Empathy Scale (BES) to make it more comprehensive. This instrument is based on [20] definition of empathy as "sharing and understanding other people's emotional states or circumstances as a result of experiencing emotional states and comprehending individual emotions."

The process of developing an empathy instrument showed that the results of the instrument were declared valid using the Classic Test Theory method [21]. However, the instrument being analyzed has limitations in the Classic Test Theory validation process. When two different types of tests are given to two groups, the results cannot be compared. In addition, the difficulty level and weighting of the questions greatly depend on the measurements taken [22]. In addition, the instrument has weaknesses from using classic theory to measure the validity and reliability of an instrument construct. The measurement results will depend on the test characteristics used, and item parameters depend on the test takers' ability. Measurement errors can only be known for non-individual groups [23]. The Rasch model is an alternative way to investigate instrument validity. The Rasch model presents the scale structure of an instrument so that the disclosure of empathy instruments is valid or invalid [24], [25].

The Rasch model can be used to analyze the results of the instrument. The advantage of the Rasch model is that it provides accurate results regarding participants and instrument quality by creating a measurement scale with equal intervals. The advantages of the Rasch model are: (1) it can predict missing data; (2) it provides a linear scale with the same interval; (3) it detects inaccuracies; (4) it provides more accurate estimates/estimates; and (5) it provides replicable measurements [26]. This research was conducted to fill the gaps in other instrument analysis models by conducting instrument tests that measure empathy in adolescents using the Rasch model. The BES-A instrument developed by [19] was further adapted and modified to fit the respondents' circumstances in Indonesia. Adjustments were made regarding the language and context of the statements used, which did not eliminate the concepts that Carré had developed. The items resulting from the adaptation and modification process were fifteen items. Therefore, the instrument had to be analyzed and evaluated to maintain its measurement accuracy. In this context, the development and modification of BES-A are called the Indonesia Basic Empathy Scale (I-BES).

The RASCH model measures I-BES to determine its validity and reliability. Several studies evaluating the instruments have mentioned that the RASCH model is a powerful tool for determining validity and reliability, even with a small sample size [27], [28]. The RASCH model provides a probabilistic view of the object of measurement. Therefore, a RASCH model analysis is not deterministic and can more accurately identify the observed item. The RASCH model can also address metric disparities between items and difficulties in data intervals [29].

In this article, we aimed to evaluate the I-BES modified and adapted based on the Indonesian context through comprehensive techniques by analyzing the RASCH model, Based on the research problem presented. The results of this analysis will contribute to the evaluation of the questionnaire and significantly influence the literature in different contexts. Specifically, In the study, we identified appropriate respondents with I-BES, identified invalid items, confirmed their construction, and indicated the reliability between items and respondents.

Research Design

The methodology used in this research is a survey design. Research using a survey design describes quantitative (numbers) related to empathy [30]. This research was conducted by collecting, classifying, and analyzing or processing data, drawing conclusions, and writing reports to objectively show the quality of the (I-BES) instrument in a description.

The study design was a survey study. A survey is a quantitative research technique in which the researcher interviews a sample or an entire population to determine the population's attitudes, beliefs, behaviors, or characteristics. For this study, a random sample was used, focusing on the population of Lampung, Indonesia. The basis for consideration is that the philosophical values of the Lampung cultural subculture demonstrate an openness to diversity that supports the development of the value of empathy. The social philosophy of Nengah Nyappur in Lampung culture is rooted in the cultural attitudes of the Lampung (ethnic) people. Thus, it can help avoid discriminatory social attitudes that can potentially cause unproductive social rifts in building community unity (integration) [31].

Rasch Model

In the 1960s, Georg Rasch constructed an analytical method of item response theory or item response theory (IRT) known as 1PL (one parameter logistic) [32], [33]. Ben Wright later popularized this mathematical model [34]. Rasch developed a model that connects students and things using raw data in dichotomous data (true and false) representing students' abilities [33].

In the field of social sciences, the data were obtained in the form of source numbers, which can usually be obtained in the form of attitudes and opinions about statement items or questions in a given instrument [35]. The instrument was designed based on satisfactorily defined variables. Subsequently, the relevant constructs were identified, and the items to measure the variables in question were developed. At the same time, the response options offered generally follow the scoring pattern of classical test theory (CTT). In the context of the Rasch model, this 'settled' scoring pattern is nothing more than a measurement whose results depend on who is being measured (test-dependent scoring), whereas quantitative research in the social sciences is concerned with objective measurement [36], [37].

According to Mok and Wright [32] , objective measurement in the social sciences must meet five criteria: 1. providing a linear measure with equal intervals; 2. conducting an appropriate estimation procedure; 3. finding inappropriate items (misfits) or unusual ones (outliers); 4. overcoming data loss; 5. generating replicable measurements (independent of the parameters studied). Only the Rasch model can satisfy the five conditions of the five conditions. In other words, the quality of measurements in the social sciences made with the Rasch model will be the same as measurements in the field of physics [32], [33], [38].

Sample and Data Collection

The participants were guidance and counseling students of STKIP PGRI Bandar Lampung, a total of hundred and two students. Forty-eight students were in 2018-2019, and as many as fifty-four started their studies in 2020-2021. The distribution was based on the COVID-19 pandemic, which emerged at the end of 2019. The social interaction restrictions that impacted the world of education required students to take lectures online after the policy was enacted in mid-2020, around June-July. Namely, students in the 2018-2019 cohort were still delivering lectures offline and interacting a lot with their peers in lectures.

Meanwhile, students in the class of 2020-2021 did not hold intensive offline lectures because no policy allows lectures to be held with a 100% capacity in a class. In addition, participants were also categorized by gender - a total of eighty-five female students and seventeen male students. Specifically, the distribution of participants is described in Table 1.

Table 1. Demographic Characteristics of the Respondents

No	Demographic Characteristics	Respondents
No	Demographic Characteristics	Total	%
1	Gender
	Male	17	16.83
	Female	84	83.17
2	Region
	Bandar Lampung	46	45.54
	Pringsewu	14	13.86
	Tanggamus	8	7.92
	Pesawaran	13	12.87
	Other (Outside Lampung Province)	20	19.80

The Basic Empathy Scale in Adult (BES-A)

The Basic Empathy Scale-A (BES-A) measures items derived from the Basic Empathy Scale (BES) [39]. BES-A was developed by Carré et al. (2013) as a revision and reconstruction of the shortcomings of BES so that it is more comprehensive than the previous item. BES has forty items, whereas BES-A consists of twenty items aimed at adults. In the context of development, the average student is already in the adult stage, so this item should be able to measure empathy comprehensively.

BES is based on Cohen and Strayer's (1996) definition of empathy as "sharing and understanding the emotional state or context of others as a result of experiencing the emotional state (affective) and understanding the emotions (cognitive) of others" The BES assesses five basic emotions (fear, sorrow, anger, and joy) as well as broader cognitive and affective empathy measures than nonspecific emotional states (e.g., anxiety). For a forty-item scale, items were provided with reversed wording, with twenty items requiring a positive answer and another twenty items requiring a negative response. A shortened twenty-item version is also available for use with adults in France. However, several studies have found that the BES, a two-factor scale, does not consider newer ideas of empathy, which have found that empathy depends on three components [40], [41].

Modification of BES-A and Transformation to Indonesia Basic Empathy Scale (I-BES)

To develop a comprehensive empathy instrument and refer to the recommendations of several previous studies [40], [42], [43], the Indonesia Basic Empathy Scale (I-BES) refers to three main components of empathy. The empathy instrument construct measures empathy contagion (affect empathy), cognitive empathy, and emotional disconnection. Empathy contagion is a situation in which a person automatically adjusts his or her emotions to the emotions or states of others. Cognitive empathy refers to a person's ability to understand and articulate emotions based on the effect of other people's emotions or situations. In addition, emotional disconnection is a regulatory component that involves self-protection from anguish, suffering, and emotional impact caused by other people's emotions or situations.

I-BES is an instrument analyzed and evaluated for validity and reliability. Since the research was conducted in Indonesia, the items were translated into Indonesian and adapted to the context of the participants. The language change was made to avoid irrelevant construct variables caused by the participants' language proficiency. Translators included a psychologist, a career counselor, and an English language expert. For the instrument to be comprehensively adapted to Indonesian culture regarding empathy, language translation equality, concepts, and metrics, considering the linguistic expressions and language culture of the local community under this study. Language translation involved forward translation, expert panel, back translation, pre-testing and cognitive interviewing, and final revision.

The instrument initially contained twenty items and was adjusted to fifteen based on empirical test results. The primary consideration of the instrument used had fifteen items; that is, the instrument was conducted through the stage of unidimensionality, and the item analysis consisted of item fit and item difficulty, using procedures and assumptions from the Rasch model; the results are fifteen items; by eliminating items not included in the criteria, procedures, and assumptions of the Rasch model-a 4-point Likert scale in the form of statements with sixty statements. Participants were asked to select from four statements for each item the one that applied to them. A summary of the correspondence between I-BES and item taxonomy is provided in Table 2.

Table 2. Summary of the I-BES

Construct	Control Taxonomy	Number of Items	Item Number
Basic Empathy	Emotional Contagion	5	1-5
	Cognitive Empathy	5	6-10
	Emotional Disconnection	5	11-15

Data Coding

Participants' responses to the items in I-BES were coded using a predetermined Likert scale. The statement describing the most positive behavior received a score of 4, and the statement describing the most negative behavior received a score of 1.

Data Analysis Procedure

The RASCH model was used to analyze the instrument I-BES. An application that can analyze the RASCH model, WINSTEP [44]. was used for the analysis. The RASCH model can be used to determine whether the responses given by participants correspond to their abilities [45]. The RASCH model estimates an item's ability and difficulty by setting the parameters of the person and the item to the same logit measure. The association between the parameters must be linear to satisfy the RASCH criterion of relative invariance [22], [46]–[49].

The term "item model fit" refers to how well an item fits the model. The statistical value of the item means fit square (MNSQ) was determined to evaluate the extent to which the items reflect the underlying concept [22], [35], [37], [49].

The MNSQ values infit and outfit are used to measure the item's fit to the model. The infit value can predict abnormal patterns in respondents' observation of items based on their ability. The infit values can also predict abnormal patterns in respondents' observation of items regardless of their ability level [50]. Ideally, the MNSQ value, which indicates that the item is consistent with the RASCH model, is 1.0, and researchers believe that the criteria for an acceptable MNSQ value range from 0.6 to 1.4 [34], [50], [51]. The item is considered a misfit if the MNSQ score does not meet these criteria. Misfit items indicate that the item does not measure what it is supposed to measure [34].

Rating scale analysis was used to analyze the degree of accuracy of the scale. I-BES is suitable for rating scale analysis because it uses a 4-point Likert scale that assumes a constant threshold at each item, a common characteristic in scales that measure personality or attitudes [52]. Analysis was conducted to determine whether the relative difficulties of steps within the items were constant, the psychological distances between statement choices, and whether or not they had the same position [53].

The difficulty of the items can be determined using the RASCH model analysis. The RASCH analysis creates a distribution map between respondents and items that graphically represents the items' difficulty level and the respondent's ability level. The RASCH model estimates the item difficulty and respondent ability parameters using logit to form the same linear interval scale [35]. Thus, both parameters can be compared simultaneously to determine whether the items available match the respondent's ability.

Reliability measures were analyzed using the RASCH model, which was conducted to determine the reliability and separation index level for individuals and items [54], [55]. The reliability value was obtained to determine how much the scale can discriminate between persons and items. The reliability value must be between 0 and 1 [36]. The separation index value indicates how much of the distribution of persons or items can be identified in the measured variable. The separation value is at least 2 to provide adequate separation between persons, items, or [56][57].

Unidimensionality is an analysis used to determine the construct validity of the instrument. This analysis aims to identify the ability of the instrument to measure the range of respondents' abilities [24], [49]. Instruments that have a valid construct measure respondents from highest to lowest ability. This analysis determines whether the respondent's measured ability is correct. In addition, the one-dimensionality analysis measures the variables comprehensively. The analysis is conducted to determine if the items used in the instrument measured what should be measured [37], so that the instrument can provide accurate information. The value to be considered is the natural variance explained by the measures. The instrument measures the diversity of respondents' abilities if the value is at least 20 %. In addition, the value of unexplained variance in the first contrast is accepted if the value does not exceed 15 %, so the instrument does not have significant noise. The eigenvalues are also considered to determine if the items are not measuring what should be measured. Initially, if the eigenvalue of unexplained variance is above 2, other items are not measuring what should be measured. This result is evidenced in the standardized residual contrast plot.

Findings

Item Model Fit

Item-model fit for the I-BES instrument was measured by looking at the MNSQ value on infit and outfit. Misfit items with MNSQ values greater than 1.4 will be analyzed further. This result is because items that do not fit will give a picture of information that should not be measured. Four items are identified as misfits: item 2, item 1, item 5, and item 14. Detailed descriptions are presented in Table 3.

Table 3. Descriptions and Item fit Statistics

Entry Number	Item Difficulty	Infit Mean-Square	Outfit Mean-Square	Control taxonomy
2	0.84	1.74	1.82	Emotional Contagion
1	-0.35	1.77	1.43	Emotional Contagion
5	-1.37	1.60	1.06	Emotional Contagion
14	0.22	0.55	0.57	Emotional Diconnection

Note. MNSQ = mean square

The four items declared to be misfits were removed, leaving eleven items. Therefore, ten items are declared fit with the model after going through the item-model fit process twice in the analysis process. Table 4 describes the twelve items that were declared misfits in the second analysis after the items that were declared misfits in the first analysis were removed. The following is also Table 5, which displays the items that were declared fit with the model.

Table 4. Descriptions and Item fit Statistics of items 12

Item No	Item Difficulty	Infit Mean-Square	Outfit Mean-Square	Control taxonomy
12	-1.58	1.48	0.72	Cognitive Empathy

Table 5. Item Fit Statistics of the Final Refit Model

Item Number	Item Difficulty	Infit Mean-Square	Outfit Mean-Square	Control taxonomy
3	0.55	1.08	1.08	Emotional Contagion
4	0.70	1.01	0.90	Emotional Contagion
6	-0.04	0.85	0.85	Cognitive Empathy
7	-0.80	1.15	0.83	Cognitive Empathy
8	0.98	1.25	1.16	Cognitive Empathy
9	0.20	0.65	0.71	Cognitive Empathy
10	-0.59	1.10	0.98	Emotional Disconnection
11	-0.19	1.25	1.20	Emotional Disconnection
13	0.42	0.85	0.83	Emotional Disconnection
15	-1.23	1.16	0.85	Emotional Disconnection

After making adjustments by eliminating items that do not fit or have infit values and MNSQ outfits that are less or more than 0.6 and 1.4, 10 items were found to have been declared fit. The average MNSQ infit and outfit scores were 1.03 (SD = 0.18) and 0.94 (SD = 0.15), indicating that the items in the instrument fit the model.

Rating Scale

Rating scale analysis was conducted to determine whether each statement on each item could be understood and used by the respondent as intended. The rating scale values are presented in Table 6 as follows.

Table 6. Rating Scale Structures for the I-BES

Category Label	Observed Average	Expected Average	Infit Mean-Square	Outfit Mean-Square	Threshold
1	-0.28	-0.20	0.91	0.84	None
2	0.32	0.25	1.08	0.98	0.26
3	0.86	0.82	0.91	0.86	-1.14
4	1.59	1.62	1.18	1.08	0.87

Note. Obsv. Avrg = Observed Average, Expt. Avrg = Expect Average, MNSQ = Mean Square

Table 6 shows that respondents did not understand the difference in answers if the observer's average and Andrich's threshold values decreased in the alternative third, so it was suspected that the third choice confused the respondent, the assumptions of the Rasch analysis model of the confusing choice had to be removed, to examine more sharply can be analyzed through figure 1 shows the probability curve of the answer choice categories to describe the probability of the responses in each category of answer choices which are estimated based on the score and difficulty level of the item.

Category probability curves show that category one curves do not intersect through categories 2, 3, and 4, respectively. The threshold value in Table 5, which has decreased on a scale of 2 to 3, is proven to be a discrepancy, resulting in an assumption, not by the intended scale. In other words, scales 2 and 3 are often considered similar, causing bias. The omitted scale is a scale of 3 with a decreasing threshold value.

Compatibility of difficulty level with sample

The item's difficulty level and the respondent's ability level will be illustrated in Figure 2, the map variable. The item difficulty level and the respondent's skill level are ordered from most difficult to least complicated. Respondents with more excellent empathy abilities and the most difficult things in sequence will be at the top, while those with lesser empathy abilities and the most accessible items will be at the bottom.

Reliability Measures

Table 6 shows the summary of the reliability of the person and item. The summary is the result after eliminating items that are declared misfits. Table 7 summarizes the reliability of persons and items.

Table 7. Person and Item Reliability Summary Statistics

Summary	Average Measure	Average Z-Standard (SD)		Standar Deviation	Real root-mean-square-deviation	Separation	Realibility
Summary	Average Measure	Infit Mean-Square	Outfit Mean-Square	Standar Deviation	Real root-mean-square-deviation	Separation	Realibility
Person	1.08	0.1	0.0	0.62	0.53	1.17	0.58
Item	0.00	0.2	-0.3	0.65	0.16	4.15	0.95

Table 7 shows that the person has a separation value of 1.17, and the item has a value of 4.15. This indicates that the results obtained will likely change when repeated measurements are made.

Dimensionality

Dimensionality is a Rasch-residual-based principal component analysis (PCA) that identifies whether I-BES can be stated as dimensional. Previously, I-BES had been adjusted based on item-model fit analysis by eliminating items declared to be a misfit. The results of dimensionality after adjustment with before adjustment has changed. The following is Table 8, which displays a summary of PCA results.

Table 8. Summary of PCA Results

	First factor Eigenvalue units	First contrast Eigenvalue units
Original Scale	6.4 (29.8%)	2.0 (9.2%)
First Revision	6.4 (36.6%)	1.7 (9.6%)
Second Revision	5.4 (35.0%)	1.6 (10.2 %)

Note. First Revision = items 2, 1, 5, and 14 were removed from the original scale; Second Revision = item 12 was removed from the first revision scale.

Table 8 shows a change in the value of first-factor and first-contrast eigenvalue units. The first-factor eigenvalue unit shows that the original scale up to the second revision has natural variance explained by quite reasonable measures because it is above 20 %. Even the increase occurred after the first revision, although it decreased in the second revision but was insignificant. Thus, the instrument can quite measure what should be measured, especially using an instrument revised second.

The MNSQ value for infit and outfit was used to determine the fit of the item model for the instrument I-BES. To obtain better results or "Rasch-compliant" [34], results, the four items that were declared misfits were removed, leaving eleven items. The remaining items were reanalyzed using the same analysis. It is known that one item, namely item twelve, was declared a misfit. Therefore, ten items are declared as matching the model after going through the item-model matching process twice in the analysis process. Table 4 describes the twelve items declared as misfits in the second analysis after removing the items declared as misfits in the first analysis.

The observed and expected averages increase, but the threshold does not. A scale of 2 to 3 shows that the threshold has decreased. This result indicates that the scale used is still not functioning correctly [44], [58]. However, each scale's MNSQ infit and outfit values range from 0.8 to 1.2. This result shows that the scale used can be well understood and does not generate noise that leads to misinterpretation because the value is below 2.0 [58].

The threshold in Table 6, which fell on a scale of 2 to 3, is a discrepancy that results in an assumption, not by the intended scale. In other words, scales 2 and 3 are often considered similar, leading to bias. To overcome this bias, one of the categories is simplified or omitted [58] to become a 3-point Likert scale. The omitted scale is a 3-point scale with a decreasing threshold. Theory of Empathy [59] Students who understand relationships with others are categorized as being in an empathy learning process if they are categorized at a high level in the personality structure. Conversely, students who only do or act are weak in interacting with others so they may be categorized as weak in empathy. Students with weak relationships with others also affect their social life in the school environment and the community [60]. The difficulty of the items and the respondent's ability level are arranged from highest to lowest difficulty. Respondents with more excellent empathy abilities and the most difficult items in the sequence are at the top, while respondents with lower empathy abilities and the easiest are at the bottom. Figure 2 shows that the difficulty of items varies with different difficulty levels. From the comparison between the difficulty level of the item and the ability level of the respondents, it appears that most respondents were able to answer each item. This result shows that the diversity of respondents' abilities is less visible. Thus, the solution is to increase the number of items with different difficulty levels and functions. In this way, the diversity of respondents in terms of difficulty and function of each item becomes greater.

Nevertheless, no items have the same function, which can be appeared from the fact that no items are in the same position [56]. The items obtained from the analysis results will comprehensively describe individual empathic behavior, starting with identification; they will be considered for implementing preventive or curative efforts that can be carried out by educators or a team of experts, which include guidance and counseling (BK). BK, the efforts of educators are a form of help, understanding, and facilitating the optimal development of individual potential; if left unchecked, individuals will commit destructive acts. The 2020 United Nations Children's Fund (UNICEF) research states that out of 39,8675 students, 41 % of adolescents at age fifteen had low empathy development and engaged in bullying behaviors at least several times a month, which, if left unchecked, would increase the risk of psychological disorders later in life and poor social functioning [61], [62]. Strengthened by the research findings [63] which stated that 367 students out of a total of 1167 students had the weakest social interactions, it was predicted that students would experience acts of violence, bullying, and lower quality of learning in tertiary institutions. If students' weak development of empathy is left unchecked, they are at risk for frustration, acts of violence, beatings, sexual harassment, and murder [64].

It is known that the most challenging item is item 8 (Q8), with a value of 0.98, and the most accessible item is item fifteen (Q15), with a value of -1.23. The respondent with the highest ability has a value of 3.29, and the lowest ability has a value of -0.57. The difference between the maximum and minimum scores between the respondent's ability and the item's difficulty indicates that I-BES may not provide sufficient information, especially for respondents who scored much higher than that of the most challenging item. This result is because the gap between the scores of the most difficult items and high-ability respondents is greater than 5 %, so the information provided by the instrument for respondents with scores greater than 5 % of the most challenging items provides weak or less accurate information [58]. This result is caused by items with lower difficulty levels that are higher or more different than those measured. The number of items needs to be increased to maintain the variety of items that provide accurate information, especially for respondents with high empathy ability.

Table 7 shows that the person has a separation score of 1.17, and the item has a score of 4.15, indicating that the person has weak diversity while the item has excellent diversity [58]. This result shows that the instrument could not provide diverse information about the respondent's empathy ability. This result may be caused by the low difficulty level of the item, or in other words, it is too easy for the respondent. This result has already been explained in the section discussing the suitability of the difficulty level for the sample. Moreover, the reliability value of the person is 0.58, and that of the item is 0.95. The reliability value of the items can be considered excellent, but the reliability value of the person is relatively weak [58]. This finding indicates that item difficulty separation is declared reliable, but person separation reliability is not. This result suggests that the results obtained will likely change with repeated measurements.

Table 8 shows that the values of the eigenvalue units of the first factor and the first contrast have changed. The first factor, the eigenvalue unit, shows that until the second revision, the original scale had a natural variance that can be considered quite good since it is above 20%. After the first revision, there was an increase, while in the second revision, it decreased but was not significant. Thus, it can be concluded that the instrument is quite capable of measuring what should be measured, especially when using an instrument revised in the second revision.

The original scaling appears to have decreased in the first contrast eigenvalues, and the percentage increased in the second revision. This result shows that after the second revision, there are no items left in the instrument that does not measure what should be measured. The percentage of eigenvalue items in the first contrast increased, although not significantly. However, for the second improvement, the percentage did not exceed 15 %, so it can be noted that the unexplained variance in the contrasts was quite good.

On the original scale, the value of the first contrast eigenvalue unit is 2.0, indicating the possibility that an item does not measure what it should. This value touches the tolerance value of the contrast eigenvalue units, 2.0. So it is necessary to eliminate or improve items that do not measure what should be measured. However, after the first and second improvements, the value of the first contrast eigenvalue unit has decreased so that it has not touched the value of 2.0. So, the adapted instrument has an item that measures what it should.

After some analysis and evaluation, the I- BES instrument is exceptionally well constructed. Initially, I-BES had 15 items, and now ten items after analysis. In the first item model fit analysis, four items (items 1, 2, 5, 14) were found to be inappropriate or did not measure what was intended to be measured, so they had to be removed. In addition, in the second item-model fit analysis, by eliminating four items declared as misfits, another item was found to be declared as a misfit, item 12. As a result, I-BES contains ten items that are indicated to fit the model or can measure what should be assessed.

In the response category analysis section, I-BES has a problem in one of the response categories. The third or third response category is not functioning correctly, and its function is still the same as that of the second scale or response category. In other words, respondents still consider answer choices 2 and 3 to be the same. Regarding understanding the sentences, the answer choices are still easy to understand and do not create noise that leads to misinterpretation.

After analysis, the difficulty with the sample is known to be the most challenging item in the instrument and, after revision, the easiest. For the items measured based on the logit parameter, it is known that the items have their respective functions, and there are no equal values, so the items are considered uniform. In addition, the ability of the highest and lowest respondents was known. As measured by the logit parameter, it is known that the highest level of response-ability exceeds the difficulty level of the items, well above 5 %. This result indicates that the instrument provides less accurate information to respondents when the logit value of the respondent's ability is 5 % higher than the logit item value. Thus, although the items have different functions, they cannot correctly identify the diversity of the respondents. This result indicates that increasing the number of items with different difficulty levels and functions is necessary.

In analyzing the reliability of the diversity of persons and objects, it is also known that the diversity of persons is only 1.17. This result illustrates the previous discussion that the instrument can identify the diversity of groups only up to 1 group (rounded 1.17). This result means that the instrument will only provide weak information about the diversity of respondents' abilities. Moreover, the item's reliability value can be considered excellent, but the reliability value of the person is weak. This finding shows that the item's difficulty level is the same throughout. However, the person's reliability to be weak indicates that the value is unreliable. If the respondent is allowed to complete the instrument again, there will likely be a significant change in ability. This result may be caused by the inconsistency of the answers given by the respondent with the respondent's abilities.

In the dimensionality test, it is known that from the original scale to the second revision, nothing is below the minimum limit of the eigenvalue units of the first factor. In addition, for the eigenvalue units of the first contrast, it is known that there is evidence in the original scale that some items do not measure what should be measured. After the first and second revisions were made based on the results of the model-item fit and a dimensionality test was conducted, it was found that the items that did not measure what should be measured no longer existed. Therefore, it can be concluded that the instrument has a reasonably good construct and has items that measure what should be measured. The results of this study can be used as teaching material for counseling and guidance groups, and the personality theory of sharpening the material regarding the development of empathy, in addition to the measurement instruments for the development of empathy, which can be used as material for counseling and guidance assessment courses.

Acknowledgments

The authors would like to thank all the participants involved in this research.

Authors’ contributions

All authors reviewed the manuscript. ND and DS drafted the text of main manuscript text. US, NR and AMN supervision article, ND, AQ and DS collected the data and conducted analyses. AMN edited the manuscript for content. All Authors read and approved the final manuscript.

Funding

This research is not funded by the university.

Data Availability

Data is available from the corresponding author upon reasonable request

Competing interest

The authors declare no competing interests.

Ethics approval and consent to participate

The research conducted hereby states that Analyzing The Construct Validity of the Indonesian Basic Empathy Scale using the Rasch model, Compiled and created exclusively by me (and co-authors, if any) and is what I claim, and will claim, to be correspondence author, fully approved by the co-author (if any). certify, on behalf of myself and co-authors (if any), that I/we have no conflict of interest relation to this work. certifies that this article is not being reviewed by any other journal. certify that this article contains original data, the integrity and accuracy of which I (and the co-authors, ifwhatever) assume responsibility for. This research used a sample of students, and informed consent was given to collect data to ensure willingness to provide data voluntarily and accurately.

Consent for publication

Not applicable.

A. M. Varghese and M. N. Natsuaki, “Coping With the Pandemic: Implementing Social and Emotional Learning in the California K-12 School System,” Policy Insights from Behav. Brain Sci., vol. 8, no. 2, pp. 136–142, 2021, doi: 10.1177/23727322211033003.
F. Arendt, A. Markiewitz, M. Mestas, and S. Scherr, “COVID-19 pandemic, government responses, and public mental health: Investigating consequences through crisis hotline calls in two countries,” Soc. Sci. Med., vol. 265, no. November, p. 113532, 2020, doi: 10.1016/j.socscimed.2020.113532.
T. Wu et al., “Prevalence of mental health problems during the COVID-19 pandemic: A systematic review and meta-analysis,” J. Affect. Disord., vol. 281, no. November 2020, pp. 91–98, 2021, doi: 10.1016/j.jad.2020.11.117.
N. Rusmana, A. Hafina, and D. Suryana, “Group Play Therapy for Preadolescents : Post-Traumatic Stress Disorder of Abstract :,” pp. 213–222, 2020, doi: 10.2174/1874350102013010213.
X. Zhou and X. Wu, “Posttraumatic stress disorder and aggressive behavior in adolescents: A longitudinal and interpersonal functional approach,” Child. Youth Serv. Rev., vol. 114, p. 105027, 2020, doi: https://doi.org/10.1016/j.childyouth.2020.105027.
J. L. Kristeller and T. Johnson, “Cultivating Loving Kindness: A Two-Stage Model of The Effects of Meditation on Empathy, Compassion, and Altruism,” Zygon®, vol. 40, pp. 391–408, 2005, doi: doi.org/10.1111/j.1467-9744.2005.00671.x.
X. Qin, F. Yang, Z. Jiang, and B. Zhong, “Empathy Not Quarantined: Social Support via Social Media Helps Maintain Empathy During the COVID-19 Pandemic,” Soc. Media Soc., vol. 8, no. 1, 2022, doi: 10.1177/20563051221086234.
M. Wei, K. Y. H. Liao, T. Y. Ku, and P. A. Shaffer, “Attachment, Self-Compassion, Empathy, and Subjective Well-Being Among College Students and Community Adults,” J. Pers., vol. 79, no. 1, pp. 191–221, 2011, doi: 10.1111/j.1467-6494.2010.00677.x.
G. R. Holt, “The Pandemic Effect: Raising the Bar for Ethics, Empathy, and Professional Collegiality,” Otolaryngol. - Head Neck Surg. (United States), vol. 163, no. 4, pp. 621–622, 2020, doi: 10.1177/0194599820933179.
S. Pfattheicher, L. Nockur, R. Böhm, C. Sassenrath, and M. B. Petersen, “The Emotional Path to Action: Empathy Promotes Physical Distancing and Wearing of Face Masks During the COVID-19 Pandemic,” Psychol. Sci., vol. 31, no. 11, pp. 1363–1373, 2020, doi: 10.1177/0956797620964422.
S. Del Canale et al., “The relationship between physician empathy and disease complications: An empirical study of primary care physicians and their diabetic patients in Parma, Italy,” Acad. Med., vol. 87, no. 9, pp. 1243–1249, 2012, doi: 10.1097/ACM.0b013e3182628fbf.
L. Meiring, S. Subramoney, K. G. F. Thomas, J. Decety, and M. M. Fourie, “Empathy and helping: Effects of racial group membership and cognitive load,” South African J. Psychol., vol. 44, no. 4, pp. 426–438, 2014, doi: 10.1177/0081246314530280.
E. Segal, Social Empathy: The Art of Understanding Others. New York: Columbia University Press, 2018.
M. Hojat et al., “The devil is in the third year: a longitudinal study of erosion of empathy in medical school.[Erratum appears in Acad Med. 2009 Nov;84(11):1616],” Acad. Med., vol. 84, no. 9, pp. 1182–1191, 2009.
P. Nunes, S. Williams, B. Sa, and K. Stevenson, “A study of empathy decline in students from five health disciplines during their first year of training,” Int. J. Med. Educ., vol. 2, pp. 12–17, 2011, doi: 10.5116/ijme.4d47.ddb0.
H. Riess, “The Science of Empathy,” J. Patient Exp., vol. 4, no. 2, pp. 74–77, 2017, doi: 10.1177/2374373517699267.
J. Stephenson and K. Renk, “My First Time Hurt: Using Preschool PTSD Treatment to Address PTSD Symptoms in a Young Girl With a History of Pediatric Cancer,” Clin. Case Stud., vol. 18, no. 2, pp. 87–105, 2019, doi: 10.1177/1534650118815601.
WHO, “Boost COVID-19 Vaccination Coverage: WHO,” World Health Organization, 2022. .
A. Carré, N. Stefaniak, F. D’Ambrosio, L. Bensalah, and C. Besche-Richard, “The Basic Empathy Scale in Adults (BES-A): Factor Structure of a Revised Form,” Psychol. Assess., vol. 25, no. 3, pp. 679–691, 2013, doi: 10.1037/a0032297.
D. Cohen and J. Strayer, “Empathy in conduct-disordered and comparison youtho Title,” Dev. Psychol., vol. 32, no. 6, pp. 988–998, 1996, doi: 10.1037/0012-1649.32.6.988.
C. Y. Lin, “Psychometric validation of the Persian bergen social media addiction scale using classic test theory and Rasch models,” J. Behav. Addict., vol. 6, no. 4, pp. 620–629, 2017, doi: 10.1556/2006.6.2017.071.
D. Indihadi, D. Suryana, and A. B. Ahmad, “the Analysis of Construct Validity of Indonesian Creativity Scale Using Rasch Model,” Creat. Stud., vol. 15, no. 2, pp. 560–576, 2022, doi: 10.3846/cs.2022.15182.
D. M. Shapiro, “Testing validity inferences of the Science Motivation Questionnaire II scores using a Rasch analysis framework,” 2019.
Muslihin. H. et.al, “Analysis of the Reliability and Validity of the Self-Determination Questionnaire Using Rasch Model,” Int. J. Instr., vol. 15, no. 2, pp. 207–222, 2022, doi: . https://doi.org/10.29333/iji.2022.15212a.
N. Rusmana, A. Hafina, R. O. Wardhany, and D. Suryana, “Students’ Confidence Instrument Analysis in Poetry Learning through Rasch Model,” Open Psychol. J., vol. 13, no. 1, pp. 289–299, 2020, doi: 10.2174/1874350102013010289.
A. Taufiq, E. S. Yudha, Y. H. Md, and D. Suryana, “Examining the Supervision Work Alliance Scale: A Rasch Model Approach,” Open Psychol. J., vol. 14, no. 1, pp. 179–184, 2021, doi: 10.2174/1874350102114010179.
N. H. Che Lah, Z. Tasir, and N. F. Jumaat, “Applying alternative method to evaluate online problem-solving skill inventory (OPSI) using Rasch model analysis,” Educ. Stud., vol. 00, no. 00, pp. 1–23, 2021, doi: 10.1080/03055698.2021.1874310.
M. Clinton, N. Alayan, and L. El-Alti, “Rasch analysis of lebanese nurses’ responses to the eis questionnaire,” SAGE Open, vol. 4, no. 3, pp. 1–10, 2014, doi: 10.1177/2158244014547182.
D. Andrich, RASCH Model for Measurement - Series Quantitative Application in The Socials Sciences. Perth, Australia: Sage Pubications, 1988.
J. W. Cresswell and V. L. P. Clark, “Designing and conducting mixed methods research. 2nd edn Sage Publications Inc,” Thousand Oaks, CA, 2011.
A. Pahrudin and M. Hidayat, “Budaya Lampung dan Penyelesaian Konflik Sosial Keagamaan.” repository.radenintan.ac.id, pp. 1–256, 2007.
B. Sumintono, “Rasch Model Measurements as Tools in Assesment for Learning,” 2018, doi: 10.2991/icei-17.2018.11.
H. Maulana, A. A. Rangkuti, B. Sumintono, and L. D. Utami, “Testing of the Indonesian Version of the Instrument “Teachers’ Sense of Efficacy Scale” Using Rasch Modelling [Pengujian Kualitas Instrumen Teachers’ Sense of Efficacy Scale Versi Bahasa Indonesia Menggunakan Pemodelan Rasch],” ANIMA Indones. Psychol. J., vol. 35, no. 2, 2020, doi: 10.24123/aipj.v35i2.2905.
B. D. Wright and J. M. Linacre, “Reasonable Mean-Square Fit Values,” Rasch Measurement Transactions2, 1994. .
B. Sumintono and W. Widhiarso, Aplikasi Model RASCH Untuk Penelitian Ilmu-ilmu Sosial. Trim Kominkata, 2015.
T. G. Bond and C. M. Fox, “Applying the Rasch Model: Fundamental Measurement in the Human Sciences,” J. Educ. Meas., 2003, doi: 10.1111/j.1745-3984.2003.tb01103.x.
W. J. Boone, M. S. Yale, and J. R. Staver, Rasch analysis in the human sciences. 2014.
N. Rusmana, D. Suryana, H. S. Kurniasih, and N. Almigo, “The development of speaking Skill’s instrument in elementary school with rasch model analysis,” Univers. J. Educ. Res., vol. 8, no. 7, 2020, doi: 10.13189/ujer.2020.080702.
D. Jolliffe and D. P. Farrington, “Examining the relationship between low empathy and bullying,” Aggress. Behav., vol. 32, no. 6, pp. 540–550, 2006, doi: 10.1002/ab.20154.
P. L. Jackson, E. Brunet, A. N. Meltzoff, and J. Decety, “Empathy examined through the neural mechanisms involved in imagining how I feel versus how you feel pain,” Neuropsychologia, vol. 44, no. 5, pp. 752–761, 2006, doi: 10.1016/j.neuropsychologia.2005.07.015.
K. E. Smith, G. J. Norman, and J. Decety, “The complexity of empathy during medical school training: evidence for positive changes,” Med. Educ., vol. 51, no. 11, pp. 1146–1159, 2017, doi: 10.1111/medu.13398.
B. M. P. Cuff, S. J. Brown, L. Taylor, and D. J. Howat, “Empathy: A review of the concept,” Emot. Rev., vol. 8, no. 2, pp. 144–153, 2016, doi: 10.1177/1754073914558466.
A. Carré, N. Stefaniak, F. D’Ambrosio, L. Bensalah, and C. Besche-Richard, “The basic empathy scale in adults (BES-A): Factor structure of a revised form,” Psychol. Assess., vol. 25, no. 3, pp. 679–691, 2013, doi: 10.1037/a0032297.
J. M. Linacre, “WINSTEPS Rasch Measurement.” Winsteps.com, Chicago, IL, 2005.
G. Rasch, Probabilistic Models for Some Intelligence and Attainment tests. Nielsen & Lydiche, 1960.
T. G. Bond and C. M. Fox, Applying the rasch model: Fundamental measurement in the human sciences: Second edition. 2007.
G. Rasch, “On specific objectivity: An Attempt of Formalizing The Request for Generality and Validity of Scientific Statements,” Danish Yearb. Philos., vol. 14, pp. 58–94, 1977.
R. Schumacker and E. Smith, “Reliability: A rasch perspective,” Educ. Psychol. Meas., vol. 67, pp. 394–409, Jan. 2007.
A. B. A. Ilfiandra, Nadia Aulia Nahirah, Dodi Suryana, “Development and Validation Peaceful Classroom Scale : Rasch Model Analysis,” vol. 15, no. 4, pp. 497–514, 2022.
J. M. Linacre, “Fit diagnosis: infit outfit mean-square standardized,” winsteps.com, 2000. .
Y.-S. Lee, J. Grossman, and A. Krishnan, “Cultural Relevance of Adult Attachment,” Educ. Psychol. Meas., vol. 68, no. 5, pp. 824–844, 2008, doi: 10.1177/0013164407313367.
G. Engelhard and S. A. Wind, Invariant Measurement with Raters and Rating Scales. 2019.
S. E. Embretson and S. P. Reise, Item Response Theory for Psychologists. Lawrence Erlbaum Associates Publishers, 2000.
L. Nur, L. A. Nurani, D. Suryana, and A. Ahmad, “Rasch model application on character development instrument for elementary school students,” Int. J. Learn. Teach. Educ. Res., vol. 19, no. 3, 2020, doi: 10.26803/ijlter.19.3.24.
L. Nur, A. Yulianto, D. Suryana, A. A. Malik, M. A. Al Ardha, and F. Hong, “An Analysis of the Distribution Map of Physical Education Learning Motivation through Rasch Modeling in Elementary School,” Int. J. Instr., vol. 15, no. 2, pp. 815–830, 2022, doi: 10.29333/iji.2022.15244a.
H. H. T. Liu and Y. S. Lee, “Measuring Self-Regulation in Second Language Learning: A Rasch Analysis,” SAGE Open, vol. 5, no. 3, 2015, doi: 10.1177/2158244015601717.
T. G. Bond and C. M. Fox, Applying The Rasch Model, Fundamentals Measurement in The Human Sciences, 3rd Editio. New York: Routledge, 2015.
W. P. Fisher, “Rating Scale Instrument Quality Criteria,” Rasch Meas. Trans., 2007.
S. C. Keskin, “From what isn’t empathy to empathic learning process,” Procedia-Social Behav. Sci., vol. 116, pp. 4932–4938, 2014.
G. Gini, P. Albiero, B. Benelli, and G. Altoè, “Does empathy predict adolescents’ bullying and defending behavior?,” Aggress. Behav., vol. 33, no. 5, pp. 467–476, 2007, doi: 10.1002/ab.20204.
V. F. Anthony and Z. Dan, “Basic empathy: Developing the concept of empathy from the ground up,” Int. J. Nurs. Stud., 2020.
M. Stone, The Illustrated Slave: Empathy, Graphic Narrative, and the Visual Culture of the Transatlantic Abolition Movement, 1800-1852. JSTOR, 2018.
Á. Orosa-Duarte, R. Mediavilla, A. Muñoz-Sanjose, and ..., “Mindfulness-based mobile app reduces anxiety and increases self-compassion in healthcare students: a randomised controlled trial,” Med. …, 2021, doi: 10.1080/0142159X.2021.1887835.
J. G. Culhane, Sticks and Stones: Defeating the Culture of Bullying and Rediscovering the Power of Character and Empathy, by Emily Bazelon. HeinOnline, 2012.

No competing interests reported.

Download PDF

Version 1

posted

You are reading this latest preprint version

Analyzing The Construct Validity of the Indonesian Basic Empathy Scale using the Rasch model

Status:

Version 1

Abstract

Figures

Introduction

Methodology

Discussion

Conclusion

Declarations

References

Additional Declarations

Status:

Version 1