In this study, we developed an 18-item original ESS from the PRO-CTCAE and verified the validity and reliability of the ESS in cohort one and the responsiveness of the ESS in cohort two.
The validity and reliability of the ESS was shown in cohort one. The findings were generally favorable and within expectations. Emojis are widely used in social media, which can increase the end users' fondness for these items as a fun and practical tool. Our study included many patients who used smartphones and emojis, reflecting today's society wherein these have become a part of life. In this regard, future use of emojis between healthcare providers and patients is easy to imagine, particularly if communicating by a conventional method is difficult. Based on our findings, most patients indicated that the ESS was easier to answer than the PRO-CTCAE. Further, they did not feel embarrassment or discomfort while answering the ESS, and the response time was shorter than that for the PRO-CTCAE. Lastly, some patients indicated that they enjoyed answering the questions on the ESS.
Similarly to previous reports [17, 22], we received a few comments regarding comprehensibility, particularly difficulty understanding differences in severity. Lee et al. developed a 5-step (1: very happy, 2: happy, 3: fair/average, 4: sad, and 5: very sad) evaluation tool with a smiley face scale for patients with ischemic stroke and noticed that older patients had difficulty telling the difference between very happy and happy, as well as very sad and sad, faces [17]. Thus, they reduced the number of expressions to 3. Since we anticipated this problem during the development phase of ESS, we added some text to the emojis. In the next version, we may need to revise the expressions to make them more easily understandable.
We also received a few comments regarding comprehensiveness. In question 3, which assesses content validity, most patients answered that they had no other symptoms to report. However, a small number of patients stated that they experienced additional symptoms, including “pain in the surgical wound,” “itchiness,” and “headache.”. In the next version, some informative items will be added to the ESS. "Pain in the surgical wound," a possible complication of total mastectomy, is referred to as post-mastectomy pain syndrome (PMPS). In a previous report, 50% of survey respondents stated that they were still experiencing PMPS at a mean of 9 years after surgery [23]. “Headache” is also a potentially significant symptom that may be suggestive of distant metastasis. Covering all symptoms, including rare symptoms, is not possible, but we will consider adding new items based on their frequency of occurrence and clinical importance.
For criterion validity, only “Decreased appetite”, one of 18 items (5.6%), was below the standard correlation value of 0.41. We noticed a discrepancy in this item between the PRO-CTCAE and ESS questionnaires. The PRO-CTCAE examines the degree of appetite loss by asking “In the last 7 days, what was the SEVERITY of your DECREASED APPETITE at its WORST?”, whereas the ESS asks “How was your appetite in the past week?” This may have led to the poor agreement between the two scales. In the next version of the ESS, we plan to update the question for “Decreased appetite.”
We confirmed that the test-retest reliability of ESS was favorable. Only two items, "Vomiting" and "Diarrhea," showed weak agreement. For “Vomiting,” there was a significant difference in κ coefficient between “F: Frequency” and “S: Severity” in the PRO-CTCAE. “Vomiting” is examined by frequency and not intensity. As such, compared with “Severity,” “Frequency” may have better agreement with the true outcomes. Likewise, in the ESS, “Vomiting” was expressed as “Severity” and not “Frequency.” Compared the PRO-CTCAE, the ESS had a much worse κ coefficient for “Frequency” but a similar κ coefficient for “Severity.” Further, “Vomiting” in the ESS had a higher κ coefficient than “Severity” in the PRO-CTCAE. Lastly, “Diarrhea” had poor agreement between the PRO-CTCAE and ESS, which may also be due to the “Frequency” and “Severity” difference seen for “Vomiting.” These findings that emojis were not good indicators of frequency suggest a need for revisions such as using more suitable text and emoji stickers.
In cohort two, we interpreted categorical values with continuous values and assessed associations between the PRO-CTCAE and ESS. We found a generally favorable responsiveness for the ESS between baseline and post-administration. Since the expected side effects vary widely depending on the treatment regimen, each side effect was evaluated on a regimen-by-regimen basis. In this analysis, we assessed the correlation of changes between two time points. To validate responsiveness, we did not use effect size (mean of the second time − mean of the first time/SD of the first time) or standardized response means (mean of the second time − mean of the first time/SD of the change) because we assumed that the symptoms in this cohort were relatively stable with little variability and a small standard deviation. However, a question remains on the appropriateness of replacing categorical variables with continuous variables; i.e., whether each has a guaranteed equal priority [15]. Although some authors perceive responsiveness as the most important characteristic of an evaluative tool, the proper way to assess responsiveness is not apparent [24]. Terwee et al. concluded that a distinct measure of responsiveness leads to a distinct conclusion because of a distinct objective [24]. Further discussion is needed to determine the most appropriate method to validate responsiveness.
Without considering dropouts, an estimated sample size of 14 patients was needed for each treatment group in cohort two. However, the relatively small sample size of the paclitaxel group (n = 17) requires a comment. In cohort two, patients were registered from December 2019 to October 2020. At that time, COVID-19 began to spread in Japan, and a reduced number of hospital visits was necessary to control the outbreak. Therefore, many physicians chose to administer docetaxel every 3 weeks for 4 doses rather than paclitaxel every week for 12 doses. Due to COVID-19, extending the enrollment period would not have increased the number of enrolled cases. Nevertheless, the study was completed as planned.
This study had several limitations. First, it was limited to female, Japanese patients with breast cancer. Many factors, such as gender, age, type of cancer and disease, religious, cultural background, and social media platform, affect individual use of emojis. As such, each situation may need a different ESS. Second, cohort one had a small sample size. We estimated that 100 cases were needed in this cohort, but we were able to analyze only 70 cases. This small sample size may lead to a bias. Third, as mentioned above, the appropriate approach to validation of responsiveness is uncertain.
In conclusion, we developed an original ESS from the PRO-CTCAE for patients with breast cancer and confirmed the validity, reliability, and responsiveness of the ESS. The next version of the ESS requires clarification of differences in severity and improvement of the item “Decreasing appetite”. Development and validation using ePRO in the ESS is also needed.