It is increasingly recognized that inadequate HL is associated with poor health-related knowledge and comprehension and as a result adverse health-outcomes [39]. Mounting evidence now supports a growing awareness that general HL is a major individual factor affecting an individual’s health status [40]. The availability of reliable instruments to measure HL has contributed to the raising awareness on the impact of HL on individuals’ health [22, 26, 41, 42]. Most of the instruments were developed and validated in English and, because it is difficult to develop and validate new instruments in other languages de novo, such instruments are translated to other languages to be adopted in different cultures. However, it is a challenge to adapt such instruments in a culturally relevant and comprehensible form while maintaining the meaning and the intent of the original items [33].
In this study we report on the translation and validation of the NVS instrument into Arabic using a validation process as described in international guidelines [33, 35, 36]. CA was 0.58, which is not high as an index of internal consistency of a test. However, CA is only a reflection of the inter-relatedness of the items within the test [43], (if the items in a test are correlated to each other, the value of CA is increased) and because the fifth and sixth questions are meant to measure comprehension while the first four questions (questions 1 to 4) meant to measure numerical skills such low CA is not surprising. Furthermore, CA is affected by the length of the test and if the test is too short like NVS, the value of CA is reduced. In fact, a high level of CA has never been reported from any study. CA was reported to be 0.69 among 85 participants in Iraq [24], 0.70 in Turkey [28], 0.74 in UK [17], 0.69 in Spain [26] and 0.76 for NVS-D in Netherlands [27]. Although CA of ≥ 0.70 is deemed to be satisfactory [44], there is numerous criticism for this approach [43]. A test with CA = 0.70 still has huge amount of error that may exceeds 50% of the results of the test.
We agree with the previous studies that the fifth question “is it safe for you to eat this ice cream?” is particularly problematic [45]. This question requires a dichotomous answer “yes, no” which can be easily answered by guessing. Salgado et al [45] showed that most of those who answered the question correctly failed to report the reason for their answer, which is requested in the next question. In our study, the fifth question was most often answered correctly but approximately one third of those who answered the fifth question correctly failed to explain why in the next question. In Turkey the fifth question was the most correctly answered question [28]. As mentioned above, unlike the first four questions, fifth and sixth question meant to evaluate comprehension, and this also undermine the reliability NVS.
Most of the previous studies did not report test retest validity (external reliability). We assessed external reliability using a test-retest method using intraclass correlation coefficient (ICC) with two-way mixed effects model. This was found to be 0.61, a number that is deemed to be low. Usually, intra-class correlation coefficient of > 0.80 is indicative for excellent reliability [46].
We found no association between NVS- Ar Score and the educational level of the parents/guardians. This finding is in agreement with previous studies which showed no link between NVS score and educational level [17, 24, 45] although some studies have reported a link [28]. In UK, NVS score was reported to have a weak correlation with educational attainment [17]. Nevertheless, it is known that educational achievements are not good predictor of literacy skills as many individuals have literacy skills well below what might be expected from their level of education [17].
Previous studies showed that HL of the parents of children with diabetes is related to glycemic control of their children [37], thus we investigated the functionality of NVS-Ar by assessing the association between NVS-Ar score for the parents/guardians and Hb1AC for their children with T1D. Our results show no correlation between NVS-Ar score and the HbA1C of the diabetic children. In fact, there was a significant inverse association between HL of the parents as measured by NVS-Ar (Score ≥ 4) and optimal glycemic control in their children. Our results are not surprising as NVS previously was shown to have a limited utility in predicting medication adherence among Portuguese adult [45] although this was attributed to floor effect (large number of participants scored very low score). In our study, despite the average high score (median 4.0) (i.e. limited floor effect), the NVS-Ar showed limited predictability of HbA1C which cast doubt on the predictive value of the NVS-Ar in our setting. Al-Jumaili et al [24] suggested that NVS might not be applicable as HL test for people in Iraq who are not accustomed to reading product labels in their daily life. In the Netherlands, NVS was found to be problematic and a new NVS for Dutch people was developed [27].
NVS is attractive because of its brevity and covering both reading and numerical skills. However, the drawbacks of NVS include low reliability and doubts about its validity. The validity of NVS has been investigated by running a parallel HL test such as TOFLA which does not really show the functionality of the tool. Our findings suggest that NVS is too short to provide reliable results and to be a predictive tool for functional HL at least in our setting.
HL is an increasingly researched area in health care in the West compared to a severe lack of such research in the Arab world. Our current study highlights the need for the future research and international comparison of HL in Arabic-speaking populations. It presents a validated and rigorous cross-cultural adaptation process of the measurement tool that addressed potential differences in cultural interpretation of language and its utilization. However, there are some limitations to the present study. Firstly, due to the lack of a “gold standard” Arabic tool to test HL among adults, the process of validating the Arabic version of the NVS is incomplete. Secondly, to test feasibility and reliability of the Arabic version, adult caregivers of children under follow-up at the Diabetes clinics were selected. These participants may have different HL skills compared to the general adult population in the country which might limit the generalizability of our findings. Testing of the Arabic version of the NVS on other adult populations could further validate the feasibility and reliability of the instrument.
In conclusion, our findings demonstrate that NVS is unlikely to be a predictive tool for functional HL in Arabic settings and that there is a need to properly translate and validate other tools such as TOFLA or alternatively develop a reliable tool de novo. Such work is a prerequisite for initiatives that aim to improve HL in Arab countries.