The development and validation of the questionnaire consist of two main steps. The first step was item development. In this step, a meticulous study of the literature was conducted to identify available resources that measure knowledge, attitude, and practice of TMC as well as discover relevant items existing in previous published surveys. The second step was psychometric validation including (i) content validity using items-content validity index (I-CVI) technique, (ii) Face validity, (iii) Test-retest in 57 respondents within 14 days apart with Indonesian male participants in South Australia, and (iv) Exploratory factor analysis (EFA) to explore the pattern of domains and also to act as a pilot study which involved 140 participants in West Timor (Fig. 1). Data were analyzed using SPSS version 29 (SPSS, Chicago, USA).
Stage I: Item development
Literature search was conducted to discover available resources on knowledge, attitude, and practice and to identify relevant items in existing questionnaires on TMC. Item development was employed based on the systematic review of TMC and HIV risk factors in men [36], identified items from existing literature, and the male circumcision services quality assessment tool kits developed by the WHO [43]. A literature search regarding knowledge and attitudes was conducted to identify potential items through keywords of “knowledge”, “attitude”, “questionnaire”, “circumcision”, “HIV” and “traditional male circumcision”, which have been used in previous studies [36, 44]. Data search was conducted in English data basis including PubMed, CINHAL, Scopus, and Medline. The researcher (GAA) identified questions in other surveys from literatures including WHO tool kits that measure knowledge, attitude, and practice and created a list of items. The listed items then were sent to other authors (NKF, HAG, PRW) for assessment. Some items were modified as the items were noted to be unclear, which may be because the items were originally developed for use in different countries such as Africa. The researcher (GAA) iteratively made revision and contextualised the items according to the context in the study setting based on the feedback from the other authors. The revised version went through further discussion among the other authors prior to content validity by experts or panels.
Stage II: Psychometric validation
Content validity
Content validity refers to assessing the content relevance and content representation i.e., the items generated assess knowledge and attitude on TMC and HIV risk transmission of the targeted population. It also provides preliminary evidence of the construct validity. This study employed the most widely used method known as content validity index (CVI) involving the level of agreement among experts or panels [45, 46]. The number of panels rate for each item divided by the total number of panels. Scale Content Validity Index (S-CVI) average is calculated by taking the sum of the I-CVIs divided by the total number of items. It is suggested that the number of panels for the CVI ranged between 5 to 10 [46]. CVI value > 0.79 representing the relevancy items, between 0.70 and 0.79 required some revisions, and values less than 0.70 indicating elimination of the item [46].
To evaluate the content validity, the first step we invited 8 panels who have publications on male circumcision. These panels were randomly selected from their publications via Google Scholar. Of the 8 panels, 1 panel is from a university in Australia, 3 panels are from universities in Africa, 1 panel is from a university in Canada, 2 are from universities in the US, and 1 from a university in Switzerland. Of the first-round invitation, 2 panels provided the feedback. The second-round invitation was taken after two months of the first round with the purpose to reach at least 6 panels CVI purpose. Other 5 panels consisting of 4 panels from the US and 1 from Denmark were invited for the CVI. All panels were asked to evaluate each item’s relevancy, clarity, and simplicity. Of the 5 panels, 1 panel provided the feedback until the deadline. Therefore, there were 3 panels provided the feedback for CVI. All the authors (GAA, NKF, HAS, PRW) compromised to have 3 panels for the CVI. All the panels were asked to rate the relevance of each item with 5-likert scale such as 1 = highly irrelevant, 2 = not relevant, 3 = neither irrelevant nor relevant, 4 = relevant, and 5 = highly relevant. The CVI was calculated as follows. The value 1 if the item was rated 4 or 5 indicating the items are relevant, and the value 0 if the item was rated less than 4 indicating the items are not relevant. The I-CVI was based on the proportion in agreement about relevance or computed based on the number of experts giving and then divided by the total number of experts. The item is proceeded to the next validity, if the majority of the experts rates 4 or 5 (relevant). Meanwhile, the item is not proceeded to the next validity, if the majority of the experts rates ˂ 4 (irrelevant).
Face validity
Face validity is the response process of validity from targeted participants [47]. This study used Face Validity Index (FVI) [48]. All participants were asked to evaluate the items with respect to problems, ambiguity, proper terms and grammar and understandability using a 5-point Likert scale. Likert scores of 4–5 was categorized as very clear and understandable and was recorded as 1. The other scales were categorized as 0, indicating the items are not understood. The recommended FVI for 10 participants is 0.83 [47]. Items with ambiguous meanings or interpretations detected by face validity were deleted or reformulated. In this phase, items validated in the content validity step were translated into Indonesian language. 10 men from different backgrounds in Indonesia were asked to evaluate the translated items as the validated questionnaire will be used for evaluating knowledge and attitude of men particularly in communities practicing TMC in Indonesia.
Test-retest reliability
Test-retest reliability was measured using the same instrument on the same sample at two different times on the assumption that there will be no significant change in the construct under study between the two sampling time points [49, 50]. A high correlation between the scores at the two time points indicates the instrument is stable over time [49]. The shorter the interval, the higher the correlation between the two tests, the longer the interval, the lower the correlation [50]. The test-retest reliability was assessed by intra-class correlation coefficients (ICCs). An ICC below 0.5 indicated poor reliability, 0.5 to 0.75 indicated moderate reliability, 0.75 to 0.9 indicated good, and greater than 0.90 showed excellent reliability [51, 52].
In this study, a test-retest reliability was conducted with Indonesian men living in South Australia. This is because the researcher was studying in South Australia, pursuing his PhD degree, and this validation questionnaire was part of his PhD project. The researcher compromised to do the test retest with Indonesian in South Australia before the researcher went to field study in Indonesia. In doing so the researcher focused on the survey during the field study. Additionally, as the questionnaire will be used for the Indonesia context, we focused on having Indonesian participants who have TMC practice in their communities or culture. The inclusion criteria were the completion of a Google form-based questionnaire distributed to 60 Indonesian men aged 18 to 49 living in South Australia. The first test was distributed on 16th November 2024, and the second test was disseminated two weeks after the first test. Of the 60 participants, 57 participants completed the second survey. Three incomplete questionnaires were excluded. Non-parametric statistical test was applied as the scale was not continuous. To analyse the significant difference between test and retest, Wilcoxon Non-parametric Statistical Test was employed.
The participants were contacted after obtaining ethics approval from the Research Ethics Committee at Torrens University Adelaide. Participants signed a consent form in the Google form prior to involving in the study. All data were anonymous to maintain confidentiality.
Exploratory Factor Analysis
Factor Analysis is a statistical method commonly used during instrument development to identify factors and summarise the items into a small number of factors [53]. The aim of using exploratory factor analysis (EFA) is to discover the correlations between items and to examine its internal reliability. The researcher applied EFA to ensure that all the items do reflect the same thing both in knowledge domain and in attitude domain. Before conducting EFA, we ensured that the correlation matrix was factorable by using Bartlett’s test of sphericity to check the correlation matrix was not random and then used the Kaiser–Meyer–Olkin to measure of sampling adequacy, which was required to be above 0.5 for all variables [54–56]. EFA is a particular factor analysis method used to examine the relationships among variables without determining a particular hypothetical model [57]. It is also used to explain the link among the observed factors with or without underlying theoretical processes in mind [58, 59].
The returned questionnaires were entered into SPSS. EFA was used to assess the multidimensional structure of the questions and statements. P value from Kolmogorov was applied to check the normality of the data. It is suggested that the sample size to conduct EFA is between 2 and 20 respondents per item [60]. The sample size was calculated based on the sample size for EFA using a sample to a variable ratio (N: p Ratio where N refers to the number of participants and P refers to the number of variables or items) recommended for questionnaire validation studies. For this study, the ratio of 10:1 (10 participants per item). Cronbach alpha was used to measure internal consistency. Cronbach alpha within the range of 0.60 to 0.95 was used in the questionnaire [61]. Items with a loading factor of 0.5 and above were considered an acceptable loading factor [62, 63]
Data collection for the pilot test was from January to April 2024 using face to face survey. The inclusion criteria for this analysis are men aged 18–49 years regardless of their circumcision status and living in West Timor. The questionnaire was hosted through Google forms and was exported to Excel spreadsheets for analysis. The consent was obtained via the Google form and each participant was informed that their participation was voluntary and that they could withdraw before and after the survey. Upon the completion of the validation none of participants required to do so.