Citation analysis of the results was performed using CitNetExplorer software. The findings of this section are based on Cluster publications based on their citation relationship and analyzing the resulting clustering solutions at the level of individual publications.
Thus, the analysis of the results was as follows that is a total of 3826 publications are involved in 3661 citation links between these publications during the study period. Table 2 provides an overview of citation links in three three-year block periods. The highest citation links and relative publications are observed in the third 10-year period.
Table 2. Evolution of citation links
Block Period
|
Publications
|
Citation Links
|
Cl/P*
|
1960-2000
|
462
|
113
|
0.24
|
2000-2010
|
1388
|
932
|
0.67
|
2010-2020
|
2211
|
1157
|
0.52
|
* Citation Links/Publications
The chronological citation network is shown in Figure 2. CitNetExplorer was used in the visualization of a citation network of documents, by default, displays tags with the first author's last name. In this image, the circles symbolize the documents. The curved lines represent the citation relationships of each document (Van Eck and Waltman 2011).
The map above shows that the main and most cited articles were in two main themes. In the first theme (left), the main content of the resources was an effective method of organizing information. Most of the articles in this category deal with the methods of mining, anthologies, and their application and indexing of resources. As time goes on, the topics of the articles move from mining and its methods in retrieving information to anthologies and their application to the meaning of information. In this regard, Wilbur and Yang's article is considered as a basic article. They provided a new information-theoretical interpretation of term strength, reviewed some of its uses in focusing on the processing of documents for IR, and described new results obtained in document categorization (Wilbur and Yang 1996).
In the second topic, the main content of the documents was the application and performance of IR systems in Question and Answering forms and the analysis of information behaviors of health professionals. Over time, the thematic content of documents has shifted from search, text browsing, and search tools to topics such as physicians' clinical answers and information-seeking behaviors. The Harsh article in this category is considered as a basic article in which it discusses the use of IR systems by physicians to answer clinical questions and physician information behavior. The purpose of this article is to provide a conceptual framework and to apply the results of previous studies to this framework (Hersh and Hickam 1998).
Thematic clustering of documents based on CiteNetExplorer analysis is shown in Figure 3. From the result after analysis, the documents were categorized into 4 thematic clusters. Each cluster contains documents that are strongly related to each other. The results showed that a total of 136 documents are placed in these clusters and the core clusters were in the form of the following clusters: 48 (35%) to group 1 (blue), 36 (26%) to group 2 (green), 35 (25%) to group 3 (red) and 10 (7%) to group 4 (orange).
The thematic theme of the documents in blue clusters was “the analysis of physicians' information behavior, IR systems, EBM, and CDSSs”, in Green Cluster were “EHR and Medical Documents”. Also, the thematic theme of the red cluster was “text mining and indexing” and the thematic theme of the orange cluster was “question answering systems”.
Also, as can be seen in the chart above, most of the core publications were published from 2000 and 2010, and this indicates the significant impact of the scientific activities within this decade on the scientific productions of the next decade. In other words, most of the core scientific products that have created the infrastructure for other IR research in medical sciences had been from 2000 to 2010. As shown in Table 2, the citation link ratio of the scientific production of this period was higher than the number of its publications (0.67).
Topic networks were based on Co-occurrence networks and term maps using VOSviewer software. This embodiment shows the most important terms in the publications belonging to a cluster and the co-relational relationships of these terms. In this section, the co-occurrence analysis of words for the analysis of thematic trends in the field of IR in medical sciences is examined.
One of the problems of this stage was the existence of different forms of writing or singular and plurals and synonyms of concepts for drawing lexical maps. Therefore, to unify the concepts and prevent the dispersion of the same concepts, the researchers first designed a specialized thesaurus in IR in medical science to be used in the analysis by VOSviewer. This is one of the specialized advantages of VOSviewer software analysis. Figure 4 shows a picture of designed terminology to use in analyzing data by VOSviewer
The results of this section showed that the documents examined had a total of 10783 keywords. In addition to the author's keywords, a "keyword plus" is provided on the web of science database to provide a more accurate overview of the summary of articles. Therefore, based on the researchers' experience, both options were selected as the criteria for selecting keywords for deeper analysis. For the meaningful drawing of knowledge maps, the minimum number of occurrence conditions was considered to be 20 for analysis, and under these conditions, 116 keywords were selected as frequent keywords for these articles. Then, to increase accuracy, irrelevant keywords such as "medicine" were removed from the selected keywords. In the end, 80 keywords remained. In all maps, we plotted the weight of the words based on the frequency of the events.
The placement of keywords in clusters and the distance between nodes is based on the simultaneous use of two or more similar keywords. The size of each circle in the cluster indicates the abundance of that word in that cluster (Mohammadi et al. 2019; Rezaei and Mohammadi 2018).
After drawing the clusters and examining the keywords, it was found that the analyzed documents were in the themes of IR technologies and techniques (first cluster), information behaviors and CDSS systems (second cluster), indexing and knowledge representation tools (the third cluster) and the knowledge of searching for resources and topics related to databases (the fourth cluster) and searching for information as placed on the web (the fifth cluster). The first and second clusters had the highest number of keywords with 30 items, and after these clusters, the third clusters with 10, the fourth with 7 items, and the fifth cluster with 4 items.
In terms of all the three indicators of links, total strength link, keyword occurrence, the order of importance of keywords in the 5 clusters are as follows: In the first cluster, the keywords of “Information storage and retrieval”, “IR system”, “Natural language processing”, “Ontology’s”; in the second cluster, “Knowledge”, “Models”, “Electronic health record”, in the third cluster, “Query expansion’, “MeSH”, “UMLS”, “Terminology”, in the fourth cluster, “Bibliographic databases”, “Bibliometric”, “Databases” and “Literature searching” have the most important in their cluster (Figure 5).
Table 3 provides detailed information on the keywords in each cluster, the number of links per keyword with other concepts, Total Strength Link, and keyword Occurrence. For each specific keyword, the links and total strength link, respectively, show the number of links of a keyword with other keywords and the overall strength of the links of a keyword with other items. There can be a link between any pair of items. A link is a relationship between two things. In other words, the numbers presented indicate the number of links between each item and other items; That is, the X keyword is related to several other keywords in terms of coincidence. Each link has a strength that is indicated by a positive numerical value, as the higher the value, the stronger the bond. The strength of a link indicates the number of documents in which the two terms occur together. Occurrences show the number of documents in which a keyword appears.
Based on these results, the first cluster, "IR technologies and techniques," had the highest Link (1176), Total Strength Link (7265), and Keyword Occurrence (3228). Regarding the Link index, the keywords of the fifth cluster, "web IR", had the lowest number of links and coincidences with 216 items. But the keywords of the third cluster with Total Strength Link equal to 1129 and Keyword Occurrence equal to 420 had the lowest indicators.
Table 3. Thematic clusters in IR in medical science and detailed information of keywords based on the three (3) attributes.
Cluster number (color)
|
keyword
|
Link
|
TSL*
|
KOc**
|
Cluster number (color)
|
keyword
|
Link
|
TSL
|
KOc
|
1 (red)
|
Algorithms
|
47
|
164
|
83
|
2(green)
|
Access to information
|
51
|
130
|
41
|
Annotation
|
35
|
98
|
24
|
Behavior
|
40
|
143
|
57
|
Architecture
|
33
|
61
|
21
|
Clinical question
|
44
|
130
|
35
|
Big data
|
21
|
45
|
29
|
Communication
|
28
|
49
|
27
|
Bioinformatics
|
30
|
98
|
37
|
Decision making
|
36
|
97
|
37
|
Biomedical literature
|
28
|
71
|
22
|
Decision support systems
|
35
|
95
|
35
|
Classification
|
56
|
241
|
100
|
Design
|
48
|
148
|
51
|
Content-based image retrieval
|
16
|
32
|
23
|
Education
|
36
|
90
|
47
|
Data mining
|
39
|
134
|
60
|
Electronic health record
|
50
|
219
|
97
|
Image retrieval
|
38
|
118
|
45
|
Framework
|
53
|
147
|
74
|
Gene ontology
|
32
|
74
|
22
|
Impact
|
46
|
148
|
56
|
Information extraction
|
39
|
174
|
63
|
informatics
|
65
|
381
|
143
|
Information retrieval system
|
76
|
723
|
279
|
Information management
|
29
|
81
|
29
|
Information storage and retrieval
|
79
|
2845
|
1562
|
Information seeking behavior
|
24
|
53
|
20
|
Integration
|
30
|
63
|
20
|
Information systems
|
34
|
97
|
37
|
Machine learning
|
41
|
151
|
63
|
Knowledge
|
71
|
390
|
110
|
Natural language processing
|
59
|
401
|
146
|
Management
|
44
|
128
|
44
|
Networks
|
54
|
171
|
59
|
Medical records
|
22
|
71
|
25
|
Ontologies
|
50
|
354
|
147
|
Memory
|
8
|
14
|
22
|
Patterns
|
28
|
61
|
23
|
Models
|
61
|
261
|
109
|
Recognition
|
20
|
36
|
21
|
Needs
|
37
|
125
|
37
|
Resources
|
49
|
103
|
31
|
Patient care information
|
29
|
103
|
24
|
Search engines
|
47
|
116
|
40
|
Quality
|
48
|
230
|
78
|
Semantic web
|
49
|
212
|
85
|
Question
|
42
|
147
|
45
|
Similarity
|
27
|
66
|
29
|
Relevance
|
36
|
97
|
35
|
Text mining
|
50
|
254
|
93
|
seeking
|
41
|
151
|
44
|
Text retrieval
|
58
|
258
|
58
|
Support
|
36
|
86
|
26
|
Tools
|
45
|
141
|
43
|
Technology
|
36
|
86
|
32
|
Total
|
1176
|
7265
|
3228
|
Total
|
1130
|
3897
|
1417
|
3 (blue)
|
Indexing and abstracting
|
40
|
133
|
40
|
4(yellow)
|
Bibliographic databases
|
38
|
202
|
65
|
Controlled vocabularies
|
35
|
81
|
26
|
Bibliometric
|
25
|
49
|
20
|
Evaluation
|
31
|
92
|
33
|
Databases
|
68
|
529
|
178
|
Language
|
42
|
115
|
41
|
Literature searching
|
14
|
62
|
23
|
MeSH
|
52
|
160
|
58
|
Medline
|
62
|
513
|
185
|
Performance
|
37
|
86
|
40
|
Search
|
65
|
419
|
142
|
Query expansion
|
46
|
172
|
71
|
Strategies
|
35
|
125
|
41
|
Total
|
307
|
1899
|
654
|
Terminology
|
46
|
127
|
42
|
5(purple)
|
Consumer health information
|
21
|
71
|
21
|
UMLS
|
36
|
111
|
49
|
Information
|
72
|
598
|
192
|
Vocabulary
|
26
|
52
|
20
|
Internet
|
62
|
567
|
206
|
Web
|
61
|
406
|
141
|
Total
|
391
|
1129
|
420
|
Total
|
216
|
1642
|
560
|
* Total Strength Link
** Keyword Occurrence
We used SciMAT to draw a thematic strategic diagram in the field of IR in medical sciences. To do this, after entering the data into the software, 10530 keywords were recovered. The reason for the difference with VOSviewer is that SciMAT only considers the author's words and not the keyword plus. Then we cleared the keywords. By removing the unrelated ones and replacing the synonyms. After all this work, 263 items (keywords that have been cleared) remained for analysis.
Figure 5 shows stability measures over three consecutive periods. The loops represent the periods and numbers inside each loop, indicating the number of keywords. The horizontal arrow shows the number of common keywords in both periods, and in parentheses, the similarity index is shown between them. The upper-incoming arrow indicates the number of new keywords within a period and in the period but not in the next period (Cobo et al. 2011).
The results of this section showed that the number of keywords increased significantly over time, and in the 2000-2010 period, compared to the period before 2000, it increased by 2.38 times. Similarly, the number of common keywords between subsets has increased from 94, between the period before 2000 and the period 2000 to 2010, to 224 between the period 2000-2010 and 2010-2020. The similarity index has grown over time from 0.43 to 0.71. This means that researchers in medical IR have, over time, brought their terms closer together. On the other hand, the findings show that during the 2000s and 2010s, most new keywords (91 keywords) entered the literature and terminology of IR in the medical field, indicating the growth of new concepts and dramatic changes in the development of thematic boundaries in this decade. But from 2010 and 2020, compared to the 2000s and 2020s, the emergence of new keywords has reached almost half (57 keywords), indicating a relative slowdown in the growth rate of the subject's domain (Figure 6).
Figure 2 shows a strategic chart of scientific topics in a chart. In this diagram, the centrality index is on the x-axis and the density index is on the y-axis. The strategic chart is used to determine and analyze the position of clusters and thematic concepts under each field and to describe the internal relationship and correlation from thematic clusters and the illustration of maturity and the coherence of thematic clusters. Also in the strategic chart, centrality indicators are used to measure the relationship between one subject area and other thematic areas and the density. Centrality indicates the importance of an issue, and the larger the index, the more important the cluster among the existing issues. The density index indicates the strength of the bonds that connect words in a cluster (Abdollahzadeh 2019; Cobo et al. 2012).
Using two indicators, centrality and density, the strategic chart is divided into four quarters. The topics in the upper right quarter (first quarter) are fully developed and are very important for the development of the main research structure in medical science. They are known as special themes due to their high centrality and density. The placement of the Topics in this quarter means that they have the most internal coherence and connection and are conceptually very close and related. Topics in the upper left quadrant (second quarter) are still coherent but decentralized, each of which consists of smaller specialized areas of science. Topics in the lower left quadrant (third quarter) have low density and centrality, which mainly reflect emerging or declining scientific disciplines. Topics in the lower right quarter (fourth quarter) are important in a research field but have not yet matured and have the potential to become major topics in the field (Abdollahzadeh 2019; Cobo et al. 2011; Ke et al. 2013; Melcer et al. 2015) (Figure 7).
To draw a strategic diagram to explain the situation more accurately, a strategic diagram is presented based on the number of scientific productions and the index of citation to the scientific products of the field under study.
Based on the average of citation to scientific products, the largest clusters includes ‘Similarity measures” (40.41 citations), “Mechanism” (39.37 citations), and “Barriers” (34.82 citations). In the Similarity measures cluster, “Similarity measures”, “distance nodes” with 11 documents were the largest nodes, followed by “Sets”, “Topic Models” with 6 documents in the next ranks. In the Mechanism, the cluster was “Mechanism” nodes with 15 documents and “Single-molecule magnet” with 3 documents. In the Barriers cluster, there were “Complexes” nodes with 14 documents and “Barriers” with 4 documents.
Based on the number of documents, “Medical Informatics” (1281 Doc.), “Experience” (51 Doc.), and “Expert Systems” (45 Doc.) were the largest clusters. In the Medical Informatics cluster, “Medline Search” (224 Doc.), “Medical Informatics” (211 Doc), “Database Management Systems” (189 Docs), and “Ontology” (152 Doc) was the largest nodes. In the Experience cluster, “Methodology and Experience” (15 Doc) and “University Library” (12 Docs) nodes were the largest nodes. In Expert System clusters, “Expert Systems” (13 Docs), “Conceptual graph” (12 Doc), “Interface” (11 Docs), and “Cased-based reasoning” (11 Docs) were the largest nodes. (Figure 8).
Analysis of these findings shows that in the field of IR in medical sciences, clusters of Similarity measures, Expert systems, Concepts, Experience, Answers, Multi-model IR are in the first quarter of the strategic chart. In the second quarter are the Smartphone, Hybrid, Decision tree, RFID, Feasibility Study Clusters. The third quarter includes Relational Database Clusters, Mechanism, Clinical Information Systems, Medical Terminologies, and barriers. In the fourth quarter are health information exchanges, metadata, Medical Informatics.