Conceptual coverage
For the symptom and sign concepts (Supplementary material Table 1) both SNOMED-CT and MedDRA provided complete coverage, whilst MeSH covered 88%. FoodEx2, LanguaL and AGROVOC did not cover any of the symptom concepts but did provide superior coverage for food concepts (Table 2). AGROVOC covered 85% of the food concepts, LanguaL had complete coverage and FoodEx2 covered all of the food concepts except for sulphur dioxide. In addition, LanguaL uses the EFSA FoodEx2 coding and classification system for products in the European Union, which validates the legitimacy and effectiveness of FoodEx2.
Table 2
Comparison of five terminologies for the classification and coding of food allergy information.
Terminology | Conceptual coverage | Classification | Concept description | Additional information |
Symptom (n = 25) | Food (n = 47) |
SNOMED-CT | 25 | 42 | Concepts are arranged into a hierarchical structure. | Detailed and unambiguous definition. | Preferred term and synonyms. |
MeSH | 22 | 32 | Concepts are arranged into a logical and detailed hierarchical structure. | Detailed and unambiguous definition. | Provides related concepts (where available) for each term. |
MedDRA | 25 | 6 | Terms are arranged into a 5 tier branching system expanding from very specific to more general concepts. | No definition provided. | Includes a preferred name and related synonyms. |
FoodEX2 | 0 | 46 | Concepts are categorised into one of 21 groups and facets can encode additional detail including ingredients or processing techniques. | Clear description along with common and scientific name | N/A |
LanguaL | 0 | 47 | Concepts are classified systematically based on 14 key terms including product type, food source and cooking method. | Clear description along with common and scientific name | N/A |
AGROVOC | 0 | 40 | Terms are arranged in a hierarchical and non-hierarchical system to classify a range of concepts and to indicate related terms. | No description is provided, but the broader concept is included to provide context. | A list of related concepts are included for some terms. |
Description of concepts
The term definitions provided by each of the coding systems were reviewed and compared with the definitions found in the literature (Supplementary material Tables 2 and 3). MedDRA does not provide formal definitions for the symptom and sign concepts and so fails to provide any additional clarity or description for each term. In contrast, SNOMED-CT and MeSH provide clear and unambiguous descriptions for each concept. This is illustrated in Table 3 using “urticarial rash” and “hazelnut” as an example sign and food respectively. Thus, both MeSH and SNOMED-CT include a detailed definition to describe the sign “Urticarial rash” which is consistent with the key definitions (Supplementary material Table 2). Similarly, FoodEx2, LanguaL and AGROVOC all provide a detailed description for each food term as well as detailing both the common and the Latin name for each food to reduce ambiguity (Supplementary material Table 3).
Table 3. Comparison of the definition of exemplar symptom (urticarial rash) and food (hazelnuts) terms according to the different terminologies.
Terminology
|
Definition
|
Urticarial rash
|
SNOMED-CT
|
“A raised, erythematous papule or cutaneous plaque usually representing short-lived dermal oedema.“
|
MeSH
|
“A vascular reaction of the skin characterised by erythema and wheal formation due to localized increase of vascular permeability. The causative mechanism may be allergy, infection or stress.”
|
MedDRA
|
N/A
|
ThRAll approach
|
“A condition characterized by the development of wheals (hives), angioedema or both”
|
Hazelnut
|
SNOMED-CT
|
“Tree nut (substance)”
|
MeSH
|
“A plant genus of the family BETULACEAE known for the edible nuts”
|
FoodEX2
|
“Tree nuts from the plant classified under the species Corylus avellana L., commonly known as Hazelnuts or Cobnuts or Common hazelnut. The part consumed/analysed is not specified. When relevant, information on the part consumed/analysed has to be reported with additional facet descriptors. In case of data collections related to legislations, the default part consumed/analysed is the one defined in the applicable legislation.”
|
LanguaL
|
“The group includes kernels of the seeds of all species similar to Hazelnuts or similar nuts sharing the same pesticide to the maximum residue level (MRL) as Hazelnuts.”
|
AGROVOC
|
“The fruit of small trees of shrubs of the Corylaceae family. The round-oval nuts are surrounded by a leafy involucre, which comes out easily when the fruit is mature. Remains the nut, with a pericarp, the shell, more or less woody, and depending on varieties. Inside is the seed, covered by a very thin tegument.”
|
Where appropriate, MedDRA aggregates and highlights similar terms related to the specific concept and so allows comparable terms to be easily accessed. For example, ‘urticaria rash’ is the preferred name that classifies 33 related concepts including ‘urticaria localized’ and ‘generalised urticarial rash’, which provide varying levels of detail and alternative terms that include a level of clinical interpretation. This is useful but the large number of similar terms may reduce the specificity and level of detail initially identified by a term. In contrast, concepts in SNOMED-CT are associated with a unique Fully Specified Name (FSN); this is the ideal term that a clinician would use in a particular language, dialect or context. SNOMED-CT also identifies relationships to other similar and related concepts and considers the preferred name for ‘urticarial rash’ to be ‘weal’ with ‘wheal’, ‘welt’, ‘hives’ and ‘nettle rash’ being noted as alternative terms. Since concepts are coded in MeSH for the purpose of indexing publication records, each MeSH term can comprise several synonyms; for example, ‘urticaria’ also includes ‘urticarias’ and ‘hives’. All concepts that come under one record are considered equivalent and this is useful when trying to maximise the number of relevant articles identified in a search but not necessarily when identifying synonyms in the context of compiling data on food allergy in the ThRAll database.
LanguaL and FoodEx2 do not provide related concepts since this is not appropriate in the context of a food classification system as there would be no suitable synonyms. In certain cases AGROVOC provides equivalent terms and where appropriate states the broader and/or narrower relevant concepts as well as identifying what the product is produced (i.e. the plant/ animal species of origin).
Mechanistic basis to classify an adverse reaction
It was also important to compare the pathway that each of the terminologies used to classify an adverse reaction. Food can induce a range of adverse reactions but, although the symptoms and signs may be similar, the mechanistic basis of allergies, intolerances and sensitivities are completely different. An adverse reaction to food 28 encompasses both immune- and non-immune mediated adverse reactions (Fig. 1). Sub-types of immune-mediated adverse reactions include those involving the development of food-specific IgE antibody responses. This type of food allergy results in symptoms and signs that appear immediately (in less than 2 hours, usually within 30 minutes) sometimes even after ingesting a small dose of the allergen, and can involve multiple organs including the skin, respiratory, digestive and cardiovascular systems. A second type of well-defined non-IgE-mediated adverse reaction to food is the T-cell mediated syndrome triggered by ingestion of gluten, known as coeliac disease (CD)29. This life-long disease involves sensitivity to gluten and individuals with CD often present with gastrointestinal signs and symptoms, including diarrhoea, together with weight loss due to the malabsorption of nutrients. In contrast, food intolerance conditions are not immune mediated but nevertheless can be reproducibly induced following ingestion of specific foods. One example is lactose intolerance where individuals lack the lactase enzyme, which is involved with the digestion of lactose. Symptoms appear shortly after drinking milk or consuming dairy and are commonly reported as stomach pain, bloating and diarrhoea. Lactose intolerance is different to an IgE mediated milk allergy, which is an IgE mediated reaction where symptoms appear within 2 h of consuming milk-containing foods.
The classification of an adverse reaction used in the ThRAll project (Fig. 1) was used to benchmark how the different terminologies classify an IgE mediated food allergy (Fig. 2). SNOMED-CT considers the trigger of an allergic reaction as the causative food and then links this back to the allergic hypersensitivity. It provides an overview of the process of the reaction but also filters into specific details about the adverse response, including branching to the causative agent and a qualifier for the severity of the reaction. This is consistent with terminology from the World Allergy Organisation (WAO) and the European Academy of Allergy and Clinical Immunology (EAACI)30. MedDRA and MeSH classify food allergy from a disease perspective and then acknowledge the response as a consequence from ingestion of the problematic food. MeSH also includes a logical and clear flow using the descriptor “food hypersensitivity” as a type of “immediate hypersensitivity” which is used as a synonym of IgE-mediated food allergy. This is linked to the causal food and the eliciting symptoms. In contrast, MedDRA utilizes a tree flow diagram to show how a food allergy is classified but this was a simpler and less detailed pathway compared to MeSH and SNOMED-CT. Since the ThRAll project is considering the reaction from a food and public health perspective, the SNOMED-CT approach to classify a food allergy was considered the most appropriate.
Figure 3 illustrates how MeSH, MedDRA and SNOMED-CT classify a non-IgE immune mediated adverse reaction to food and coeliac disease. MedDRA and MeSH both provide multiple ways to classify the pathway of coeliac disease. These terminologies consider this disease as a nutritional or gastrointestinal disorder that leads to malabsorption, which clearly demonstrates the reaction is triggered by food. MedDRA has a third pathway to classify coeliac disease from a disease perspective as an autoimmune disorder; this is consistent with the ThRAll approach, which classifies coeliac disease as an immune-mediated reaction. Again, SNOMED-CT has a different way of approaching the classification of coeliac disease in comparison to MedDRA and MeSH but still considers it a malabsorption syndrome caused by the ingestion of gluten. Figure 3 demonstrates that coeliac disease is an adverse response with a clear food trigger, yet the pathway to classify this reaction is very different to the classification of a food allergy and so these reactions should be considered separately.
Lastly the pathways to describe a non-immune mediated adverse reaction by specifically looking at the classification of lactose intolerance are shown in Fig. 4. MedDRA, MeSH and SNOMED-CT all describe lactose intolerance as a metabolic or gastrointestinal disorder which disrupts the absorption of carbohydrates. The ThRAll approach classifies lactose intolerance as a non-immune mediated reaction due to a disorder of an enzymatic process. The lack of lactase enzyme in lactose intolerance patients causes the malabsorption of the carbohydrate lactose and so demonstrates the similarities between these classifications.
Classification schemes for symptoms and signs
MedDRA has a logical five-tier structure expanding from very specific to more general concepts. Lowest Level Terms (LLTs) represent the most specialised concepts including symptoms, medical procedures and personal characteristics. LLTs can be considered the preferred term (PT) or a synonym of the preferred term (PT) 31. Similar PTs are aggregated into High Level Terms (HLT) based upon anatomy, pathology, physiology or aetiology. These HLTs are further categorised into High Level Group Terms (HLGT) that are then split into one of 26 System Organ Classes (SOC’s) providing the most general classification. This is a logical and methodical organisation system but concepts are confined to these five levels and further clarification or granularity cannot be expressed beyond the LLT to indicate the severity or manifestation of a symptom. Concepts in SNOMED-CT can vary in their specificity; more general concepts are aggregated together which filter down to more specific terms. Relationships are used to portray a confirmed association between multiple concepts. The branching structure in SNOMEC-CT is useful to be able to code and represent clinical data at a level of detail that is appropriate to a range of different uses. In contrast MeSH is a cataloguing system with a slightly different framework to the other terminologies as terms do not identify clinical phenomena but instead represent a category. MeSH is used to categorise and retrieve records and is organised in a hierarchical structure with 16 primary categories that splits into subcategories providing more detailed terms. A letter corresponding to a category and a number representing the hierarchical level provides an identifier for each term.
Classification schemes for food
There are many levels of granularity that need to be considered when describing and encoding a food product including the origin of the food, the food matrix and processing techniques. For example, when considering a food allergy, we need to identify differences in the frequency or severity of a reaction,1 which may be affected by cooking technique (e.g. dry roasted compared to raw peanuts) or the food matrix/ vehicle used (e.g. peanut butter, whole peanut, baked goods containing peanuts). It is useful to have a system with the ability to encode multiple levels of detail depending on the amount of data and information that is available. This is important given that the literature shows that some food allergens are sensitive to food-processing techniques and a high fat content may increase the allergenicity of the protein 32,33. The structure of these different classification systems represents their primary use whether that is to understand nutritional value, physical characteristics or the type of food product.
FoodEx2 is arranged into 21 clearly defined food groups, such that every food aligns to exactly one group. The system is made up of base terms and facets; the base term is defined by a unique five-character alphanumeric code and represents the specific foods within the hierarchy. Facets provide additional detail to the base term, such as the origin of the product; its ingredients and the process involved in its preparation. This additional information can be combined with the base term to provide a more detailed and complete description of the food product. This demonstrates the range of granularity and amount of information that has been considered in this classification system making it useful for encoding allergenic foods and common matrices or derivatives used in oral food challenges.
LanguaL systematically classifies food in a systematic way according to 14 key concepts including product type, cooking method and packing medium. These concepts are used to encode the product and enable almost any food product to be classified to the level of detail that is required. A unique code is provided for each food concept, which can then be translated into multiple languages.
AGROVOC relates concepts in a hierarchical and non-hierarchical structure. A branching logic is used whereby the term becomes more specific and precise. For example, nuts provides a broader way of describing hazelnuts. There is also a non-hierarchical relationship and this expresses related concepts where appropriate.
Figure 6 demonstrates how each of the standards classifies hazelnuts as an example of a food concept. The figure shows that LanguaL utilises the logic from FoodEx2 and that each of the terminologies distinguish between plant and animal products before identifying the relevant concept from a list of key categories and then filtering into the specific species.
Encoding symptom severity
When considering the dose at which an allergen induces an objective reaction it is useful to encode the severity of that reaction. Thus, when identifying the dose that elicits a reaction in p% of the allergenic population (eliciting dose, EDp), it is useful to understand the proportion of individuals that presented with either a mild, moderate or severe reaction at this dose. Some studies, such as iFAAM, have collected detailed information on severity which can also be used to identify criteria for stopping a challenge [16]. This information can be used to identify eliciting doses that present a tolerable level of risk of reaction in allergic individuals, as observed in the Peanut Allergen Threshold study 13. The classification of reaction severity is inherently subjective, arising from clinical interpretation of signs and symptoms and many different approaches have been developed and compared 34,35. SNOMED-CT codes do not always provide sufficient detail to describe the severity as well as symptom presence but this can be addressed by using an additional qualifier code (mild, moderate or severe) to fully express the specific, observed sign or symptom. This is illustrated for the iFAAM oral food challenge record where symptoms have been classified and coded with both the symptom code and the terms, mild, moderate or severe using the approach of Sampson 36 (Table 4). This shows, for example, if a patient experienced one episode of diarrhoea this would be coded as the SNOMED-CT code for diarrhoea and the code of mild severity and paired in a structure to indicate one episode of diarrhoea: 62315008–255604002.
Table 4
Encoding severity alongside symptoms and signs using SNOMED-CT.
Symptom | SNOMED- CT code | Description for relevant severity grade (SNOMED-CT code) |
Mild (255604002) | Moderate (6736007) | Severe (24484000) |
Pruritus | 418363000 | • Occasional scratching • Continuous scratching for > 2 min at a time | • Hard continuous scratching leading to excoriations • Scratching of palms, soles, genitals, scalp | |
Erythema | 247441003 | • Few areas of faint erythema • Areas of erythema ( < = 50%) | Generalised marked erythema (> 50%) | |
Urticarial rash (wheal) | 247472004 | Up to 10 new hives | Generalised involvement (> 10 new hives) | |
Angioedema | 41291007 | Mild lip oedema | • Significant lip oedema • Whole face oedema | |
Rhinitis | 70076002 | | • Rare bursts, occasional sniffing • < 10 bursts, frequent sniffing or intermediate rubbing of nose; • Long bursts, persistent rhinorrhoea or continuous rubbing | |
Ocular | | Intermittent rubbing of eyes | Continuous rubbing, periocular swelling, reddening | |
Wheezing | 56018004; 272040008 | | | • Expiratory wheezing to auscultation • Inspiratory and expiratory wheezing to auscultation • Use of accessory muscles or audible wheezing |
Gastrointestinal pain Nausea | 21522001 422587007 | | • Complaints of nausea or abdominal pain • Frequent complaints of nausea or pain with abnormal activity • Notably distressed due to GI symptoms with decreased activity | |
Emesis | 422400008 | | • 1 episode of emesis • > 1 episode of emesis | |
Diarrhoea | 62315008 | | | • 1 episode of diarrhoea • > 1 episode of diarrhoea |
Signs and Symptoms recorded in the iFAAM challenge record [16] were classified and coded as to their severity using a combination of the approach of Sampson 36 as being either mild (Sampson grade 1), moderate (Sampson grades 2 and 3) or severe (Sampson grades 4 and 5) and then encoded using SNOMED-CT. |