Prediction and mechanistic analysis of Drug-Induced Liver Injury (DILI) based on chemical structure

doi:10.21203/rs.3.rs-16599/v1

Download PDF

Research

Prediction and mechanistic analysis of Drug-Induced Liver Injury (DILI) based on chemical structure

https://doi.org/10.21203/rs.3.rs-16599/v1

This work is licensed under a CC BY 4.0 License

Journal Publication

published 18 Jan, 2021

Read the published version in Biology Direct →

You are reading this latest preprint version

Background: Drug-induced liver injury (DILI) is a major safety concern characterized by a complex and diverse pathogenesis. In order to identify DILI early in drug development, a better understanding of the injury and models with better predictivity are urgently needed. One approach in this regard are in silico models which aim at predicting the risk of DILI based on the compound structure. However, these models do yet show sufficient predictive performance or interpretability to be useful for decision making by themselves, the former partially stemming from the underlying problem of labeling the in vivo DILI risk of compounds in a meaningful way for generating machine learning models.

Results: As part of the Critical Assessment of Massive Data Analysis (CAMDA) “CMap Drug Safety Challenge” 2019 (http://papers.camda.info/), chemical structure-based models were generated using the binarized DILIrank annotations. Support Vector Machine (SVM) and Random Forest (RF) classifiers showed comparable performance to previously published models with a mean balanced accuracy over models generated using 5-fold cross-validation inside a 10-fold training scheme of 0.759±0.027 when predicting an external test set. In the models which used predicted protein targets as compound descriptors, we identified the most information-rich proteins which agreed with the mechanisms of action and toxicity of nonsteroidal anti-inflammatory drugs (NSAIDs), one of the most important drug classes causing DILI, and the previously established stress response pathways mediated by mitochondria, p38 MAPK and TP53. In addition, we identified multiple proteins involved in xenobiotic metabolism which could be novel DILI-related off-targets, such as CLK1 and DDR2. Moreover, we derived potential structural alerts for DILI with high precision, including furan and hydrazine derivatives; however, all derived alerts were present in approved drugs and were over specific indicating the need to consider quantitative variables such as dose.

Conclusion: Using chemical structure-based descriptors such as structural fingerprints and predicted protein targets, DILI prediction models were built with a predictive performance comparable to previous literature. In addition, we derived insights on proteins and pathways statistically (and potentially causally) linked to DILI from these models and inferred new structural alerts related to this adverse endpoint.

Biological Chemistry

DILI

mechanistic models

structural alerts

protein target

Drug-induced liver injury (DILI) is a major safety concern and one of the leading causes of drug failure in clinical drug development and market withdrawal, which can be found across nearly all classes of medication (1). DILI may occur either as a hepatitis or cholestatic injury or a mixed form of both and can be further distinguished between intrinsic and idiosyncratic DILI (1). If a drug is hepatotoxic in a dose-dependent manner both in preclinical models and humans (e.g. acetaminophen) it is considered to cause intrinsic DILI. Idiosyncratic DILI, on the other hand, is characterized by the lack of a clear dose-dependency and its rarity (usually less than 1 of 10,000 treated patients develops DILI symptoms). In contrast to intrinsic DILI, idiosyncratic DILI is the result of a patient’s rare combination of genetic and non-genetic risk factors, which is responsible for their susceptibility towards the drug (2). Consequently, in most cases idiosyncratic DILI cannot be detected in preclinical studies (3). The idiosyncratic nature of DILI also impedes its prediction with quantitative structure-activity relationship (QSAR) models, as idiosyncrasy implies that the underlying cause lies beyond inherent compound properties. Due to the low incidence of DILI, revealing causal links between the use of a drug and an observed liver injury is a difficult task (4), which decreases the confidence in provided DILI labels and further complicates the building of QSAR models with high predictivity.

The limited capability of animal models to detect hepatotoxic compounds raises the need for alternative testing strategies including in vitro and in silico models, as well as a better understanding of the underlying biology. Major challenges associated with the prediction of DILI using in vitro approaches lie in identifying relevant assays (5) and extrapolating from assay concentrations to in vivo blood concentrations associated with a hepatotoxic risk (6). Numerous in silico models have been generated based on molecular structure (7–12) and in vitro readouts, such as bioactivity (13) and gene expression (14) in cell culture, which are able to predict DILI better than random, but with a performance not yet sufficient for decision making in practice.

In the case of computational predictions, DILI is often simplified to a classification problem, i.e. separating compounds with or without this annotation in a data set (7–9,11,13). These labels, however, do not provide information on important factors such as dose-dependency or affected patient population, and consequently the practical applicability of such models is limited. While more detailed information on quantitative compound toxicity is difficult to retrieve, the strength of evidence for DILI is often provided in the datasets available. Paying attention to the quality of the data used for model generation has previously been shown to be relevant; for example, Kotsampasakou et al. (2017) (9) demonstrated that better models can be derived with smaller, but higher quality datasets.

The present work is derived from participation in the Critical Assessment of Massive Data Analysis (CAMDA) ‘CMap Drug Safety Challenge’ 2019 (http://papers.camda.info/) where the aim was to develop more predictive models for DILI from various different descriptor spaces. In this study, we retrieved compound hepatotoxicity annotations from the DILIrank (15) and SIDER (16) databases which were used as labels to generate compound-based DILI classifiers. The annotations in DILIrank were assigned by considering DILI-related market withdrawals and warnings in drug labels in combination with assessing causal links between the use of the drug and the occurrence of DILI. The drug is annotated as ‘DILI positive’ in two different severity classes (‘vMost-DILI concern’ and ‘vLess-DILI concern’) only if casual links to DILI could be confirmed. Drugs with existing concern but lack of causal proof were annotated as ‘Ambiguous DILI concern’, whereas drugs without concern were annotated as ‘vNo-DILI Concern’. The task set by the CAMDA challenge was to predict the labels of 55 drugs, which were previously annotated as ‘Ambiguous DILI concern’ and recently re-classified by the FDA. To this end, multiple descriptors were derived of which were used to build classification models for DILI: chemical fingerprints (17) describing the 2D compound structure, as well as Mordred molecular descriptors (18), and predicted protein targets inferred from chemical structure with PIDGIN (19–21). The predictivity of the resulting models was evaluated using two different external test sets.

In addition to predictive performance, we also focused on two practically relevant aspects of DILI prediction, namely the ability of models to extrapolate in chemical space, as well as the interpretation of relevant molecular and biological factors underlying DILI since interpretable models are more trusted, for example by regulatory agencies (22). To gain insights into biological processes, the protein targets with significantly higher binding probability in DILI compounds and the highest information for DILI classification were extracted from the protein target-based machine learning models. Based on these, we identified biological processes associated with DILI labels in the current dataset using the Reactome pathway maps (23) to show that mechanistic understanding of the biology underlying DILI can be obtained from this chemical structure-based feature space.

From the purely chemical side, we derived interpretable structural alerts related to DILI with the Molecular Substructure miner algorithm (MoSS) implementation of graph-based Molecular Fragment miner algorithm (MoFa) (24) and the fragment-based SARpy package (25), which could guide lead optimization to reduce the risk of DILI, as is currently standard practice for other toxicities (26). We then compared the quality of the derived structural alerts against the recent review of DILI related structural alerts by Liu et al. (2015) (27).

Predictive modeling

We first compared the performance of Support Vector Machine (SVM) and Random Forest (RF) models trained using different input descriptors to predict DILI positive compounds. To this end, three datasets with differing levels of DILI label confidence and size were compared (summarized in Table 1): “DILIrank (-vLessConcern)”, comprising DILIrank compounds labelled as either vMostConcern or vNoConcern (high confidence), “DILIrank” which additionally contains compounds from the DILIrank vLessConcern class (low confidence), and “DILIrank (+ SIDER)” which additionally includes inactives from the SIDER database (low confidence).

Table 1 Datasets used to generate predictive DILI models.


Dataset Name	Data samples				Binary class
	vLessConcern (n = 260)	vMostConcern (n = 174)	vNoConcern (n = 227)	SIDER (n = 262)	n(DILI)	n(NoDILI)
DILIrank (-vLessConcern)	-	DILI	NoDILI	-	174	227
DILIrank	DILI	DILI	NoDILI	-	434	227
DILIrank (+ SIDER)	DILI	DILI	NoDILI	NoDILI	434	489

It can be seen in Fig. 1 that models trained using Extended Connectivity Fingerprints of diameter 4 (ECFP4) descriptors show similar predictive performance for both the Leave-One-Cluster-Out cross-validation (LOCO-CV) and the external test set across all datasets for both the RF and SVM algorithms. For example, RF models trained using the DILIrank (-vLessConcern) dataset had a mean balanced accuracy of 0.734 ± 0.044 during cross-validation and 0.746 ± 0.032 for the external test set (Table S1). Secondly, all models achieved higher prediction accuracy than y-scrambling models (Fig. 1), demonstrating they all had a predictive power exceeding that of pure chance (28). Thirdly, for LOCO-CV and external test set a slightly better predictive performance was found using the highest confidence dataset in comparison to the lower confidence datasets (Fig. 1). For example, the SVM models the LOCO-CV mean balanced accuracy decreased from 0.714 ± 0.058 on the DILIrank (-vLessConcern) to 0.671 ± 0.043 (DILIrank) and 0.643 ± 0.045 (DILIrank (+ SIDER)). The mean external test set balanced accuracy decreased from 0.759 ± 0.027 (DILIrank (-vLessConcern)) to 0.697 ± 0.048 (DILIrank) and 0.709 ± 0.036 (DILIrank (+ SIDER)). These three findings were also observed for models trained using Mordred molecular descriptors (18) and protein target descriptors (19,20) (see Table S1, and Figures S1 and S2).

A significant drop in performance is seen for the majority of models on the FDA validation set with a balanced accuracy of below 0.6 (Fig. 1), which indicated that the models were less capable of generalizing to the FDA validation set than to the external test set. These findings were also observed for models trained using Mordred molecular descriptors (18) and protein target descriptors (19,20) indicating the limited generalization of models occurred irrespective of descriptor space. The best performing model across the most metrics was the SVM model trained using the DILIrank (-vLessConcern) dataset which utilized a linear kernel, a C parameter of 0.1, and a ‘one vs. rest’ decision function. This model achieved a mean balanced accuracy of 0.759 ± 0.03 and 0.655 ± 0.00 on the external test set and FDA validation set, respectively, thus demonstrating relatively high predictive power across the two independent test sets compared to all other models generated (Fig. 1 and Table S1).

We next investigated the relationship between compounds’ Tanimoto similarity to 5 nearest neighbors in the training set and classification performance for the external test set (Fig. 2a). This was achieved by generating an SVM model (with the same hyperparameters as the best model noted previously) where within each fold of a Leave-One-Out cross-validation (LOO-CV) scheme a compound’s predicted DILI label and Tanimoto similarity to the training set were retrieved (see Methods). Note that such an analysis for the FDA validation set was not possible as the DILIrank labels for this set of compounds were withheld. It was found that 65% of the compounds with a Tanimoto similarity to their 5 nearest neighbors in the training set between 0.0-0.2 were correctly classified (already comparable to the predictive performance on the FDA validation set for the same model - mean accuracy 0.673 ± 0.000), and this increased to 89% for compounds with 0.4–0.5 Tanimoto Similarity, and subsequently to 100% for compounds with a Tanimoto similarity greater than 0.5 (Fig. 2a). As similar inter-similarity distributions were found between the training dataset and both the external test and FDA validation sets (Fig. 2b), one could have naively anticipated a higher predictive performance (in line with the external test set) for the FDA validation compounds than seen in practice (Fig. 1).

Biological interpretation of protein targets

We next compared the median feature importance of proteins in the RF and SVM model based on the DILIrank (-vLessConcern) dataset which showed the best classification performance with protein target descriptors across LOCO-CV, external test set and FDA validation set (Figure S2). The Pearson correlation between the absolute respective feature importance is low (0.29) indicating that overall they identify different protein targets as being important for DILI classification.

Given that the focus is set on proteins with bioactivity related to DILI risk, we only further examined those that were significantly enriched in DILI-related compounds as determined by a Wilcoxon rank test. This included aldose reductase AKR1B1 which has been linked to APAP-induced oxidative stress and hepatotoxicity (29), the CYP enzymes CYP1A2 and CYP2C9 which are involved in xenobiotic metabolism in the liver (30), and the p38 kinase MAPK11 which is known to mediate stress-related signals in hepatotoxicity (31). Moreover, aldo-keto reductase family 1 member C (AKR1C3) is essential for Phase II drug metabolism pathways and Hypoxia-inducible factor prolyl 4-hydroxylase (P4HTM) inactivation has reported to have a protective role against DILI (32–34).

However, novel proteins were also identified such as Dual specificity protein kinase (CLK1). Interestingly, two of the identified novel proteins, namely Discoidin domain receptor Tyrosine Kinase 1 (DDR1) and Adenosine A1 Receptor (ADORA1), are members of the same protein families as DDR2 and ADORA2A, which are known in liver damage (35,36). In fact, DDR1 and DDR2 possess similar structure and shared functions in collagen-mediated signaling (31), and also the adenosine receptors ADORA1 and ADORA2 share physiologic functions (37,38). Furthermore, ADORA1 has been found to contribute to renal dysfunction associated with acute liver injury in rats, supporting a plausible involvement of this target in DILI (39). A full list of the proteins identified as containing the highest feature importance for classification of the current dataset with the RF and SVM methods and their known or potential links to hepatotoxicity are shown in Table S2.

In a next step, over-represented Reactome pathways (21) were determined among the top protein targets, which were significantly enriched in the DILI positive compounds (false discovery rate (FDR) < 0.05) and showed high feature importance in either the RF or SVM models. While results across different importance thresholds are shown in Figures S3 and S4, representative results of this analysis (with thresholds chosen to be an RF feature importance > 0.0018 and an SVM feature importance > 0.23) are shown in Fig. 4. From both the RF and SVM models, arachidonic acid metabolism and metabolism of lipids were identified as significantly over-represented processes (see Table S3), and the involvement of these two pathways in liver damage has been extensively characterized, especially for injuries induced by acetaminophen (40,41). Moreover, p53 signaling and mitochondrial biogenesis were retrieved by the RF models, which are key regulators of cellular stress response, and which also play a well-established role in DILI (42–44). The only process which is not known in DILI is related to RUNX1, which is involved in hematopoiesis (45) and was over-represented due to links to three proteins (EP300, CSNK2A2, CSNK2B) which are shared with TP53 signaling and lipid metabolism. Although not known in DILI specifically, increased expression of RUNX1 correlates with inflammation, fibrosis and NASH activity score in patients suffering from Non-Alcoholic Steatohepatitis (NASH) (46). In contrast, SVM identifies the Phase I metabolism and biological oxidation pathways as characteristic for DILI from the data (Fig. 4), which has previously been shown to be involved in the generation of toxic metabolites causing DILI (47). Hence, both algorithms prioritize proteins known to be involved in key processes in DILI. The same analysis with the lower-performing models based on the DILIrank and DILIrank (+ SIDER) datasets did not retrieve as many relevant proteins and pathways (results not shown).

Structural alerts

233 MoSS structural alerts (SAs) and 20 SARpy SAs were derived from the DILIrank (-vLessConcern) dataset, of which 23 and 11 were deemed significant (p-value ≤ 0.05), respectively. The number of derived SAs was sensitive to the parameters chosen and reflects the implementation of both algorithms (Methods). The quality of the inferred SAs, alongside 12 derived from the recent review literature of Liu et al. (2015) (5 of them were significant) (27), was assessed using multiple metrics (Table S4), with a particular focus on precision and coverage among compounds labelled as DILI positive (Fig. 5), with a summary of metrics per structural alert source in Table 2. Furthermore, an analysis of the occurrences of SAs in DrugBank (48) approved compounds was conducted.

Overall, for both data- and literature-derived structural alerts a common trade-off between precision and coverage was observed i.e. if a substructure had a high precision it rarely had a high coverage (Fig. 5). For example, the SAs benzene derivative (SARpy) and aniline derivatives (SARpy) had relatively low precision (0.47 and 0.65) relative to the mean precision of 0.85 ± 0.18 for SARpy SAs, but conversely had relatively high coverage (77.59% and 31.61%) relative to the mean coverage of 14.11 ± 22.54 for SARpy SAs. Table 2 affords a comparison between the precision and coverage of structural alerts from each of the three sources analyzed in this study. In particular it was noted that the SAs extracted by Liu et al. (2015) had lower coverage than those generated by MoSS and SARpy, but that SAs from all sources had high precision on average. Furthermore, it was seen that SAs from all methods were found in at least some approved compounds from the DrugBank database.

Table 2 Metrics for DILI-related significant structural alerts (p-value ≤ 0.05).

Source	Mean precision	Mean coverage (%)	Min. compound presence in DrugBank	Max. compound presence in DrugBank
Liu	0.88 ± 0.11	6.09 ± 1.55	25	89
MoSS	1.0 ± 0.0	18.73%±10.73	3	41
SARpy	0.85 ± 0.18	14.11 ± 22.54	11	1400

Mean precision and coverage (%) of significant structural alerts from the MoSS, SARpy, and Liu et al. (2015) indicated that highly precise structural alerts were generated from all three sources, but that these observed varying degrees of coverage. The minimum and maximum presence of structural alerts for each source in DrugBank approved compounds demonstrated that all SAs were found in at least some approved compounds, but that the absolute frequency varies significantly between SAs.

A maximum precision of 1 occurred for 29 out of 39 of the significant structural alerts, of which hydrazine derivative was seen for all three sources and had the highest coverage in DILI positive compounds (7.47%). Thus, despite the fact, for example, that SARpy could generate SAs with a precision of 1, these SAs seen were overly specific and only occurred in a very small proportion of DILI positive compounds.

Predictive modeling

Overall, the most predictive model generated in this study was an SVM model (linear kernel, C = 0.1, ‘one vs. rest’ decision function) trained using the DILIrank (-vLessConcern) dataset, and ECFP4 chemical structure descriptors. In contrast to non-linear models such as those generated using RF, the linear kernel utilized by this SVM model ensured a lack of interaction between input variables which may have facilitated the model’s improved generalization properties. This model ranked third and first for cross-validation (0.714 ± 0.058) and external test set (0.759 ± 0.03), respectively, by balanced accuracy, across a compendium of studies that sought to develop classification models for DILI, despite the significantly smaller dataset used for model training in this study (Table 3). It should be noted that datasets, features, and cross-validation schemes used across these studies vary and therefore performance metrics can hardly be compared directly. Among others, a LOCO-CV scheme was implemented in the present study to avoid predicting compounds with high Tanimoto similarity to the training data, which is more conservative than the random splits used by Kotsampasakou et al. (2017) (Table 3) and translated to a more rigorous evaluation of internal model performance.

Table 3 Comparison of performance indicators to several DILI classification models reported in the literature.


Model algorithm	Number of Compounds	CV scheme	CV Balanced accuracy (%)	CV Sensitivity	external test set Balanced accuracy (%)	external test set Sensitivity	References
RF	996 (541+/455-)	10-fold, random splits	0.645	0.680	0.588	0.536	Kotsampasakou et al. (2017)
SVM	1317 (571+/407-)	5-fold, splitting scheme unknown	0.767	0.948	0.597	0.848	Zhang et al. (2016)
Ensemble of RF and SVM models (5 total)	1241 (683+/558-)	5-fold, splitting scheme unknown	0.701	0.799	0.719	0.909	Ai et al. (2018)
Ensemble of eight different algorithms derived models (8 total)	1254 (636+/618-)	10-fold, splitting scheme unknown	0.783	0.818	0.716	0.773	He et al. (2019)
SVM	401 (174+/227-)	5-fold, Tanimoto similarity based GroupKFold	0.714± 0.06	0.697± 0.08	0.759 ± 0.03	0.724 ± 0.08	Present study

Literature model performance derived from He et al. (2019) (8). External test values quoted for the model developed in the present study are for the external test set. Despite being trained on the fewest compounds (401) and using a conservative LOCO-CV cross-validation scheme, the SVM model developed in the present study demonstrated robust predictivity between cross-validation and external test set.

Two key trends relating to training data quality and model bias were identified. Firstly, a large proportion of models, irrespective of descriptor type, showed consistent performance between cross-validation and the external test set, but then observed a steep drop in performance on the FDA validation set (Figs. 1, S1, and S2) despite showing similar chemical similarity distribution to the training set as seen for the external test set (Fig. 2b). One explanation for this is that both test sets populate different regions of chemical space with the model showing better performance in one area, or that while populating similar regions of chemical space, these are not clearly attributed to one of the DILI classes and intrinsically difficult to predict. An alternative explanation for the lack of generalization could also be attributed to the fact that compounds in the training data and external test set were labeled with higher confidence and hence the model might be able to distinguish between DILI positive and DILI negative compounds well. In contrast, the compounds of the FDA validation set, originally being labeled as ambiguous due to lack of clear evidence, might be inherently more difficult to predict. Deriving accurate DILI labels for compounds is a complex process given the uncertainty of causality assessment and the difficulty in trying to incorporate administration factors such as dose and patient populations. Moreover, phenomena such as idiosyncratic DILI which cannot usually be detected even in preclinical studies and occur only in subpopulations make the task of accurate DILI labeling even harder (3).

Secondly, across all descriptor types (ECFP4, Mordred molecular descriptors, and predicted protein targets) balanced accuracy in the cross-validation and on the external test set decreased as the training dataset was expanded from the high confidence dataset (DILIrank (-vLessConcern)) to either of the lower confidence datasets - DILIrank or DILIrank (+ SIDER) (Figs. 1, S1, and S2). This indicated that the inclusion of compounds from the vLessConcern class from DILIrank i.e. those with lower annotated evidence for DILI risk, as well as inactives derived by text-mining of package label inserts of marketed drugs (SIDER), harmed predictive performance despite increasing the number of training samples. While this is consistent with previous studies (9) which demonstrated that careful data curation can lead to improved performance, it should be noted that the sample size of the external test set and in particular the FDA validation set (49 compounds) were small. This makes it difficult to accurately evaluate model performance and accordingly also to confidently compare models (Fig. 1).

Larger datasets would be required to allow for enhanced fine-grained sampling of chemical space and the establishment of a model applicability domain. In the present study, the poor generalization to the FDA validation demonstrated that the relationship between chemical structure and the propensity to cause DILI is too complex for the model to learn from the small training dataset used (401 compounds). However, it must be noted that even if larger and higher quality datasets were acquired, model predictivity would still be limited as relevant information that may relate to the manifestation of DILI such as the influence of metabolism in the formation of hepatotoxic prodrugs were not considered in the descriptors used in the present study.

Protein targets

From the models which used predicted protein targets as features, we extracted biological processes by incorporating prior knowledge on bioactivity using PIDGIN and the functional contexts of proteins based on Reactome pathway maps. SVM and RF both identified arachidonic acid metabolism which is the mechanism of action and toxicity of NSAIDs, one of the most common causes for DILI (40,49,50). In addition, cellular stress response was identified as over-represented by Random Forest which is mediated via known pathways in DILI namely p53 signaling, and mitochondrial biogenesis which is initiated upon stress - which was in this case represented by the p38 kinases MAPK11 and MAPK12 (31–36,40–44). In contrast, SVM identified processes involved in xenobiotic metabolism, biological oxidations and phase I metabolism, which could contribute to the generation of toxic metabolites (47).

While the inferred biological processes have been known to be associated with DILI, this is not true for many of the proteins identified by feature importance themselves (Table S2, S3) such as CLK1 and DDR2. Given that the analysis was based on target binding probabilities, it can be hypothesized that these proteins might be off-targets directly (or indirectly) involved in the pathogenesis of DILI. The described workflow hence was able to derive functional hypotheses on biological processes from compound DILI annotations, which can subsequently be investigated experimentally.

Structural alerts

In this study, structural alerts related to DILI were derived using the SARpy (25) and MoSS (24) algorithms using the DILIrank (-vLessConcern) dataset. Both MoSS and SARpy derived structural alerts were found to be comparable to those reviewed by Liu et al. (2015) (27) in terms of precision and coverage. It should be noted that in contrast to the structural alerts of SARpy and MoSS which were explicitly derived and subsequently tested on the dataset used in this study, the structural alerts of Liu et. al. (2015) were derived using data from different sources, mainly LiverTox (51).

Table 4 DILI-related significant structural alerts (p-value ≤ 0.05) with the highest precision and coverage (%).

The following quality metrics are shown for the best substructures from MoSS, SARpy, and Liu et al. (2015): Precision, coverage in DILI positive compounds (%), p-value, as well as the number of approved compounds in Drugbank (44) which contain the substructure, alongside an example of such a structure.

Of the significant structural alerts obtained by SARpy, MoSS, and Liu et al. (2015) only hydrazine derivative (NN) was found to overlap between all of them (Table 4) and this obtained a precision of 1. However, a DrugBank (48) database search of the significant structural alerts showed that all of the significant structural alerts derived using MoSS occurred in at least 3 approved drugs, and those from SARpy and by Liu et al. (2015) occurred in at least 10 approved drugs (Table S4). For example, aniline derivative (SARpy) and carbamide derivative (SARpy), were present in 422 and 80 marketed drugs, respectively (Table S4). From the methodological angle it illustrated that whilst structural alerts can be informative about an increased probability of a compound being toxic, the presence of all structural alerts analyzed in this study in DrugBank approved compounds demonstrated they are not diagnostic of DILI in isolation. Administration dose is a key consideration to make when developing therapeutics and is not taken into account when simply screening for the presence of a structural alert. For example, hydrazine derivatives (shared between SARpy, MoSS, and Liu et al. (2015)) can increase muscle, neural, kidney, liver, blood and spleen toxicity (52), however it is present in e.g. procarbazin, which is a registered antineoplastic agent used in Hodgkin’s disease treatment and is an orphan drug for glioma (53). This example demonstrates that it can be beneficial to accept an increased toxicity risk in favor of prolonging the patient's life.

In this study, DILI classifiers were trained using data from the DILIrank and SIDER databases, by employing the SVM and RF algorithms with either ECFP4 fingerprints, Mordred molecular descriptors, or predicted protein targets as chemical structure-derived descriptors. The best predictive performance was seen when using more reliable data (excluding the DILIrank vLessConcern class and SIDER text-mined inactives), underlining the importance of data quality for such approaches. The performance of the best model in cross-validation and external test set achieved comparable performance to models reported in literature. On the other hand, performance on an additional test set provided by CAMDA (http://papers.camda.info/) was much lower, underlining the difficulty of accurately validating DILI models given the low number of labeled compounds. In the present study, the datasets used to evaluate the DILI model were both small, consisting of 80 (for the high confidence dataset) and 51 compounds respectively, with the latter proving much more difficult to predict despite the comparable structural similarities of the two datasets to the model training dataset.

Protein target descriptors achieved inferior predictive performance, but their advantage, compared to Mordred molecular descriptors and ECFP4 fingerprints, is that each individual feature corresponds to a protein and is hence interpretable from a biological perspective (also in combination with pathway annotations). Based on the feature importances in predictive models, it was hence possible to identify known and potentially novel key proteins involved in DILI, as well as important biological processes in drug-induced liver apoptosis, such as mitochondrial stress signaling, and the mechanism of action and toxicity of NSAIDs, which are known to be common causes for DILI.

Moreover, we inferred structural alerts with comparable precision and coverage to previously derived ones. However, due to the high structural diversity of DILI annotated compounds, the derived alerts were found to have rather low compound coverage by themselves. Moreover, all alerts were found to be present in approved drugs further highlighting the challenge in deriving practically useful structural alerts for DILI, and underlining the importance of quantitative factors such as dose when screening compounds for DILI. Hence, overall, this work achieved similar results as seen in previous studies with respect to performance of predicting DILI; on the other hand it introduced the utilization of biologically interpretable predicted protein targets to the field and underlined the importance of large and reliable dataset annotations when developing predictive models for DILI.

Data Preparation

Compounds’ SMILES strings were retrieved from the DILIrank database (1036 compounds) (15), and the SIDER 4.1 database (1430 compounds) (16). Side effects are recorded in SIDER using the preferred terms of the MedDRA (Medical Dictionary for Regulatory Activities), which provides a hierarchical organization of adverse events. Starting with the entire SIDER dataset (SIDER 4.1), all compounds with at least one reported side effect contained in the MedDRA’s System Organ Class hepatobiliary disorders were discarded to keep only drugs for which no liver-related side effects have been reported. The SMILES retrieved from DILIrank and SIDER were standardized using the Python package Standardiser of Atkinson et al. (2016) (54). This involved the removal of counterions and solvents and the neutralization of the remaining fragments if necessary. Moreover, tautomers were standardized according to the rules implemented in the standardizer. Subsequently, compounds that fell into at least one of the following categories were discarded: mixtures of more than one active ingredient, inorganic molecules, metal-organic compounds, and compounds with a molecular weight above 1 kDa. If compounds were present in both the DILIrank and the SIDER dataset, the compound from the SIDER inactive dataset was removed to avoid duplicate entries. The final set contained 923 compounds composed as follows: DILIrank: 174 vMost-DILI-Concern, 260 vLess-DILI-Concern, 227 vNo-DILI-Concern, SIDER: 262 compounds without reported liver-related side effects.

ECFP4 (17) hashed to 2,048 bits were generated using the Python library RDKit (version 2019.03.1.0) (55). 1,189 1D and 2D molecular descriptors were generated using the Python package Mordred (18). For the generation of models the values of the molecular descriptors were scaled to a Gaussian distribution with zero mean and unit variance using the StandardScaler function in the scikit-learn Python library (version 0.21.2) (56). Bioactivity for 1,673 human protein targets was predicted using the PIDGINv3 software (19–21). 10 µM was chosen as the bioactivity cut-off to consider highly and marginally active compounds. To get a prediction for every compound-target pair, no threshold for the applicability domain was applied. For 6 out of the 923 drugs (4 of them from SIDER) no protein target prediction was made, since their structures could not be standardized internally in the PIDGINv3 software.

In order to implement a OCO-CV scheme (to ensure similar compounds were not in different folds), we performed hierarchical clustering of compounds. Based on pairwise Tanimoto similarities calculated using ECFP4, a tree was generated using hierarchical clustering with the Nearest Point Algorithm implemented in SciPy (version 1.2.1) (57). Clusters were generated by cutting the hierarchical tree at a distance of 0.5, which resulted in compounds with a Tanimoto similarity of at least 0.5 being in the same cluster.

Model Generation

Overview

We used the SVM and RF implementations in the scikit-learn Python library (version 0.21.2) to train binary classification models for DILI. Models were developed for all three input feature spaces (fingerprints, protein targets, and Mordred molecular descriptors). In addition, we generated models using different subsets of data considering different DILIrank classes as well as additional inactives from the SIDER database (Table 1).

Model Hyperparameter Grid Search

Firstly, for SVMs (58) we used a classifier as implemented in the sklearn Python library and performed a hyperparameter grid search over the following parameters: Kernel: [‘linear’], Class weight: [‘balanced’], Decision function: [‘one vs. rest’], Shrinking: [‘True’], C: [0.05, 0.1, 0.2, 0.3, 0.4, 0.5, 1]. Of the possible SVM kernels implemented sklearn, we only evaluated ‘linear’ as it alone allows for easy interpretation of model feature importances. Secondly, for RFs (59) we used a classifier as implemented in the sklearn Python library and performed a hyperparameter grid search over the following parameters: Bootstrap: [‘True’], Class weight: [‘balanced subsample’], Max. Tree depth: [10, 15, None], Min. samples per leaf: [1,2,3], Number estimators: [100, 200, 300, 400, 500, 750, 1000].

Training procedure

The predictive performance of models was evaluated using a 5-fold cross-validation inside a 10-fold training scheme. The original dataset was split into 10 stratified folds using the StratifiedKFold function in scikit-learn with parameters: n_splits=10, and shuffle=True to assess the impact of different training data on model predictive performance. Within each training fold, an internal grid search over model hyperparameters was conducted using an internal 5-fold cross-validation to select the best model per training fold. This best model in each fold, assessed by balanced accuracy, was then used to predict for the holdout external test set. A LOCO-CV scheme was implemented with the GroupKFold function in scikit-learn with clusters only containing compounds with a ECFP4 Tanimoto similarity (60) greater than or equal to 0.5. This cross-validation scheme was utilized irrespective of the descriptor type. In addition, baseline models were trained and evaluated using the same procedure as mentioned, but with prior y-scrambling of the output labels (3 runs of different random scramblings). To further evaluate model predictive performance, an FDA validation set composed of 49 compounds (a subset of 55 - the CAMDA organizers removed 6, of which were unknown to the authors) previously labeled as vAmbiguous-DILI-Concern, but later relabeled as DILI positive or DILI negative by the FDA was used as an additional test set.

To evaluate the relationship between the chemical similarity of a compound to its 5 nearest neighbors in the training set and model correct classification rate a LOO-CV was conducted for the best performing model (SVM, DILIrank (-vLessConcern, (Figure 2a). This required the calculation of the Tanimoto similarities between the training dataset and the left-out compound using ECFP4 fingerprints and predicting its DILI label within each LOO-CV fold. Furthermore, the Tanimoto 5 nearest neighbor inter-similarity of the FDA validation set to the training set was compared to the corresponding similarities of the external test set (Figure 2b). As we previously evaluated the performance of the model (SVM, DILIrank (-vLessConcern)) with 10 distinct external test sets (in a 10-fold cross-validation scheme; see above), the similarities are averaged across all 10 pairs of training and test set.

Balanced accuracy (equation 1) was primarily used to assess the predictive performance of the models. We also utilized specificity to compare models to those previously published in the literature (Table 3). Those metrics were calculated from a confusion matrix consisting of true positives (TP), true negatives (TN), false positives (FP) and false negatives (FN).

Interpretation of protein targets

Proteins targets with higher binding probability in DILI positive compounds were determined using a one-sided Wilcoxon rank-sum test with a FDR of <0.05. Among those, proteins features with high median importance across the 10 train-test splits were identified for RF and SVM models, respectively. The feature importance for RF models implemented in scikit-learn (56) describes the decrease of node impurity achieved by a feature, averaged over all trees in the forest, as a fraction, so that the importance of all features included in the model sum up to 1 (59). In SVM models using linear kernels the importance of features is reflected by the magnitude of their coefficients describing the hyperplane (58). The sign indicates which class is favored by the presence of a given feature. The values used for further analysis were the median importance of a feature across the 10 train-test splits.

Over-enrichment analysis was performed using the ReactomePA R package (version 1.26.0) (23) and Reactome pathway maps. To this end, Uniprot IDs were mapped to Entrez gene IDs with the biomaRt package (61) and the whole list of PIDGIN target proteins was used as background. Only gene sets with a containing 10 or more genes were considered and p-values were adjusted using the Benjamini-Hochberg procedure. The analysis was performed using various feature importance thresholds scanning across the top quantile of absolute feature importance values.

Structural alerts

Derivation of structural alerts

Two algorithms for structural alert derivation were used, with the DILIrank (-vLessConcern) dataset as input - MoSS and SARpy. MoSS is a graph-based depth-first search method used for chemical substructure mining (24) and we used a KNIME (v3.7.2.) (63) implementation of MoSS in the current study. It derives potential structural alerts as “subgraphs" with only heavy atoms, which are neither SMILES nor SMARTS. Users might decide to approximate the subgraphs with SMARTS in order to match the substructure to molecules (denoted by SMILES). This program searches for frequent molecular substructures and discriminative fragments in a set of molecule graphs. In a graph, a vertex is a representation of an atom and an edge is a representation of a bond. Each vertex has attributes related to atom type, charge, and whether it is a part of an aromatic ring. Edges indicate the bond type. The search starts from the root of graph tree being a single atom and follows recursively through atoms linked to leaf atoms with subsequent bonds. Substructures are then created based on each state of the graph tree and are pruned if the substructure occurrence in the active class is lower than the defined minimum focus support (MFS).

In order to find a discriminative fragment, two thresholds should be defined by the user. The first one is the aforementioned MFS used for pruning and the second one is minimum complement support (MCS) i.e. the substructure occurrence in the inactive class. The following KNIME MoSS settings were chosen: ‘1%’ MFS (the minimum fraction of the fragment-contained chemicals in the DILI positive class - the true positive rate), ‘0.01%’ MCS (the maximum fraction of the fragment-contained drugs in the DILI negative class - the false positive rate). In addition, only substructures in which the number of bonds ranged from ‘2’ to ‘15’ were kept. Pure carbon fragments were ‘ignored’ and ring mining was ‘applied’.

SARpy is a string-based search method used for chemical substructure mining (25). Briefly, structural alerts in the form of SMARTS strings are generated by recursively breaking every combination of bonds working directly on the SMILES strings of the input dataset. Fragments are then internally validated against all compounds in the dataset, and then a reduced set of substructure “rules” is extracted. In this work’s implementation of SARpy (v.1.0) the fragmentize function parameters minAtoms and maxAtoms were set to 2 and 15 respectively, and the ‘target’ (i.e. DILI positive or DILI negative) was set to None. Structural alerts for DILI were extracted using the extract function with the parameters: 5 minHits, 1 minLR, and 0 minPrecision. These settings are identical to those used by Yang et al. (2017) (62), except that a precision threshold was not applied in order to generate a larger compendium of structural alerts to analyze.

Evaluation of structural alerts

Structural alerts’ SMARTS were matched to compounds’ SMILES using the RDKit HasSubstructMatch function in Python (RDKit 2018.09.3.0). Precision (equation 2) and coverage in DILI positive compounds were both used to assess the predictive performance of the structural alerts. In addition, significance measured by p-value was also calculated for each structural alert using the SciPy (version 1.3.0) stats module fisher_exact function with alternative parameter set to ‘greater’.

As previously mentioned, because MoSS utilizes a graph-based search approach it may not consider the slight difference between aromatic and aliphatic atoms, leading to mismatches when matching its substructures to SMILES. For example, in MoSS, ‘N-C’ can match both aminofuran (NC1=CC=CO1) and aminotetrahydrofuran (NC1CCCO1). However, in SMARTS, ‘C’ is different from ‘c’, so RDKit will not match ‘N-C’ with aminofuran, because the carbons are aromatic. Despite this, the significance of these structural alerts for all structural alerts is based on the presence calculated using RDKit.

To investigate a substructure’s presence in already approved and marketed drugs, substructures were matched to compounds in the DrugBank database (48) (v.5.1.4) using the RDKit (version 2018.09.1) (55) HasSubstructMatch function. This involved firstly standardizing compounds’ SMILES using the Python package Standardiser of Atkinson et al. (2016) (54). As some SMILES could not be standardized, this step reduced the total number of DrugBank compounds in the analysis from 2411 to 2136.

ADORA1: Adenosine A1 Receptor; AKR1C3: aldo-keto reductase family 1 member C; CAMDA: Critical Assessment of Massive Data Analysis; CLK1: Dual specificity protein kinase; DDR1: Discoidin domain receptor Tyrosine Kinase 1; DILI: Drug-induced liver injury; ECFP4: Extended Connectivity Fingerprints of diameter 4; FDR: false discovery rate; FN: false positive; FP: false negative; TN: true negatives; TP: true positives; LOCO-CV: Leave-One-Cluster-Out cross-validation; LOO-CV: Leave-One-Out cross-validation; MCS: minimum complement support; MedDRA: Medical Dictionary for Regulatory Activities; MFS: minimum focus search; MoFa: Molecular Fragment miner algorithm; MoSS: Molecular Substructure miner algorithm; NASH: Non-Alcoholic Steatohepatitis; NSAIDs: nonsteroidal anti-inflammatory drugs; P4HTM: hypoxia-inducible factor prolyl 4-hydroxylase; QSAR: Quantitative structure-activity relationship; RF: Random Forest; SA: structural alert; SVM: Support Vector Machine

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Availability of data and material

The datasets generated and/or analyzed during the current study are available in the GitHub repository https://github.com/anikaliu/CAMDA-DILI which will be released upon publication.

Competing interests

The authors declare that they have no competing interests.

Funding

AL was supported by funding from GlaxoSmithKline and PW by funding from the National Centre for the Replacement, Refinement and Reduction of Animals in Research (NC3Rs).

Author contributions

MW, PW, AL and AE analyzed the performance of the predictive models generated by MW. AL analyzed protein targets and derived pathways, results of which were supported by literature research performed by AL and DD. The structural alerts analysis was performed by AMB and PW, and was revised by HY. AL, PW, MW, AMB, DD, HY wrote the manuscript. AL and AB supervised the study. All authors read and approved the manuscript.

Acknowledgments

We thank CAMDA and the FDA for organizing the CAMDA CMap Drug Safety Challenge 2019 and for providing us with the opportunity to present our contributions at the CAMDA/ISMB conference.

Author information

Not applicable.

David S, Hamilton JP. Drug-induced Liver Injury. US Gastroenterol Hepatol Rev. 2010 Jan 1;6:73–80.
Mosedale M, Watkins PB. Drug-induced liver injury: Advances in mechanistic understanding that will inform risk management. Clin Pharmacol Ther. 2017 Apr;101(4):469–80.
Chen M, Borlak J, Tong W. Predicting idiosyncratic drug-induced liver injury: some recent advances. Expert Rev Gastroenterol Hepatol. 2014 Sep;8(7):721–3.
Chen M, Vijay V, Shi Q, Liu Z, Fang H, Tong W. FDA-approved drug labeling for the study of drug-induced liver injury. Drug Discov Today. 2011 Aug 1;16(15):697–703.
Wink S, Hiemstra SW, Huppelschoten S, Klip JE, van de Water B. Dynamic imaging of adaptive stress response pathway activation for prediction of drug induced liver injury. Arch Toxicol. 2018 May 1;92(5):1797–814.
Albrecht W, Kappenberg F, Brecklinghaus T, Stoeber R, Marchan R, Zhang M, et al. Prediction of human drug-induced liver injury (DILI) in relation to oral doses and blood concentrations. Arch Toxicol. 2019 Jun 1;93(6):1609–37.
Ai H, Chen W, Zhang L, Huang L, Yin Z, Hu H, et al. Predicting Drug-Induced Liver Injury Using Ensemble Learning Methods and Molecular Fingerprints. Toxicol Sci. 2018 Sep 1;165(1):100–7.
He S, Ye T, Wang R, Zhang C, Zhang X, Sun G, et al. An In Silico Model for Predicting Drug-Induced Hepatotoxicity. Int J Mol Sci. 2019 Jan;20(8):1897.
Kotsampasakou E, Montanari F, Ecker GF. Predicting drug-induced liver injury: The importance of data curation. Toxicology. 2017 Aug 15;389:139–45.
Zhang C, Cheng F, Li W, Liu G, Lee PW, Tang Y. In silico Prediction of Drug Induced Liver Toxicity Using Substructure Pattern Recognition Method. Mol Inform. 2016;35(3–4):136–44.
Zhang H, Ding L, Zou Y, Hu S-Q, Huang H-G, Kong W-B, et al. Predicting drug-induced liver injury in human with Naïve Bayes classifier approach. J Comput Aided Mol Des. 2016 Oct 1;30(10):889–98.
Hong H, Thakkar S, Chen M, Tong W. Development of Decision Forest Models for Prediction of Drug-Induced Liver Injury in Humans Using A Large Set of FDA-approved Drugs. Sci Rep. 2017 Dec 11;7(1):1–15.
Liu J, Mansouri K, Judson RS, Martin MT, Hong H, Chen M, et al. Predicting Hepatotoxicity Using ToxCast in Vitro Bioactivity and Chemical Structure. Chem Res Toxicol. 2015 Apr 20;28(4):738–51.
Wang Z, Clark NR, Ma’ayan A. Drug-induced adverse events prediction with the LINCS L1000 data. Bioinformatics. 2016 Aug 1;32(15):2338–45.
Chen M, Suzuki A, Thakkar S, Yu K, Hu C, Tong W. DILIrank: the largest reference drug list ranked by the risk for developing drug-induced liver injury in humans. Drug Discov Today. 2016 Apr 1;21(4):648–53.
Kuhn M, Letunic I, Jensen LJ, Bork P. The SIDER database of drugs and side effects. Nucleic Acids Res. 2016 Jan 4;44(Database issue):D1075–9.
Rogers D, Hahn M. Extended-Connectivity Fingerprints. J Chem Inf Model. 2010 May 24;50(5):742–54.
Moriwaki H, Tian Y-S, Kawashita N, Takagi T. Mordred: a molecular descriptor calculator. J Cheminformatics. 2018 Feb 6;10(1):4.
lhm30. lhm30/PIDGINv3 [Internet]. 2019 [cited 2019 Oct 28]. Available from: https://github.com/lhm30/PIDGINv3
Mervin LH, Afzal AM, Drakakis G, Lewis R, Engkvist O, Bender A. Target prediction utilising negative bioactivity data covering large chemical space. J Cheminformatics. 2015 Oct 24;7(1):51.
Aniceto N, Freitas AA, Bender A, Ghafourian T. A novel applicability domain technique for mapping predictive reliability across the chemical space of a QSAR: reliability-density neighbourhood. J Cheminformatics [Internet]. 2016 Dec 3 [cited 2018 Sep 20];8. Available from: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5395519/
Idakwo G, Luttrell J, Chen M, Hong H, Zhou Z, Gong P, et al. A review on machine learning methods for in silico toxicity prediction. J Environ Sci Health Part C. 2018 Oct 2;36(4):169–91.
Fabregat A, Jupe S, Matthews L, Sidiropoulos K, Gillespie M, Garapati P, et al. The Reactome Pathway Knowledgebase. Nucleic Acids Res. 2018 04;46(D1):D649–55.
Borgelt C, Meinl T, Berthold M. MoSS: a program for molecular substructure mining. In: Proceedings of the 1st international workshop on open source data mining frequent pattern mining implementations - OSDM ’05 [Internet]. Chicago, Illinois: ACM Press; 2005 [cited 2019 Oct 28]. p. 6–15. Available from: http://portal.acm.org/citation.cfm?doid=1133905.1133908
Ferrari T, Cattaneo D, Gini G, Bakhtyari NG, Manganaro A, Benfenati E. Automatic knowledge extraction from chemical structures: the case of mutagenicity prediction. SAR QSAR Environ Res. 2013 May 1;24(5):365–83.
Limban C, Nuţă DC, Chiriţă C, Negreș S, Arsene AL, Goumenou M, et al. The use of structural alerts to avoid the toxicity of pharmaceuticals. Toxicol Rep. 2018 Aug 31;5:943–53.
Liu R, Yu X, Wallqvist A. Data-driven identification of structural alerts for mitigating the risk of drug-induced human liver injuries. J Cheminformatics. 2015 Feb 11;7(1):4.
Lipiński PFJ, Szurmak P. SCRAMBLE’N’GAMBLE: a tool for fast and facile generation of random data for statistical evaluation of QSAR models. Chem Zvesti. 2017;71(11):2217–32.
Ahmed MME, Al-Obosi JAS, Osman HM, Shayoub ME. Overexpression of Aldose Reductase Render Mouse Hepatocytes More Sensitive to Acetaminophen Induced Oxidative Stress and Cell Death. Indian J Clin Biochem. 2016 Apr;31(2):162–70.
Zanger UM, Schwab M. Cytochrome P450 enzymes in drug metabolism: regulation of gene expression, enzyme activities, and impact of genetic variation. Pharmacol Ther. 2013 Apr;138(1):103–41.
Huang Y, Wu C, Ye Y, Zeng J, Zhu J, Li Y, et al. The Increase of ROS Caused by the Interference of DEHP with JNK/p38/p53 Pathway as the Reason for Hepatotoxicity. Int J Environ Res Public Health [Internet]. 2019 Feb [cited 2019 Oct 28];16(3). Available from: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6388128/
Barski OA, Tipparaju SM, Bhatnagar A. The aldo-keto reductase superfamily and its role in drug metabolism and detoxification. Drug Metab Rev. 2008;40(4):553–624.
Chen W-D, Zhang Y. Regulation of Aldo–Keto Reductases in Human Diseases. Front Pharmacol [Internet]. 2012 [cited 2019 Oct 28];3. Available from: https://www.frontiersin.org/articles/10.3389/fphar.2012.00035/full
Laitakari A, Ollonen T, Kietzmann T, Walkinshaw G, Mennerich D, Izzi V, et al. Systemic inactivation of hypoxia-inducible factor prolyl 4-hydroxylase 2 in mice protects from alcohol-induced fatty liver disease. Redox Biol. 2019;22:101145.
Leitinger B. Discoidin domain receptor functions in physiological and pathological conditions. Int Rev Cell Mol Biol. 2014;310:39–87.
Chiang DJ, Roychowdhury S, Bush K, McMullen MR, Pisano S, Niese K, et al. Adenosine 2A Receptor Antagonist Prevented and Reversed Liver Fibrosis in a Mouse Model of Ethanol-Exacerbated Liver Fibrosis. PLOS ONE. 2013 Jul 18;8(7):e69114.
Huang J, Chen M-N, Du J, Liu H, He Y-J, Li G-L, et al. Differential Expression of Adenosine P1 Receptor ADORA1 and ADORA2A Associated with Glioma Development and Tumor-Associated Epilepsy. Neurochem Res. 2016;41(7):1774–83.
Borea PA, Gessi S, Merighi S, Vincenzi F, Varani K. Pharmacology of adenosine receptors: the state of the art. Physiol Rev. 2018;98(3):1591–625.
Ming Z, Fan Y, Yang X, Lautt WW. Contribution of hepatic adenosine A1 receptors to renal dysfunction associated with acute liver injury in rats. Hepatology. 2006;44(4):813–22.
Suciu M, Gruia AT, Nica DV, Azghadi SMR, Mic AA, Mic FA. Acetaminophen-induced liver injury: Implications for temporal homeostasis of lipid metabolism and eicosanoid signaling pathway. Chem Biol Interact. 2015 Dec 5;242:335–44.
Meng XJ, Wang JL. Arachidonic acid metabolism in galactosamine/endotoxin induced acute liver injury in rats. J Tongji Med Univ Tong Ji Yi Ke Xue Xue Bao. 1994;14(3):169–72.
Huo Y, Yin S, Yan M, Win S, Aung Than T, Aghajan M, et al. Protective role of p53 in acetaminophen hepatotoxicity. Free Radic Biol Med. 2017;106:111–7.
Pessayre D, Fromenty B, Berson A, Robin M-A, Lettéron P, Moreau R, et al. Central role of mitochondria in drug-induced liver injury. Drug Metab Rev. 2012 Feb;44(1):34–87.
Ramachandran A, Visschers RGJ, Duan L, Akakpo JY, Jaeschke H. Mitochondrial dysfunction as a mechanism of drug-induced hepatotoxicity: current understanding and future perspectives. J Clin Transl Res. 2018 May 28;4(1):75–100.
Ichikawa M, Yoshimi A, Nakagawa M, Nishimoto N, Watanabe-Okochi N, Kurokawa M. A role for RUNX1 in hematopoiesis and myeloid leukemia. Int J Hematol. 2013 Jun;97(6):726–34.
Kaur S, Rawal P, Siddiqui H, Rohilla S, Sharma S, Tripathi DM, et al. Increased Expression of RUNX1 in Liver Correlates with NASH Activity Score in Patients with Non-Alcoholic Steatohepatitis (NASH). Cells. 2019;8(10):1277.
Corsini A, Bortolini M. Drug-induced liver injury: the role of drug metabolism and transport. J Clin Pharmacol. 2013 May;53(5):463–74.
Wishart DS, Feunang YD, Guo AC, Lo EJ, Marcu A, Grant JR, et al. DrugBank 5.0: a major update to the DrugBank database for 2018. Nucleic Acids Res. 2018 Jan 4;46(D1):D1074–82.
Chang CY, Schiano TD. Review article: drug hepatotoxicity. Aliment Pharmacol Ther. 2007 May 15;25(10):1135–51.
Xue-jun M, Jia-long W. Arachidonic acid metabolism in galactosamine/endotoxin induced acute liver injury in rats. J Tongji Med Univ. 1994;14(3):169–72.
Hoofnagle JH, Serrano J, Knoben JE, Navarro VJ. LiverTox: a website on drug-induced liver injury. Hepatol Baltim Md. 2013 Mar;57(3):873–4.
Hawks A, Hicks RM, Holsman JW, Magee PN. Morphological and Biochemical Effects of 1,2-Dimethylhydrazine and 1-Methylhydrazine in Rats and Mice. Br J Cancer. 1974 Nov;30(5):429–39.
Kim S-H, Yoo H, Chang JH, Kim C-Y, Chung DS, Kim SH, et al. Procarbazine and CCNU Chemotherapy for Recurrent Glioblastoma with MGMT Promoter Methylation. J Korean Med Sci. 2018;33(24):e167.
Atkinson F. flatkinson/standardiser [Internet]. 2019 [cited 2019 Oct 28]. Available from: https://github.com/flatkinson/standardiser
RDKit [Internet]. [cited 2019 Oct 28]. Available from: https://www.rdkit.org/
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. Scikit-learn: Machine Learning in Python. J Mach Learn Res. 2011;12:2825–30.
Jones E, Oliphant T, Peterson P, others. SciPy.org — SciPy.org [Internet]. [cited 2019 Oct 28]. Available from: https://www.scipy.org/
Cortes C, Vapnik V. Support-Vector Networks. In: Machine Learning. 1995. p. 273–297.
Breiman L. Random Forests. Mach Learn. 2001 Oct 1;45(1):5–32.
Bajusz D, Rácz A, Héberger K. Why is Tanimoto index an appropriate choice for fingerprint-based similarity calculations? J Cheminformatics. 2015 May 20;7(1):20.
Durinck S, Moreau Y, Kasprzyk A, Davis S, De Moor B, Brazma A, et al. BioMart and Bioconductor: a powerful link between biological databases and microarray data analysis. Bioinforma Oxf Engl. 2005 Aug 15;21(16):3439–40.
Yang H, Li J, Wu Z, Li W, Liu G, Tang Y. Evaluation of Different Methods for Identification of Structural Alerts Using Chemical Ames Mutagenicity Data Set as a Benchmark. Chem Res Toxicol. 2017 19;30(6):1355–64.

Download PDF

Journal Publication

published 18 Jan, 2021

Read the published version in Biology Direct →

Review #2 received at journal
19 Aug, 2020
Reviewer #2 agreed at journal
31 Jul, 2020
Reviewer #1 agreed at journal
29 Jul, 2020
Reviewers invited by journal
28 Jul, 2020
Editor assigned by journal
09 Mar, 2020
Editor invited by journal
08 Mar, 2020
Submission checks completed at journal
05 Mar, 2020
First submitted to journal
27 Feb, 2020

You are reading this latest preprint version

Prediction and mechanistic analysis of Drug-Induced Liver Injury (DILI) based on chemical structure

Status:

Journal Publication

Version 1

Abstract

Figures

Background

Results

Discussion

Conclusions

Methods

Data Preparation

Model Generation

Overview

Model Hyperparameter Grid Search

Training procedure

Interpretation of protein targets

Structural alerts

Derivation of structural alerts

Evaluation of structural alerts

Abbreviations

Declarations

References

Supplementary Files

Status:

Journal Publication

Version 1