External validation results for acute toxicity of
Daphnia magna The 36 organic chemicals assessed in this study represent a diverse array of commercial substances. They include olefins, halides, nitrobenzene, perfluorooctane sulfonate, phenols, aldehydes, diphenyl ether, biphenyl, and phenylamine. Heavy metals are not included in the QSAR prediction. The external validation results pertaining to the acute toxicity of Daphnia magna are shown in Table 3.
Table 3
Results of predicted toxicities and categories of Daphnia magna for the validation set
CAS No. | Exp data | Predict data (LC50 mg/L) | |
Ecosar | T.E.S.T | QSAR Toolbox | Danish | VEGA | Read Across | KATE |
LC50 | Cat. | LC50 | Cat. | LC50 | Cat. | LC50 | Cat. | LC50 | Cat. | LC50 | Cat. | LC50 | Cat. | LC50 | Cat. |
120-82-1 | 1.68 | 4 | 2.115 | 4 | 2.88 | 4 | 2.37 | 4 | 1.14 | 4 | 2.69 | 4 | 20.0 | 5 | 0.84 | 3 |
81-15-2 | 0.15 | 3 | 1.33 | 4 | 2.41 | 4 | 1.33 | 4 | 0.46 | 3 | 6.71* | 4 | 0.042* | 2 | 1.6 | 4 |
85535-84-8 | 0.53 | 3 | 0.12* | 3 | 0.89 | 3 | NA | NA | 0.12* | 3 | 0.775* | 3 | NA | NA | 0.059* | 2 |
0075-09-2 | 27.0 | 5 | 127 | 6 | 59.06 | 5 | 146 | 4 | 127 | 6 | 81.92* | 5 | 121 | 6 | 80.0 | 5 |
50-00-0 | 29.0 | 5 | 46.086 | 5 | NA | | 46.1 | 5 | 46.086 | 5 | NA | NA | 92.0 | 5 | 13.0 | 5 |
77-47-4 | 0.039 | 2 | 0.24 | 3 | 1.04 | 4 | 0.208 | 3 | 0.24 | 3 | 4.88 | 4 | 1.45 | 4 | 0.027 | 2 |
25637-99-4 | 0.0032 | 1 | 0.004* | 1 | 0.16 | 3 | 0.0035 | 1 | 0.0013* | 1 | 0.477* | 3 | 0.327* | 3 | 0.0026* | 1 |
91-20-3 | 1.96 | 4 | 6.199 | 4 | 8.14 | 4 | 1.85 | 4 | 1.85 | 4 | 2.95 | 4 | NA | NA | 3.3 | 4 |
1763-23-1 | 37.04 | 5 | 16.916 | 5 | NA | NA | NA | NA | 0.385 | 3 | 2.51* | 4 | NA | NA | 26.0 | 5 |
307-35-7 | 130 | 5 | 0.005 | 1 | NA | NA | 0.005 | 1 | 0.385 | 3 | 2.51* | 4 | 43.5* | 5 | 1.6 | 4 |
2795-39-3 | 130 | 5 | 3.035 | 4 | NA | NA | NA | NA | 0.385 | 3 | NA | NA | 405 | 6 | NA | NA |
25154-52-3 | 0.14 | 3 | 0.063 | 2 | 0.36 | 3 | 0.19 | 3 | 0.19 | 3 | NA | NA | 2.46 | 4 | 0.17* | 3 |
84852-15-3 | 0.19 | 3 | 0.084 | 2 | 0.42 | 3 | 0.907 | 3 | 0.712 | 3 | 4.88* | 4 | 2.46 | 4 | 0.17* | 3 |
9016-45-9 | 0.148 | 3 | 1.821 | 4 | 8.94 | 4 | 0.308 | 3 | 1.371 | 4 | 0.171 | 3 | 0.8 | 3 | 1.4* | 4 |
67-66-3 | 29.0 | 5 | 127.56 | 6 | 77.4 | 5 | 144 | 6 | NA | NA | 17.07 | 5 | 21.4 | 5 | 29.0 | 5 |
79-01-6 | 43.01 | 5 | 29.819 | 5 | 36.1 | 5 | 7.96 | 4 | 11.08 | 5 | 5.27* | 4 | 20.6 | 5 | 9.4 | 4 |
R2 | 0.3223 | | 0.2095 | | 0.2277 | | 0.3729 | | 0.1236 | | 0.5766 | | 0.6349 | |
R2AD | 0.1615 | | 0.2095 | | 0.2277 | | 0.1605 | | 0.0400 | | 0.5148 | | 0.4063 | |
Accuracies | 44% | | 50% | | 44% | | 56% | | 25% | | 31% | | 56% | |
* indicated the chemical is out the applicability domain; NA indicated not available for prediction. |
Total accuracy measures the fraction of chemicals correctly placed in the toxicity category with a log scale error within 1, while missing predictions are excluded from the analysis. The correlation coefficient (R2) and the correlation coefficient in the AD (R2AD) between the experimental and predictive data were calculated for these models. Based on the predictive power of classification of the entire data set into the six toxicity categories, the tested tools for Daphnia magna can be ranked in the following order, from the highest to the lowest performers: T.E.S.T. > Danish QSAR Database = QSAR Toolbox > KATE > ECOSAR = Read Across > VEGA. The total accuracy of T.E.S.T. is relatively high at 56%, and R2 of the experimental and predicted values is 0.2095. The accuracies of both the Danish QSAR Database and the QSAR Toolbox are 44%, while the corresponding R2 values are 0.3729 and 0.2277, and the R2AD values are 0.1605 and 0.2277. A few of the chemicals, namely, 2, 8, 3, 2, and 5, are outside the ADs of the Danish QSAR Database, VEGA, Read Across, ECOSAR, and KATE. The accuracy, R2, and R2AD of KATE are 37%, 0.6349, and 0.4063, respectively; while those of ECOSAR are 31%, 0.3223, and 0.1615, respectively. The corresponding values for Read Across are 31%, 0.5766, and 0.5148; and those for VEGA are 25%, 0.1236, and 0.04.
The predicted deviations (Fig. 1) of T.E.S.T., Read Across, and VEGA for hexachlorocyclopentadiene (CAS: 77-47-4) and hexabromocyclododecane (CAS: 25637-99-4) exceed 1 log unit, whereas those of ECOSAR and QSAR Toolbox for perfluorooctyl sulfonic acid salts and perfluorooctyl sulfonyl chloride (CAS: 307-35-7) exceed 1 log unit. The values for the other substances fall within the 1 log unit deviation. Hexabromocyclododecane is considered to be outside the ADs of five in silico methods, and perfluorooctyl sulfonic acid salts and perfluorooctyl sulfonyl chloride are outside the ADs of two in silico methods.
External validation results for acute toxicity of Pimephales promelas
The external validation results for the acute toxicity of Pimephales promelas are shown in Table 3. Based on predictive power of classification into the six toxicity categories of the entire data set, the tested tools for Pimephales promelas can be ranked in the following order from the highest to the lowest performers: QSAR Toolbox > ECOSAR > Read Across > KATE > T.E.S.T. = Danish QSAR Database > VEGA. The total accuracy of QSAR Toolbox for the six categories is 44%, whereas the values of R2 and R2AD of the experimental and predicted values are 0.318 and 0.2156, respectively. The accuracies of ECOSAR and Read Across are 44% and 39%, whereas the corresponding R2 and R2AD values are 0.2166 and 0.4144, and 0.4229 and 0.4430. Moreover, 2, 3, 6, 2, 2, and 4 of the chemicals are outside the AD for the Danish QSAR Database, QSAR Toolbox, VEGA, Read Across, ECOSAR, and KATE, respectively. The accuracy, R2, and R2AD of KATE are 38%, 0.2714, and 0.6793, respectively, while the corresponding values of the Danish QSAR Database are 27%, 0.1495, and 0.2381.
Table 4
Results of predicted toxicities and categories of Pimephales promelas for the validation set
CAS No. | Exp data | Predict data (LC50 mg/L) | |
ECOSAR | T.E.S.T | QSAR Toolbox | Danish | Vega | Read Across | KATE |
LC50 | Cat. | LC50 | Cat. | LC50 | Cat. | LC50 | Cat. | LC50 | Cat. | LC50 | Cat. | LC50 | Cat. | LC50 | Cat. |
120-82-1 | 0.7 | 3 | 2.822 | 4 | 1.84 | 4 | 1.81 | 4 | 1.36 | 4 | 2.67 | 4 | 15.8 | 5 | 2.4 | 4 |
81-15-2 | 0.2 | 3 | 0.23 | 3 | 0.07 | 2 | 0.198 | 3 | 0.28 | 3 | 1.06* | 4 | 0.273 | 3 | 1.4 | 4 |
85535-84-8 | 100 | 6 | 0.13 | 3 | 6.44 | 4 | 127* | 6 | NA | NA | NA | NA | 192* | 6 | 0.19* | 3 |
0075-09-2 | 330 | 6 | 249 | 6 | 316 | 6 | 284 | 6 | 249 | 6 | 311* | 6 | 31 | 5 | 180 | 6 |
50-00-0 | 23.9 | 5 | 12.5 | 5 | NA | NA | 17.9 | 5 | 12.5 | 5 | 0.776* | 3 | 31 | 5 | 11 | 5 |
77-47-4 | 0.007 | 1 | 0.346 | 3 | 0.33 | 3 | 0.091 | 2 | 0.34 | 3 | 0.242* | 3 | 0.355 | 3 | 0.035 | 2 |
25637-99-4 | 100 | 6 | 0.004* | 1 | 0.045 | 2 | 0.0048* | 1 | 0.004* | 1 | 9.62* | 4 | 40.0 | 5 | 0.011* | 2 |
91-20-3 | 0.96 | 3 | 9.249 | 4 | 7.27 | 4 | 6.31 | 4 | 3.19 | 4 | 6.12 | 4 | NA | NA | 8.8 | 4 |
307-35-7 | 4.7 | 4 | 0.045 | 2 | 0.24 | 3 | 0.0477 | 2 | 0.006 | 1 | 102 | 6 | 654* | 6 | 0.086 | 2 |
2795-39-3 | 9.5 | 4 | 23.6 | 5 | 0.57 | 3 | 74.5 | 5 | 0.006 | 1 | NA | NA | 654 | 6 | NA | NA |
25154-52-3 | 0.128 | 3 | 0.039 | 2 | 0.34 | 3 | 0.454 | 3 | 0.0449 | 2 | 0.139 | 3 | 0.146 | 3 | 0.083 | 2 |
84852-15-3 | 0.135 | 3 | 0.057 | 2 | 1.16 | 4 | 0.5 | 3 | 0.087 | 2 | 0.014* | 2 | 0.17 | 3 | 0.082* | 2 |
9016-45-9 | 5.00 | 4 | 2.239 | 4 | 9.95 | 4 | 0.93 | 3 | 0.021 | 2 | 16.21* | 5 | 0.215 | 3 | 3.5 | 4 |
67-66-3 | 121 | 6 | 242 | 6 | 72.24 | 5 | 41.9 | 5 | NA | NA | 54.14 | 5 | 24.3 | 5 | 68.0 | 5 |
79-01-6 | 44.52 | 5 | 11.21 | 5 | 30.49 | 5 | 45.6 | 5 | 9.95 | 4 | NA | NA | 52.3 | 5 | 23.0 | 5 |
1163-19-5 | 0.183 | 3 | 9.40E-07* | 1 | NA | NA | 9.45E-05* | 1 | 0.0004* | 1 | 0.7998 | 3 | 5.37 | 4 | 7.3E-06* | 4 |
127-18-4 | 8.4 | 4 | 5.4 | 4 | 15.65 | 5 | 7.44 | 4 | 2.86 | 4 | 11.84 | 5 | 0.24 | 3 | 7.1 | 4 |
75-07-0 | 30.8 | 5 | 33.83 | 5 | 36.99 | 5 | 34.3 | 5 | 37.49 | 5 | 31.94 | 5 | 27.4 | 5 | 36.0 | 5 |
R2 | 0.2166 | | 0.291 | | 0.3180 | | 0.1495 | | 0.4123 | | 0.4144 | | 0.2714 | |
R2AD | 0.4229 | | 0.2910 | | 0.2156 | | 0.2381 | | 0.1943 | | 0.2910 | | 0.6793 | |
Six categories accuracy | 44% | | 27% | | 44% | | 27% | | 22% | | 39% | | 35% | |
* indicated the chemical is out the applicability domain; NA indicated not available for prediction. |
The deviation (Fig. 2) of the in silico methods for hexabromocyclododecane, hexachlorocyclopentadiene, and perfluorooctyl sulfonic acid salts and perfluorooctyl sulfonyl chloride are above 1 log unit, similar to the predictions of acute toxicity of Daphnia magna. Hexabromocyclododecane is outside the ADs of five in silico methods, while perfluorooctyl sulfonic acid salts and perfluorooctyl sulfonyl chloride are outside the ADs of one in silico method.
Uncertainty of the predictions
To improve regulatory confidence and acceptance of toxicity predictions made by in silico methods, it is crucial to assess the uncertainty of the predictions and determine to what extent the uncertainty is acceptable. The sources of uncertainty are as follows: quality of the data, mechanisms of action, descriptors (experimentally measured or calculated properties; e.g. log KOW and relevance of the descriptors), statistical method, and AD[39]. Currently, no coherent mapping or definition of uncertainty exists with regard to QSAR models[40]. The predictions of acute toxicity in the regulatory data set containing chemicals such as hexabromocyclododecane, hexachlorocyclopentadiene, and perfluorooctyl sulfonic acid salts and perfluorooctyl sulfonyl chloride are not used in the training set as some amount of uncertainty is associated with them. Thus, they inapplicable. T.E.S.T. and VEGA do not provide any correct toxicity classifications for LC50 (median lethal dose) < 0.01 mg/L, 0.01 mg/L < LC50 < 0.1 mg/L, LC50 > 100 mg/L for Daphnia magna, and the same is true for predictions of Pimephales promelas by T.E.S.T. and VEGA for LC50 < 0.01 mg/L. Moreover, none of the tools provide correct toxicity classifications for 0.01 mg/L < LC50 < 0.1 mg/L for Pimephales promelas. This shows that some chemicals are mispredicted as being extremely toxic (0.01 mg/L < LC50 < 0.1 mg/L and LC50 < 0.01 mg/L). Notably, the tools provide predictions regarding the acute toxicity of the chemicals to Daphnia magna for LC50 values ranging from less than 0.01 mg/L to greater than 100 mg/L. The tools provide poor predictions of acute toxicity of the chemicals to Pimephales promelas when LC50 ranges from 0.01 mg/L to 0.1 mg/L. The number of correct classifications of acute toxicity for Daphnia magna and Pimephales promelas are shown in Figs. 3 and 4. The results show that the predictions for perfluorooctyl sulfonic acid salts, perfluorooctyl sulfonyl chloride, hexachlorocyclopentadiene, decabromodiphenyl oxide, 1,2,4-trichlorobenzene, and musk xylene, all of which are composed of special structures, are poor.