Inference of genetic networks using random forests: performance improvement using a new variable importance measure

doi:10.21203/rs.3.rs-737867/v1

Download PDF

Research

Inference of genetic networks using random forests: performance improvement using a new variable importance measure

https://doi.org/10.21203/rs.3.rs-737867/v1

This work is licensed under a CC BY 4.0 License

Version 1

posted

You are reading this latest preprint version

Background: Among the various methods so far proposed for genetic network inference, this study focuses on the random-forest-based methods. Confidence values are assigned to all of the candidate regulations when taking the random-forest-based approach. To our knowledge, all of the random-forest-based methods make the assignments using the standard variable importance measure defined in tree-based machine learning techniques. We think however that this measure has drawbacks in the inference of genetic networks.

Results: In this study we therefore propose an alternative measure, what we call ``the random-input variable importance measure,'' and design a new inference method that uses the proposed measure in place of the standard measure in the existing random-forest-based inference method. We show, through numerical experiments, that the use of the random-input variable importance measure improves the performance of the existing random-forest-based inference method by as much as 45.5% with respect to the area under the recall-precision curve (AURPC).

Conclusion: This study proposed the random-input variable importance measure for the inference of genetic networks. The use of our measure improved the performance of the random-forest-based inference method. In this study, we checked the performance of the proposed measure only on several genetic network inference problems. However, the experimental results suggest that the proposed measure will work well in other applications of random forests.

Bioinformatics

Molecular Biology

genetic network inference

random forest

variable importance measure

Download PDF

Version 1

posted

You are reading this latest preprint version

Inference of genetic networks using random forests: performance improvement using a new variable importance measure

Status:

Version 1

Abstract

Figures

Full Text

Status:

Version 1