Deep learning architecture optimization with metaheuristic algorithms for predicting BRCA1 / BRCA2 pathogenicity NGS analysis

doi:10.21203/rs.3.rs-1407523/v1

Download PDF

Research Article

Deep learning architecture optimization with metaheuristic algorithms for predicting BRCA1 / BRCA2 pathogenicity NGS analysis

https://doi.org/10.21203/rs.3.rs-1407523/v1

This work is licensed under a CC BY 4.0 License

Journal Publication

published 17 Apr, 2022

Read the published version in BioMedInformatics →

Version 1

posted

You are reading this latest preprint version

BRCA1 and BRCA2 are genes with tumor suppressor activity, and they are involved in

a considerable number of biological processes allowing the regulation of the cell

replication cycle. A mutation in one of these two genes has a significant probability of

causing cancer. We have set up within the platform a machine learning algorithm based

on the random forest to predict pathogenicity in colorectal, melanoma, lung, and glioma

cancer. but this algorithm has revealed its limits when we want to predict on more

complex genes like BRCA1 and BRCA2. To help the biologist in the classification of

tumors, we decided to develop a deep learning algorithm.

The question we ask ourselves when we want to construct a neural network is how

many hidden layers and neurons should we use. If the number of inputs and outputs is

defined by the problem that we require to resolve, the number of hidden layers and

neurons is difficult to define because there is no pre-established rule. The number of

hidden layers and neurons that make up each layer of the neural network has an

influence on the performance of system predictions. There are different methods for

finding the optimal architecture like grid search or based on empirical equations. All

these techniques can be very time-consuming. In this paper, we will present the two

packages that we have developed, the genetic algorithm (GA) and the particle swarm

optimization (PSO) to optimize the parameters of the neural network for the prediction

of the pathogenicity of the BRCA1 and BRCA2 genes. We will compare the results

obtained by the two algorithms. We used datasets collected from our NGS analysis of

BRCA1 and BRCA2 genes to train deep learning models. This represents a data

collection of 11,875 BRCA1 and BRCA2 variants (BRCA1 benign 2,632, BRCA1

pathogenic 2,660, BRCA2 benign 3,446, BRCA2 pathogenic 3,137). Our preliminary

results show that the PSO provided the most significant architecture in terms of hidden

layers and the number of neurons compared to grid search and GA. The optimal

architecture found by the PSO algorithm is composed of 6 hidden layers with 275 hidden

nodes with an accuracy of 0.98, precision 0.99, recall 0.98, and a specificity of 0.99.

Bioinformatics

BRCA1

BRCA2

genetic algorithm

particle swarm optimization

deep learning

ngs analysis

neural network

artificial intelligence

Supplementary data not available with this version.

Download PDF

Journal Publication

published 17 Apr, 2022

Read the published version in BioMedInformatics →

Version 1

posted

You are reading this latest preprint version

Deep learning architecture optimization with metaheuristic algorithms for predicting BRCA1 / BRCA2 pathogenicity NGS analysis

Status:

Journal Publication

Version 1

Abstract

Full Text

Supplementary Data

Status:

Journal Publication

Version 1