Background
Biomarker identification is one of the major goals of functional genomics and translational medicine research. The advent of NGS lead to a constant and exponential increase of large datasets that have the potential of providing the means for novel biomarker identification for the early diagnosis of complex diseases and/or for patient/disease stratification. Once a biomarker has been identified, a validation study is necessary to assess its value. A study design that considers its appropriateness and cost-effectiveness is paramount. The calculation of a sample size is a challenge that needs to be addressed.
Methods
The workflow of our tool, termed PowerTools, is based on based on the method described by Blaise et al., (2016) [1]. For a given number of input data sets, a simulation step with the random multivariate normal distribution including correlation is considered. As a next step, datasets of variable sizes are generated by random selection of samples. Based on the outcome variable, either classification or regression modes are available. For binary classification ANOVA and linear regression test can be performed and then performance matrices can be evaluated.
Results
We developed an online framework to streamline power calculations to aid future omics study designs within a translational medicine research context. We make our code freely available on GitHub [2] and we have provided a web interface that can be accessed at online [3].
Conclusions
PowerTools offers the potential for designing appropriate and cost-effective subsequent omics study designs.